prython in a nutshell
What's prython for?
It's an IDE that allows you to code in R or Python using panels that can be connected in a canvas.
You can use it in two ways:
- in the connected mode, each panel can accept multiple IN and OUT connections. You can run each panel in three modes: just one panel, everything up to that panel (running everything that serves as an input to it),
and everything after that panel (running everything that uses the code from this panel). Every time you press run in either one of the three approaches, a new Python/R session gets created and all variables are evaluated again.
-
In the free mode, each panel becomes independent of the rest. However, all panels belong to the same R or Python session. This makes your development faster, as you don't need to
rerun each panel every time you need to test something. However, this makes your project less organised and it is somewhat hard to remember the order of execution used for each panel.
Here you can see how a project on connected mode looks (LEFT) with all its connections. On the right, you can see the same project with the free mode activated.
Data professionals need to experiment with their data, build multiple plots, and separate the code into different areas. They rarely want to have a single linear script that runs from start to end.
This almost inevitably leads to very messy scripts, unclear outputs, multiplicity of confusing plots, and users needing to remember what needs to be commented out to test something. No other IDE is well suited for this.
Check here our video tutorials
Why do you want to use it?
- To track and describe experiments and tests. Instead of remembering what needs to be commented out in a script to test X change, you can easily do that with prython
- to display your results and plots in a canvas that can be seen at the same time
- to run complex tests on different models that run with a single click (for example: you want to test several scikit-learn models at the same time)
- to split your code into different areas in the canvas. ie. input loading on one part of the canvas, model training in another area, plots/analysis in a different area
- to mix Python and R code within the same project.
- to visualize how dataframes change and evolve in a script. prython computes all changes done to dataframes (both in R or Python) across panels, and whenever a dataframe is altered, it is shown as a table next to each panel
Is it compatible with R or Python?
You need Python greater than 3.0 and R greater than 4.0. It is fully compatible with any R or Python package/library.
prython is currently only available for Windows. It needs R > 4.00 and Python > 3.7
Workspace extras
Cleaning the outputs
Keeping a clean workspace is essential. You can clean the outputs (including all plots, run times, logs) by clicking on this
Cleaning the list of processes
The list of processes (which appears on the lower right part of the screen) can be cleaned by clicking on this or pressing F8.
The resulting list of processes will look like this
Installing prython
Just download the latest version from the downloads page here
Configuring R
Once R is installed, open prython and set the folder path to your R kernel. You can do this by clicking on the R button here
The following window will appear, and you will need to choose the path to your R binaries
Note that we need to select the folder where the R executable is located, usually the bin/x64 or bin/i386 folders
Configuring python
You also need to install pandas and numpy using pip install pandas
and pip install numpy
if planning to use Python panels
Once Python or Anaconda is installed, open prython and set the folder path to your Python and R kernels. You can do this by clicking on the Python button here
The following Python window will appear, allowing you to choose the folder where the python executables are located. You need to define whether
you want to use an Anaconda based installation or plain Python. If you choose plain Python, a window will appear and you need to indicate the path to that installation. After that is done,
the process will finish.
However, if you choose Anaconda, a list of the available environments will appear. You will need to choose the one that you want to use.
Finally you will be prompted with a message telling you that Anaconda has been configured
Once you finish
Once this is done, you will see this. The corresponding R and Python versions are shown here.
Connectors
You can add brackets by simply clicking on the toolbox or pressing F3. Once you add one,
just click and drag on either one of the circles and place them over the IN or OUT parts of each panel. If you do a right click on it, a cross will appear allowing you to remove it.
Markers
Markers are used to define areas of your project; you can drag them across the canvas to flag different areas such as input processing, model training, etc.
When you add one two things are added: a white label that you can place in your canvas, and a label with a green caret that is fixed
in the lower left part of the screen. If you click on the green caret, the screen is redirected towards the label position.
If you do a right click on the text label, an input box will appear that will allow you to edit the text content. A cross will appear as well that will allow you to remove it
Screenshots
You can take a screenshot over a specific area of your project. You can then drag it wherever you want. To remove them, click on the red cross.
Notes
You can add notes and place them wherever you want in your canvas. They don't have any functionality.
Brackets
You can add brackets by simply clicking on the toolbox or pressing F7. These don't have any functionality, and are only used to organise your project.
On the left we have the bracket when we add it; if we right click on it, a cross appears that allows us to remove it. Also, two squares are displayed that allow you
to control its height
Frames
Frames can be added by clicking on the toolbox or by pressing F7. They are used to separate areas of your code. They have an embedded label that appears on top of it. You can adjust their size as you want. On the left, we created one that is wide enough to
keep two panels inside. In this case we are simply isolating two panels.
On the right you can see what happens when we do a right click: two squares appear that allow us to adjust the height and width;
a text input box also appears allowing us to change the text label. When you move these frames, whatever is inside them will also move with them
Panels
Panels are the fundamental element used to build prython pipelines.
You can choose whether you want it to be Python or R 1
They are usually used in conjunction with connectors that connect it with other panels. Each panel can be connected to another as an input or output.
Panels connected to the input part will be executed prior to this panel running. Panels connected to the output will be executed after this panel runs.
You can then choose whether you want to run just that panel 2 , everything leading to that panel 3, or everything after it (including it as well) 4.
When plotting in Python using matplotlib, you need to call plt.show() in order to show the image in Prython
Editor
Next to it, you can activate the editor 5 that will appear on the right. It will do syntax highlighting based on whether the panel is R or Python
Hiding and showing panels
With 6 you can hide or show a panel. This is almost equivalent to commenting out the code in that panel, but in addition to that, the relationship is temporarily destroyed; meaning that everything that depends on its outputs won't be executed
Freezing panels
You can use 7 to freeze whatever was printed as a result. This is specially useful when testing different things on a panel,
and you don't want to copy and paste the result into an external text editor in order to remember what was created. For example, here we saved two snapshots: the iris dataset, and the iris dataset after removing one observation. At last we just printed the same iris dataset, but removing two observations.
Console
Consoles can be attached by clicking on 9 Naturally R panels will spawn an R console, and Python ones will spawn a Python one.
Inside each console you can access the environment up to the moment the script ran.
For example if panel A is connected to the IN part of panel B (meaning that A "outputs" objects into B); if you attach a console to panel A, you will only
have whatever exists in A in the environment. On the other hand, in panel B, you will have everything that was defined both in A and B
On the left you can see an R console that was attached. Note that the variable d
that was defined in it can be accessed via the console.
On the right we have a Python console that was attached. On a similar vein, every object created in that panel (or before) can be accessed through it.
Not every object is saved for Python sessions. Pandas dataframes, numpy arrays, lists, strings, numbers, dictionaries, sets, and dates.
Future releases might extend this to other object types.
There is a natural overhead in saving the environments for each panel. In certain situations, this can also require a substantial
amount of disk space (for example when large dataframes are created). Hence, you can turn it off by clicking on the SAVE PANEL STATES button.
Dataframes and plots
Every data-frame created or modified within a panel (both for R and Python) is displayed next to the corresponding panel.
The same occurs for plots, and is done for both matplotlib in Python and ggplot/standard plots in R. You can click on 10 to activate or deactivate this capability.
For example, here we are creating a dataframe with random numbers in R, and doing one plot. The dataframe appears on the left. This panel is connected to ]
the IN part in the panel below, so we can access and modify the dataframe. Here we are creating a new column based on the existing column in that dataframe.
Note that the dataframe appears again on the left, because it was modified in the panel. The same is done for pandas dataframes
When using python panels, numpy arrays are also tracked (shown on the right of each panel). You can easily see the dimensions for each one of them.
In the pro version, you can adjust the size of the images
Copying the outputs
You can create a live copy of the output window that can be placed anywhere 8 . It can used for seeing what's inside objects defined in other panels (for example after using a print() statement in R/Python) while you are coding in another panel.
It can also be used to compare multiple outputs from many models in the same place.
Progressbars
You can create progressbars when running R or Python code by calling a print statement with the following structure
progress=0.3 / progress_bar_name
It is automatically parsed to generate them. Check the example below in R
Extending the output window
If you hover the cursor over an output window it gets extended
Linked panels
Linked panels can be created by clicking on 11. They will contain the exact same code defined in the panel used to create them.
This is particularly useful when you want to test different inputs for a panel.
For example you can create a series of plots, and even a machine learning model inside a panel using raw data originating from a csv file.
And you want to test what happens when the input changes, either from another file or a sql database.
Instead of needing to maintain two pieces of code containing the plots and the model, you can create a linked one. And connect them to the input paneels.
Note that a green dotted line is used to indicate that the panels are linked. All modifications done to the panel that create the linked one will
be propagated to the linked one.
Removing panels
Panels can be removed by clicking on 12. It is not possible to revert this operation
Running
Free mode
In the free mode, there is only one run button. When pressed, the panel gets executed and the outputs appended to the current panel output. Whatever variable was previously available in the R or Python session, will continue to be available.
Connected mode
In order to run a panel (see below - right), you can choose from either one of the three modes (running only one panel 1 , running everything up that panel 2 , running that
panel and everything using its outputs 3).
Let's now assume that we have the following (see below - left): if we go to Panel1 (top) and we click on run everything 3 , three scripts will be created
and three sub-processes will be spawned.
All panels connected together need to be in either Python or R. If mix them, and error will appear when trying to execute them. In this case, the three processes will not run in parallel.
If we had three separate paths with no connection between them, and you ran the three of them separately, they would be running in parallel
The aforementioned processes will appear on the lower right part of the screen. For each one you can see the number of panels involved (2,4 and 3 in this case). As they complete, they will increase the counter in the column Completed. The time when the last panel was finished appears on the right.
If we were instead in panel7, and we clicked on 2 things would have been different: only what is marked in blue as script 2 would have been processed. If we were on panel6 and clicked on 2 as well, only panels 1,3,6 would have been processed.
You will also notice a process tracker appearing on the right of the screen. Here we are running only one panel. Panels finished will appear in green and their execution time will be shown inside
What happens to panels when you run your code?
There are four states for a panel: The first one appears when you press either one of the three run buttons. Each panel will
get executed as indicated by the IN/OUT connections. Before any panel runs, it is flagged as scheduled
The second one appears as next to run... to indicate that it is the next panel to be executed
When a panel is effectively running, you can see when it started and a green animation next to it
When a panel finishes, the start and end time for each panel is displayed
Stopping processes
If you look at the bottom right of the screen, you will see the process tracker. If you click on Cancel all processes, this will stop all processes. However some functions using C++ libraries in python or R cannot be stopped, unless you close prython
python (pandas) and R dataframes are tracked for each panel. Whenever a dataframe is modified, it gets shown next to the corresponding panel. You can change their width, and open them in Excel directly.
It's important to highlight that each dataframe consumes some RAM memory, and the bigger the dataframe, the bigger the memory footprint. You can click on preferences and click on "Trim dataframes to 1k records" to reduce the RAM footprint.
In the connected mode, all dataframes modified by a panel get shown next to it, as stated above. In the free mode, the same holds. However, there is a small extra advantage here. If you type the dataframe name at the end of your code, that dataframe will be shown next to the
corresponding panel.
On the pro version, you get an extra button that allows you to see all the dataframes. If you click on any of the column names: the distinct values are shown on the right.
Text
You can add text with a colorful background. You can right click on them to edit their content and color. To move them, just left drag them.
Saving and exporting
You can save and load your projects by clicking on File>Open Project and File>Save As. In addition to this, whenever you run
a panel or a series or panels every combination of panels gets exported.
For example, if you have a panelA connected to panels panelB and panelC (both consuming the outputs of panelA)
two sub-scripts will get exported: panelA+panelB and panelA+panelC.
In order to turn off/on this automatic saving, you can click on the following. It will also allow you to specify the folder where you want to dump the scripts to.
What is saved when we save a project?
When you save a project, everything is saved including all connections, outputs, plots, etc. This also includes all the environments created for each
panel. This means that when you load a file, you can attach consoles to panels and load the corresponding environments.
Customizing
The following window will appear. A few of these options deserve some special comments: The Trim dataframes to 1k records is used when dealing with large dataframes
that might require a lot of RAM. The minimap requires substantial computing power depending on the size of the project; in consequence you can turn it off in case it causes performance issues
The error detection and code suggestions can sometimes introduce some performance issues for large panels, and they can be turned off here.
The theme/color for each panel can be chosen here