1 ∙ Requirements and set up
What you will need, and how to get your computer ready.
The words library, module and package are often used interchangeably, and they mean very similar things; it is entirely natural to be confused by this!
- Library : a collection of python files that expand the ability of python, using the
import
command. These are like accessories or modifications, typically giving you access to powerful, professional functions and classes written by collaborations of expert coders.- A library can be standard
(it comes built into python), for example
time
ormath
, or - 3rd party, so, written by people outside of the core Python developers, and as such will need to be installed using a package manager.
numpy
andpandas
are 3rd party libraries with millions of users.
- A library can be standard
(it comes built into python), for example
- Package : a library that is available for delivery to a package manager, such as
pip
orconda
. PyPI (“Python Package Index”) is the main package repository for Python. - Module : anything that is imported to a running script. Libraries and packages are made of modules. If you write your own Python file, and
import
it, that is also a module. - Environment : this has two meanings.
- A “Python Environment” is a specific, isolated installation of Python, available on your computer. Typically, users have a default install, called
base
, which we can create duplicates of. Think of each environment as a minimal, clear version of Python, where we can install functionality just for the project we are working on. This avoids conflicts, and prevents yourbase
install of Python becoming burdened with libraries that you don’t need all the time. It is good software engineering practice to create a new environment for every project. - The other meaning for “environment” is the interface that you are using to develop code - for example IDEs like VS Code, PyCharm, Jupyter Lab, even the command line itself are interfaces, also known as environments. Confusing, I know!
- A “Python Environment” is a specific, isolated installation of Python, available on your computer. Typically, users have a default install, called
- Dependency : anything that is required to get software to run. For example, here we need some libraries to be installed, so we are dependent on those being in place and working. In other contexts, “dependencies” might be single files, settings, or whole software packages.
Required experience and software
We try to be as inclusive as possible regarding your coding level, but to get the most out of this course, you should have:
- experience of Python to an intermediate level; for example, you should have experience of functions and conditionals. If you have attended any of our courses on data analysis , you are ready to approach this course. If you have only done Intermediate Python you should also be fine!
- Anaconda Navigator installed on your computer.
Setting up
To begin, we will need create a clean version of Python with just our requirements for this session. This will run Jupyter Lab, so we have an interface in your browser. We provide instructions here for both Windows or MacOS (+ Linux), so follow the instructions relevant to you:
In the Windows start button type Anaconda Prompt
(if you don’t see this, ensure Anaconda Navigator
is installed) - this will open a command line for you. First, we make a new folder (a new directory) by typing, in the Anaconda Prompt window
mkdir graphical-data-apps
then we move into that folder using the command
cd graphical-data-apps
We then create a new Python virtual environment (see the glossary box above for an explanation of what this is), with the command
python -m venv data_apps_env
which means, “run python, using the module called venv
, and create a new environment called data_apps_env
”.
We then tell the terminal to use this environment:
data_apps_env\Scripts\activate
Finally, we install the required Python packages for this workshop:
pip install jupyterlab streamlit plotly
We can get a command line interface by opening Spotlight (command + space
) and typing Terminal
. In this command line window, first, we create a new folder to work in
mkdir graphical-data-apps
then we move into that folder
cd graphical-data-apps
We then create a new Python virtual environment (see the glossary box above for an explanation of what this is), with the command
python3 -m venv ./data_apps_env
which means, “run python, using the module called venv
, and create a new environment here .
called /data_apps_env
”.
We then tell the terminal to use this environment as the Python install it is using.
source ./data_apps_env/bin/activate
Finally, we install the required Python packages for this workshop (this might take a couple of minutes:
pip install jupyterlab streamlit plotly
Whichever operating system you are using, you are now ready to start, so finally run this command to open a Jupyter Lab session in your browser:
jupyter lab
The above steps set things up for the first time, but in future you will not need to go through all the steps. For Windows, the steps are:
- Start
Anaconda Prompt
- Move into the folder you are working in, for example
cd graphical-data-apps
- Activate the Python environment in that folder, with
<environment name>\Scripts\activate
- Run
jupyter lab
, which will open a new tab in your browser with Jupyter Lab. - Open a Terminal in your Jupyter Lab tab, and enter the command
streamlit run <name of python file>
On Linux / MacOS, the steps are: - Start
Terminal
- Move into the folder you are working in, for example
cd graphical-data-apps
- Activate the Python environment in that folder, with
source ./<environment name>/bin/activate
- Enter the command
jupyter lab
in Terminal to start a browser tab with Jupyter Lab - Open a Terminal in your Jupyter Lab tab, and enter the command
streamlit run <name of python file>
Jupyter Lab
In the Jupyter Lab file navigator in the sidebar to the left, find and move into the folder we made called graphical-data-app
(this should be in your home folder, or another default location).
We are going to have two panes open: one Python file (a text editor), and one Terminal. You can open these either from the launch screen (with icons), or from the menu bar (in the Jupyter Lab tab menu, not the browser menu!), going File → New → Python File and File → New → Terminal.
For this session, a useful layout is with the Python file editor at the top, and Terminal below. This allows us to have a wide pane for the editor, and the terminal will only be running Streamlit (and occasionally reporting what is happening, and any errors), so we don’t need it to take up too much space. In any case, layout as suits your screen best, and we are now ready to start!
There are three panes in Jupyter Lab that we commonly use in our teaching:
- Text editor : this is a basic word-processor designed for code, that is opened when you create a new Python or plain text file.
- Terminal : this is a command-line (text based) version of your file manager (ie, Finder in MacOS, File Explorer in Windows). This is used to navigate the folders on your computer, and run commands.
- Console : in the context of Jupyter lab, the console runs Python interactively. This means you enter Python code line by line, which are immediately run - but it does not create or edit a script file. This is useful for understanding how code works, and prototyping a script.
Please skip this section if you are using Jupyter Lab / Anaconda.
You are welcome to use another IDE, but please be confident with installing new packages into your Python environment. You will need to be running both an editor and a terminal, as in the previous section. If you use the package manager pip
, we have provided a requirements.txt
file here
, which will install the required packages. (environment.yaml
and requirements.txt
are essentially identical, for conda
and pip
respectively.) Create a suitable new folder to work in, and move the requirements.txt
file into that folder. A typical series of commands would be:
Create a new environment here, called venv
:
python -m venv ./venv
Activate this new environment:
source ./venv/bin/activate
Use pip
to acquire and install our dependencies:
pip install -r requirements.txt
If you are interested, or if you need to do things manually, the libraries this installs are:
pandas
- allows us to organise data into powerful formats , most notably the dataframeplotly
- an open-source graphing, charting and data vis librarystreamlit
- the data app interface builder