Download and install Anaconda by choosing the installer corresponding to your OS. Select the Python 3.x Version.
After the installation is finished, you can check your installation by typing conda list
in your terminal. If Anaconda is successfully installed, this command will display a list of the installed packages.
You may need to set the environment variables by running
export PATH="~/anaconda3/bin:$PATH"
in your terminal and adding
export PATH=~/anaconda3/bin:$PATH
to your .bashrc
(to edit it e.g. with nano, type nano ~/.bashrc
in your terminal).
In Python, it is good practice to work with environments, in order to keep the dependencies clean and the required modules separated. To create a new environment for the Data Science Course, type
conda create -n datsci python=3 scikit-learn jupyter matplotlib pandas
in your terminal.
This will create a new environment named datsci
and also install some of the packages required during the course (scikit-learn, jupyter, ...). Those packages have further dependencies, e.g. on numpy (which we will need anyhow) which are installed during the process as well.
Finally, you need to activate the environment with:
activate datsci
(Windows)
source activate datsci
(OSX & Linux)
The command prompt indicates the activated environment by prepending (datsci).
Further information about environments can be found in the documentation.
To install further packages, use conda install
. For example, to install the scipy
package, simply type conda install scipy
.
By default, packages will be installed in the currently active environment, so make sure to activate your desired environment before. Alternatively, you can explicitly specify the environment by typing conda install --name datsci scipy
(assuming you want to install into the environment datsci
).
In case you stumble across a missing module in one of our notebooks, you can install it via this procedure.
IMPORTANT: The keras
module does not seem to be available on the official conda channels, therefore install it by typing
pip install keras
in your terminal (make sure, your desired environment is active).
Further information about package management can be found in the documentation.
Git is a versioning system, which we use in this course merely to distribute the course material. All the (probably) relevant commands are listed below, but you may also want to take a look at the Quick Guide or the full documentation.
apt-get install git
(Linux)
download and install (Windows)
download and install (OS X)
Windows: Create a terminal with git support via right click on the mouse or the start menu entry (Github -> Git Shell)
To clone a repository, run git clone
. During the Data Science Course, we will use two repositories, so you have to clone both of them by typing:
git clone https://github.com/zieglerk/python-tutorials.git
git clone https://github.com/zieglerk/data-science-tutorials.git
in your terminal. These commands will clone the repositories into your current folder and create the folders python-tutorials
and data-science-tutorials
. You might need to register a Github account in order to be able to clone the repositories.
In case a notebook is updated during the course, you can get the latest version by running git pull
in the respective directory (either python-tutorials
or data-science-tutorials
).
If you already edited a notebook, you might be required to either commit or stash your changes. When you commit your changes and pull the latest version, Git will try to merge the changes automatically.
After you have completed all the previous steps you can run the notebooks by navigating to one of the repositories (e.g. data-science-tutorials
) and typing
jupyter notebook
in your terminal. This will open the notebooks in a browser. In order to execute a cell in a notebook hit Shift+Enter
.
For those who cannot install the required tools, we provide a ready-to-use web-environment. You find the link in the Ilias course. In order to use the notebooks, you need to download them from Github and upload them to the Jupyterhub. Github offers an option to download the whole repository as zip-file.