Tutorial 0a: Setting Up Python For Scientific Computing

© 2017 Griffin Chure and Muir Morrison. This work is licensed under a Creative Commons Attribution License CC-BY 4.0. All code contained herein is licensed under an MIT license.

Why Python?

As is true in human language, there are hundreds of computer programming languages. While each has its own merit, the major languages for scientific computing are C, C++, R, MATLAB, Python, Java, and Fortran. MATLAB and Python are similar in syntax and typically read as if they were written in plain english. This makes both languages a useful tool for teaching but they are also very powerful languages and are very actively used in real-life research. MATLAB is proprietary while Python is open source. A benefit of being open source is that anyone can write and release Python packages. For science, there are many wonderful community-driven packages such as NumPy, SciPy, and Pandas just to name a few.

Installing Python

Python 3.6 vs Python 2.7

There are two dominant versions of Python used for scientific computing, Python 2.7.x and Python 3.x.x. The most recent release (Python 3.6.0 as of March 2017) is not backwards compatible with previous versions of Python. While there are still some packages written for Python 2.7 that have not been modified for compatibility with Python 3, most have transitioned, including everything we will use in the course. Nowadays any new code should probably be written in Python 3, unless there is a compelling reason against doing so.

Anaconda

There are several scientific Python distributions available for MacOS, Windows, and Linux. The two most popular, Enthought Canopy and Anaconda are specifically designed for scientific computing and data science work. For this course, we will use the Anaconda Python 3.6 distribution. To install the correct version, follow the instructions below.

  1. Navigate to the Anaconda download page and download the Python 3.6 graphical installer.

  2. Launch the installer and follow the onscreen instructions.

Congratulations! You now have the beginnings of a scientific Python distribution.

(If you already have a working Python 3.x.x installation, that should work fine. If you are not using Anaconda, be sure you have pip installed to add any missing packages as we go along.)

Installing GitBash for Windows Users

It will be useful to have access to a UNIX command line to run Python scripts, make and move directories, as well as to install extra packages for Python through the conda package manager. For those on OSX, we will use the built-in Terminal.app program. If you are using Linux, we will assume you already know what you are doing.

Windows does not come with a UNIX command line interface. To install such an interface, we recommend using GitBash. To install, please navigate to their download page and follow the download instructions. Please set the following setttings upon installation.

  • Adjusting your PATH environment -> Use Git from Windows Command Prompt.
  • Configuring the line ending conversions -> Checkout Windows-style, commit unix style line endings.

We will not be using the git version control system, so these preferences are less important.

Once installed, you will be able to launch a UNIX compatible terminal interface wherever you are in your operating system by right clicking on the desktop or windows explorer window and selecting "Run GitBash here".

Installing extra packages using Conda

With the Anaconda Python distribution, you can install verified packages (scientific and non-scientific) through the Conda package manager. Note that you do not have to download Conda separately. This comes packaged with Anaconda. To install packages through Conda, we must manually enter their names on the command line. For the purposes of these tutorials, we will only need to install one package -- Seaborn for plotting styling. To open a terminal on OSX, click on the search icon in the upper right-hand corner of your menu bar and type "Terminal". This application is installed by default on your computer. For Windows users, you can simply right click on your desktop and select "Run GitBash here". Once you have a terminal window open, type the following command:

conda install seaborn

Note that you will have to type y in your terminal when prompted. This is to ensure that you are aware of what is being installed and what dependencies it requires.

Installing Atom text editor

Most of our coding in this course will be done in Python scripts which we subsequently run from a terminal. If you have familiarity with a specific text editor (vim, emacs, sublime, VisualCodeStudio, etc), feel free to use it. For this course, we recommend downloading and configuring the Atom text editor. To install, navigate to their page and follow the instructions below.

  1. Navigate to the Atom homepage and follow the instructions for installation.
  2. Once installed, launch Atom and navigate to Packages -> Settings View -> Open and scroll to the bottom of the page. Select Editor and make sure the setting Tab Length is set to 4. Below that, make sure Tab Type is set to soft. This is important as indentation and white space is interpreted in Python.

Setting up the directory structure

For this course (and your coding in 'real life'), it will help if you follow a specific directory structure for your code and data. During this course, we will be writing a lot of Python scripts that will load in data. So you can directly follow along in class, it is important that you and the instructors have the same directory structure. To make this structure, open Atom and follow the instructions below.

  1. Navigate to File -> Add Project Folder and make a new folder in your home directory. On OSX and Linux, this will be in /Users/YOUR_USERNAME/. On Windows, this will be C::/Users/YOUR_USERNAME/.
  2. Name this project pboc.
  3. Now pboc should appear on the left-hand side of your editor. Right-click on pboc and make a new folder called data. This is where all of our data from the class will live.