ud359 ยป

Getting Started

Local installation

Please read the instructions completely before beginning installation as multiple instances of Python can cause trouble.

You will need to install the following Python libraries and packages to run the assignments on your own computer:

  • Pandas
  • Numpy
  • Scipy
  • Statsmodels
  • Ggplot
  • Matplotlib
  • Seaborn
  • Pandasql

We highly recommend that you install the Anaconda distribution of Python. This distribution is bundled with most of the libraries and packages that you need to work on the assignments. Note that some operating systems (Mac, for example) come with a system version of Python already installed. This version should not be removed, and Anaconda can be installed in addition to this version.

The following packages are not included in Anaconda. We use them in the course (in the Udacity online editor) but they are not required for completing the final project. If you would like to install them, follow these instructions:

Pandasql can be added to your installation (after you install Anaconda) by running the following from a command prompt:

conda install pandasql

ggplot is not part of Anaconda but can be installed by running:

pip install ggplot

Anaconda comes with IPython notebook - it is a great environment to perform exploratory data analysis - it is very popular among professional data scientists. We strongly encourage you to try it out. If you prefer an IDE, then you can try out Spyder (also comes with Anaconda).