Asim is a software engineer from the U.K., and he recently graduated from Udacity’s Intro to Data Science course with a completely Udacious final project that explores NYC subway data to determine the relationship between weather and ridership. Asim uses IPython Notebook to share his methodology and conclusions – check it out!
We asked Asim a few questions about data science and lifelong learning:
What are your biggest lessons from your time learning with Udacity?
How important it is to treat online classes as a springboard to further continuous learning. Often I found that the courses armed me with a great set of questions and vocabulary to further dig into the subject matter.
However it was only while writing up the final project for Intro to Data Science that I got to unleash all the pent up curiosity and research I did around the subject. Without the course I wouldn’t have been prepared to ask the right sets of questions, nor given an outlet to apply my knowledge.
How are you using data science in your career and hobbies?
My experience with Intro to Data Science is giving me confidence and a richer set of tools to ask and answer important questions in meaningful ways.
As a software engineer, I often run into masses of performance and numeric data that I want to interrogate intelligently, and now I can! Even more important is knowing what types of visualizations to use to convince people of my findings.
I’m also working on an open-source hobby project that aims to predict news articles and blog posts I’d like to read based on previous ratings.
Any advice for other data science students?
When it comes to data science and Python, the tooling is constantly changing and always improving. Try to attend local meetups and conferences and keep up to date on general interest newsletters like Python Weekly and the Hacker News digest to stay in the fast lane. Or wait until my news recommendation app works and get this done for you! :).
Be curious, but be systematic. Every time you write tools to do data analysis and visualization I strongly urge you version control such tools on e.g. GitHub, or even just keep a backup on e.g. Dropbox. One day years down the line you’ll find you face the same questions over and over again and you’ll be thankful you saved some code!
What was your inspiration for your final project? Will you share it with others?
My goal was to explain a simple question to my friends and family – “What effect does the weather have on the MTA?”, but at the same time not hide the code or the rigor of the analysis. Although challenging I tried to draw meaningful conclusions whilst showing the full code behind them, while at the same time explaining ideas clearly without hiding behind jargon.
I found that using an IPython Notebook as the medium for my final project was perfect. I’m a big believer in reproducible science, and showing my code in-line with my prose gives my audience confidence that I’m not hiding my methods and even empowers them to reproduce it or re-use my techniques elsewhere. Also it’s just really fun playing around in an IPython Notebook with data, quickly darting in and out of different charts and models. My final project would not have been the same without IPython Notebook, and I was convinced to write my final project in IPython Notebook after attending the PyData 2014 conference in London, and seeing how completely in love with IPython Notebook data scientists are.