Are top musicians born or are they a product of large cities? Data Analysis with R student, Stefan Z., decided to address this question and many others while exploring the geography of American music.

First, he collected 5,280 of the “hottest” songs using The Echo Nest’s API, or application programming interface. If you’re not familiar with APIs, think of it as a way of requesting information. Next, he added location data for each artist’s birth place using Google Maps API.

From there, Stefan combined his song and artist data with US census data, and finally, he used mongoDB’s geospatial mapping to find nearby artists to identify clusters of top musicians.

plot of chunk long_lat_plot_of_songs

Clusters of top musicians

Stefan didn’t stop there. He researched how to create maps using a blog post and this StackOverflow question. With a little bit of more code, he created this map.

plot of chunk map_music

Birthplace of top musicians alongside cities with large populations

Red dots represent cities with populations above 500,000, and yellow dots represent the birthplace of a top musician. What observations do you have about the map?

You can see if your observations match up with Stefan’s and see the rest of his work in Stefan’s Geography of American Music full report.

Want to learn by doing? Exciting projects like this one are within your reach! You can visualize data and learn the building blocks of code with the R programming language in Data Analysis with R.

Chris Saden, Course Instructor, Data Analysis with R