The excitement around our Machine Learning Nanodegree program has been amazing to witness, and the vitality and dynamism in the space right now is pretty incredible. There are so many fascinating storylines in the world of Machine Learning, it’s sometimes hard to even know what to focus on. But unquestionably, the people working in this field—those individuals at the cutting-edge of these new technologies—are a critical part of the Machine Learning narrative. One of the things I find personally really exciting is how many women are shaping the future of Machine Learning. My former colleague Katie Malone is a wonderful example of this, and I’m very grateful she was able to take some time recently to talk Machine Learning with us!
– Can you “officially” introduce yourself, and tell our readers a little bit about what you do?
My name is Katie Malone, I’m a data scientist in the research and development department at Civis Analytics. Civis is a startup that specializes in data science software and consulting. My job is threefold: I help our consulting data scientists find the best way to solve new problems, I build general-purpose tools that enable us to quickly and robustly solve those problems repeatedly, and I mess around with blue-skies research projects that everyone agrees sound really interesting but the application might be a little farther away. It’s an extraordinarily fun job and I love it.
– How did you originally get started in programming? What inspired you to learn to program?
My background is in particle physics, which is very programming-heavy. I knew that the only way to do physics was to learn to code, so I learned to code as an undergraduate. For a while I only knew C++ (it was the language of choice in my undergraduate program) but then in grad school I decided to invest some effort into learning python. I sat down for an hour a week every week for a quarter and worked my way through an online course, which gave me enough know-how to start programming in python on my own. It was a small investment at the time, but probably the single highest-return investment I’ve ever made–I do most of my work in python now and use it every day for the bulk of my work.
– Can you briefly describe your educational background?
I majored in engineering physics at Ohio State for my undergrad, and then went to Stanford for a PhD in physics. My graduate work was searching for new particles at CERN in Switzerland, which was extremely cool and (lucky me) generalizes well to my current career in data science.
– You began your career as an intern in data science. How did that come about?
I knew I wanted to do something other than physics after finishing grad school, and data science seemed like the natural choice. It was a big choice to make, though, leaving physics, and an internship seemed like a great way to dip my toe in the water and make sure it would be a good fit (I remember thinking that doing an internship was like running an experiment, which made me more comfortable with it as a scientist). It was also a great asset when I went on the job market for a full-time position, since it demonstrated a real interest in data science and probably helped distinguish my resume.
– When did you first become interested in machine learning?
My research group did some machine learning in physics when I was an undergrad, so I had some exposure to the methods very early on (before it was cool!) I got much more serious about it in grad school, taking online courses while I was at CERN and even playing around with machine learning on my physics projects sometimes.
– We heard you attempted to apply machine learning to wine. Cool! Can you tell us about that?
Ha! Yeah, this is something that is always further along in my head than on the keyboard. I had a friend who was really into wine, and would host wine tastings for a bunch of people that were really, really fun. Eventually I moved to another city and couldn’t go to wine class anymore, but I wanted to keep trying out new wines, and in particular have a new bottle every week with dinner. But I didn’t know enough about pairing wine with food to be very effective at knowing what wine to buy or what food to cook, but that did seem like something that a machine learning algorithm might be pretty good at. It’s a fun project, kind of like a recommendation engine but with some twists, although (like most projects like this) I’ve found that the hardest part, by far, is finding good data and getting it formatted properly.
– Where do you think the field of machine learning is headed?
Oh man, I wish I knew. Neural nets certainly seem like the new hotness right now, I’ll be curious to see what new innovations people get out of those in the years to come. I’m also excited about some recent work in making machine learning models easier to interpret, which makes them more accessible to non-experts and (eventually, I expect) significantly easier to use. It’s tough to say, though, because every week I hear about really exciting new work that changes the way I think about machine learning. It’s a pretty incredible place to be right now.
– Do you think anyone with basic coding skills can become a machine learning engineer?
Hmmm, I would say there are a couple other things that are necessary, but then the answer is yes. The first thing is a willingness to deal with some math—being good at machine learning means thinking about math and statistics a lot (it’s what the algorithms are based on). The second, and maybe more important thing, is being happy getting your hands dirty with real problems. A lot of machine learning and data science isn’t about knowing a formula or being able to code an algorithm from scratch, it’s more about seeing lots of problems and thinking about them from many directions. That comes with experience. But getting that experience is easier now than ever, with lots of great online courses, competitions, and collaboration that can be done online.
“A lot of machine learning and data science isn’t about knowing a formula or being able to code an algorithm from scratch, it’s more about seeing lots of problems and thinking about them from many directions.”
– Do you feel like the number of women in the field of machine learning and/or data science is increasing?
Machine learning and data science are still fairly young fields, so I don’t have much historical data on this. Certainly many of the STEM feeder fields, like physics in my case, are still far from seeing gender equity, and anecdotally it seems like this propagates through to data science. That said, I work with a ton of incredibly smart and dedicated female data scientists every day, so I certainly don’t feel lonely.
– Why is it important for more women to get involved, and what resources out there would you recommend for women (or anyone!) interested in machine learning?
From my perspective, data science and machine learning are fields where someone can have a really huge impact. That’s part of the reason I went into data science, because I could see that the work makes a real difference in people’s lives. I think that’s an important reason for women to consider data science—the world needs the skills that they bring to the table. Of course, that’s true for men too, but anecdotally (from talking to lots of women at career fairs, or during interviews) I’ve found that a lot of women are especially excited to work on projects that improve the lives of others.
“Data science and machine learning are fields where someone can have a really huge impact. I think that’s an important reason for women to consider data science—the world needs the skills that they bring to the table.”
For resources, the single best thing you can do is find people who can challenge you and make you think. These can be collaborators that you work with in “real life,” or folks online (say, for example, contributing to open source projects). I’ve also found that the projects that turn out the best for me are the ones that I find most interesting or exciting, so I’ve grown to put a lot of effort into reading about many different things so I can find out what seems most cool or fun and then go after that—at first it felt a little backward, like instead I should be reading up to find out what I “should” be excited about and then letting that guide my choices, but I’ve found that thinking about it instead from the perspective of “what makes me excited, and let’s think of a way to apply machine learning or data science to that” is way more fun for me. That’s not really a resource, sorry, but I think it’s important. For resources, I love online courses (like Udacity of course, but there are lots of good ones out there), podcasts (I have to say that, since I host one as a side project–Linear Digressions), and there are some excellent blogs out there too.
Katie Malone, thank you! Thank you so much for sharing your knowledge, your experience, and your insights. You’ve given our readers—and me—so much to think about. It’s an exciting time to be in Machine Learning!