One of the most fascinating things about Machine Learning—and those who work in the field—is the remarkable scope of what’s so rapidly becoming possible because of this technology. The applications are almost limitless, and we witness this every time we talk to a Machine Learning specialist.
Lauren Edelson is a perfect example, as you’ll see from her responses to our questions below. Lauren was gracious enough to share a great deal of insight and experience with us, and her answers cover a wide range of subjects, including computer science, the relationship of bioinformatics to Machine Learning, and Machine Learning impacts on fraud detection. When queried specially about all the different ways Machine Learning informs our modern lives, she mentioned everything from healthcare and Uber, to Spotify and the tracking of presidential election swing votes!
If you have any lingering doubts as to whether Machine Learning is an exciting field, banish those doubts, and read on!
– Can you introduce yourself and tell us what you do?
I’m the Founding Data Scientist for Simility, an intelligent fraud detection and analytics startup in Palo Alto, California. Working with a relatively small but growing startup means that in addition to creating Machine Learning (ML) models, my role also includes aspects of data engineering, customer management, product design and product management.
– How did you originally get started in programming? What inspired you to learn to program?
I took my first programming class as a freshman at Stanford because a friend told me that, with my logical approach to problem-solving, I’d probably love coding. I’ve always been good at math, but I immediately preferred computer science. After struggling for hours on a difficult problem, instead of just getting some value for x, I was able to create tangible results!
After that, I found programming incredibly addicting, and eventually decided to pursue a minor in Computer Science. These were by far my most difficult and incredibly frustrating classes of college, but the satisfaction of solving a difficult logic puzzle kept bringing me back for more. It was definitely a love-hate relationship.
As my skills progressed, I found myself able to understand how technologies that I use every day function under the hood, all the way down to the level of 0s and 1s, which is incredibly intriguing. Living in a world where, with some training, anyone can learn to translate ideas into code is an amazing thing. It’s been one of humanity’s greatest breakthroughs and it is inspiring to be a part of it.
– Can you briefly describe your educational background?
I’ve always been captivated by the human body and how such a precise, well-balanced system could come to exist. So, in my undergraduate studies, I majored in Human Biology at Stanford with a focus on Neuroscience. As I mentioned, I separately pursued a minor in C.S. As I was finishing up my degree, I discovered the field of bioinformatics, which uses computer science techniques to solve problems in biology and medicine on a scale that has never before been possible. I stayed at Stanford for an extra year to get my Master’s degree in Biomedical Informatics. I finally felt like I had found my passion because my separate academic interests all converged on this incredible field.
After graduating, I quickly learned that I could apply the exact same data mining and statistical analysis techniques I’d learned in the context of bioinformatics to datasets in totally different domains. That’s how I found myself using data science to fight fraud.
– When did you first become interested in Machine Learning?
“Machine Learning” was always one of those buzzwords I heard all the time in my CS classes but knew little about practically. When I first started out in CS, I distinctly remember reading the Random Forest Wikipedia page in my free time because I thought that sounded like such a cool phrase and had no idea what it was. It wasn’t until I began implementing ML algorithms and applying them in depth during my graduate studies that I fully appreciated the power of these techniques. Similarly to programming, my relationship with ML has been one of both incredible frustration and incredible reward—you can never build a “perfect” model, but learning how to optimize various parameters and take full advantage of ML as a tool is a powerful experience.
“My relationship with Machine Learning has been one of both incredible frustration and incredible reward.”
– What does a typical work day look like for you?
One of the reasons I love my job is because no day is the same! Working at a small startup where data is integral to our product means that I get to touch every side of our business.
I’ll spend time talking in detail with our customers about what types of fraud problems they are facing on a broad scale and how our tool can be most effectively applied to solve them. Then I’ll go code up the solutions we discussed. Based on these customer-facing conversations, I’ll work with our engineering team to design new features or processes within our tools to make them even more effective. I’m always brainstorming new potential patterns within our datasets that could be indicative of fraud; I’ll then spend time quantifying these insights into new features that can then feed into various ML algorithms. I’ll iterate over these algorithms in order to build the most effective ML model possible. Once I’ve productionized a model, I’m constantly monitoring its precision, recall, and other various metrics in order to pinpoint when its predictive ability begins to slip.
If it sounds crazy busy, that’s because it is. As the central “data” person in a small startup, I get to influence so many aspects of our product!
– The company you work for, Simility, is rooted in fraud protection. Can you talk a little about how Machine Learning/Data Science can be applied to protecting everyday people from fraud?
The founders of Simility noticed that despite fraudsters’ best efforts, they inevitably leave behind small traces of their activity while committing “bad” behavior. Using this knowledge, we can apply data science concepts to aggregate these patterns across massive datasets and predict which users or transactions within a given network are most likely to be fraudulent.
For example, Simility’s “Device Recon” technology collects 300+ dimensions of data on any device browsing our customers’ websites (for example, we’ll track browser font size, screen pixel dimensions, whether a user is on mobile versus desktop, etc). We then use ML algorithms to determine which device fingerprints are more likely to commit fraud than others and apply this knowledge to score all incoming devices.
– We know Machine Learning can have applications in many different fields and industries. Where do you think it’s having some real impact, or can you think of any areas where there is tons of untapped potential?
Because of my background, I’m immediately inclined to say healthcare—the opportunities in bioinformatics right now are so omnipresent that it’s hard to quantify their potential impact. Running predictive algorithms over existing datasets can predict with high accuracy who will get which diseases when, thus increasing opportunities for intervention, improving diagnostics, and effectiveness treatment choice.
But ML is being used everywhere you look to solve some very interesting problems nowadays! I find the data science behind the now-ubiquitous “Uber Pool” in San Francisco fascinating; it calculates probabilities that are then used to increase carpooling in the city and reduce transportation costs. I suspect that Spotify uses ML to generate its “Discover Weekly” personalized recommendation playlists that people swear by. A couple of my closest friends are using ML to optimize ship routes in the shipping container industry at a brand new startup called ClearMetal. By aggregating multiple data sources, ML can even be used to predict who a given swing voter will support in the upcoming presidential election, sometimes even before that voter has fully decided him or herself. The widespread availability of data in 2016 for the first time in human history means that we are just beginning to consider the potential applications of these techniques.
– Do you think anyone with basic coding skills can become a Machine Learning Engineer?
Nowadays there are so many software libraries and services that make it easy for anyone with a dataset and some basic programming knowledge to “build models” in a matter of minutes. However, knowing when to deploy the right model is its own beast.
Mastering a Machine Learning algorithm is like cooking with a brand new frying pan. It’s relatively easy to buy a pan, but the real magic happens when it’s used correctly. Knowing when to add olive oil to the frying pan and when to combine it with a high- versus a low- heat stove is an acquired skill and may make all the difference on how the food turns out, depending on what meal you’re trying to make. Similarly, each ML model works best when deployed in a highly specific scenario, and it takes practice to learn where and how to apply each type of model. Furthermore, it’s possible build an entire career out of feature extraction, which is 100% essential to a good ML algorithm but takes an entirely different skill set and creative mindset.
I think anyone can pick up the coding skills necessary to build ML models, but learning how and when to use them appropriately simply comes with practice. Some of the best data scientists I know can’t even articulate why they choose to apply certain techniques in certain scenarios.
– Do you feel like the number of women in the field of Machine Learning and/or Data Science is increasing?
I have no idea! I’ll have to find some data on that and get back to you with a p-value 😛
In all seriousness though, I’m sure it’s increasing, but it seems to be at a glacial pace. There are many factors that play a role in this, but ultimately I think it comes down to girls needing the confidence to say “yes, I can study data science and become an engineer or a programmer if I want to,” even if society has been telling them for years that they are more likely to fail at these professions. Who cares?! Everybody fails at something every day. It’s just a matter of not letting that get to you.
– Any advice you can give to other women who might be considering a career in Machine Learning, or similar fields?
I think data science and programming can be inherently frustrating fields, and I would encourage women who are interested to keep that in mind as they pursue data science as a career option. Just like any field, there are going to be good days and bad days. But if you love what you’re working on, it’s absolutely worth persevering through the not-so-great days because it makes the moments when you do crack some tough puzzle all that more rewarding. I love what I do and I think that for analytically minded-people who want to tackle some of the world’s hardest, most important problems, ML & data science is an incredible career path that we are lucky to have as an option in 2016 (even 10 years ago data science was so much less accessible!).
– Final thoughts?
For anyone who wants to learn “Data Science,” I want to emphasize that it’s such a fluid term. I would recommend starting out by analyzing a dataset in an area that is close to your heart, whether it’s real estate pricing data for your neighborhood, or educational curriculum & outcome data. For me, it just happened to be genomic and patient health data, because that was a problem space I felt comfortable playing around in. If you explore the myriad technologies out there to help you find patterns and predict things from that data you’re passionate about, before you know it you’ll wake up one day and realize that you’re actually a data scientist! It’s all a matter of learning how to use different tools and when to apply them, and this just comes with practice. I’ve also heard of many semester-long courses that promise to teach adults data science; although I haven’t done any of these myself, they do seem like a good way to get your foot in the door and learn some of the essential tools used by data scientists.
We hope you enjoyed reading Lauren’s thoughtful observations, and that you’ll continue to follow our Women in Machine Learning series. The work these outstanding individuals are doing is ground-breaking in every sense of the term, and their accomplishments deeply inform our own work in the field. The process of developing content for our Machine Learning Nanodegree program is regularly enriched by the contributions of experts like Lauren, and we hope you’ll consider enrolling, so that you too can become a part of this very exciting and dynamic world.
Lauren, on behalf of all at Udacity, thank you for taking the time to share your insights and experience with us!