Lyft is a ridesharing company that spun out of a hackathon at Zimride. Unlike its spiritual predecessor, which focused on long-distance ridesharing trips for college campuses, Lyft is primarily targeted at shorter commutes within cities. As a passenger, like most folks who interact with the app, I like to think about Lyft’s product as a collection of services on a spectrum, as it was once described to me by someone we’re lucky to have on my team. In selecting a specific feature, I’m trading off between having greater comfort and control over my ride on one end with our Lux services and enjoying greater economy and a more communal experience on the other with shared rides like Line and Shuttle. - Hao Yi Ong, Research Scientist at LyftIn the old worldview of Lyft business platforms, models were simply extensions of simple business rules handcrafted by analysts. Their infrastructure, operations, and product simply had to be re-tailored for a world where decisions are made automatically by machine learning (ML) models. At the Applied AI Summit in Houston this November 29 - 30, Hao Yi will present his work on operationalising machine learning for Lyft's business platforms. The presentation will focus on the operational realities of scaling ML-based decisioning beyond the technical infrastructural challenges in the context of fraud, enterprise, and customer support. Within these “business platforms,” Hao Yi and his team explore the unique challenges of working with messy data sources that suffer from feedback loops and adaptive adversaries, building ML-based workflows that improve support agent efficiency, and developing scalable infrastructure that can serve the needs of diverse teams. In advance of the summit, we spoke with Hao Yi to learn more about his current work at the company.

How are you using AI at Lyft?

As a result of Lyft’s spectrum of product features, we get many interesting problems. From the more traditional consumer growth and ad-targeting problems that most tech companies are familiar with to supply and demand modeling and pricing decisions in the marketplace to uncertainty estimation for locations and routing, the sheer diversity of problems and even larger variety of mathematical approaches (what we might call “AI”) for each of them are what drew me to Lyft in the first place. Machine learning is a big part of our product features across the company. But there’re many other problems that aren’t obviously amenable to machine learning and we rely on other methods to tackle them.
It’s hard to give an exhaustive list of how Lyft uses AI. As a teaser, consider the dispatch problem that arises whenever you request a ride. In this case, we have what we call a“bipartite matching problem” to determine what’s the best set of matches between passengers on the one hand and drivers on the other. But deciding which passenger-driver pair to prioritize depends on market efficiency and ride satisfaction considerations that are determined using statistical machine learning models and longer-term budgeting considerations.
It’s easy to read a few Medium blog posts and point out each and every small part of the business and product that we can improve using AI, such as ranking models that help our support teams by recommending the most likely issue type. What’s really important, however, is thinking about the overall AI framework that encompasses a larger product feature beyond a few automation tools and getting the right people with the right background onboard to develop it. And that’s the way we’re striving towards being an AI-first company at Lyft.

Give us an overview of your role as a research scientist at LyftIt’s hard to pin down exactly what a research scientist is given the alphabet soup of titles across the industry: DS (data scientist), MLE (machine learning engineer), SWE ML (software engineer — machine learning), ARS (applied research scientist), RE (research engineer), BI (business intelligence) analyst… You get the point. Different companies have different expectations for the roles. Often, different managers and recruiters within the same company have different expectations of the same roles, and of course, these expectations change from team to team and as the company grows. At Lyft, research science can be described as applied research that generates product impact, with conference talks and papers being a happy side effect.

In the past two years at Lyft, I was a research scientist on what is called the Business Platform “Science vertical.” In this vertical, I developed machine learning-based solutions to problems faced by the Fraud, Support Experience (XP), Enterprise, and Driver Quality teams. As what you might expect, a main part of my role consists translating abstract descriptions of the various problems faced by the team into clear business metrics and mathematical formulations of them. But that’s not all. Merely generating a classification model by itself isn’t ultimately going to solve the problem.
Success in my role includes driving and managing the project from a research science perspective. From the inception of the project, I’m thinking about who the relevant stakeholders are across different teams and organizations, what needs to be built, how we can work together, when they should be brought in, what are the risks and challenges, and how to set objectives and measure our progress against them.
For instance, in Q2 this year, I led the development of a machine learning-based workflow for our Business Operations team in the Philippines to solve a fraud problem faced by our enterprise partners on our Concierge product. I worked closely with the team’s manager to size staffing needs based on early offline evaluations of the model’s performance and developed a technical roadmap with engineering managers on the Fraud and Enterprise teams that gelled well with the machine learning model. The execution of the technical roadmap was planned in a way that aligned with how quickly the new agents could ramp up and start helping us solve the problem. At the same time, I worked with our corporate counsel to ensure that we were careful with our approach and the counter-fraud actions we took to curb the problem.
At the end of the day, the machine learning model and even the infrastructure is but one small piece in a bigger puzzle we had to piece together. We wouldn’t have severely reduced the fraud rate and hit our Q2 OKR (objective and key result) without someone who understood how the machine learning components fit in with all these other moving parts and can drive the project.
How did you begin your work in machine learning?

In all honesty, I was thrust into it. I had some background for it, of course, acquired through the immensely popular CS 229 course taught by Andrew Ng at Stanford. As a reward for having done well in his class, a bunch of us were invited to the Baidu Research office in Sunnyvale and exposed to some real-world applications of deep learning. But beyond that, I never really used machine learning before Lyft.
Prior to Lyft, my work centred on optimization and automating decision making under uncertainty for large problems. I was a graduate student working under a fellowship from NASA Ames on their drone traffic management program — very different from the machine learning projects that I focused on when I first stepped into industry. That said, there was a lot of underlying math background that helped me get up to speed really quickly with the machine learning literature and apply them in practice.
That said, I’m glad I was recognized and given the opportunity to — it helped accelerate my learning in this field and grow as a research scientist in Lyft’s context. You can read more about it in a pretty substantive blog post I’ve written about my journey. In brief, we had a really immature machine learning infrastructure when I first joined Lyft. But rather than bogging me down, it helped me learn really quickly. Being exposed to many of the issues early on forced our team to constantly think about improving it just to keep our heads above water. To borrow a cliche from a certain technical vocational institute, I took a sip from the firehose.

How are you using AI for good in your work?Ultimately, my perspective on Lyft is that it’s about the drivers that we serve and the passengers that they, in turn, serve. The impact isn’t always obvious and direct. But what drives my work also my team’s goal of improving the driver earning experience on our platform. (I’m currently working on Personal Power Zones on the Driver Positioning team.) That doesn’t simply just mean improving their earnings but also the process by which they make their earnings. For instance, is Prime Time really the best experience for them? If not, what is better?
I’ve also worked a little on establishing best practices with regard to algorithmic fairness in internal settings, and it’s unfortunately a sensitive topic to bring up in a public context. What I can say is that at Lyft, we’re committed to making sure that our service is accessible and effective to everyone no matter who they are. This line of work is not dissimilar from broader Lyft initiatives such as our recent Ride to Vote, which provides ride promotions aimed at reducing obstacles to voting for underserved communities.
What’s next for you?I’m currently focused on our Personal Power Zones product on the Driver Positioning team. A big part of my work is on developing a comprehensive framework that gracefully focuses on the system-level interactions between the optimization workflow and larger systems. This framework is particularly important to our team’s product, where we know hidden technical debt may rapidly accumulate not just at the infrastructural but also the algorithmic level.
In this project, I’m trying to build a forward-looking framework that not only solves the immediate product problem but also sets us up for success when we try to incorporate various product features both internal and external. While there are many machine learning components here (forecasting, driver preference models, etc.), it’s but a small piece in a much larger optimization framework that has to encompass a variety of product considerations. For instance, how do we let general managers in each city figure out how to tell our algorithm that it’s okay to spend more on driver bonuses for big events? How do we let them tell our algorithm where these events are going to happen?
It’s an exciting place to be and I hope to be able to make a big impact to improving our product for drivers.

Keen to learn more from Hao Yi? Join us at the Applied AI Summit in Houston this November 29 - 30. Additional confirmed speakers include AI experts from Coursera, Wayfair, Reddit, Netflix, PipelineAI & more.