5-minute interview Charles Frye

“I think there’s this kind of overhang between what's been achieved in research and what's been achieved at a demo level and what can actually be turned into products. And we're in this fun and exciting phase now of productizing and productionizing that research.”

This week we’re interviewing Charles Frye from Modal. Charles is an AI Engineer working largely with developer advocacy and platform experimentation.

Could you tell us about Modal and what do you do there?

Modal is a serverless infrastructure platform for data, ML, and AI teams that need flexible infrastructure that makes scaling up python workloads and distributing them onto GPU’s straightforward. It’s a kind of infrastructure as code, or infrastructure from code, setup so that a data scientist or an ML researcher can actually own their infrastructure and deployments more easily. Modal is Co-founded by Eric Bernhardtson and Akshat Bubna who were sort of building the kind of tooling that they found themselves building internally or they saw other people building internally at large companies over and over again.

I was actually an early user of Modal. I found it because I was trying to make some kind of deployable AI and ML demos about a year or two ago (just before ChatGPT dropped and the world changed). I was looking at how I can better do deployment of ML models, and I found myself needing infrastructure. I was struggling with eight different types of YAML files, writing Docker files by hand, permissions and VPCs on? AWS. When I found Modal, it was like a breath of fresh air to be able to deploy stuff myself. And that's actually also around the time that I found out about Hopsworks, because Jim Dowling was also teaching courses about how to deploy ML, which is what I was doing at the time. And he was using Modal along with Hopsworks to make an actually maintainable, usable ML pipeline for your Python data scientist or Python ML researcher persona. I got promoted from user champion on social media to internal AI engineer. So I continue to use the platform for crazy AI projects, and then I also do developer relations. So I figure out what people are doing with the platform and help advocate for those uses internally.

How did you get into the ML field?

I entered the field of machine learning and AI about ten years ago when I came out to the University of California, Berkeley, for a PhD. I actually came in under the banner of Neuroscience. I've been doing Neurobiology and Psychology experiments. but when I started grad school, I was thinking that this ML thing looks like it's really taking off. AlexNet had just had its big success just about a year before, and there had been continued successes for neural networks. I was like, well, neural networks have “neuro” in the name, and it looks like this is going to be a successful technology. So that's what I should study for my PhD and that's how I got started, trying to bridge the gap between what computers and brains can do.

Why do you think ML is important?

From the beginning, there were all these things that brains could do that computers couldn’t do, at the first layer it’s just perception. Like machine perception was the first thing to tackle. Can a machine tell what's in an image? A machine is currently listening to this audio and turning it into text. That’s this machine perception layer. And it seems obvious that making machines able to do just that is going to be beneficial because we have to spend a lot of human labor on doing these things. It's effortful to have to transcribe a podcast like this manually, if it’s harder to make transcripts that will, at length, actually make podcasts less accessible. It felt like there was this huge array of potential applications if we could get this (ML) to work. And then we did in fact get it to work over the last decade, and much faster than I thought it would. I think there's this kind of overhang between what's been achieved in research and what's been achieved at a demo level and what can actually be turned into products. And we're in this fun and exciting phase now of productizing and productionizing that research.

Do you see more GenAI and LLM related use cases at Modal?

Yes, absolutely. I think the goal is to make a very generic infrastructure platform that's just serverless computing that is actually good and usable. But the place where we've seen our initial product market fit or we've seen something that has really taken off with a bunch of use cases, is in running inference for midsized generative models. So something that's on the scale of a billion parameters to 13 billion parameters, or something like a podcast transcription with Whisper that is smaller than 13 billion scale models, but still very useful for doing something like transcribing a podcast. In that case, it's actually a very parallelizable task. You can run 1 minute at a time on 100 workers, and that means you can finish 100 minutes of transcription in however long it takes you to transcribe 1 minute. We've seen a wide array of those things, as well as just classic ETL, the kind of thing that you might run to create a feature in a feature store, cron jobs and triggered functions, or just hosting a tiny little python web server. All this stuff is the same basic infrastructure and same core tooling works for it.

Do you have any interesting resources to recommend?

For the Modal side I work on documentation and examples, and we kind of pride ourselves on the quality of that documentation examples. We maintain and monitor the quality the way that we do the rest of our software. There's actually a really cool example there that I wrote when I was still just a user of how to make customized, stable diffusion that'll make pictures of your pet. A more recent example, I set up a TensorRT-LLM, which is a notoriously hard tool to get working but once you get it working it can double or triple the throughput of your LLM application. So two different little examples there to show off what you can do with Modal.

At a higher level for AI engineering right now, there aren't a ton of great books to recommend. So right now it's mostly like people are hacking and you can find out about it on Twitter or Discord but there's some events I would recommend. So in the month of June, the MLOps community is having an event on Quality Assurance for AI. I'll be giving a talk there. Then there is the AI Engineer World's Fair put out by the people behind the Latent Space podcast. Both events I think are in San Francisco and I'll do a little workshop there as well.

Listen to the full episode: