We will build an operational ML system to predict air quality in London. Instead of a single monolithic ML pipeline, we will build a more manageable system as 3 FTI pipelines: a Feature pipeline, a Training pipeline, and an Inference pipeline, connected together by a feature store. The feature pipeline scrapes new data and provides historical data (air quality observations and weather forecasts), The training pipeline produces a model using the air quality observations and features. The inference pipeline takes weather forecasts and predicts air quality for London, visualized in a UI. The system will be hosted on free serverless services - Modal, Hugging Face Spaces, and Hopsworks. It will be a continually improving ML system that keeps collecting more data, making better predictions, and provides a hindcast with insights into its historical performance.
More info on PyData London.




