Hopsworks feature store can be configured to leverage the content of data warehouses to simplify the data science workflow. For data scientists, using data directly from a data warehouse presents three challenges: data in the data warehouse is often updated making it impossible to reproduce previously generated training data and previous experiments. Data warehouses often lack the historical view of the data, leaving to data scientists the chore of building it. Finally productionizing a model often requires building additional pipelines to make the same data available in a low latency database for online serving.
In this talk we will discuss how Hopsworks can be connected to existing cloud native data warehouses like Snowflake, Redshift and BigQuery. We will show how to use data warehouses as a source of data to build historical and reproducible training dataset. We will show how to leverage the core functionalities of Hopsworks: Python centric APIs, time travel, statistics, search and data validation to build historical, clean and reproducible dataset to train and productionize machine learning models.
In this webinar, Fabio Buso, VP of Engineering at Hopsworks will present the new 3.0 release; How our new Feature View and write APIs have evolved to help Data Scientists bring their models to production, and other state-of-the art new improvements.
In this webinar, we will introduce the challenges of building online feature pipelines, including integration with a feature store, online model serving infrastructure, and real-time feature computation, with stream processing and on-demand features.