Creating Training Data in Hopsworks

Hopsworks Live Coding

February 24, 2022
10:00 am
10:30 am

Once data scientists have found the features needed for their model, they can create  training data to train their models with. There are a number of things you might do with your features before including them in your training data: do you want to filter out some data (e.g., only users older than 18 years old), do you want to attach online transformation functions (e.g., normalize a numerical feature) to features, how do you want to receive your training data - a Pandas DataFrame or files in CSV or TFRecord or Petastorm file formats. During this 60 minutes workshop, we will walk through how to create and work with training data.

This session requires that you have created some feature groups in your Hopsworks Feature Store. It is enough to run the feature store demo project. If you haven’t done so, here is a video to catch you up before attending this live session.

What you will learn:

  • How to create a training data from Feature Groups;
  • How to join features, filter out features, and apply transformation functions to selected features; 
  • How to re-create previously deleted training datasets.


Account created on or have access to a Hopsworks cluster.




Feature Store

