When you create a feature group, you can select one or more features (columns) as the partition key, storing data with the same partition key values in the same directory. Partitioning can enable faster queries using a feature store’s Offline API, by enabling you to pass partition key values, and only results with those partition key values will be read in the query. For example, in Hopsworks, the offline store uses Hive-style partitioning to store feature group partitions in directories. If you only want to create training data for users in the location “USA”, you can pass a filter to your feature view, and only data in the feature group’s subdirectory “USA” will be read, skipping the rest of the data in the feature group:
Data modeling for feature stores involves organizing your entities and features into feature groups. In data warehousing, dimensional modeling is a data modeling technique that identifies entities and then decomposes your data into “facts” and “dimensions” related to those entities.