Hopsworks AI Lakehouse with

Spark (EMR, Databricks, Cloudera, HDInsight, DataProc)

If you have large volumes of data as input to your feature or batch pipelines, then Spark can be used to compute features as DataFrames and write them directly to Hopsworks Feature Store. Spark can write features in both batch and streaming modes.

Spark can be used to implement batch feature pipelines, streaming feature pipelines, batch inference pipelines, and streaming inference pipelines for features that are written to Hopsworks Feature Store.

Hopsworks AI Lakehouse with

Spark (EMR, Databricks, Cloudera, HDInsight, DataProc)

Other integrations