2026-03-31
Hopsworks 4.8: Trino Analytics, Apache Superset & Data Sources
Hopsworks 4.8 is now generally available. This release introduces native Trino and Apache Superset integration, DLTHub integration for data ingestion and a unified Data Sources API replacing storage connectors.
Trino Query Engine
Trino is now a first-class component of the Hopsworks platform in 4.8. The integration includes a Python API for submitting queries directly from notebooks and pipelines, an external load balancer for Trino endpoints, a Trino event listener for audit and observability, Prometheus metrics and Grafana dashboards for Trino query performance. This enables large-scale analytical queries across the Feature Store and connected data sources without leaving the Hopsworks environment.
Apache Superset
Hopsworks 4.8 includes Apache Superset as an integrated business intelligence layer. Teams can create interactive dashboards and charts over Feature Store data, online store metrics, and connected data sources directly within the platform. Superset is deployed as a managed component with Hopsworks authentication, and 4.8 ships with pre-built dashboards for the Hopsworks platform including Grafana dashboard templates for GPU operator metrics.
DLT Hub Integration
Hopsworks 4.8 integrates dlt (data load tool), the open-source Python library for building declarative data pipelines. dlt Hub connects Hopsworks to hundreds of pre-built sources - CRM systems, SaaS platforms, REST APIs, databases, and enables data to be loaded directly into the Feature Store without writing custom extraction code. Pipelines are defined as Python functions, run as Hopsworks Jobs, and benefit from automatic schema inference, incremental loading, and full lineage tracking within the platform.
Data Sources API
Storage connectors have been unified and renamed to Data Sources in 4.8. The new API provides a consistent interface for reading from and writing to external storage regardless of the underlying system - S3, GCS, ADLS, Snowflake, BigQuery, Redshift, JDBC, and more. Each project is pre-configured with three default Data Sources covering online feature store, offline feature store, and training dataset storage. The previous direct storage connector API is deprecated but remains functional for backwards compatibility. MySQL is now supported as a SQL data source alongside PostgreSQL.
Bug Fixes
Featurestore
FSTORE-1965: Shadowing variable caused incorrect feature values in transformation pipelines.
FSTORE-1987: Feature group insert failed in certain schema configurations.
FSTORE-1994: Duplicate column error when inserting into a feature group with event_time as part of a composite primary key.
FSTORE-2008: Similarity search returned incorrect results due to a timezone handling issue.
Ancillary Services
HWORKS-2340: Model name was not validated before moving or copying model files to the registry, leading to silent failures.
HWORKS-2587: MySQL node-down alert fired incorrectly during HPA scaling events.
HWORKS-2630: Consul CoreDNS autoconf job was not always re-run on upgrade when enabled.
HWORKS-2527: Ivy cache directory was not explicitly set, causing dependency resolution failures in air-gapped environments.