We describe the capabilities that need to be added to Lakehouse to make it an AI Lakehouse that can support building and operating AI-enabled batch and real-time applications as well LLM applications.
We present how Hopsworks leverages its time-travel capabilities for feature groups to support reproducible creation of training datasets using metadata.
In this article, we cover the added value of a feature store over a data warehouse when managing offline data for AI.
In this article we introduce the snowflake schema data model for feature stores, and show how it helps you include more features to make better predictions
We present a unified software architecture for batch, real-time, and LLM AI systems that is based on a shared storage layer and a decomposition of machine learning pipelines.
In this post, we will look at how to put feature pipelines into production using Hopsworks.
This article covers the different aspects of Job Scheduling in Hopsworks including how simple jobs can be scheduled through the Hopsworks UI by non-technical users
On the decision of building versus buying a feature store there are strategic and technical components to consider as it impacts both cost and technological debt.
Redis will no longer be open source. Our own project, RonDB, will continue being open source in order to uphold the principles that keeps the technology advancing.
In this article we describe the software factory approach to building and maintaining AI systems.
Hopsworks has added support for Delta Lake to accelerate our mission to build the Python-Native Data for AI platform.
A tutorial of the Hopsworks Feature Query Service which efficiently queries and joins features from multiple platforms such as Snowflake, BigQuery and Hopsworks without data any duplication.
The rapid development pace in AI is the cause for a lot of misconceptions surrounding ML and MLOps. In this post we debunk a few common myths about MLOps, LLMs and machine learning in production.
A comparison of the online feature serving performance for Hopsworks and Feast feature stores, contrasting the approaches to building a feature store.
This blog explores MLOps principles, with a focus on versioning, and provides a practical example using Hopsworks for both data and model versioning.
A tutorial of how to use our latest Bring Your Own Kafka (BYOK) capability in Hopsworks. It allows you to connect your existing Kafka clusters to your Hopsworks cluster.
We explain a new framework for ML systems as three independent ML pipelines: feature pipelines, training pipelines, and inference pipelines, creating a unified MLOps architecture.
Unlock the power of Apache Airflow in the context of feature engineering. We will delve into building a feature pipeline using Airflow, focusing on two tasks: feature binning and aggregations.
An ML model’s ability to learn and read data patterns largely depend on feature quality. With frameworks such as FeatureTools ML practitioners can automate the feature engineering process.
Discover the power of feature stores in modern machine learning systems and how they bridge the gap between model development and production.
In this article, we outline how we leveraged ArrowFlight with DuckDB to build a new service that massively improves the performance of Python clients reading from lakehouse data in the Feature Store
Find out how to use Flink to compute real-time features and make them available to online models within seconds using Hopsworks.
Explore the power of feature engineering for categorical features using Pandas. Learn essential techniques for handling categorical variables, and creating new features.
In this blog, we discuss the state-of-the-art in data management and machine learning pipelines (within the wider field of MLOps) and present the first open-source feature store, Hopsworks.
In this blog we present an end to end Git based workflow to test and deploy feature engineering, model training and inference pipelines.
In this blog, we introduce Hopsworks Connector API that is used to mount a table in an external data source as an external feature group in Hopsworks.
Discover how you can easily make the journey from ML models to putting prediction services in production by choosing best-of-breed technologies.
Learn how the Hopsworks feature store APIs work and what it takes to go from a Pandas DataFrame to features used by models for both training and inference.
Hopsworks Serverless is the first serverless feature store for ML, allowing you to manage features and models seamlessly without worrying about scaling, configuration or management of servers.
Hopsworks is the first feature store to extend its support from the traditional Big Data platforms to the Pandas-sized data realm, where Python reigns supreme. A new Python API is also provided.
Hopsworks 3.0 is a new release focused on best-in-class Python support, Feature Views unifying Offline and Online read APIs to the Feature Store, Great Expectations support, KServe and a Model serving
Operational machine learning requires the offline and online testing of both features and models. In this article, we show you how to design, build, and run test for features.
Learn how to connect Hopsworks to Snowflake and create features and make them available both offline in Snowflake and online in Hopsworks.
Learn how to set up customized alerts in Hopsworks for different events that are triggered as part of the ingestion pipeline.
With support to Apache Hudi, the Hopsworks Feature Store offers lakehouse capabilities to improve automated feature pipelines and training pipelines (MLOps).
Read about how the Hopsworks Feature Store abstracts away the complexity of a dual database system, unifying feature access for online and batch applications.
Recently, one of Sweden’s largest banks trained generative adversarial neural networks (GANs) using NVIDIA GPUs as part of its fraud and money-laundering prevention strategy.
Seeing how Redis is a popular open-source feature store with features significantly similar to RonDB, we compared the innards of RonDB’s multithreading architecture to the commercial Redis products.
Learn how to design and ingest features, browse existing features, create training datasets as DataFrames or as files on Azure Blob storage.
Connect the Hopsworks Feature Store to Amazon Redshift to transform your data into features to train models and make predictions.
Use JOINs for feature reuse to save on infrastructure and the number of feature pipelines needed to maintain models in production.
The feature store is a data warehouse of features for machine learning (ML). Architecturally, it differs from the traditional data warehouse in that it is a dual-database.
Integrate with third-party security standards and take advantage from our project-based multi-tenancy model to host data in one single shared cluster.
This blog introduces the feature store as a new element in automotive machine learning (ML) systems and as a new data science tool and process for building and deploying better Machine learning models
Learn how to integrate Kubeflow with Hopsworks and take advantage of its Feature Store and scale-out deep learning capabilities.
We have many conversations with companies and organizations who are deciding between building their own feature store and buying one. We thought we would share our experience of building one.
Integrate AWS SageMaker with Hopsworks to manage, discover and use features for creating training datasets and for serving features to operational models.
This blog introduces the Hopsworks Feature Store for Databricks, and how it can accelerate and govern your model development and operations on Databricks.
Introducing the feature store which is a new data science tool for building and deploying better AI models in the gambling and casino business.
This blog introduces platforms and methods for continuous integration (CI), delivery (CD), and training (CT) with ML platforms, with details on how to do CI/CD MLOps with a Feature Store.
Deep learning is now the state-of-the-art technique for identifying financial transactions suspected of money laundering. It delivers a lower number of false positives and with higher accuracy.