In Hopsworks, ML pipelines, and feature engineering can be fully done in python using the Hopsworks library and any framework or library of choice. From an external environment (e.g., google colab, or any other notebook environments) or simply within an Hopsworks enterprise environment.
Intuitive data manipulation using dataframes, enabling easy transforms and cleaning of data for feature engineering. With a rich set of APIs and functions available in Python, dataframes provide a versatile and powerful way to explore and preprocess data.
Python-native environment enables data scientists to leverage their Python skills to create robust operational ML pipelines and feature engineering workflows, while also allowing them to use their preferred Python frameworks and libraries.
Hopsworks' flexible platform allows users to bring their existing Python code, frameworks, and pipelines into the environment, enabling seamless integration and collaboration across teams
Scripts and notebooks; Hopsworks’ Python-first approach is designed to improve usability and collaboration across all data teams.
Practical and versatile; With the vast amount of libraries available, MLOps with Python is the preferred language for data science and ML engineering.
Ease of use and community; leverage the vast ecosystem and communities of developers; Learn more about community with Numfocus (link)
The Feature, Training and Inference (FTI) pipeline pattern is a powerful concept for building scalable and maintainable ML pipelines. Hopsworks fully supports the FTI pattern, allowing users to seamlessly transition from feature engineering through training to inference. By breaking down the pipeline into three distinct stages, users can optimize each stage separately and avoid costly mistakes. Hopsworks' feature store is designed to support the FTI pattern by providing a centralized location to store and manage features, making it easy to reuse and share features across pipelines.
The only serverless feature store & ML platform available for free without any cloud requirements allowing everyone to create their own prediction services
Hopsworks supports the Apache Flink framework for building streaming feature pipelines that can be used for real-time feature engineering.