Hopsworks 4.0 is now generally available. With this release, Hopsworks can be installed and executed inside a Kubernetes cluster allowing all the different services provided by the Hopsworks platform to be available across an extended range of cloud providers and on-premises in air-gapped environments. With the 4.0 release, Hopsworks now allows our users to customize the different docker images that they use for feature engineering, training & inference with the release of the multi-environment docker images feature. Giving Hopsworks users the flexibility in how their data can be transformed has been further extended with the release of On-Demand Transformation Functions. Hopsworks 4.0 also enables A/B testing to be performed when serving models.
Kubernetes
With the Hopsworks 4.0 release, Hopsworks users can now install and run an enterprise ready machine learning operations platform on kubernetes. This allows Hopsworks users to take advantage of all the operational efficiencies that Kubernetes provides in terms of resource utilization, ease of deployment and monitoring. The deployment and management of the Hopsworks 4.0 platform can now be achieved using helm charts which enables Hopsworks users to scale up and down the platform as needed across development, testing and production environments.
Multi-environment Docker Images
As part of the Hopsworks 4.0 release, an engineering team using Hopsworks can now customize the docker images that they use for their feature, training and inference pipelines. This allows different jobs to be run with a particular docker image that is specifically for that job. Users can use a set of default docker images provided by Hopsworks for these different pipelines which can be subsequently customized. This new functionality allows our users to better maintain and manage their machine learning pipelines.
On-Demand Transformation Functions
A key component of building machine learning applications is the ability to query external data sources for real-time up to the minute information which can be incorporated into the application model. The Hopsworks 4.0 provides support for the creation and execution of on-demand transformation functions to support the querying of such external data sources. The introduction of On-Demand Transformation Functions expands the ability of Hopsworks users to transform their data as needed in feature, training and inference pipelines.
Version Upgrades
A number of libraries used by Hopsworks have been upgraded as part of the 4.0 release. These include PyTorch which has been upgraded to 2.2.2 and Tensorflow which has been upgraded to 2.16.1. In addition, Kserve has been upgraded to 0.13.0.
Major Release Breaking Changes
With the release of Hopsworks 4.0 a number of necessary breaking changes have been made to improve the overall developer experience. This list can be accessed here.
Bug Fixes
A number of bugs have been addressed with the release of 4.0, an exhaustive list of bugs can be provided upon request, below are the main bugs covering the Hopsworks Featurestore and Ancillary Services.
Featurestore
HFRONT-1242 - Load services token right before clicking the services logs
HFRONT-1239 - Remove Compute (Yarn) usage and quota from the Admin page of a Project
FSTORE-1484 - Make project ID acquisition consistent
FSTORE-1481 - Move basic things from hopsworks to hopsworks_common
FSTORE-1479 - Deduplicate project_api
FSTORE-1463 - Refactor async/aiomysql/sqlalchemy out of hsfs.util
FSTORE-1443 - Onlinefs re-balance taking long time when there are multiple topics
FSTORE-1105 - Query constructor for snowflake schema
FSTORE-812 - Enable Kafka topic auto creation
Ancillary Services
HWORKS-1619 - Create a separate API key for Airflow instead of reusing certs-operator key
HWORKS-1610 - Set termination grace period for python jobs and make it configurable
HWORKS-1589 - RonDB: Make ndbmtd memory options optional
HWORKS-1565 - Make sure the spark program exits after completion
HWORKS-1556 - Slow read times when using Hudi 0.12
HWORKS-1516 - Arrowflight logs not visible in opensearch dashboard
HWORKS-1514 - Expired JWT are not evicted from datagrid and then renewed
HWORKS-1402 - We should let users know which scope they are missing from the API key