Hopsworks 4.1 is now generally available. This new release adds support for vLLM server, the python Ray framework for building and scaling distributed applications and improved support for managing the scheduling of Hopsworks jobs running on kubernetes.
vLLM Server
With the release of 4.1, Hopsworks has now integrated the popular vLLM server library for LLM inference and serving. With vLLM server, Hopsworks enables a number of the important features that vLLM server provides including management of attention key and value memory, batching of incoming requests, quantization, tensor parallelism and pipeline parallelism and much more.
Ray Framework
Hopsworks now supports the python Ray framework for building distributed applications which provides the compute layer for parallel processing. Ray minimizes the complexity of running your distributed individual and end-to-end machine learning workflows including data preprocessing, distributed training, hyperparameter tuning, reinforcement learning, and model serving. With the kubernetes support in Hopsworks, integrating a Ray cluster with the existing Hopsworks kubernetes tools and infrastructure is greatly simplified.
Job Scheduling
With the Hopsworks 4.1 release, kubernetes labels and priority classes provide powerful tools for efficient and organized job scheduling in Hopsworks, especially in environments where multiple teams or workloads share the same cluster. These capabilities are invaluable for targeted scheduling and management, as it allows teams to define custom job groups, manage workloads based on labels, and enforce specific scheduling policies
Bug Fixes
Featurestore
FSTORE-1572 - Add weakref.finalize to Connection
FSTORE-1496 - Return online table information to user using OnlineConfigDTO
FSTORE-1539 - Issue reading s3 bucket when using pyspark
Ancillary Services
HWORKS-1715 - Speed up environment cloning
HWORKS-1722 - Use Persistent Volume in remote shuffle servers
HWORKS-1793 - Add option to set custom JVM Options for namenode/datanode.