Integrated GPU Workload Orchestration

Maximize your GPU utilization for LLMs and Deep Learning, on-premises or in the cloud with a scalable AI Platform for model training and inference - with Ray, PyTorch, KServe, Airflow, and more.

Hopsworks is an AI platform on Kubernetes that can bring together your teams, data, and compute in a single system to enable improved sharing of resources and knowledge. Hopsworks is a scalable platform that can manage 10s of thousands of users and GPUs and petabytes of data in both object storage and on tiered NVMe storage for high performance workloads. You don’t need to integrate separate GPU scheduling platforms with data and AI platforms - you can develop and run your AI workloads (from feature engineering to training and inference) on Hopsworks. Whether it’s batch AI, real-time AI, or LLM, Hopsworks has you covered.

Optimize your GPU Utilization with the most advanced scheduling support.

GPU Resource Sharing
Share GPUs between teams and developers to optimize GPU utilization. Hopsworks enforces queues, quotas and time-sharing policies so that multiple teams or users can use GPUs from a shared pool for only the time needed. The same GPUs can be used for model training or model inference, depending on need.
Priority Scheduling for GPUs
GPU usage is typically a mix of experimentation and production jobs. Priority scheduling ensures that production jobs are allocated GPUs when they need them and if they GPUs are not being used by production jobs, teams can use them for lower priority jobs and experimentation. You can even dedicate GPUs or GPU servers to teams if needed.
Data Processing, Training, and Inference Workloads on a Single Cluster
From feature pipelines to model training and batch serving, Hopsworks handles full-stack ML operations with native support for GPU-heavy workflows — no more stitching together platforms. You only need a single cluster for your CPU and GPU workloads. Mix Spark/Flink workloads with training and inference workloads, all on a single cluster.

Observability & Control
Get real-time insights into GPU usage, job metrics, and system load. Set quotas and charge business units for usage through usage reporting.

Hopsworks is built to integrate seamlessly with your existing infrastructure and is already working with industry leaders like NVIDIA, Supermicro, and OVHcloud.

Relevant Resources