Blog

Slurm Workload Manager: The go-to scheduler for HPC and AI workloads

Slurm Workload Manager is a cornerstone of high-performance computing (HPC) infrastructure, trusted by supercomputing centers worldwide for its scalability and flexibility. As AI workloads grow in size and complexity, Slurm is gaining traction among ML teams as well. In this article, we will look at why it remains relevant, how it supports GPU clusters and what to consider when using it in AI workflows.

Technology articles

Agent 101: Launching production-grade agents at scale

To go from prototype to production, AI agents need more than just a good model. In this guide, we break down the four components that matter most: reliable LLMs, orchestration frameworks, evaluation tools, and memory systems. We cover how teams are using Nebius AI Studio with CrewAI, ADK, LangChain, and more to ship scalable, observability-friendly agent workflows, all powered by fast, cost-efficient inference.

Technology articles

From genome analysis to quantum chemistry: Nebius powers the next generation of biotech research with NVIDIA

As part of NVIDIA GTC Paris at VivaTech, Nebius has announced deeper integration of the NVIDIA AI Enterprise software suite. This includes NVIDIA BioNeMo, a collection of tools, applications, generative AI solutions, and pre-trained microservices (NVIDIA NIM) designed squarely for the biopharma sector.

Technology articles

What is object storage: Key differences from traditional storage explained

Learn the fundamentals of object storage, how it differs from traditional block storage solutions and why it is becoming the go-to choice for modern data management. With data volumes exploding and cloud systems becoming the norm, traditional systems are struggling to keep up with today’s data chaos. That’s where object storage comes in — built for the cloud, made to scale and ready for anything. This guide cuts through the jargon to show you why object storage is the future and how it outperforms block storage where it counts.

Technology articles

Kubernetes: How to use it for AI workloads

Building and deploying AI systems at scale means juggling complex infrastructure — and that’s where Kubernetes shines. From managing GPU resources to scaling inference endpoints, Kubernetes brings structure and automation to the chaos of machine learning pipelines. In this article, we’ll break down how Kubernetes works, why it’s a natural fit for AI workloads and what best practices help keep things resilient, reproducible and production-ready.

Technology articles

The role of compute cluster networking for AI training and inference

While earlier machine learning models could be trained on CPU servers with one or two GPUs, today’s generative AI models have billions of parameters — orders of magnitude more than their predecessors. Such models require terabytes of training data that can only be parallel processed over multiple GPU servers. These GPU servers work together in clusters to run the underlying data computations that make these models work. This article explores GPU cluster networking technologies and their critical role in accelerating AI workloads.

Technology articles

Introduction to model distillation: Efficient knowledge transfer for AI applications

Model distillation is a powerful technique in machine learning where a compact “student” model learns to replicate the behavior of a larger, more complex “teacher” model within the given task. In this tutorial, we demonstrate how to perform distillation by using Nebius AI Studio, to create a grammar-correcting model.

Technology articles

Incident post-mortem analysis: outage of the S3 service in the eu-north1 region

A detailed analysis of the incident on May 5th, 2025 that led to an outage of the S3 service in the eu-north1 region.

Technology articles

Serving Qwen3 models on Nebius AI Cloud by using SkyPilot and SGLang

Explore how to get Qwen3 running on Nebius AI Cloud with SkyPilot and SGLang. This setup enables you to deploy both the massive 235B MoE model and the efficient 32B variant seamlessly, leveraging high throughput, cost-effective scaling and robust multilingual support.

Technology articles

Serving Llama 4 models on Nebius AI Cloud with SkyPilot and SGLang

Let’s walk through how to get Llama 4 running on Nebius AI Cloud (recently integrated with SkyPilot) by using SGLang as the serving framework. This combo provides high throughput, efficient memory usage and none of the typical deployment headaches.

Technology articles

Understanding the Model Context Protocol: Architecture

As LLM-powered agents become more complex, integrating them with tools, APIs, and private data sources remains a major challenge. Model Context Protocol (MCP) offers a clean, open standard for connecting language models to real-world systems through a modular, plug-and-play interface. In this article, we explore how MCP works.

Technology articles

What is Apache Spark and how can it help with LLMs?

Large language models (LLMs) rely on fast data processing and distributed computing, making the efficiency of data processing tools a critical factor. Apache Spark streamlines text data preparation, enables parallel processing of massive datasets and simplifies the development of scalable ML workflows. This article explores Spark’s architecture, its advantages for data preparation and solutions to common limitations when working with LLM-scale models.

Blog

Slurm Workload Manager: The go-to scheduler for HPC and AI workloads

Agent 101: Launching production-grade agents at scale

From genome analysis to quantum chemistry: Nebius powers the next generation of biotech research with NVIDIA

What is object storage: Key differences from traditional storage explained

Kubernetes: How to use it for AI workloads

The role of compute cluster networking for AI training and inference

Introduction to model distillation: Efficient knowledge transfer for AI applications

Incident post-mortem analysis: outage of the S3 service in the eu-north1 region

Serving Qwen3 models on Nebius AI Cloud by using SkyPilot and SGLang

Serving Llama 4 models on Nebius AI Cloud with SkyPilot and SGLang

Understanding the Model Context Protocol: Architecture

What is Apache Spark and how can it help with LLMs?

Products

Resources

Solutions

Prices

Security and compliance

Programs

Company

Legal