Contact us

Request access for Managed MLflow

Request access

Contact our team to get access

Make your bid

Request access for Managed Spark

Request access to Inference Service by Nebius AI Studio

Subscribe

Contact us for H200

Talk to an expert in biotech

Contact us for Blackwell GPU cluster

Subscribe to media updates

Stay in touch with the Nebius for Startups team

Contact AI Studio team

Contact AI Studio sales team 

Register now

Blog

NVIDIA HGX B200 instances are now publicly available as&nbsp;self-service AI&nbsp;clusters on&nbsp;Nebius AI&nbsp;Cloud. This means anyone can access NVIDIA Blackwell&nbsp;&mdash; the latest generation of&nbsp;NVIDIA&rsquo;s accelerated computing platform&nbsp;&mdash; with just a&nbsp;few clicks and a&nbsp;credit card.

Introducing self-service NVIDIA Blackwell GPUs on Nebius AI Cloud

Introducing self-service NVIDIA&nbsp;Blackwell GPUs on&nbsp;Nebius AI&nbsp;Cloud

Our July&rsquo;s digest describes how our customers like Stanford and Shopify use flexible compute capacities and the steps we&nbsp;are taking to&nbsp;boost performance. Nebius&rsquo; first anniversary also took place in&nbsp;July, marked by&nbsp;Nasdaq in&nbsp;Times Square, and there was news from across the ocean as&nbsp;well.

Nebius monthly digest: July 2025

Nebius monthly digest: July 2025

Slurm Workload Manager is&nbsp;a&nbsp;cornerstone of&nbsp;high-performance computing (HPC) infrastructure, trusted by&nbsp;supercomputing centers worldwide for its scalability and flexibility. As&nbsp;AI&nbsp;workloads grow in&nbsp;size and complexity, Slurm is&nbsp;gaining traction among ML&nbsp;teams as&nbsp;well. In&nbsp;this article, we&nbsp;will look at&nbsp;why it&nbsp;remains relevant, how it&nbsp;supports GPU clusters and what to&nbsp;consider when using it&nbsp;in&nbsp;AI&nbsp;workflows.

Slurm Workload Manager: The go-to scheduler for HPC and AI workloads

Slurm Workload Manager: The go-to scheduler for HPC and AI&nbsp;workloads

Skip the CLI commands and web console clicks&nbsp;&mdash; just ask Claude about your cloud infrastructure. Today, we&rsquo;re excited to&nbsp;announce the Nebius MCP Server, our integration that connects Claude by&nbsp;Anthropic, or&nbsp;other AI&nbsp;chatbots, to&nbsp;the Nebius AI&nbsp;Cloud infrastructure.

Introducing Nebius MCP Server: The LLM-native way to manage your AI Cloud 

Introducing Nebius MCP Server: The LLM-native way to&nbsp;manage your AI&nbsp;Cloud

Welcome back to&nbsp;our quarterly roundup featuring all the improvements and updates delivered on&nbsp;Nebius AI&nbsp;Cloud over the past three months. This overview highlights the key developments we&rsquo;ve made to&nbsp;enhance your AI&nbsp;infrastructure experience.

Q2 2025: Nebius AI Cloud updates

Q2 2025: Nebius AI&nbsp;Cloud updates

To&nbsp;go&nbsp;from prototype to&nbsp;production, AI&nbsp;agents need more than just a&nbsp;good model. In&nbsp;this guide, we&nbsp;break down the four components that matter most: reliable LLMs, orchestration frameworks, evaluation tools, and memory systems. We&nbsp;cover how teams are using Nebius AI&nbsp;Studio with CrewAI, ADK, LangChain, and more to&nbsp;ship scalable, observability-friendly agent workflows, all powered by&nbsp;fast, cost-efficient inference.

Agent 101: Launching production-grade agents at scale

Agent 101: Launching&nbsp;production-grade agents at&nbsp;scale

Our standout moment in&nbsp;June was NVIDIA GTC Paris, where Nebius was recognized multiple times. To&nbsp;mark the conference, we&nbsp;announced the opening of&nbsp;the UK&nbsp;cloud region and introduced integration with NVIDIA DGX Cloud Lepton, NVIDIA AI&nbsp;Enterprise stack and other systems. Also, our ISEG2 is&nbsp;#13 on&nbsp;the Top500 of&nbsp;supercomputers; we&nbsp;held the AI&nbsp;Discovery Award, and launched Managed Soperator for Slurm-based workloads. Read these and other news in&nbsp;today&rsquo;s digest.

Nebius monthly digest: June 2025

Nebius monthly digest: June 2025

We&nbsp;kicked off Q2 with a&nbsp;single mission: turn raw compute horsepower into concrete business outcomes. With numerous launches, including groundbreaking models, streamlined fine-tuning, scalable throughput and seamless integrations, Nebius AI&nbsp;Studio has been updated significantly.

Nebius AI Studio Q2 2025 updates

Nebius AI&nbsp;Studio Q2&nbsp;2025&nbsp;updates

Global Head of Life Science and Healthcare Growth at Nebius

As&nbsp;part of&nbsp;NVIDIA GTC Paris at&nbsp;VivaTech, Nebius has announced deeper integration of&nbsp;the NVIDIA AI&nbsp;Enterprise software suite. This includes NVIDIA BioNeMo, a&nbsp;collection of&nbsp;tools, applications, generative AI&nbsp;solutions, and pre-trained microservices (NVIDIA NIM) designed squarely for the biopharma sector.

From genome analysis to quantum chemistry: Nebius powers the next generation of biotech research with NVIDIA

From genome analysis to&nbsp;quantum chemistry: Nebius powers the next generation of&nbsp;biotech research with NVIDIA

Learn the fundamentals of&nbsp;object storage, how it&nbsp;differs from traditional block storage solutions and why it&nbsp;is&nbsp;becoming the go-to choice for modern data management. With data volumes exploding and cloud systems becoming the norm, traditional systems are struggling to&nbsp;keep up&nbsp;with today&rsquo;s data chaos. That&rsquo;s where object storage comes in&nbsp;&mdash; built for the cloud, made to&nbsp;scale and ready for anything. This guide cuts through the jargon to&nbsp;show you why object storage is&nbsp;the future and how it&nbsp;outperforms block storage where it&nbsp;counts.

What is object storage: Key differences from traditional storage explained

What is&nbsp;object storage: Key differences from traditional storage explained

Managed Soperator, our fully managed Slurm-on-Kubernetes solution, is&nbsp;now available for everyone in&nbsp;self-service. It&nbsp;provides you with a&nbsp;ready-to-work Slurm training cluster, powered by&nbsp;modern hardware and delivered with all necessary pre-installed libraries and drivers&nbsp;&mdash; allowing you to&nbsp;start ML&nbsp;training immediately.

Introducing Managed Soperator: Your quick access to Slurm training

Introducing Managed Soperator: Your quick access to&nbsp;Slurm training

Building and deploying AI&nbsp;systems at&nbsp;scale means juggling complex infrastructure&nbsp;&mdash; and that&rsquo;s where Kubernetes shines. From managing GPU resources to&nbsp;scaling inference endpoints, Kubernetes brings structure and automation to&nbsp;the chaos of&nbsp;machine learning pipelines. In&nbsp;this article, we&rsquo;ll break down how Kubernetes works, why it&rsquo;s a&nbsp;natural fit for AI&nbsp;workloads and what best practices help keep things resilient, reproducible and production-ready.

Kubernetes: How to use it for AI workloads

Kubernetes: How to&nbsp;use it&nbsp;for AI&nbsp;workloads

Our AI&nbsp;R&amp;D team announces the open-source release of&nbsp;the SWE-rebench dataset of&nbsp;more than 21,000 real-world, interactive software engineering tasks. For a&nbsp;detailed methodology and technical report, please see our accompanying <a href="https://arxiv.org/abs/2505.20411">paper on&nbsp;arXiv</a>.

SWE-rebench dataset: More than 21,000 verifiable tasks for SWE agents

SWE-rebench dataset: More than 21,000 verifiable tasks for SWE agents

While earlier machine learning models could be&nbsp;trained on&nbsp;CPU servers with one or&nbsp;two GPUs, today&rsquo;s generative AI&nbsp;models have billions of&nbsp;parameters&nbsp;&mdash; orders of&nbsp;magnitude more than their predecessors. Such models require terabytes of&nbsp;training data that can only be&nbsp;parallel processed over multiple GPU servers. These GPU servers work together in&nbsp;clusters to&nbsp;run the underlying data computations that make these models work. This article explores GPU cluster networking technologies and their critical role in&nbsp;accelerating AI&nbsp;workloads.

The role of compute cluster networking for AI training and inference

The role of&nbsp;compute cluster networking for AI&nbsp;training and inference

Even larger workloads are welcome! We&nbsp;raised quotas last month, so&nbsp;you can now access up&nbsp;to&nbsp;32&nbsp;NVIDIA H200 GPUs on&nbsp;demand in&nbsp;both the US&nbsp;and Europe. Speaking of&nbsp;Europe, we&rsquo;ll be&nbsp;revealing many of&nbsp;our plans for the region at&nbsp;NVIDIA GTC Paris next week.

Nebius monthly digest: May 2025

Nebius monthly digest: May 2025

Today, we&rsquo;re thrilled to&nbsp;announce our first submission of&nbsp;MLPerf&reg; Training v5.0 results. As&nbsp;a&nbsp;peer-reviewed industry benchmark suite, MLPerf&reg; Training by&nbsp;MLCommons&reg; is&nbsp;one of&nbsp;the most trustworthy sources of&nbsp;data about AI&nbsp;cloud performance in&nbsp;the industry.

Nebius demonstrates industry-leading AI training performance in latest MLPerf® results

Nebius demonstrates industry-leading AI&nbsp;training performance in&nbsp;latest MLPerf&reg; results

Model distillation is&nbsp;a&nbsp;powerful technique in&nbsp;machine learning where a&nbsp;compact &ldquo;student&rdquo; model learns to&nbsp;replicate the behavior of&nbsp;a&nbsp;larger, more complex &ldquo;teacher&rdquo; model within the given task. In&nbsp;this tutorial, we&nbsp;demonstrate how to&nbsp;perform distillation by&nbsp;using Nebius AI&nbsp;Studio, to&nbsp;create a&nbsp;grammar-correcting model.

Introduction to model distillation: Efficient knowledge transfer for AI applications

Introduction to&nbsp;model distillation: Efficient knowledge transfer for AI&nbsp;applications

A&nbsp;detailed analysis of&nbsp;the incident on&nbsp;May 5th, 2025&nbsp;that led to&nbsp;an&nbsp;outage of&nbsp;the S3 service in&nbsp;the eu-north1 region.

Incident post-mortem analysis: outage of the S3 service in the eu-north1 region

Incident post-mortem analysis: outage of&nbsp;the S3 service in&nbsp;the eu-north1 region

In&nbsp;the past weeks, we&nbsp;have made several significant observability-related improvements to&nbsp;the Nebius AI&nbsp;Cloud. Among them: advanced monitoring metrics in&nbsp;our web console and API, out-of-the-box Grafana dashboards, as&nbsp;well as&nbsp;monitoring and logging features, allowing customers to&nbsp;upload their custom metrics and logs to&nbsp;our cloud.

Improving AI cluster observability: New metrics, Grafana dashboards and advanced logging

Improving AI&nbsp;cluster observability: New metrics, Grafana dashboards and advanced logging

Our AI&nbsp;R&amp;D team presents SWE-rebench, a&nbsp;new benchmark for evaluating agentic LLMs on&nbsp;a&nbsp;continuously updated and decontaminated set of&nbsp;real-world software engineering tasks mined from real GitHub repositories. Our goal with this benchmark is&nbsp;to&nbsp;make evaluation of&nbsp;software engineering LLMs more transparent, reproducible and focused on&nbsp;core model capabilities.

SWE-rebench: A continuously updated benchmark for SWE LLMs

SWE-rebench: A&nbsp;continuously updated benchmark for&nbsp;SWE&nbsp;LLMs

Explore how to&nbsp;get Qwen3 running on&nbsp;Nebius AI&nbsp;Cloud with SkyPilot and SGLang. This setup enables you to&nbsp;deploy both the massive 235B MoE model and the efficient 32B variant seamlessly, leveraging high throughput, cost-effective scaling and robust multilingual support.

Serving Qwen3 models on Nebius AI Cloud by using SkyPilot and SGLang

Serving Qwen3 models on&nbsp;Nebius AI&nbsp;Cloud by&nbsp;using SkyPilot and SGLang

Let&rsquo;s walk through how to&nbsp;get Llama 4&nbsp;running on&nbsp;Nebius AI&nbsp;Cloud (recently integrated with SkyPilot) by&nbsp;using SGLang as&nbsp;the serving framework. This combo provides high throughput, efficient memory usage and none of&nbsp;the typical deployment headaches.

Serving Llama 4 models on Nebius AI Cloud with SkyPilot and SGLang

Serving Llama 4&nbsp;models on&nbsp;Nebius AI&nbsp;Cloud with SkyPilot and SGLang

This month, we&nbsp;revealed some of&nbsp;the partnerships and collaborations we&rsquo;ve been engaged in, such as&nbsp;with Meta and&nbsp;DDN. We&nbsp;also showcased work done by&nbsp;our customers like vLLM and Brave, as&nbsp;well as&nbsp;participants of&nbsp;our Research credits program&nbsp;&mdash; including Stanford, USC and others.

Nebius monthly digest: April 2025

Nebius monthly digest: April 2025

As&nbsp;LLM-powered agents become more complex, integrating them with tools, APIs, and private data sources remains a&nbsp;major challenge. Model Context Protocol (MCP) offers a&nbsp;clean, open standard for connecting language models to&nbsp;real-world systems through a&nbsp;modular, plug-and-play interface. In&nbsp;this article, we&nbsp;explore how MCP works.

Understanding the Model Context Protocol: Architecture

Understanding the Model Context Protocol: Architecture

In&nbsp;this short research blog post, we&nbsp;discuss how reinforcement learning-finetuned reasoning language models can be&nbsp;used as&nbsp;a&nbsp;better alternative to&nbsp;regression-based critics when performing parallel trajectory search in&nbsp;test time, and how to&nbsp;train such models to&nbsp;prioritize precision over recall.

Reasoning critics enable better parallel search for software engineering agents

Reasoning critics enable better parallel search for software engineering agents

Large language models (LLMs) rely on&nbsp;fast data processing and distributed computing, making the efficiency of&nbsp;data processing tools a&nbsp;critical factor. Apache Spark streamlines text data preparation, enables parallel processing of&nbsp;massive datasets and simplifies the development of&nbsp;scalable ML&nbsp;workflows. This article explores Spark&rsquo;s architecture, its advantages for data preparation and solutions to&nbsp;common limitations when working with LLM-scale models.

What is Apache Spark and how can it help with LLMs?

What is&nbsp;Apache Spark and how can it&nbsp;help with LLMs?

Moving large datasets between S3 buckets is&nbsp;often slow, unreliable and frustrating&nbsp;&mdash; especially across accounts or&nbsp;clouds. In&nbsp;this post, we&nbsp;share a&nbsp;fast, fully open-source workaround using SkyPilot, s5cmd and Nebius AI&nbsp;Cloud.

Bulk Object Storage data migration with SkyPilot

Bulk Object Storage data migration with SkyPilot

Audit Logs is&nbsp;a&nbsp;new feature, now available in&nbsp;preview in&nbsp;Nebius AI&nbsp;Cloud. With Audit Logs, our customers can improve their accountability, security and ensure compliance with regulatory requirements.

Meet Audit Logs in Nebius&nbsp;AI Cloud

Meet Audit Logs in&nbsp;Nebius&nbsp;AI&nbsp;Cloud

Pre-trained models let you skip the hassle of&nbsp;training new models by&nbsp;providing already hosted models you can fine-tune to&nbsp;build use-case-specific models. This article explores how this works, some use cases, upsides and downsides (plus countermeasures) of&nbsp;using pre-trained models.

Understanding pre-trained AI models and their applications

Understanding pre-trained AI&nbsp;models and their applications

Fine-tuning pre-trained AI&nbsp;models is&nbsp;a&nbsp;challenging task in&nbsp;itself. What strategy should you use and what resources do&nbsp;you require? This article presents a&nbsp;comprehensive overview of&nbsp;the AI&nbsp;model fine-tuning process.

AI model fine-tuning: What it is and why it matters

AI&nbsp;model fine-tuning: What it&nbsp;is&nbsp;and why it&nbsp;matters

We&rsquo;ve had a&nbsp;very productive first quarter of&nbsp;2025&nbsp;at&nbsp;Nebius AI&nbsp;Studio, shipping major features, adding powerful new models and significantly expanding our infrastructure. This comprehensive roundup covers everything we&rsquo;ve launched and where you can find us&nbsp;next.

Nebius AI Studio Q1 2025 roundup: Fine-tuning, new models and major expansions

Nebius AI&nbsp;Studio Q1&nbsp;2025&nbsp;roundup: Fine-tuning, new models and major expansions

If&nbsp;you work with personal data and need to&nbsp;store it&nbsp;in&nbsp;the cloud, you have to&nbsp;choose a&nbsp;cloud provider that offers appropriate storage. If&nbsp;you&rsquo;re wondering whether our cloud can store and process personal user data, the answer is&nbsp;yes. Our platform ensures the secure storage of&nbsp;personal data, in&nbsp;full compliance with applicable regulations. But what does that mean in&nbsp;practice? Let&rsquo;s break it&nbsp;down.

Storing personal data in Nebius AI Cloud: Security, GDPR and AI readiness

Storing personal data in&nbsp;Nebius&nbsp;AI&nbsp;Cloud: Security, GDPR and AI&nbsp;readiness

The increasing adoption of&nbsp;AI&nbsp;cloud solutions highlights the need for efficient, vendor-independent orchestration tools. dstack provides an&nbsp;open-source, AI-focused orchestration platform that emphasizes flexibility and ease of&nbsp;use for AI&nbsp;development. We&nbsp;are announcing an&nbsp;integration with dstack to&nbsp;enhance the developer experience for professional ML&nbsp;teams. You can now choose Nebius in&nbsp;dstack and start managing dev environments, executing training jobs and deploying models on&nbsp;our AI&nbsp;infrastructure.

Announcing the integration between Nebius and dstack

Announcing the integration between Nebius and dstack

Nebius AI&nbsp;Cloud is&nbsp;collaborating with NVIDIA Inception, a&nbsp;leading program for AI&nbsp;startups with over 22,000 members, to&nbsp;accelerate AI&nbsp;startup innovation. Nebius&rsquo; AI&nbsp;Lift program&nbsp;&mdash; announced at&nbsp;NVIDIA GTC 2025&nbsp;&mdash; offers eligible NVIDIA Inception members up&nbsp;to&nbsp;$150,000 in&nbsp;cloud credits, access to&nbsp;AI&nbsp;technical expertise and support and a&nbsp;range of&nbsp;other exclusive benefits.

Supercharging startups: Nebius accelerates AI-native innovation with NVIDIA

Supercharging startups: Nebius&nbsp;accelerates AI-native innovation with NVIDIA

The end of&nbsp;the quarter is&nbsp;a&nbsp;great time to&nbsp;reflect and share a&nbsp;kind of&nbsp;changelog with you, highlighting what&rsquo;s improved in&nbsp;Nebius AI&nbsp;Cloud over the past three months. This quarter, we&nbsp;launched new regions, released new services for MLOps and enhanced our existing product lineup to&nbsp;minimize your infrastructure concerns even further.

Q1 2025: Nebius AI Cloud updates

Q1 2025: Nebius&nbsp;AI&nbsp;Cloud updates

We&nbsp;wrapped up&nbsp;our first series of&nbsp;technical meetups with stops in&nbsp;Paris, London and San Francisco&nbsp;&mdash; each time providing a&nbsp;deep dive into our cloud. Here, you can watch all the talks from the first event in&nbsp;the series.

Watch the talks: Videos from Nebius AI Cloud Unveiled meetup

Watch the talks: Videos from Nebius AI&nbsp;Cloud Unveiled meetup

NVIDIA GTC was the highlight of&nbsp;last month for&nbsp;us. Our extensive participation included being a&nbsp;platinum sponsor, a&nbsp;strong presence with the booth, two tech talks, interviews with our leaders, partners and customers, and much more.

Nebius monthly digest: March 2025 and NVIDIA GTC

Nebius monthly digest: March 2025&nbsp;and NVIDIA&nbsp;GTC

We&rsquo;re excited to&nbsp;announce our integration with SkyPilot, an&nbsp;open-source framework that simplifies running AI&nbsp;and batch workloads across cloud platforms. This collaboration enables direct access to&nbsp;Nebius AI&nbsp;Cloud resources via SkyPilot.

Nebius AI Cloud is now integrated with SkyPilot

Nebius AI&nbsp;Cloud is&nbsp;now integrated with SkyPilot

Managed Service for MLflow becomes generally available&nbsp;&mdash; which means it&nbsp;gains more stability, a&nbsp;newer version of&nbsp;the platform, service level agreements and a&nbsp;service fee.

Meet Managed Service for MLflow in general availability

Meet Managed Service for MLflow in&nbsp;general availability

Today, we&rsquo;re excited to&nbsp;announce that JupyterLab is&nbsp;now available in&nbsp;our Standalone Applications service, running on&nbsp;NVIDIA GPUs on&nbsp;our cloud platform. This fully managed offering eliminates infrastructure complexities, allowing you to&nbsp;focus entirely on&nbsp;your AI&nbsp;and machine learning (ML) projects.

JupyterLab®: The first app in our new service for AI development

JupyterLab&reg;: The first app&nbsp;in&nbsp;our new service for&nbsp;AI&nbsp;development

In&nbsp;this blog post, we&rsquo;ll demonstrate how to&nbsp;fine-tune an&nbsp;LLM using Nebius AI&nbsp;Studio with a&nbsp;function-calling task as&nbsp;our running example. The goal isn&rsquo;t to&nbsp;achieve state-of-the-art results but to&nbsp;walk you through the steps and showcase why fine-tuning matters.

Beyond prompting: Fine-tuning LLMs with Nebius AI Studio

Beyond prompting: Fine-tuning LLMs with Nebius AI&nbsp;Studio

Speeding up&nbsp;computational tasks is&nbsp;critical to&nbsp;push the boundaries of&nbsp;science and ensure scalability and efficiency. Check out how NVIDIA Hopper GPUs available on&nbsp;Nebius AI&nbsp;Cloud performed against CPU workflows to&nbsp;accelerate homology research, deep learning-enabled protein embeddings and ML&nbsp;clustering.

GPU vs CPU: what is the best bioinformatics accelerator?

GPU vs&nbsp;CPU: what is&nbsp;the best bioinformatics accelerator?

On&nbsp;March 13, 2025, a&nbsp;release of&nbsp;one of&nbsp;the control plane components of&nbsp;the VPC service in&nbsp;the eu-west1 region caused massive networking issues for clients' pods of&nbsp;Managed Services for Kubernetes.

Incident post-mortem analysis: networking issues for Managed Services for Kubernetes on March 13, 2025

Incident post-mortem analysis: networking issues for Managed Services for Kubernetes on&nbsp;March 13, 2025

We&nbsp;are excited to&nbsp;unveil our latest collaboration: Nebius AI&nbsp;Cloud, the ultimate cloud for AI&nbsp;innovators, and Outerbounds, an&nbsp;AI/ML platform powered by&nbsp;the popular open-source framework Metaflow, are now integrated as&nbsp;part of&nbsp;a&nbsp;broader technology partnership.

Nebius and Outerbounds form strategic technology partnership through integration

Nebius and Outerbounds form strategic technology partnership through integration

In&nbsp;February, we&nbsp;at&nbsp;Nebius were busy preparing several new data center regions and working on&nbsp;our large-scale participation in&nbsp;NVIDIA GTC. We&nbsp;also rolled out some improvements to&nbsp;the platform and contributions to&nbsp;open source&nbsp;&mdash; as&nbsp;well as&nbsp;new initiatives to&nbsp;support AI&nbsp;research: a&nbsp;brand new biotech award and a&nbsp;research credits program.

Nebius monthly digest, February 2025

Nebius monthly digest, February&nbsp;2025

Today marks a&nbsp;significant step forward for AI&nbsp;developers and businesses: fine-tuning is&nbsp;<a href="https://nebius.com/services/studio-fine-tuning?utm_medium=blog&amp;utm_source=blog&amp;utm_campaign=ai-studio-fine-tuning">now available</a> in&nbsp;Nebius AI&nbsp;Studio. This launch transforms how you can customize leading open-source models, making them work precisely for your specific needs.

Make AI work for you: fine-tuning launches on Nebius AI Studio

Make AI&nbsp;work for you: fine-tuning launches on&nbsp;Nebius AI&nbsp;Studio

We&rsquo;re thrilled to&nbsp;announce a&nbsp;major upgrade to&nbsp;our US-based compute capacity. To&nbsp;bring it&nbsp;to&nbsp;life, we&rsquo;ve joined forces with DataOne, an&nbsp;AI&nbsp;hosting infrastructure company, to&nbsp;ensure that the first phase of&nbsp;the New Jersey facility goes live this summer. We&rsquo;re also launching a&nbsp;colocation facility in&nbsp;Iceland with Verne, a&nbsp;provider of&nbsp;sustainably powered data centers across the Nordics, and expect it&nbsp;to&nbsp;go&nbsp;live this month.

We’re introducing the 300 MW New Jersey region and expanding to Iceland

We&rsquo;re introducing the 300&nbsp;MW&nbsp;New Jersey region and&nbsp;expanding to&nbsp;Iceland

<a href="/services/managed-postgresql">Managed Service for PostgreSQL</a> is&nbsp;now generally available to&nbsp;all Nebius users. We&rsquo;ve been improving the service&rsquo;s functionality and stability for the last few months, to&nbsp;empower AI&nbsp;practitioners with a&nbsp;reliable and convenient tool for storing structured data in&nbsp;the cloud.

Introducing general availability of Managed Service for PostgreSQL 

Introducing general availability of&nbsp;Managed Service for PostgreSQL

Generative AI&nbsp;and LLMs require immense computational power, making compute clusters essential for efficient scaling. Training and fine-tuning these models exceed the capacity of&nbsp;a&nbsp;single compute node, and parallel computing enables businesses to&nbsp;handle GenAI workloads at&nbsp;scale. In&nbsp;this article, we&nbsp;explore what compute clusters are and how to&nbsp;choose the right one for your needs.

What are AI compute clusters and how to choose yours?

What are AI&nbsp;compute clusters and how to&nbsp;choose yours?

Modern AI&nbsp;systems require substantial computational power to&nbsp;train and infer complex models. Cloud GPUs provide scalable infrastructure with high parallelism, offering access to&nbsp;virtualized GPUs through APIs and cloud platforms. The article delves into the technical aspects of&nbsp;cloud GPUs including virtualization, elastic scaling and management of&nbsp;high-performance workloads.

What is cloud infrastructure and why&nbsp;is it essential for&nbsp;AI development

What is&nbsp;cloud infrastructure and why&nbsp;is&nbsp;it&nbsp;essential for&nbsp;AI&nbsp;development

Images with pre-installed drivers in&nbsp;our Managed Service for Kubernetes enhance NVIDIA GPU cluster scalability and reduce compute node start-up time by&nbsp;2-3x.
Preparing a&nbsp;GPU cluster requires some effort and careful attention from your engineering team. It&nbsp;can become a&nbsp;significant challenge in&nbsp;terms of&nbsp;deploying and configuring a&nbsp;compute environment. Today, we&rsquo;re happy to&nbsp;announce a&nbsp;small yet meaningful enhancement that makes the cluster preparation process much easier and faster.

Pre-installed drivers are now available for Kubernetes nodes

Pre-installed drivers are now available for&nbsp;Kubernetes nodes

Today, we&rsquo;re <a href="https://github.com/nebius/kvax">open-sourcing Kvax</a>, our FlashAttention implementation based on&nbsp;JAX. Designed for efficient training with long sequences, Kvax supports context parallelism and optimized computation of&nbsp;document masks. It&nbsp;outperforms many other FlashAttention implementations in&nbsp;long-context training with dense packing, achieving state-of-the-art performance.

Kvax: Fast and easy-to-use FlashAttention implementation for JAX

Kvax: Fast and easy-to-use FlashAttention implementation for JAX

A&nbsp;wave of&nbsp;advancements, including the generation of&nbsp;massive amounts of&nbsp;data, is&nbsp;transforming how researchers and companies reshape science. Access to&nbsp;compute and workflow management systems is&nbsp;essential to&nbsp;benefit from the new wave of&nbsp;tools. In&nbsp;this blog post, we&rsquo;ll explore how to&nbsp;leverage <a href="https://seqera.io/platform/">Seqera Platform</a> and Slurm on&nbsp;Nebius AI&nbsp;Cloud to&nbsp;deploy scalable <a href="https://www.nextflow.io/">Nextflow</a> workflows.

Running Nextflow workflows with Seqera Platform and Slurm on Nebius AI Cloud

Running Nextflow workflows with Seqera Platform and Slurm on&nbsp;Nebius AI&nbsp;Cloud

Repetitive employee queries can overwhelm HR&nbsp;departments and reduce time for strategic tasks. At&nbsp;Nebius, we&nbsp;implemented an&nbsp;AI&nbsp;Assistant powered by&nbsp;Nebius AI&nbsp;Studio that automates over 150&nbsp;monthly queries. In&nbsp;this article, we&nbsp;explore how NLP and RAG help Studio drive such processes efficiency.

How we streamlined HR operations with an AI assistant (and why you should too)

How we&nbsp;streamlined HR&nbsp;operations with an&nbsp;AI&nbsp;assistant (and why you should too)

Explore what tokens are, how they enable AI&nbsp;models to&nbsp;understand and process language, represent text in&nbsp;AI&nbsp;systems and why tokenization is&nbsp;critical for LLMs. This post will also allow you to&nbsp;understand different tokenization techniques, how they affect processing efficiency and how modern NLP models like transformers use them to&nbsp;deliver accurate and context-aware responses.

What is a token in AI? Understanding how AI processes language with tokenization

What is&nbsp;a&nbsp;token in&nbsp;AI? Understanding how AI&nbsp;processes language with tokenization

We&nbsp;have been hard at&nbsp;work this past month, expanding the scope of&nbsp;our materials, which support customer progress on&nbsp;the platform. Our blog has been hopping too, showcasing the latest releases from Nebius AI&nbsp;Studio and the resilient infrastructure potential Nebius AI&nbsp;Cloud, accelerated by&nbsp;NVIDIA, brings for different AI&nbsp;use cases. We&nbsp;wish you a&nbsp;productive Q1, as&nbsp;we&nbsp;empower your AI&nbsp;journey on&nbsp;our platform.

Nebius monthly digest, January 2025

Nebius monthly digest, January&nbsp;2025

A&nbsp;detailed analysis of&nbsp;the incident that led to&nbsp;service outages. The incident impacted Compute API, Virtual&nbsp;Private&nbsp;Cloud and Managed Kubernetes.

Incident post-mortem analysis: Major connectivity loss on January 27, 2025

Incident post-mortem analysis: Major connectivity loss on&nbsp;January 27, 2025

Striving to&nbsp;simplify the user experience, we&nbsp;<a href="https://console.eu.nebius.com/">increased</a> the amount of&nbsp;compute resources immediately available for every user that signs up&nbsp;to&nbsp;our cloud. <a href="https://console.eu.nebius.com/">You can now</a> access 8 <a href="https://www.nvidia.com/en-us/data-center/h200/">NVIDIA H200 GPUs</a>, 16 <a href="https://www.nvidia.com/en-us/data-center/h100/">NVIDIA H100 GPUs</a> and 2 <a href="https://www.nvidia.com/en-us/data-center/l40s/">NVIDIA L40S GPUs</a> via our console and start using them immediately.

Nebius extends immediate access to NVIDIA Hopper GPUs

Nebius extends immediate access to&nbsp;NVIDIA Hopper GPUs

With this complete prompt guide, you will learn practical tips for creating exceptional images with Flux on&nbsp;Nebius AI&nbsp;Studio. I&nbsp;will share clear, actionable insights from experience to&nbsp;help bring your AI&nbsp;art game to&nbsp;the next level. With Nebius AI&nbsp;Studio&rsquo;s prices starting at&nbsp;$0.0013 per image, you can experiment with crystal-clear visuals up&nbsp;to&nbsp;2000&times;2000 pixels without breaking the bank.

Creating images with Flux: Your prompt guide

Creating images with Flux: Your prompt guide

If&nbsp;you think New Year celebrations started on&nbsp;January 1, it&nbsp;depends on&nbsp;how you look at&nbsp;it&nbsp;&mdash; because in&nbsp;the AI&nbsp;world, the real fireworks went off in&nbsp;December with the release of&nbsp;<a href="https://studio.nebius.ai/playground?models=deepseek-ai%2FDeepSeek-V3">DeepSeek V3</a>. Now, <a href="https://studio.nebius.ai/playground?models=deepseek-ai%2FDeepSeek-R1">DeepSeek R1</a> is&nbsp;lighting up&nbsp;the sky with open-source brilliance that&rsquo;s making even the most entrenched Silicon Valley giants feel the heat.

DeepSeek R1 and V3: Chinese AI New Year started early

DeepSeek R1 and V3: Chinese AI&nbsp;New Year started early

Today, we&nbsp;are happy to&nbsp;announce that our SDKs for Go&nbsp;and Python are now generally available to&nbsp;all users. This major update in&nbsp;programmatic interfaces brings more flexibility for MLOps teams and AI&nbsp;developers.

Blog

SWE-rebench dataset: More than 21,000 verifiable tasks for SWE agents

SWE-rebench: A continuously updated benchmark for SWE LLMs

Reasoning critics enable better parallel search for software engineering agents

Kvax: Fast and easy-to-use FlashAttention implementation for JAX

Scaling data collection for training software engineering agents

Leveraging training and search for better software engineering agents

Products

Resources

Solutions

Prices

Security and compliance

Programs

Company

Legal