Smarter Infrastructure for AI: Why Multi-Tenancy is a Climate Imperative

Cliff Malmborg

July 14, 2025

3 Minute Read

This is some text inside of a div block.

The Emissions Problem No One’s Talking About

A recent report from Accenture titled Powering Sustainable AI laid out a pretty sobering forecast: by 2030, AI data centers could produce more CO2 emissions than some entire countries. The rapid rise of generative AI and the GPU infrastructure powering it is on track to create one of the biggest climate challenges in tech.

AI Is Booming - And It’s Not Slowing Down

AI isn’t slowing down. If anything, we’re still in the early innings. Models are getting bigger. Use cases are multiplying. Workloads are exploding. The question isn’t how to stop AI, it’s how to make it sustainable.

And for that, we need to have a serious conversation about waste.

The Hidden Cost of Isolation

When you peel back the layers of most AI infrastructure, what you often find is a whole lot of idle compute. Clusters built just for one team. Nodes spun up just to create basic isolation. GPUs sitting untouched while developers wait for test jobs to run. In a world where energy use is climbing and infrastructure has a real environmental cost, this isn’t just inefficient, it’s irresponsible.

Multi-Tenancy as a Design Principle

That’s why I believe multi-tenancy is about to become one of the most important design patterns in cloud infrastructure. Not just for saving money (though it helps). Not just for giving teams faster access (which it also does). But because it lets us do more with less.

Why vCluster Helps

At LoftLabs, we’ve been thinking about this problem for a while. With vCluster, we built a tool that lets platform teams run virtual clusters inside real Kubernetes clusters. It looks and feels like a real cluster to each tenant, but it doesn’t require provisioning an entirely separate control plane or extra nodes to create separation. That’s a big deal for performance. And it’s an even bigger deal for sustainability.

One Size Doesn’t Fit All

What makes this approach really powerful is the flexibility. Not every workload needs the same level of isolation. Not every team needs a fully separate environment. Some use cases might be fine with namespace-level separation. Others might need virtual clusters. And for the most sensitive GPU jobs, full-on node-level isolation might make sense.

That’s why we built vCluster with many different tenancy models, Shared Nodes, Dedicated Nodes, and Private Nodes, each designed to meet different security and isolation needs without sacrificing efficiency. Whether you're running lightweight experiments or high-stakes production inference, vCluster lets you make smart trade-offs between isolation and resource sharing. And since all these tenants run on the same underlying infrastructure, teams can maximize utilization across their Kubernetes stack without spinning up silos.

The point is, with vCluster and our tenancy model options, you can make those decisions intelligently based on actual risk and actual needs, not just habit or fear.

Better Utilization, Less Waste

This flexibility means organizations can avoid spinning up dozens or hundreds of extra clusters just to keep workloads apart. It means GPUs and other expensive resources can be shared more safely and efficiently. It means infrastructure can be used more intentionally, with less idle time and more purposeful compute.

vCluster also supports features like Sleep Mode, which can automatically scale down workloads and worker nodes when they’re not in use. This helps avoid the power draw and cost of always-on infrastructure, especially in dev and test environments where usage is bursty or intermittent.

And it means AI doesn’t have to be at odds with sustainability goals.

Taking Action on Sustainability

The Accenture report also outlined four key strategies for powering sustainable AI:

Smarter silicon (like compute-in-memory hardware)
Cleaner data centers (through better location choices, dynamic scaling, and low-carbon energy)
Strategic AI use (right-sizing models and jobs to avoid unnecessary resource consumption)
Governance-as-code (embedding sustainability in AI governance frameworks)

vCluster directly contributes to progress on points 2 and 3. By enabling dynamic, on-demand provisioning of virtual clusters, it helps reduce unnecessary cluster sprawl and idle infrastructure, key aspects of cleaner data center operations. By letting teams spin up just enough infrastructure for the task at hand, vCluster encourages a more intentional and strategic use of compute.

The Road Ahead

We’re going to need a lot more innovation like this. But I’m convinced that smarter multi-tenancy, especially for Kubernetes-based AI platforms, is one of the best ways to cut waste and get more out of the infrastructure we already have.

In the end, it’s not about stopping AI. It’s about stopping the inefficiency that’s tagging along for the ride.

Smarter Infrastructure for AI: Why Multi-Tenancy is a Climate Imperative

Table of Contents

The Emissions Problem No One’s Talking About

AI Is Booming - And It’s Not Slowing Down

The Hidden Cost of Isolation

Multi-Tenancy as a Design Principle

Why vCluster Helps

One Size Doesn’t Fit All

Better Utilization, Less Waste

Taking Action on Sustainability

The Road Ahead

Related blog posts

What Is GPU Sharing in Kubernetes?

Automating Kubernetes Cleanup in CI Workflows

Bare Metal Kubernetes with GPU: Challenges and Multi-Tenancy Solutions

Table of Contents

The Emissions Problem No One’s Talking About

AI Is Booming - And It’s Not Slowing Down

The Hidden Cost of Isolation

Multi-Tenancy as a Design Principle

Why vCluster Helps

One Size Doesn’t Fit All

Better Utilization, Less Waste

Taking Action on Sustainability

The Road Ahead

Related blog posts

What Is GPU Sharing in Kubernetes?

Automating Kubernetes Cleanup in CI Workflows

Bare Metal Kubernetes with GPU: Challenges and Multi-Tenancy Solutions

Sign up for our newsletter