ibee

Bare Metal GPU
Silicon, entirely yours.

Dedicated physical GPU servers with no virtualisation layer between your models and the hardware. Built for training runs that can't tolerate shared bandwidth, noisy neighbours, or hypervisor overhead.
Who this is for

Built for ML infra teams that have outgrown shared GPU VMs.

Four patterns where the hypervisor tax starts to hurt — and where dedicated hardware pays back the price difference.

Distributed multi-node training

Run 70B+ parameter models across dozens of nodes over InfiniBand. No shared-tenancy variance, no NCCL timeouts caused by another customer's traffic.

Sustained high-utilisation inference

Serve production models on reserved GPU compute. Your p95 latency stays flat regardless of what workloads run elsewhere in the cluster.

Checkpoint-heavy training runs

Local NVMe with direct PCIe lanes means a 70B checkpoint saves in minutes — not an hour of hypervisor and network I/O contention.

HPC, simulation, and classical CUDA

CFD, genomics, rendering, scientific simulation — any CUDA or OpenCL workload where consistent clock cycles matter more than shared flexibility.

You own the stack, down to the boot loader

Bare Metal GPU gives AI teams full ownership of every layer — kernel, driver, CUDA toolkit, NCCL config, and network fabric. Nothing is abstracted; nothing is shared.

True single-tenant hardware

One physical server, one tenant. Every core, every GB of HBM, every watt of power draw is yours. No hypervisor on the hot path.

NVLink + InfiniBand fabric

Intra-server NVLink peer-to-peer, inter-node InfiniBand. Interconnect-bound training scales linearly instead of asymptotically.

Direct kernel & driver access

Install your CUDA version. Patch the kernel. Tune NCCL environment variables. The stack is yours from the boot loader up.

Predictable thermal headroom

Guaranteed cooling and power budget — cards hold sustained boost clocks, not the throttled numbers you get on oversubscribed shared racks.

Private VLAN + dedicated uplinks

Inter-node traffic never touches shared infrastructure. Segregate training, inference, and data-loader tiers by VLAN from day one.

Data residency by default

Pick the jurisdiction. Training data, model weights, and logs all stay inside the boundary your compliance team has signed off on.

IBEE AI Cloud

Mix bare metal with the rest of the stack.

Bare Metal clusters share VPC networking, storage namespaces, and access policies with every other IBEE AI Cloud service.

Dedicated GPU hardware,
reserved for your models.

IBEE Bare Metal GPU is rolling out to early-access teams now. Register interest to get a hardware spec, a reserved-capacity quote, and a benchmarking window for your workload.

Frequently Asked Questions

GPU Cloud gives you virtualised GPUs — faster to spin up, billed by the hour, convenient for experimentation. Bare Metal gives you the physical server with no hypervisor in the path: higher sustained utilisation, lower jitter, and direct access to every driver-level tunable. Training runs that span many GPU-days and inference tiers with strict p99 targets usually benefit from bare metal.
Bare Metal GPU is in active build-out as part of IBEE AI Cloud. Early access invitations are going out to teams with defined training workloads. Register interest via the contact form to join the waitlist.
Launch configurations are being finalised around current NVIDIA Hopper-class and next-generation cards, paired with AMD EPYC or Intel Xeon Scalable CPUs, local NVMe, and InfiniBand interconnects. Cluster topologies will range from single-server to multi-node rack pods. Exact SKUs will be published ahead of GA.
Yes. Full root access to the physical host. You choose the OS image, the CUDA version, the kernel patches, and the NCCL configuration. IBEE provisions the hardware and walks away — the operating stack is entirely yours.
Yes. Bare Metal GPU clusters are being designed to share private VPC networking with GPU Cloud instances and mount the same AI Storage namespaces. You can mix fleet types and share datasets, monitoring, and access policies across the whole deployment.
Yes. Hourly on-demand for evaluation, multi-month reserved contracts with discounted rates for long-running training programmes, and burst capacity arrangements for teams with predictable scaling windows. Pricing specifics will be shared closer to GA.

Have more questions?

Contact Our Technical Team