The Platform

Autonomy Cloud.
One rack for Everything.

AI inference, cloud workloads, and sovereign governance — converged on a single platform, in a single rack, under your legal jurisdiction. Not a product category that existed before. Built because it needed to exist.

The problem with conventional

Two estates. Two bills. Two points of failure.

The conventional approach to sovereign AI requires two completely separate infrastructures. A GPU cluster for AI inference — power-hungry, physically large, and dependent on specialist HPC engineering. And separately, a cloud platform for everything else — VMware or equivalent, with its own management stack, licensing costs, and compliance posture.

These two estates cannot share hardware. They can't share a control plane. They can't share a power budget. You pay for both, staff both, assure both, and get sovereignty guarantees from neither — because both rely on US-incorporated software stacks subject to CLOUD Act and FISA extraterritorial reach.

Autonomy Cloud was built on a single premise: that the separation is unnecessary, and the cost of maintaining it is borne entirely by the customer.

Conventional approach

GPU Estate AI only
Cloud Estate VMs only
10 AI racks+6 cloud racks
~256 kW · 18 racks · £9.4m+ in software licences

Autonomy Cloud

Single Converged Platform AI inference + cloud workloads + governance
23.1 kW max · 1 rack · £0 in software licences

Based on comparison of 1 x 42RU Autonomy Rack hosting 70 x AI Accelerators, and providing capacity to run 4,769 mixed VM workloads with a conventional server estate alternative running NVIDIA H200 GPU plus VMware based hypervisor. 30KW rack power.All vendor recommended build-out instructions and blueprinted builds followed, all power quoted is maximum load. Full methodology available on request.

The technology

Purpose-built sovereign hardware. Open source software stack.
Zero proprietary lock-in.

Every layer of the Autonomy Cloud stack has been chosen to eliminate dependency on US-domiciled proprietary software. No VMware. No Broadcom. No licensing agreements subject to foreign jurisdiction. The software is open source — binaries compiled from published source code, with API exposure for all key services.

Compute

Intelligence Nodes

Each compute node carries an AMD EPYC 64-core CPU and two Qualcomm AI100 Ultra programmable accelerator cards.
A full 42RU rack holds 35 nodes — 70 accelerator cards in total.
AI also needs Cloud to deliver business services: in Autonomy the same 42RU rack can concurrently support nearly 5,000 performance VM's.
HPC AI + Cloud in a single platform.

CPU per node AMD EPYC 64-core
Accelerator cards 2× QAI100 Ultra
RAM per node 2,048 GB
Total vCPU pool (42RU) 12,318

AI Accelerator

Qualcomm AI100 Ultra

Not a GPU. A programmable AI accelerator — purpose-built for inference at scale. Benchmarked independently by UC San Diego against NVIDIA A100, H100, and H200: superior energy efficiency across most tested LLM models. Supports LLM inference, computer vision, video analysis, signal processing, and model fine-tuning via an open SDK supported by the manufacturer.


TDP per card 150 W
vs NVIDIA H100 TDP 4.7× more efficient
Export classification 4A090
Concurrent 70B models (42RU) 70

Storage & Network

Ceph / S3 / Full Mesh

Ceph distributed storage with S3-compatible buckets. Full software-defined networking. IP, DNS, and IAM can be operated internally or externalised to existing customer infrastructure.
No proprietary storage appliances means no vendor lock-in at the storage layer.
Your data is your business lifeblood, so calculated on K=1 M=2 EC our storage achieves measured 99.999% data resilience. Safe + Performant.

Storage 150 TB usable (42RU)
Object storage S3-compatible
IAM Internal or external
Kubernetes Supported natively

Software stack — open source throughout

Hypervisor

KVM — open source. Zero licensing. Replaces VMware entirely.

Containers

LXC / Kubernetes — native container support out of the box.

Control Plane

OpenNebula — single unified management for VMs, containers, and AI workloads.

Compatibility

Hypervisor integration is available for organisations with legacy VMware assets.


We use Qualcomm AI100 ultra accelerators as our primary AI card of choice,and in all EAR 4A090 markets; but can support many other GPUs with a low power draw upon request or in our SKUs to provide AI capabilities in markets constricted by US EAR export restrictions.

Where we operate

Not on the device.
Not in a hyperscale DC. The space in between.

The term "Edge AI" gets used loosely — sometimes to mean AI running on a smartphone or sensor (on-device inference), sometimes to mean anything that isn't a hyperscale data centre. Technically informed readers will rightly point out that these are wildly different things. They are.

True far-edge AI runs directly on the end device. That's not what Autonomy Cloud is. What we do is more accurately described as Near Edge, Fog AI, or MEC AI — depending on context.
We typically run sovereign AI inference on infrastructure at the network edge: in PoP rooms, exchange facilities, base station hubs, and local data centres. Close enough to deliver low latency. Powerful enough to run enterprise LLMs. Sovereign enough for regulated and sensitive data. Small enough to fit where conventional AI infrastructure can't go.

This intermediary layer — between the user's device and the hyperscale core — is where the most interesting and most consequential AI deployments happen. It's also where the conventional infrastructure model fails most completely: too power-hungry, too large, too jurisdictionally exposed, and too dependent on hyperscale connectivity to operate at the network edge.

When we say "Edge", we mean the infrastructure edge — the sovereign, distributed, near-edge layer where Autonomy Cloud is uniquely deployable. That's a real world benefit, not marketing.

Far Edge / On-Device AI

Model inference runs on the end device — phone, sensor, vehicle.
Constrained by device battery and compute. Not what we do.

Near Edge / Fog AI / MEC AI ← Autonomy Cloud

Sovereign AI inference on local infrastructure at the network edge.
PoP rooms, exchange facilities, base station hubs.
Low latency, enterprise-grade compute, with full data sovereignty. This is where Autonomy Cloud excels.

Near Edge Fog AI MEC AI Edge Server AI

Cloud Core / Hyperscale

Centralised hyperscale data centres.
High latency to mass consumer end users.
Foreign jurisdictions. Power-intensive. No sovereignty guarantees.
This is where conventional cloud and AI infrastructure lives.

In telecoms contexts, the near-edge layer is commonly referred to as Multi-access Edge Computing (MEC) — AI placed at the cellular network edge for real-time analytics without device-side compute or hyperscale latency.
Autonomy Cloud is fully MEC-deployable from its 8 RU 4-card AI enabled entry-level configuration.

Deployment scale

PoP room to national platform. Same architecture throughout.

Autonomy Cloud scales from an 8 RU minimum entry point — deployable in any standard 19″ rack environment and scaleable 1RU at a time — through to full 42 RU racks and multi-rack federated deployments. The architecture is identical at every scale.
What changes is the platform capacity, not its complexity.

EDGE ENTRY

8 RU - 21RU Half-Rack

Deployable in PoP rooms, exchange facilities, mobile or vehicle-mounted configurations, and any constrained environment with a standard 19″ rack. Quiet enough to sit in an office corner.
At entry-level (8RU:4GPU) it draws under 2 kW — suitable for a domestic power circuit; whilst a 21RU:30GPU Half-Rack consumes just 10.5KW.


Telecoms edge · emergency services · remote government · island territories
Talk to us

FULL RACK

42 RU — reference model

The numbers are impressive:
35 compute nodes. 70 Qualcomm AI100 Ultra accelerator cards.
12,318 vCPU workload pool. 71,680 GB RAM. 150 TB usable storage.
Run AI Inference alongside up to 4,700 mixed VM's. A maximum verified draw of 23.1 kW — fits any standard data centre power envelope globally.


Enterprise · regulated sectors · national digital infrastructure
Talk to us

FEDERATED MESH

Multi-rack · multi-site

Multiple Autonomy deployments — across sites, cities, or countries — federated into a mesh. Each cluster operates independently. When connectivity is restored, clusters re-join automatically.
No manual intervention.
No central point of failure.
No blast radius between clusters.

National cloud · sovereign AI networks · telco sovereign infrastructure

Talk to us


Resilience architecture

ASSURED DECENTRALISED INFRASTRUCTURE (ADI)

ADI is the operational model that makes federated Autonomy deployments genuinely resilient rather than theoretically resilient.
Each cluster holds its own state, serves its own workloads, and makes its own routing decisions. A failure in one cluster — hardware, software, or physical — doesn't propagate. The failure domain is in the single cluster.
The broader mesh continues operating : and when connectivity to a failed cluster is restored, it re-starts, self-checks, then re-federates automatically and synchronises without needing any manual intervention.

Go deeper

The architecture briefing for Autonomy goes further than this page.

If you want the full technical picture — component specifications, deployment configurations, assurance posture — we do that in conversation, not in a brochure.

Request a technical briefing