Early Access
Alloy

One gateway. Every provider.

Centralised LLM control plane for engineering teams using multiple providers. Manage API keys, enforce budgets, track costs and audit usage across OpenAI, Anthropic, Azure OpenAI, AWS Bedrock and 12 more. Self-hosted single binary with an embedded admin console and the Prism self-service portal.

16 providers · sub-2ms overhead · sk-alloy-... keys

Everything Your Team Needs

Alloy puts your organisation in control of every LLM interaction: cost, access, compliance and performance.

Centralised API Key Lifecycle

Replace per-developer API keys with sk-alloy-... prefixed keys. Revoke, rotate or audit any key from a single pane. No more shared credentials buried in CI pipelines.

Budget Enforcement

Set per-key and per-team budgets with hard or soft limits. Automatic rate limiting kicks in before overspend occurs. Daily, monthly and total budget windows supported.

16 Provider Adapters

OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, Google Vertex, Gemini, Ollama, OpenRouter, DeepSeek, Groq, Perplexity, Mistral, Cohere, HuggingFace, Replicate and any OpenAI-compatible endpoint. Each provider holds a pool of API keys, so Alloy fails over when a key is rate-limited and rotates keys with zero downtime.

Virtual Models & Routing

Define virtual model names that map to one or more provider deployments. Route by priority order, round-robin distribution or percentage weight. Circuit breakers exclude failing deployments, so you can add fallbacks or swap providers without changing client code.

Prism Self-Service Portal

Every instance ships with Prism, a browser portal where team members create and rotate their own API keys, watch their spend and token usage, and see which models they can reach. Served straight from the binary, with no admin in the loop.

Enterprise Identity

OIDC single sign-on for teams on professional and enterprise plans. Role-based access, group membership and fine-grained control over who can reach which models.

Guardrails & Audit

Regex content guardrails detect PII patterns such as card numbers, SSNs and email addresses before requests reach providers, with configurable block, redact or log actions. A full audit log records every administrative change, and circuit breakers protect against provider outages.

Plans

Start free with Community. Upgrade when your team needs identity integration, hosted infrastructure or enterprise-grade SLAs.

Community

Free
Container deployment
  • Single container deployment
  • All 16 provider adapters
  • sk-alloy-... key management
  • Provider key pools with failover
  • Basic budget controls
  • Admin console included
  • Community support
Learn More

Professional

Contact
Self-hosted Postgres + OIDC
  • Everything in Community
  • Postgres persistent storage
  • OIDC single sign-on
  • Prism self-service portal
  • Advanced budget policies
  • Flux CLI for CI/CD
  • Priority support
Join Early Access

Professional Hosted

Contact
TensorFoundry managed SaaS
  • Everything in Professional
  • TensorFoundry managed
  • Automatic updates
  • SLA uptime guarantee
  • Dedicated support channel
Learn More

Enterprise

Contact
SCIM, dedicated infra
  • Everything in Professional Hosted
  • IP allowlist enforcement
  • Dedicated infrastructure
  • Custom SLA
  • Architecture review
  • Dedicated account team
Learn More

How Alloy Fits the Stack

Alloy manages access to external LLM providers. Other TensorFoundry products run or route to local inference.

Olla logo

Alloy vs Olla

Enterprise vs Open Source

Olla is open source, local-first and proxies across local backends. Alloy is the enterprise control plane: managed API keys, team budgets, OIDC identity and audit logging across cloud providers.

FoundryOS logo

Alloy vs FoundryOS

API Providers vs Node Fleet

Alloy manages API usage and costs across external LLM providers. FoundryOS orchestrates a fleet of on-premise inference nodes. They serve different layers of the AI stack.

Forge logo

Alloy vs Forge

Gateway vs Inference Engine

Alloy routes to external providers and controls access. Forge runs the model locally using CUDA on NVIDIA hardware. Use Alloy when you consume external APIs; use Forge when you host your own.

Frequently Asked Questions

What is Alloy?

Alloy is an enterprise LLM gateway and control plane. It provides centralised API key lifecycle management, team budget enforcement, cost tracking and policy enforcement across 16 LLM providers including OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, Google Vertex, Gemini and more. Alloy ships as a self-hosted single binary with an embedded admin console and the Prism self-service portal for team members.

Who is Alloy designed for?

Alloy is designed for engineering teams and organisations that use multiple LLM providers and need centralised control over API keys, budgets and access policies. It suits teams where shared credentials buried in CI pipelines are a risk, or where finance teams need accurate cost attribution and budget guardrails across departments.

When will Alloy be available?

Alloy is currently in Early Access from Q2 2026, with RTM targeted for Q3 2026. Join the waitlist to get access to the current early access build and to provide feedback that shapes the final release.

What LLM providers does Alloy support?

Alloy supports 16 provider adapters: OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, Google Vertex, Gemini, Ollama, OpenRouter, DeepSeek, Groq, Perplexity, Mistral, Cohere, HuggingFace, Replicate and any OpenAI-compatible endpoint. Each provider holds a pool of API keys, so Alloy automatically fails over when a key is rate-limited and rotates keys with zero downtime.

How does Alloy differ from Olla?

Olla is open source and local-first: it proxies across local inference backends like Ollama, LM Studio and vLLM. Alloy is the enterprise control plane for external cloud LLM providers: managed API keys, team budgets, OIDC single sign-on and audit logging. They serve different layers of the AI stack and can be deployed together.

How do I get early access to Alloy?

Join the Alloy Early Access waitlist to register for access and be notified of updates. Join Early Access.

Get Early Access to Alloy

Alloy v5 is in early access. Join the waitlist and be first to take control of your team's LLM spend.