Early Access

Alloy

One gateway. Every provider.

Centralised LLM control plane for engineering teams using multiple providers. Manage API keys, enforce budgets, track costs and audit usage across OpenAI, Anthropic, Azure OpenAI, AWS Bedrock and 12 more. Self-hosted single binary with embedded admin console.

16 providers · 11,400 RPS · sk-alloy-... keys

Join Early Access See Features

Everything Your Team Needs

Alloy puts your organisation in control of every LLM interaction: cost, access, compliance and performance.

Centralised API Key Lifecycle

Replace per-developer API keys with sk-alloy-... prefixed keys. Revoke, rotate or audit any key from a single pane. No more shared credentials buried in CI pipelines.

Budget Enforcement

Set per-key and per-team budgets with hard or soft limits. Automatic rate limiting kicks in before overspend occurs. Daily, weekly and monthly budget windows supported.

16 Provider Adapters

OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, Google Vertex, Gemini, Ollama, OpenRouter, DeepSeek, Groq, Perplexity, Mistral, Cohere, HuggingFace, Replicate and any OpenAI-compatible endpoint.

Virtual Models & Routing

Define virtual model names that map to one or more provider deployments. Route by latency, cost or availability. A/B test providers transparently without changing client code.

Enterprise Identity

OIDC single sign-on and SCIM user provisioning for teams on professional and enterprise plans. Role-based access, group membership and automated deprovisioning.

Guardrails & Audit

Regex content guardrails catch policy violations before they reach providers. Full audit log of every request with latency, token counts and cost attribution. Circuit breakers protect against provider outages.

Plans

Start free with Community. Upgrade when your team needs identity integration, hosted infrastructure or enterprise-grade SLAs.

Community

Free

Container deployment

Single container deployment
All 16 provider adapters
sk-alloy-... key management
Basic budget controls
Admin console included
Community support

Learn More

Professional

Contact

Self-hosted Postgres + OIDC

Everything in Community
Postgres persistent storage
OIDC single sign-on
User self-service portal
Advanced budget policies
Flux CLI for CI/CD
Priority support

Join Early Access

Professional Hosted

Contact

TensorFoundry managed SaaS

Everything in Professional
TensorFoundry managed
Automatic updates
SLA uptime guarantee
Dedicated support channel

Learn More

Enterprise

Contact

SCIM, dedicated infra

Everything in Professional Hosted
SCIM user provisioning
Dedicated infrastructure
Custom SLA
Architecture review
Dedicated account team

Learn More

How Alloy Fits the Stack

Alloy manages access to external LLM providers. Other TensorFoundry products run or route to local inference.

Alloy vs Olla

Enterprise vs Open Source

Olla is open source, local-first and proxies across local backends. Alloy is the enterprise control plane: managed API keys, team budgets, OIDC identity and audit logging across cloud providers.

Alloy vs FoundryOS

API Providers vs Node Fleet

Alloy manages API usage and costs across external LLM providers. FoundryOS orchestrates a fleet of on-premise inference nodes. They serve different layers of the AI stack.

Alloy vs Forge

Gateway vs Inference Engine

Alloy routes to external providers and controls access. Forge runs the model locally using CUDA on NVIDIA hardware. Use Alloy when you consume external APIs; use Forge when you host your own.

Get Early Access to Alloy

Alloy v5 is in early access. Join the waitlist and be first to take control of your team's LLM spend.

Join Early Access Talk to Us