Centralised API Key Lifecycle
Replace per-developer API keys with sk-alloy-... prefixed keys. Revoke, rotate or audit any key from a single pane. No more shared credentials buried in CI pipelines.
AlloyCentralised LLM control plane for engineering teams using multiple providers. Manage API keys, enforce budgets, track costs and audit usage across OpenAI, Anthropic, Azure OpenAI, AWS Bedrock and 12 more. Self-hosted single binary with an embedded admin console and the Prism self-service portal.
Alloy puts your organisation in control of every LLM interaction: cost, access, compliance and performance.
Replace per-developer API keys with sk-alloy-... prefixed keys. Revoke, rotate or audit any key from a single pane. No more shared credentials buried in CI pipelines.
Set per-key and per-team budgets with hard or soft limits. Automatic rate limiting kicks in before overspend occurs. Daily, monthly and total budget windows supported.
OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, Google Vertex, Gemini, Ollama, OpenRouter, DeepSeek, Groq, Perplexity, Mistral, Cohere, HuggingFace, Replicate and any OpenAI-compatible endpoint. Each provider holds a pool of API keys, so Alloy fails over when a key is rate-limited and rotates keys with zero downtime.
Define virtual model names that map to one or more provider deployments. Route by priority order, round-robin distribution or percentage weight. Circuit breakers exclude failing deployments, so you can add fallbacks or swap providers without changing client code.
Every instance ships with Prism, a browser portal where team members create and rotate their own API keys, watch their spend and token usage, and see which models they can reach. Served straight from the binary, with no admin in the loop.
OIDC single sign-on for teams on professional and enterprise plans. Role-based access, group membership and fine-grained control over who can reach which models.
Regex content guardrails detect PII patterns such as card numbers, SSNs and email addresses before requests reach providers, with configurable block, redact or log actions. A full audit log records every administrative change, and circuit breakers protect against provider outages.
Start free with Community. Upgrade when your team needs identity integration, hosted infrastructure or enterprise-grade SLAs.
Alloy manages access to external LLM providers. Other TensorFoundry products run or route to local inference.
Olla is open source, local-first and proxies across local backends. Alloy is the enterprise control plane: managed API keys, team budgets, OIDC identity and audit logging across cloud providers.
Alloy manages API usage and costs across external LLM providers. FoundryOS orchestrates a fleet of on-premise inference nodes. They serve different layers of the AI stack.
Alloy routes to external providers and controls access. Forge runs the model locally using CUDA on NVIDIA hardware. Use Alloy when you consume external APIs; use Forge when you host your own.
Alloy is an enterprise LLM gateway and control plane. It provides centralised API key lifecycle management, team budget enforcement, cost tracking and policy enforcement across 16 LLM providers including OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, Google Vertex, Gemini and more. Alloy ships as a self-hosted single binary with an embedded admin console and the Prism self-service portal for team members.
Alloy is designed for engineering teams and organisations that use multiple LLM providers and need centralised control over API keys, budgets and access policies. It suits teams where shared credentials buried in CI pipelines are a risk, or where finance teams need accurate cost attribution and budget guardrails across departments.
Alloy is currently in Early Access from Q2 2026, with RTM targeted for Q3 2026. Join the waitlist to get access to the current early access build and to provide feedback that shapes the final release.
Alloy supports 16 provider adapters: OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, Google Vertex, Gemini, Ollama, OpenRouter, DeepSeek, Groq, Perplexity, Mistral, Cohere, HuggingFace, Replicate and any OpenAI-compatible endpoint. Each provider holds a pool of API keys, so Alloy automatically fails over when a key is rate-limited and rotates keys with zero downtime.
Olla is open source and local-first: it proxies across local inference backends like Ollama, LM Studio and vLLM. Alloy is the enterprise control plane for external cloud LLM providers: managed API keys, team budgets, OIDC single sign-on and audit logging. They serve different layers of the AI stack and can be deployed together.
Join the Alloy Early Access waitlist to register for access and be notified of updates. Join Early Access.
Alloy v5 is in early access. Join the waitlist and be first to take control of your team's LLM spend.