Centralised API Key Lifecycle
Replace per-developer API keys with sk-alloy-... prefixed keys. Revoke, rotate or audit any key from a single pane. No more shared credentials buried in CI pipelines.
AlloyCentralised LLM control plane for engineering teams using multiple providers. Manage API keys, enforce budgets, track costs and audit usage across OpenAI, Anthropic, Azure OpenAI, AWS Bedrock and 12 more. Self-hosted single binary with embedded admin console.
Alloy puts your organisation in control of every LLM interaction: cost, access, compliance and performance.
Replace per-developer API keys with sk-alloy-... prefixed keys. Revoke, rotate or audit any key from a single pane. No more shared credentials buried in CI pipelines.
Set per-key and per-team budgets with hard or soft limits. Automatic rate limiting kicks in before overspend occurs. Daily, weekly and monthly budget windows supported.
OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, Google Vertex, Gemini, Ollama, OpenRouter, DeepSeek, Groq, Perplexity, Mistral, Cohere, HuggingFace, Replicate and any OpenAI-compatible endpoint.
Define virtual model names that map to one or more provider deployments. Route by latency, cost or availability. A/B test providers transparently without changing client code.
OIDC single sign-on and SCIM user provisioning for teams on professional and enterprise plans. Role-based access, group membership and automated deprovisioning.
Regex content guardrails catch policy violations before they reach providers. Full audit log of every request with latency, token counts and cost attribution. Circuit breakers protect against provider outages.
Start free with Community. Upgrade when your team needs identity integration, hosted infrastructure or enterprise-grade SLAs.
Alloy manages access to external LLM providers. Other TensorFoundry products run or route to local inference.
Olla is open source, local-first and proxies across local backends. Alloy is the enterprise control plane: managed API keys, team budgets, OIDC identity and audit logging across cloud providers.
Alloy manages API usage and costs across external LLM providers. FoundryOS orchestrates a fleet of on-premise inference nodes. They serve different layers of the AI stack.
Alloy routes to external providers and controls access. Forge runs the model locally using CUDA on NVIDIA hardware. Use Alloy when you consume external APIs; use Forge when you host your own.
Alloy v5 is in early access. Join the waitlist and be first to take control of your team's LLM spend.