TensorFoundry TensorFoundry
  • Services
  • Blog
  • News
  • Releases
Blog

Notes from the Foundry

Guides, comparisons and practical write-ups on running LLMs on your own hardware. Written by the team building Olla and the TensorFoundry stack.

Guide 4 June 2026

Self-Hosted LLM vs Cloud API - A Cost Framework

A transparent framework for comparing the cost of self-hosted LLM inference against cloud APIs - the variables that matter, the break-even maths, and where each wins.

Thushan Fernando
  • llm-cost
  • self-hosted
  • cloud-api
Comparison 4 June 2026

Olla vs LiteLLM - Choosing an LLM Proxy

An honest comparison of Olla and LiteLLM - where each fits, where each wins, and how to choose between a Go-based local-first proxy and a Python provider hub.

Thushan Fernando
  • olla
  • litellm
  • llm-proxy
Guide 4 June 2026

LLM Inference Servers Compared - vLLM, SGLang, llama.cpp and Ollama

A practical comparison of the main LLM inference backends - vLLM, SGLang, llama.cpp and Ollama - what each is built for, the hardware they suit, and how to choose.

Thushan Fernando
  • vllm
  • sglang
  • llama-cpp
Guide 4 June 2026

Deploying LLMs on Your Own Infrastructure - A Practical Guide

A complete guide to running large language models on your own infrastructure - why teams do it, the stack from backends to orchestration, hardware, cost and compliance.

Thushan Fernando
  • self-hosted-llm
  • on-prem-llm
  • on-premise-ai
Guide 7 May 2026

What is an LLM Proxy?

A practical look at what an LLM proxy does, why you end up needing one, and how it sits in front of inference backends like Ollama, vLLM and llama.cpp.

Thushan Fernando
  • llm
  • proxy
  • load-balancer

Forging Intelligence at the Edge

We bring together compute, orchestration and machine-learning infrastructure to make local AI reliable, scalable and production-ready - helping you build the best solutions for your customers.

Newsletter

Stay at the Edge of our Innovation

We'll let you know about our product releases & updates

Products

  • Pivotal Early Access
  • Kaizen Early Access
  • AgentOS EAP Q4 2026
  • Olla Available now
  • Alloy Early Access
  • FoundryOS EAP Q3 2026
  • Forge EAP Q3 2026

Company

  • About Us
  • Our Journey
  • Blog
  • News & Updates
  • Contact Us

Developers

  • GitHub
  • Velocity
  • Releases
  • Research Labs

Resources

  • Services
  • Technology
HIPAA
GDPR
Privacy Policy •Terms of Service •Export Controls •Cookie Policy

© 2026 TensorFoundry. Forged with precision in Melbourne & Sydney Australia.

TENSORFOUNDRY PTY LTD  ·  ABN 71 696 763 381  ·  ACN 696 763 381  ·  Level 1, 470 St Kilda Rd, Melbourne VIC 3004