🏠

Local AI

Running AI models and agents on your own hardware. Privacy-first, offline-capable, fully under your control.

The case for local AI

Cloud AI is convenient, but it comes with trade-offs: latency, cost, privacy exposure, and vendor dependency. Local AI flips the equation — your models, your hardware, your data.

In 2026, local AI is no longer a hobbyist pursuit. Consumer hardware (Apple Silicon, modern GPUs) can run capable models, and the tooling has matured to make local deployment straightforward.

Key categories

Local model runners

Ollama — The simplest way to run LLMs locally llama.cpp — High-performance inference for GGUF models LM Studio — Desktop app for running local models with a GUI Jan — Open source ChatGPT alternative that runs locally

Local agent frameworks

OpenClaw — Full agent operations platform, local-first Open Interpreter — Natural language interface to your computer PrivateGPT — Chat with your documents, fully offline

On-device inference

MLX — Apple’s machine learning framework for Apple Silicon MLC LLM — Universal deployment for LLMs across devices ExecuTorch — PyTorch on-device inference

What to watch

  • Apple Silicon enabling increasingly capable models on consumer laptops
  • Quantization improvements making larger models practical locally
  • Hybrid local/cloud architectures that route by task complexity
  • Local agents that match cloud agent capability for common tasks