Ollama | Russ McKendrick

Ollama

Ollama is a local runtime and model manager for running large language models on a workstation or server.

Ollama makes it straightforward to run LLMs locally. You pull a model, run it, and talk to it through a command-line interface or HTTP API.

Local models are useful when you want quick experiments, offline-ish workflows, or data that you would rather not send to a hosted provider. The trade-off is hardware. Bigger models need more memory and patience, especially on laptops.

Ollama is also a useful building block. Tools can point at its API and treat a local model much like a hosted one, which makes it a handy way to test ideas before paying for tokens elsewhere.