Quickstart: Linux / macOS
The cleanest local path: macOS or Linux, Ollama on the host, codehamr in your terminal.
On Windows? See the Windows quickstart or WSL2 sandbox.
Hardware floor
A ~30B model is the sweet spot for local code generation. You need:
| Setup | Floor |
|---|---|
| Machine: Apple Silicon | modern M chip, 32 GB+ unified memory |
| Machine: with GPU | 24 to 32 GB+ VRAM |
| Machine: CPU only | 64 GB+ system RAM, expect 3 to 8 tok/s |
Below 32 GB you can pick a smaller model, but agentic coding only starts being fun at the ~30B class. Grab a HamrPass when local isn't enough.
1. Install Ollama
macOS: download from ollama.com/download, run it. In the app, open Settings and set Context length to 64k or more (depending on your machine). The 4k default silently breaks coding agents.
Linux:
curl -fsSL https://ollama.com/install.sh | sh
On Linux without the desktop app there is no slider. The context_size you set in step 4 wins.
2. Pull a model
ollama pull qwen3.6:27b
About 17 GB. Ctrl+C and resume works.
3. Install codehamr
curl -fsSL https://codehamr.com/install.sh | bash
Codehamr runs shell commands written by an LLM. Sandbox it in a devcontainer when it touches code you care about.
4. Point codehamr at Ollama
In your project:
.codehamr/config.yaml
active: local
models:
local:
llm: qwen3.6:27b
url: http://localhost:11434
key: ""
context_size: 65536
5. Run
codehamr
First prompt is slow while the model loads. Every prompt after is fast.
If something doesn't work
curl -s http://127.0.0.1:11434/v1/models
- JSON with your tag → recheck
url:in your config. - Connection refused → start the Ollama app, or
systemctl start ollama. - Empty model list → re-run
ollama pull qwen3.6:27b.