Quickstart: Linux / macOS

The cleanest local path: macOS or Linux, Ollama on the host, codehamr in your terminal.

On Windows? See the Windows quickstart or WSL2 sandbox.

Hardware floor

A ~30B model is the sweet spot for local code generation. You need:

Setup	Floor
Machine: Apple Silicon	modern M chip, 32 GB+ unified memory
Machine: with GPU	24 to 32 GB+ VRAM
Machine: CPU only	64 GB+ system RAM, expect 3 to 8 tok/s

Below 32 GB you can pick a smaller model, but agentic coding only starts being fun at the ~30B class. Grab a HamrPass when local isn't enough.

1. Install Ollama

macOS: download from ollama.com/download, run it. In the app, open Settings and set Context length to 64k or more (depending on your machine). The 4k default silently breaks coding agents.

Linux:

curl -fsSL https://ollama.com/install.sh | sh

On Linux without the desktop app there is no slider. The context_size you set in step 4 wins.

2. Pull a model

ollama pull qwen3.6:27b

About 17 GB. Ctrl+C and resume works.

3. Install codehamr

curl -fsSL https://codehamr.com/install.sh | bash

Codehamr runs shell commands written by an LLM. Sandbox it in a devcontainer when it touches code you care about.

4. Point codehamr at Ollama

In your project:

.codehamr/config.yaml

active: local

models:
    local:
        llm: qwen3.6:27b
        url: http://localhost:11434
        key: ""
        context_size: 65536

5. Run

codehamr

First prompt is slow while the model loads. Every prompt after is fast.

If something doesn't work

curl -s http://127.0.0.1:11434/v1/models

JSON with your tag → recheck url: in your config.
Connection refused → start the Ollama app, or systemctl start ollama.
Empty model list → re-run ollama pull qwen3.6:27b.