The Strange Comfort of Local-First Everything

April 21, 2026 · 793 words · 4 min read

Running things closer to home.

A while ago I wrote about my accidental homelab, which at the time was a cheap M1 Mac mini. While it was not elegant, it was one of the most useful machines I've ever owned.

That setup has now been replaced by another Mac mini, this time one with 64GB of RAM (while it's overkill, I do not regret the upgrade at all). And it's still small, even smaller actually! It's still quiet, still easy to forget about, except now it has enough memory that I can run most of my personal infrastructure and a few local models without me feeling like I am asking too much of it. The old Mac mini is still around as a dedicated Stardew Valley machine for the missus.

The new machine still runs the usual boring services, the databases, search, background jobs, tiny apps, music tracking, bookmark stuff, the weird bits of automation that would be hard to justify as standalone products, but are incredibly useful as tools. None of it needs to be in the cloud, it is my data, my weird little workflows, my half-finished experiments, and I like knowing they are sitting on a machine I can reach, understand, restart, break, and fix.

The newer change is models. Not frontier models, obviously, I am not pretending a Mac mini is a rack of H100s, but the local model scene has gotten good enough that running good models is no longer hard at all! Gemma 4 just landed, and it's tailored for someone like me, with open models sized for actual hardware, Apache 2.0, long context, multimodal support, and enough focus on on-device use that it does not feel like a cloud model squeezed down into local shape (gpt-oss). The 26B MoE and 31B dense models are the exciting ones, while the smaller E2B and E4B variants are clearly aimed at edge devices. I believe the spread achieved here is solid, local AI is not one single use case. Sometimes I want a tiny fast model for classification or cleanup, sometimes I want a bigger model to chew through a document or understand my weird handwriting. Most of the time I just want to ask something without sending the whole context to a vendor.

Gemma alone is not enough though, Qwen has become really hard to ignore, especially the Qwen3 family, because the small models really hit above their weight and the larger MoE ones are serious. And there's Llama too, somehow still the default reference point for a lot of people, even when the latest generation feels more complicated than the old stuff. And Mistral keeps making models that feel practical and fast, which I appreciate but rarely end up reaching for. And Phi is interesting as well, going in the opposite direction and making something tiny enough that it makes you rethink how much model you actually need for a task.

That is the part I keep coming back to, because local-first does not mean pretending local is always better, for me it just means the default path starts at home. If the task needs a frontier model, fine, I will use one. I pay for enough of them already! But if a local model can summarise logs, classify notes, extract metadata from my own files, draft a search query, or help with some small private workflow, why would I send that anywhere else?

And while I praised it before, Tailscale is the reason this whole setup feels sane. I know people say "it just works" too often, but Tailscale really does! The Mac mini sits at home, my laptop is wherever I am, my phone is on whatever cursed hotel Wi-Fi I happen to be using, but everything still feels like it is on the same network. That comfort is hard to explain without sounding more dramatic than the setup deserves. It is just a small computer in my house, but it changes how I build things. I can make small software without wondering which tool has some weekly usage left, I can store data without designing a privacy policy for myself, and most importantly, I can run experiments that are too personal, too niche, or too stupid to justify as hosted products.

The cloud is still useful, I am not moving everything into a bunker, I just like the balance better now. I still reach for Codex or Claude a lot, there's a great deal of work that a local model can't (yet!) do, but most of the time when the task is small it's just easier to do it locally.

My mentor likes to say that the best infra is the one that disappeared into the background, I respectfully disagree, the best infra is the one that feels mine.