Depends what you need the model to do. The recent granite4.1:3b just takes 2GB o...

aftbit · 2026-06-16T20:08:38 1781640518

Yeah it 100% depends what you want the model to do. Some tasks, like extraction, summarization, or simple tool calling (e.g. "turn on my desk lamp") are very doable with tiny models. Others, like coding or more advanced agentic workflows can demand much more powerful models. I was thinking from the lens of coding or running _big_ data extraction pipelines (think ~8 billion pages).

EagnaIonat · 2026-06-17T05:10:49 1781673049

> thers, like coding or more advanced agentic workflows can demand much more powerful models.

You can do coding and agentic fine. For coding I use qwen3.6:35b-mlx and agentic granite4.1:3b works fine.

These are the models I use.

- granite4.1:3b

- granite4.1:30b

- gpt-oss:20b

- gpt-oss:120b (less so now)

- mistral-small3.2

- qwen3.6:35b-mlx

There will always be use cases that don't sit on your laptop, but most of what can be done can be done locally, it just requires a good framework to sit on it.

packetlost · 2026-06-17T15:08:52 1781708932

Why do you like gpt-oss-120b less now? What replaced it?

aftbit · 2026-06-17T22:59:45 1781737185

It's very likely to hallucinate. I'm mostly using Gemma 4 31B now when I need something offline. It is a very strong model for its size.