As others said this was possible for months already with llama-cop’s support for Anthropic messages API. You just need to set the ANTHROPIC_BASE_URL. The specific llama-server settings/flags were a pain to figure out and required some hunting, so I collected them in this guide to using CC with local models:
One tricky thing that took me a whole day to figure out is that using Claude Code in this setup was causing total network failures due to telemetry pings, so I had to set this env var to 1: CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC
Curious how it compares to last week’s release of Kyutai’s Pocket-TTS [1] which is just 100M params, and excellent in both speed and quality (English only). I use it in my voice plugin [2] for quick voice updates in Claude Code.
You are absolutely right — most internet users don't know the specific keyboard combination to make an em dash and substitute it with two hyphens. On some websites it is automatically converted into an em dash. If you would like to know more about this important punctuation symbol and it's significance in identitifying ai writing, please let me know.
Thanks for that. I had no idea either. I'm genuinely surprised Windows buries such a crucial thing like this. Or why they even bothered adding it in the first place when it's so complicated.
The Windows version is an escape hatch for keying in any arbitrary character code, hence why it's so convoluted. You need to know which code you're after.
To be fair, the alt-input is a generalized system for inputting Unicode characters outside the set keyboard layout. So it's not like they added this input specifically. Still, the em dash really should have an easier input method given how crucial a symbol it is.
It's a generalized system for entering code page glyphs that was extended to support Unicode. 0150 and 0151 only work if you are on CP1252 as those aren't the Unicode code points.
And Em Dash is trivially easy on iOS — you simply hold press on the regular dash button - I’ve been using it for years and am not stopping because people might suddenly accuse me of being an AI.
Context filling up is sort of the Achilles heel of CLI agents. The main remedy is to have it output some type of handoff document and then run /compact which leaves you with a summary of the latest task. It sort of works but by definition it loses information, and you often find yourself having to re-explain or re-generate details to continue the work.
I made a tool[1] that lets you just start a new session and injects the original session file path, so you can extract any arbitrary details of prior work from there using sub-agents.
Yes you can literally just ask Claude Code to create a status line showing context usage. I had it make this colored progress bar of context usage, changing thru green, yellow, orange, red as context fills up. Instructions to install:
Specifically for coding agents, one issue is how to continue work when almost fill the context window.
Compaction always loses information, so I use an alternative approach that works extremely well, based on this almost silly idea — your original session file itself is the golden source of truth with all details, so why not directly leverage it?
So I built the aichat feature in my Claude-code-tools repo with exactly this sort of thought; the aichat rollover option puts you in a fresh session, with the original session path injected, and you use sub agents to recover any arbitrary detail at any time. Now I keep auto-compact turned off and don’t compact ever.
It’s a relatively simple idea; no elaborate “memory” artifacts, no discipline or system to follow, work until 95%+ context usage.
The tool (with the related plugins) makes it seamless: first type “>resume” in your session (this copies session id to clipboard), then quit and run
aichat resume <pasted session id>
And this launches a TUI offering a few ways to resume your work, one of which is “rollover”; this puts you in a new session with the original session jsonl path injected.
And in the new session say something like,
“There is a chat session log file path shown to you; Use subagents strategically to extract details of the task we were working on at the end of it”, or use the /recover-context slash command. If it doesn’t quite get all of it, prompt it again for specific details.
There’s also an aichat search command for rust/tantivy based fast full text search to search across sessions, with a TUI for humans and a CLI/JSON mode for agents/subagents. The latter ( and the corresponding skill and sub agent) can be used to recover arbitrary detailed context about past work.
Besides being trash as others said, there’s a trade off with real time transcription word by word - there’s no opportunity for an AI to holistically correct/clean up the transcription
You mean, after displaying each word as it is spoken, then OSX goes back and fixes what’s been displayed? I think I’ve seen it fix one or two recent words, but I guess you’re saying it could fix the entire sentence as well. I didn’t know that
I’ve tried several, including this one, and I’ve settled on VoiceInk (local, one-time payment), and with Parakeet V3 it’s stunningly fast (near-instant) and accurate enough to talk to LLMs/code-agents, in the sense that the slight drop in accuracy relative to Whisper Turbo3 is immaterial since they can “read between the lines” anyway.
My regular cycle is to talk informally to the CLI agent and ask it to “say back to me what you understood”, and it almost always produces a nice clean and clear version. This simultaneously works as confirmation of its understanding and also as a sort of spec which likely helps keep the agent on track.
UPDATE - just tried handy with Parakeet v3, and it works really well too, so I'll use this instead of VoiceInk for a few days. I just also discovered that turning on the "debug" UI with Cmd-shift-D shows additional options like post processing and appending trailing space.
I'll bet you could take a relatively tiny model and get it to translate the transcribed "git force push" or "git push dash dash force" into "git push --force".
Likewise "cd home slash projects" into "cd ~/projects".
https://github.com/pchalasani/claude-code-tools/blob/main/do...
One tricky thing that took me a whole day to figure out is that using Claude Code in this setup was causing total network failures due to telemetry pings, so I had to set this env var to 1: CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC
reply