Working at Microsoft, I've just now hooked up to Claude Code (my department was not permitted to use it previously), through something called "Agent Maestro", a vscode extension which I guess pipes claude code API requets to our internally hosted Claude models, including Opus 4.6.
I do wonder if there is going to be much of a difference between using Claude Code vs. Copilot CLI when using the same models.
I honestly don’t think the models are as important as people tend to believe. More important is how the models are given tools - find, grep, git, test runners, …
> I honestly don’t think the models are as important as people tend to believe.
I tend to disagree. While I don't see meaningful _reasoning power_ between frontier models, I do see differences in the way they interact with my prompts.
I use exclusively Anthropic models because my interactions with GPT are annoying:
- Sonnet/Opus behave like a mix of a diligent intern, or a peer. It does the work, doesn't talk too much, gives answers, etc.
- GPT is overly chatty, it borderline calls me "bro", tend to brush issues I raise "it should be good enough for general use", etc.
- I find that GPT hardly ever steps back when diagnosing issues. It picks a possible cause, and enters a rabbit hole of increasingly hacky / spurious solutions. Opus/Sonnet is often to step back when the complexity increases too much, and dig an alternative.
- I find Opus/Sonnet to be "lazy" recently. Instead of systematically doing an accurate search before answering, it tries to "guess", and I have to spot it and directly tell it to "search for the precise specification and do not guess". Often it would tell me "you should do this and that", and I have to tell it "no, you do it". I wonder if it was done to reduce the number of web searches or compute that it uses
unless the user explicitly asks.
Crowding around our first ever computer, a 120mhz pentium with 16mb of RAM and a 1.6gb hard disk, watching that Weezer video on the CRT monitor with my whole family is a cherished memory.
This does make a quite a bit of sense. When I was a teenager in the 90s/early aughts, it was all IRC, script kiddie stuff. Reckless abandon. What worries me is that it seems like full-grown adults are happy to accelerate the dead internet and put security at risk. I assume it's not just teenagers running these stupid LLM bots.
I don't think "we" would have been impacted since this specifically targets the updates, but recently Microsoft pulled Notepad++ from the list of apps we can use on our production management laptops. Some people were annoyed and whining about this. That predated this announcement by a few weeks. Probably the right move by the security folks.
I do wonder if there is going to be much of a difference between using Claude Code vs. Copilot CLI when using the same models.
reply