Hacker Newsnew | past | comments | ask | show | jobs | submit | hmaxwell's commentslogin

I will miss the great times like these https://www.youtube.com/watch?v=mnrMhbWG0Pc


You can still write code yourself. Just like you can still walk to work, you do not need to use a car.


I just tested both codex 5.3 and opus 4.6 and both returned pretty good output, but opus 4.6's limits are way too strict. I am probably going to cancel my Claude subscription for that reason:

What do you want to do?

  1. Stop and wait for limit to reset
   2. Switch to extra usage
   3. Upgrade your plan

 Enter to confirm · Esc to cancel
How come they don't have "Cancel your subscription and uninstall Claude Code"? Codex lasts for way longer without shaking me down for more money off the base $xx/month subscription.


How else are they going to supplement their own development expenses? The more Claude Anthropic needs the less Claude the customer will get. By their own admission that is how the Anthropic model works. Their end value is in using vibe coders and engineers alike to create a persistent synthetic developer that replaces their own employees and most of their customers.

Scalable Intelligence is just a wrapper for centralized power. All Ai companies are headed that way.


IF it helps, try hedging b/w Copilot, Claude, OpenCode and ChatGPT. That is how I have been managing off late. Claude for planning and some nasty things. ChatGPT for quick questions. OpenCode with Sonnet4.5 on Bedrock and Copilot with Sonnet4.5/Opus4.5 (LOL)


They introduced the low limit warning for Opus on claude.ai


Exactly 100%

I read these comments and articles and feel like I am completely disconnected from most people here. Why not use GenAI the way it actually works best: like autocomplete on steroids. You stay the architect, and you have it write code function by function. Don't show up in Claude Code or Codex asking it to "please write me GTA 6 with no mistakes or you go to jail, please."

It feels like a lot of people are using GenAI wrong.


> It feels like a lot of people are using GenAI wrong.

That argument doesn’t fly when the sellers of the technology literally sing at you “there’s no wrong way to prompt”.

https://youtu.be/9bBfYX8X5aU?t=48


I feel like people who can't get AI to write production ready code are really bad at describing what they want done. The problem is that people want an LLM to one shot GTA6. When the average software developer prompts an LLM they expect 1) absolutely safe code 2) optimized/performant code 3) production ready code without even putting the requirements on credential/session handling.

You need to prompt it like it's an idiot, you need to be the architect and the person to lead the LLM into writing performant and safe code. You can't expect it to turn key one shot everything. LLMs are not at the point yet.


That's just the thing though - it seems like, to get really good code out of an LLM, a lot of the time, you have to describe everything you want done and the full context in such excruciating detail and go through so many rounds of review and correction that it would be faster and easier to just write the code yourself.


Yes, but please remember you specify the common parts only once for the agent. From there, it’ll base its actions on all the instructions you kept on their configuration.


Welcome to the waterfall development model. This is what companies did before enshitiffixation


I’ve found LLMs to be severely underwhelming. A week or two ago I tried having both Gemini3 and GPT Codex refactor a simple Ruby class hierarchy and neither could even identify the classes that inherited from the class I wanted removed. Severely underwhelming. Describing what was wanted here boils down to minima language and they both failed.


I tried getting AI to update some JUnit 4 to Junit 5 - it replaced the JUnit 4 assertions with Java's built-in assert keyword. Very underwhelming.


Exactly this. Not sure what code other people who post here are writing but it cannot always and only be bleeding edge, fringe and incredible code. They don't seem to be able to get modern LLMs to produce decent/good code in Go or Rust, while I can prototype a new ESP32 which I've never seen fully in Rust and it can manage to solve even some edge cases which I can't find answers on dedicated forums.


I have a sneaking suspicion that AI use isn't as easy as it's made out to be. There certainly seem to be a lot of people who fail to use it effectively, while others have great success. That indicates either a luck or a skill factor. The latter seems more likely.

What are your secrets? Teach me the dark arts!


There are wide gaps in:

1) the models people are using (default model in copilot vs. Opus 4.5 or Codex xhigh)

2) the tools people are using (ChatGPT vs. copilot vs. codex vs. Claude code)

3) when people tried these tools (e.g., December saw a substantial capability increase but some people only tried AI this one time last March)

4) how much effort people put into writing prompts (e.g., one vague sentence vs. a couple paragraphs of specific constraints and instructions)

Especially with all the hype, it makes sense to me why people have such different estimates for how useful AI actually is.


This sounds like my first job with a big consulting firm many years ago (COBOL as it happens) where programming tasks that were close to pseudocode were handed to the programmers by the analysts. The programmer (in theory) would have very few questions about what he was supposed to write, and was essentially just translating from the firm's internal spec language into COBOL.


I find that at the granularity you need to work with current LLMs to get a good enough output, while verifying its correctness is more effort than writing code directly. The usefulness of LLMs to me is to point me in a direction that I can then manually verify and implement.


I did some testing this morning:

Prompt: "can you find any mistakes on my codebase? I put one in there on purpose" + 70,000 tokens of codebase where in one line I have an include for a non-existent file.

Claude 3.7 Sonnet (Try 1 - new chat) Claimed it Found a race condition in the api of the admin interface that would be triggered if two admins were to change the room order at the same time. Claude suggested I group all sql queries in a single transaction. I looked at the code and found that it already used a transaction for all queries. I said: the order_update api is already done with a transaction. Claude replied: "You're absolutely right, and I apologize for my mistake. I was incorrect to claim there was a race condition issue. The transaction ensures atomicity and consistency of the updates, and the SQL queries are properly structured for their intended purpose."

Claude 3.7 Sonnet (Try 2 - new chat) Found a comment in the admin ui javascript code that suggested a potential alternative to event handler cleanup that was not implemented because I decided to go with a cleaner route. Then asked "Is this the issue you intentionally included, or would you like me to look for other potential problems?" I said: "The comment merely suggests an alternative, right?" claude said: "Yes, you're absolutely right. The comment is merely suggesting an alternative approach that isn't being used in the code, rather than indicating a mistake. So there's no actual bug or mistake in this part of the code - just documentation of different possible approaches. I apologize for misinterpreting this as an issue!"

Claude 3.7 Sonnet (Try 3 - new chat) When processing items out of the database to generate QR codes in the admin interface, Claude says that my code both attempts to generate QR codes with undefined data AS WELL AS saying that my error handling skips undefined data. Claude contradicts itself within 2 sentences. When asking about clarification Claude replies: Looking at the code more carefully, I see that the code actually has proper error handling. I incorrectly stated that it "still attempts to call generateQRCode()" in the first part of my analysis, which was wrong. The code properly handles the case when there's no data-room attribute.

Gemnini Advanced 2.5 Pro (Try 1 - new chat) Found the intentional error and said I should stop putting db creds/api keys into the codebase.

Gemnini Advanced 2.5 Pro (Try 2 - new chat) Found the intentional error and said I should stop putting db creds/api keys into the codebase.

Gemnini Advanced 2.5 Pro (Try 3 - new chat) Found the intentional error and said I should stop putting db creds/api keys into the codebase.

o4-mini-high and o4-mini and o3 and 4.5 and 4o - "The message you submitted was too long, please reload the conversation and submit something shorter."


The thread is about 2.5 Flash though, not 2.5 Pro. Maybe you can try again with 2.5 Flash specifically? Even though it's a small model.


I don’t particularly care about the non frontier models though, I found the comment very useful.


Those responses are very Claude, to. 3.7 has powered our agentic workflows for weeks, but I've been using almost only Gemini for the last week and feel the output is better generally. It's gotten much better at agentic workflows (using 2.0 in an agent setup was not working well at all) and I prefer its tuning over Clause's, more to the point and less meandering.


> codebase where in one line I have an include for a non-existent file

Ok but you don't need AI for this; almost any IDE will issue a warning for that kind of error...


3 different answers in 3 tries for Claude? Makes me curious how many times you'd get the same answer if you asked 10/20/100 times


Have you tried Claude Code?


how did you put your whole codebase in a prompt for gemini?


You can use netboot.xyz from a flash drive to boot various operating systems and utilities. Alternatively, PXE (Preboot Execution Environment) has been around for a while and works by allowing a network-capable device to boot from its network interface. A PXE-compatible network card requests a DHCP lease during the boot process, which provides the IP address of a TFTP (Trivial File Transfer Protocol) server and the file that needs to be loaded from the server.

Typically, the network card contains a basic PXE kernel. To enhance this environment, you can chainload iPXE, which offers a broader range of features. iPXE allows for more advanced booting options, such as loading scripts or initiating an unattended installation directly from the network.


This is a nothing burger compared to amazon and google giving $4b and $2b respectively to Anthropic


Find a new job, or if you like the job you have just ignore it. Or if you want to get fired start explaining concepts to your boss.


I know that improvmx was reading my emails. Reason? I get an email from them saying: We've detected activity on your domain that violate our Terms of Service, particularly the "Prohibited Activities and Responsible Usage" section.

Yea, no thank you. What's been working really well is CloudFlare's email forwarding service, plus it's free unlike improvmx.


Given that they specifically say they don't do that, what concrete evidence do you have? I don't see anything in their prohibited activities list that would require them to do so (e.g. recipient complaints would suffice)

https://improvmx.com/terms-of-service/

https://improvmx.com/our-pledge-to-you/


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: