GCC's build process does this. GCC is built 3 separate times, starting with the host compiler, then with the compiler from the previous step. If the output of stage 2 and 3 do not match the build fails.
It's nice that AI can fix bugs fast, but it's better to not even have bugs in the first place. By using someone else's battle tested code (like a framework) you can at least avoid the bugs they've already encountered and fixed.
I spent Dry January working on a new coding project and since all my nerd friends have been telling me to try to code with LLM's I gave it a shot and signed up to Google Gemini...
All I can say is "holy shit, I'm a believer." I've probably got close to a year's worth of coding done in a month and a half.
Busy work that would have taken me a day to look up, figure out, and write -- boring shit like matplotlib illustrations -- they are trivial now.
Things that are ideas that I'm not sure how to implement "what are some different ways to do this weird thing" that I would have spend a week on trying to figure out a reasonable approach, no, it's basically got two or three decent ideas right away, even if they're not perfect. There was one vectorization approach I would have never thought of that I'm now using.
Is the LLM wrong? Yes, all the damn time! Do I need to, you know, actually do a code review then I'm implementing ideas? Very much yes! Do I get into a back and forth battle with the LLM when it gets starts spitting out nonsense, shut the chat down, and start over with a newly primed window? Yes, about once every couple of days.
It's still absolutely incredible. I've been a skeptic for a very long time. I studied philosophy, and the conceptions people have of language and Truth get completely garbled by an LLM that isn't really a mind that can think in the way we do. That said, holy shit it can do an absolute ton of busy work.
What kind of project / prompts - what’s working for you? /I spent a good 20 years in the software world but have been away doing other things professionally for couple years. Recently was in the same place as you, with a new project and wanting to try it out. So I start with a generic Django project in VSCode, use the agent mode, and… what a waste of time. The auto-complete suggestions it makes are frequently wrong, the actions it takes in response to my prompts tend to make a mess on the order of a junior developer. I keep trying to figure out what I’m doing wrong, as I’m prompting pretty simple concepts at it - if you know Django, imagine concepts like “add the foo module to settings.py” or “Run the check command and diagnose why the foo app isn’t registered correctly” Before you know it, it’s spiraling out of control with changes it thinks it is making, all of which are hallucinations.
I'm just using Gemini in the browser. I'm not ready to let it touch my code. Here are my last two prompts, for context the project is about golf course architecture:
Me, including the architecture_diff.py file: I would like to add another map to architecture_diff. I want the map to show the level of divergence of the angle of the two shots to the two different holes from each point. That is, when your are right in between the two holes, it should be a 180 degree difference, and should be very dark, but when you're on the tee, and the shot is almost identical, it should be very light. Does this make sense? I realize this might require more calculations, but I think it's important.
Gemini output was some garbage about a simple naive angle to two hole locations, rather than using the sophisticated expected value formula I'm using to calculate strokes-to-hole... thus worthless.
Follow up from me, including the course.py and the player.py files: I don't just want the angle, I want the angle between the optimal shot, given the dispersion pattern. We may need to update get_smart_aim in the player to return the vector it uses, and we may need to cache that info. We may need to update generate_strokes_gained_map in course to also return the vectors used. I'm really not sure. Take as much time as you need. I'd like a good idea to consider before actually implementing this.
Gemini output now has a helpful response about saving the vector field as we generate the different maps I'm trying to create as they are created. This is exactly the type of code I was looking for.
I recently started building a POC for an app idea. As framework I choose django and I did not once wrote code myself. The whole thing was done in a github codespace with copilot in agentic mode and using mostly sonnet and opus models.
For prompting, I did not gave it specific instructions like add x to settings. I told it "We are now working on feature X. X should be able to do a, b and c. B has the following constraints. C should work like this." I have also some instructions in the agents.md file which tells the model to, before starting to code, ask me all unclear questions and then make a comprehensive plan on what to implement. I would then go over this plan, clarify or change if needed - and then let it run for 5-15 minutes. And every time it just did it. The whole thing, with debugging, with tests. Sure, sometimes there where minor bugs when I tested - but then I prompted directly the problem, and sure enough it got fixed in seconds...
Not sure why we had so different experiances. Maybe you are using other models? Maybe you miss something in your prompts? Letting it start with a plan which I can then check did definitly help a lot. Also a summary of the apps workings and technical decissions (also produced by the model) did maybe help in the long run.
I don't use VSCode, but I've heard that the default model isn't that great. I'd make sure you're using something like Opus 4.5/4.6. I'm not familiar enough with VSCode to know if it's somehow worse than Claude Code, even with the same models, but can test Claude Code to rule that out. It could also be you've stumbled upon a problem that the AI isn't that good at. For example, I was diagnosing a C++ build issue, and I could tell the AI was off track.
Most of the people that get wowed use an AI on a somewhat difficult task that they're unfamiliar with. For me, that was basically a duplicate of Apple's Live Captions that could also translate. Other examples I've seen are repairing a video file, or building a viewer for a proprietary medical imaging format. For my captions example, I don't think I would have put in the time to work on it without AI, and I was able to get a working prototype within minutes and then it took maybe a couple more hours to get it running smoother.
Also >20 years in software. The VSCode/autocomplete, regardless of the model, never worked good for me. But Claude Code is something else - it doesn't do autocomplete per se - it will do modifications, test, if it fails debug, and iterate until it gets it right.
I'm (mostly) a believer too, and I think AI makes using and improving these existing frameworks and libraries even easier.
You mentioned matplotlib, why does it make sense to pay for a bunch of AI agents to re-invent what matplotlib does and fix bugs that matplotlib has already fixed, instead of just having AI agents write code that uses it.
I mean, the thesis of the post is odd. I'll grant you that.
I work mostly with python (the vast majority is pure python), flask, and htmx, with a bit of vanilla js thrown in.
In a sense, I can understand the thesis. On the one hand Flask is a fantastic tool, with a reasonable abstraction given the high complexity. I wouldn't want to replace Flask. On the otherhand HTMX is a great tool, but often imperfect for what I'm exactly trying to do. Most people would say "well just just React!" except that I honestly loathe working with js, and unless someone is paying me, I'll do it in python. I could see working with an LLM to build a custom tool to make a version of HTMX that better interacts with Flask in the way I want it to.
In fact, in my project I'm working on now I'm building complex heatmap illustrations that require a ton of dataprocessing, so I've been building a model to reduce the NP hard aspects of that process. However, the illustrations are the point, and I've already had a back and forth with the LLM about porting the project into HTML, or some web based version of illustration at least, simply because I'd have much more control over the illustrations. Right now, matplotlib still suits me just fine, but if I had to port it, I could see just building my own tool instead of finding an existing framework and learning it.
Frameworks are mostly useful because of group knowledge. I learn Flask because I don't want to build all these tools from scratch, and because I makes me literate in a very common language. The author is suggesting that these barriers -- at least for your own code -- functionally don't exist anymore. Learning a new framework is about as labor intensive as learning one you're creating as you go. I think it's short-sighted, yes, but depending on the project, yea when it's trivial to build the tool you want, it's tempting to do that instead learning to use a similar tool that needs two adapters attached to it to work well on the job you're trying to do.
At the same time, this is about scope. Anyone throwing out React because they want to just "invent their own entire web framework" is just being an idiot.
Well maintained, popular frameworks have github issues that frequently get resolved with newly patched versions of the framework. Sometimes bugs get fixed that you didn't even run into yet so everybody benefits.
Will your bespoke LLM code have that? Every issue will actually be an issue in production experienced by your customers, that will have to be identified (better have good logging and instrumentation), and fixed in your codebase.
Frameworks that are (relatively) buggy and slow to address bugs lose popularity, to the point that people will spontaneously create alternatives. This happened too many times.
Have you? Then you know that the amount of defects scales linearly with the amount of code. As things stand models write a lot more code than a skilled human for a given requirement.
In practice using someone else’s framework means you’re accepting the risk of the thousands of bugs in the framework that have no relevance to your business use case and will never be fixed.
Yet people still use frameworks, before and after the age of LLMs. Frameworks must have done something right, I guess. Otherwise everyone will vibe their own little React in the codebase.
Shooting the driver of a car that's driving at you is not self defense. Cars don't instantly stop if the driver is incapacitated. You'll likely make the situation even worse because the incapacitated driver's foot will press the accelerator down (exactly what happened here). If your actual intent is to defend yourself the only move that makes any sense is to get out of the way.
From the outside, as a user, this last year has been incredible for KDE Plasma, in terms of new features and stability. So whatever they're doing internally is absolutely working from my perspective.
I don't think the Interstellar Blu-ray has Dolby Vision (or Dolby Atmos), just regular HDR10. If the TV/AVR says it's Dolby Vision something in your setup might be doing some kind of upconversion.
You're right! It looks like the Sony UBP-X700 doesn't automatically detect the HDR type and was set to Dolby Vision. I turned it off and the TV now displays the same HDR logo it shows when connecting to the PC. The AVR says...
...color are now more aligned with the PC. The Blu-ray video seems to be showing more detail in the explosion. I thought this extra detail was because of more color being shown, but I now think this might have something to do with Youtube's HDR video being more compressed.
Desktop/Laptop Linux is improving pretty fast, but by using an LTS distro like Debian you miss out on a lot of that.
I had to run Ubuntu 22.04 on a laptop for a while and encountered similar monitor switching and bluetooth issues. Eventually I figured out I could get the latest version of most desktop packages from the KDE Neon repos since they were also based on 22.04 at the time.
Running the latest KDE Plasma desktop with the latest mesa and pipewire made a huge difference. Monitor switching now works every time, all the bluetooth features worked, battery life improved, and Firefox stopped crashing when using webgl.
I'm not saying it'll fix all your problems, but most of these problems are being actively worked on and I think its worth trying a distro that actually keeps up with the pace of that work.
"Filmmaker mode" is the industry's attempt at this. On supported TVs it's just another picture mode (like vivid or standard), but it disables all the junk the other modes have enabled by default without wading though all the individual settings. I don't know how widely adopted it is though, but my LG OLED from 2020 has it.
The problem with filmmaker mode is I don't trust it more than other modes. It would take no effort at all for a TV maker to start fiddling whit "filmmaker mode" to boost colors or something to "get an edge", then everyone does it, and we're back to where we started. I just turn them off and leave it that way. Companies have already proven time and again they'll make changes we don't like just because they can, so it's important to take every opportunity to prevent them even getting a chance.
"Filmmaker mode" is a trademark of the UHD Alliance, so if TV makers want to deviate from the spec they can't call it "Filmmaker mode" anymore. There's a few different TV makers in the UHD Alliance so there's an incentive for the spec to not have wiggle room that one member could exploit to the determent of the others.
It's true that Filmmaker Mode might at some point in the future be corrupted, but in the actual world of today, if you go to a TV and set it to Filmmaker Mode, it's going to move most things to correct settings, and all things to correct settings on at least some TVs.
(The trickiest thing is actually brightness. LG originally used to set brightness to 100 nits in Filmmaker Mode for SDR, which is correct dark room behavior -- but a lot of people aren't in dark rooms and want brighter screens, so they changed it to be significantly brighter. Defensible, but it now means that if you are in a dark room, you have to look up which brightness level is close to 100 nits.)
Game mode being latency-optimized really is the saving grace in a market segment where the big brands try to keep hardware cost as cheap as possible. Sure, you _could_ have a game mode that does all of the fancy processing closer to real-time, but now you can't use a bargain-basement CPU.
On my Samsung OLED game mode has an annoying effect that turns (nearly) copletely black screens into gray smudge garbage that you can only turn down but not completely off, making that mode entirely useless.
Yup, it's great, at least for live action content. I've found that for Anime, a small amount of motion interpolation is absolutely needed on my OLED, otherwise the content has horrible judder.
I always found that weird, anime relies on motion blur for smoothness when panning / scrolling motion interpolation works as an upgraded version of that... until it starts to interpolate actual animation
On my LG OLED I think it looks bad. Whites are off and I feel like the colours are squashed. Might be more accurate, but it's bad for me. I prefer to use standard, disable everything and put the white balance on neutral, neither cold nor warm.
I had just recently factory reset my samsung S90C QDOLED - and had to work through the annoying process of dialing the settings back to something sane and tasteful. Filmmaker mode only got it part of the way there. The white balance was still set to warm, and inexplicably HDR was static (ignoring the content 'hints'), and even then the contrast seemed off, and I had to set the dynamic contrast to 'low' (whatever that means) to keep everything from looking overly dark.
It makes me wish that there was something like an industry standard 'calibrated' mode that everyone could target - let all the other garbage features be a divergence from that. Hell, there probably is, but they'd never suggest a consumer use that and not all of their value-add tackey DSP.
"Warm" or "Warm 2" or "Warm 50" is the correct white point on most TVs. Yes, it would make sense if some "Neutral" setting was where they put the standards-compliant setting, but in practice nobody ever wants it to be warmer than D6500, and lots of people want it some degree of cooler, so they anchor the proper setting to the warm side of their adjustment.
When you say that "HDR is static" you probably mean that "Dynamic tone-mapping" was turned off. This is also correct behavior. Dynamic tone-mapping isn't about using content settings to do per-scene tone-mapping (that's HDR10+ or Dolby Vision, though Samsung doesn't support the latter), it's about just yoloing the image to be brighter and more vivid than it should be rather than sticking to the accurate rendering.
What you're discovering here is that the reason TV makers put these "garbage features" in is that a lot of people like a TV picture that's too vivid, too blue, too bright. If you set it to the true standard settings, people's first impression is that it looks bad, as yours was. (But if you live with it for a while, it'll quickly start to look good, and then when you look at a blown-out picture, it'll look gross.)
“Filmmaker Mode” on LG OLED was horrible. Yes, all of the “extra” features were off, but it was overly warm and unbalanced as hell. I either don’t understand “Filmmakers” or that mode is intended to be so bad that you will need to fix it yourself.
Filmmaker is warm because it follows the standardized D6500 whitepoint. But that's the monitor whitepoint it is mastered against, and how it's intended to be seen.
TV producers always set their sets to way higher by default because blue tones show off colors better.
As a result of both that familiarity and the better saturation, most people don't like filmmaker when they try to use it at first. After a few weeks, though, you'll be wondering why you ever liked the oversaturated neons and severely off brightness curve of other modes.
The whites in Filmmaker Mode are not off. They'll look warm to you if you're used to the too-blue settings, but they're completely and measurably correct.
I'd suggest living with it for a while; if you do, you'll quickly get used to it, and then going to the "standard" (sic) setting will look too blue.
The problem is that comparing to all the monitors I have, specifically the one in my Lenovo Yoga OLED that is supposed to be very accurate, whites are very warm in filmmaker mode. What's that about?
Your monitor is probably set to the wrong settings for film content. Almost all monitors are set to a cool white point out of the box. If you're not producing film or color calibrated photography on your monitor, there is no standard white temperature for PC displays.
It means that the colors should be correct. The sky on tv should look like the sky. The grass on tv should look like grass. If I look at the screen and then I look outside, it should look the same. HDR screens and sensors are getting pretty close, but almost everyone is using color grading so the advantage is gone. And after colors, don't get me started about motion and the 24fps abomination.
> It means that the colors should be correct. The sky on tv should look like the sky. The grass on tv should look like grass.
It is not as clear cut as you think and is very much a gradient. I could send 10 different color gradings of the sky and grass to 10 different people and they could all say it looks “natural” to them, or a few would say it looks “off,” because our expectations of “natural” looks are not informed by any sort of objective rubric. Naturally if everyone says it’s off the common denominator is likely the colorist, but aside from that, the above generally holds. It’s why color grading with proper scopes and such is so important. You’re doing your best to meet the expectation for as many people as possible knowing that they will be looking on different devices, have different ideas of what a proper color is, are in different environments, etc. and ultimately you will still disappoint some folks. There are so many hardware factors at play stacked on top of an individual’s own expectations.
Even the color of the room you’re in or the color/intensity of the light in your peripheral vision will heavily influence how you perceive a color that is directly in front of you. Even if you walk around with a proper color reference chart checking everything it’s just always going to have a subjective element because you have your own opinion of what constitutes green grass.
In a way, this actually touches on a real issue. Instead of trying to please random ppl and make heuristics that work in arbitrary conditions, maybe start from the objective reality? I mean, for the start, take a picture, and then immediately compare it with the subject. If it looks identical then that's a good start. I haven't seen any device capable of doing this. Of course you would need the entire sensor-processing-screen chain to be calibrated for this.
Everything I talked about above applies even more so now that you’re trying to say “we’ll make a camera capture objective colors/reality.” That’s been a debate about cameras ever since the first images were taken. “The truth of the image.”
There is no such thing as the “correct” or “most natural” image. There is essentially no “true” image.
I completely agree. Theoretically you could capture and reproduce the entire spectrum for each pixel, but even that is not "true" because it is not the entire light field. But I still think that we can look at the picture on phone in the hand and at the subject just in front, and try to make them as similar as possible to our senses? This looks to me like a big improvement to the current state of affairs. Then you can always say to a critic: I checked just as i took the picture/movie, and this is exactly how the sky/grass/subject looked.
Well, I know what you mean, color is complicated. BUT, I can look at a hundred skys and they look like sky. I will look at the sky on the tv, and it looks like sky on the tv, not like the real sky. And sky is probably easy to replicate, but if you take the grass or leaves, or human skin, then the tv becomes funny most of the time.
> I will look at the sky on the tv, and it looks like sky on the tv, not like the real sky.
Well for starters you’re viewing the real sky in 3D and your TV is a 2D medium. Truly that immediately changes your perception and drastically. TV looks like TV no matter what.
reply