Usually you can check the commit history for the unexpected line to figure out what's up, though, to let you figure out if the bug is in the code or the comment.
Comments that have the same content as the code but written in English will have code drift problems. But most comments aren't like this; they can provide context or explain what's happening at a higher level, ideally.
A good rule of thumb I've heard about a few years ago: Assume you're explaining this chunk of a system to a team member standing next to you. Write that down. And don't worry about the tone feeling casual.
And once or twice at the top of a weird class/task file/section/..., don't be afraid of being a bit verbose and explain it until it's obvious, and then one more level. Stuff tends to be obvious while you have all the context uploaded in your mental caches - but a year down the line, it'll be rather confusing. Still, having such long comments too much in straight line code tends to make it harder to read.
Curiously this seems like exactly the story of thing that code ML should be able to identify and flag. 'Hey, the comment says this, but the code is doing something totally different.'
However, it might be trained on code with erroneous comments, either because it was mangled by a merge or because it's outdated. The more times this happen, the more confused the AI model will be.
Which makes me to think that the AI models should be trained on the code evolution of commit chains and not just on isolated snippets of code. That way, the AI could analyze your own commits to detect when a comment becomes outdated.
But ideally the comments should be executable, as unit tests, making you read them if and only if you break them.
For this to be a tolerable development experience, test as much as you can while keeping your tests away from slow dependencies like networking, DB, disk I/O..., and try to keep tests relevant to what you're modifying executable locally in a few seconds.
Maybe even refactor your app to have dependencies at the top, so that most code doesn't have access to them.
> But ideally the comments should be executable, as unit tests
For the kinds of comments that answer the "why", if we could do that, we wouldn't need the actual code in the first place.
> making you read them if and only if you break them.
A good "why" comment is supposed to inform you beforehand, so you can make changes effectively and without introducing extra bugs in the process. Unit tests are more of a safety net.
I’m imagining the poster is thinking about things like rust doctests where code in the comments will be executed as tests when you run cargo test on the project. It’s a nice way of being able to ensure that (at least part of) the documentation will correspond to the behavior of the code.
Unit tests suck at being documentation, and are not a good substitute for the "why" comment. They can catch some mistakes you make with the code under test, but they can't tell you why it is the way it is. At best, they can help you guess, at the cost of having to study the test code itself (which is usually much bigger than the code it tests, and often more complicated). But the thing is, the knowledge of "why" is most valuable to have before you start making changes and break some tests.
This is true, test coverage, especially in code that has to interact with other systems in particular ways will often have ten lines of setup that only matters in the test for every line of actual verification.
// in case this is malformed, fix formatting so it will still parse
input = fixInputFormatting(input);
and had a code reviewer ask, “why are you calling fixInputFormatting”?
Nothing to raise the blood pressure like a code review question that is literally answered by a comment on the line immediately preceding where they left the question.
I'm with the reviewer on this one. Why is it malformed? Why are you fixing it here and not earlier? Why are you fixing it and not rejecting it? The comment tells me nothing.
I don’t remember the exact comment/cope pair anymore, but it was something that the answer they wanted was exactly what was on the preceding line. Coming up with a simple example to demonstrate that is a surprisingly hard thing to do.
Agreed. Comments that explain in English exactly what the next line does drive me crazy. Even if the line is complicated. I pretty much only ever comment things these days when I change from one approach to another. ie.
// This code is weird. I tried doing it the obvious way, but that doesn't work because .. reasons ..
Sometimes, if the code is short, I'll even leave the old/obvious code there for future reference when I look at the weird code and say to myself:
"This is weird! Obviously it should work in this much simpler way.."
I mostly use comments to explain business rules, and why are they implemented there.
# High value orders need to be approved before refund
# Similar logic is also applied elsewhere, this is here
# as a failsafe.
if ticketValue > 500:
emailCustomerSupport(ticket)
else
refund
I know you by ".. reasons .." you mean "<and I add the reasons here>", but I've seen too many comments worded exactly like that.
Some programmers use comments (correctly) to explain reasoning and context, some use them to redundantly say what the code already says and some, apparently, use them to apologise.
As a code reviewer, I love seeing comments like that, because it immediately flags "someone made a mistake here".
Sure, sometimes the code is right and the comment is wrong -- but sometimes the comment is right and the code is wrong, in which case the comment just saved me a lot of time.
I think it was Dijkstra that said that software actually lives in the mind of the programmer, and the code is just a distorted, lossy representation of that. Anything that gives us light on their thoughts is likely to improve our understanding of the software. Bugs happens when there's a mismatch between what's going on the mind of the programmer and what the code actually does, so when we read code a primary task is to understand what the programmer thought it should do.
Comments informs us what are the stuff that the programmer cares enough to write down. When we see a seemingly trivial comment, we may ask: why did they took time to write down that? Did they think there was any subtlety we aren't aware of? Or perhaps they were inexperienced with the language, to the point of having a hard time reading the code they themselves wrote? (if I put this comment on Google, will I find they copy-pasted from Stack Overflow? -- in this case, the comment may be very helpful, if only to track down that [0])
[0] but even better would be an IDE that highlighted code copy-pasted from Stack Overflow, Github repositories, etc
It's also possible that both the comment and the code are right, and there's some non-obvious reason why +=2 has the effect of adding 1 here, and is the only way to do it. (Not literally, as in this example, but something analogous.)
A bunch too. So I don't think comments are solely at fault. Self documenting code is only as good as the person who wrote it, and the people who approved it. Sometimes a comment is warranted, sometimes it's not.
Yesterday I learned that in emacs lisp, "defvar" is a definition that is set one time only and from that point on can never be changed (i.e. can not be VARied) and "defconst" is a definition that can be changed (i.e. is not CONSTant). Naming things is hard.
I've completely given up on comments that aren't "here is the problem this code is solving". The comments end up being actually useful that way, and longer lived in their validity. Anything more granular than that, the code itself should make obvious.
A tool like this that tells me how surprising my code is, and takes into account comments around it, might change my mind on this front. If it always works as well as in the OP it would be super useful to be able to know how surprising the code I'm writing is (and I can then judge whether that's OK with me if it's surprising code), but this would also make it hard for the code and the comments to diverge greatly.
I mean, I still won't want "add 1 to x" comments of course.
Really a good code AI would note that the comments are misleading (which is sort of what this is doing).
It's actually completely achievable with today's models to look at the comment and the code immediately after it and see how surprising it is then note the comment could be incorrect.
Some people like granular comments, and some people only want high-level comments. I’ve long suspected that both camps are correct, because they’re using different languages.
A single line of Python data analysis code is often worth 20 lines of C++. If you would be willing to add one comment per 20 lines in C++, then nearly every line of your pandas gobbledygook is worth commenting.
Terse code is good, but that doesn’t necessarily mean the comments should be terse (or absent).
But if you're using comments to explain the code, 9/10 you just wrote it in too unreadable way.
Sure, some algorithms are complex enough that some comments are needed to explain the how (that's the 1/10) but in most cases the comments should explain why, not how. So instead it should be
// Add the calibrated skew to compensate for latency
x+=2;
That risk also exists for identifiers, which can mislead to exactly the same degree as they can inform. To avoid this hazard, run the code through an obfuscator that substitutes meaningless identifiers, before looking at it.
That's why you keep the old comments and write the changes within.
"If loop does something" #rev1
"If loop did do something, it now does something twice" #rev2
"If loop doesn't do something, it does something three times" #rev3
"we loop three times because we processing supervariables" #rev4
And you have an wrong illusion of documentation. You don't need ascii diagrams. Why not a scribble in a sketch book? Whiteboards and photography exist. And the method above doesn't require you too redraw. You've already got the first and last revision. Besides, during an documentation cycle of your projects life-cycle is where you update all documentation.
> you'll either get less refactors, or outdated comments
If so, you're not disciplined enough. If your project is to be handed over down the line, more documentation is better than any and any documentation is better than none.
It's like the premature optimisation quote. The advice is technically sound, but it's still not really good advice because 90% of the times people quote it they're just using it as an excuse not to care about performance at all.
Or in your case not to write comments at all. That's obviously a terrible idea.
Code that I wrote, that I subsequently had to ask: Why aren't you processing the first/last element of this sequence? Why are you casting this to a list when it's already a tuple? Why are you filtering out Decimal("NaN") from tests but not float ones?
I once wrote some code so convoluted I put a mickey mouse ascii art in comments and a note that said "This is some Mickey Mouse BS. I hope you never have to maintain it."
That's probably the only useful comment I've ever written.
I agree with that, but it gets to the point where people police all the comments in a codebase deeming them unuseful. I think, especially in a huge codebase, explaining why there is a certain block of code is very helpful to transfer knowledge.
I'll go further. Comments are a code smell. They mean you've probably not modularized your code enough, or named your functions and variables descriptively enough. As I've approached 20 years of coding, I began to be able to count on one hand the times when I truly need to add a comment.
(I don't mean the DocBlock type comments for describing functions and class interfaces, that get compiled into docs.)
To all you downvoters: please do respond with examples of comments that are are counterexamples to what I said!
in non performance critical code, comments usually aren't necessary but they are godsends when you need to do something tricky for performance. standard examples are things like bithacks.
Observation: many comments explaining what the code is doing don't match what the code is doing after a few check-ins. E.G., I've seen variations on
too many times to count.