Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Lesson: adding comments explaining what you think the code us doing today arent helpful.

Observation: many comments explaining what the code is doing don't match what the code is doing after a few check-ins. E.G., I've seen variations on

  //Add 1 to x
  x+=2;
too many times to count.


Usually you can check the commit history for the unexpected line to figure out what's up, though, to let you figure out if the bug is in the code or the comment.

Comments that have the same content as the code but written in English will have code drift problems. But most comments aren't like this; they can provide context or explain what's happening at a higher level, ideally.


A good rule of thumb I've heard about a few years ago: Assume you're explaining this chunk of a system to a team member standing next to you. Write that down. And don't worry about the tone feeling casual.

And once or twice at the top of a weird class/task file/section/..., don't be afraid of being a bit verbose and explain it until it's obvious, and then one more level. Stuff tends to be obvious while you have all the context uploaded in your mental caches - but a year down the line, it'll be rather confusing. Still, having such long comments too much in straight line code tends to make it harder to read.


Speaking of commit history, bad merges can leave comments and code together that shouldn’t be. And it will pass tests.


Curiously this seems like exactly the story of thing that code ML should be able to identify and flag. 'Hey, the comment says this, but the code is doing something totally different.'


Then the comment can be removed, as is duplicate of code and has no use.


However, it might be trained on code with erroneous comments, either because it was mangled by a merge or because it's outdated. The more times this happen, the more confused the AI model will be.

Which makes me to think that the AI models should be trained on the code evolution of commit chains and not just on isolated snippets of code. That way, the AI could analyze your own commits to detect when a comment becomes outdated.


I'd settle for AST-aware merge. That also fixes comments not glued to code they were attached to before


Bad merges can also leave code and code together that shouldn’t be, even if it passes tests. Always check your merges.


Indeed, comments should answer the "why".

But ideally the comments should be executable, as unit tests, making you read them if and only if you break them.

For this to be a tolerable development experience, test as much as you can while keeping your tests away from slow dependencies like networking, DB, disk I/O..., and try to keep tests relevant to what you're modifying executable locally in a few seconds.

Maybe even refactor your app to have dependencies at the top, so that most code doesn't have access to them.


> But ideally the comments should be executable, as unit tests

For the kinds of comments that answer the "why", if we could do that, we wouldn't need the actual code in the first place.

> making you read them if and only if you break them.

A good "why" comment is supposed to inform you beforehand, so you can make changes effectively and without introducing extra bugs in the process. Unit tests are more of a safety net.


I’m imagining the poster is thinking about things like rust doctests where code in the comments will be executed as tests when you run cargo test on the project. It’s a nice way of being able to ensure that (at least part of) the documentation will correspond to the behavior of the code.


I am thinking of any fast unit test system. Going in seconds from red to green is a thrill.


Unit tests suck at being documentation, and are not a good substitute for the "why" comment. They can catch some mistakes you make with the code under test, but they can't tell you why it is the way it is. At best, they can help you guess, at the cost of having to study the test code itself (which is usually much bigger than the code it tests, and often more complicated). But the thing is, the knowledge of "why" is most valuable to have before you start making changes and break some tests.


This is true, test coverage, especially in code that has to interact with other systems in particular ways will often have ten lines of setup that only matters in the test for every line of actual verification.


On the flip side, I’ve had things like:

    // in case this is malformed, fix formatting so it will still parse
    input = fixInputFormatting(input);
and had a code reviewer ask, “why are you calling fixInputFormatting”?

Nothing to raise the blood pressure like a code review question that is literally answered by a comment on the line immediately preceding where they left the question.


But in this case surely you're better off with

    input = fixInputFormattingIfMalformed(input);
or even

    if (isMalformed(input)) input = fixInputFormatting(input);


I'm with the reviewer on this one. Why is it malformed? Why are you fixing it here and not earlier? Why are you fixing it and not rejecting it? The comment tells me nothing.


I don’t remember the exact comment/cope pair anymore, but it was something that the answer they wanted was exactly what was on the preceding line. Coming up with a simple example to demonstrate that is a surprisingly hard thing to do.


In general, I feel comments should explain why the code is the way it is, not what it does.


Exactly, understanding the mind-set knowledge of piece of code is important to keep the cogs turning.


Agreed. Comments that explain in English exactly what the next line does drive me crazy. Even if the line is complicated. I pretty much only ever comment things these days when I change from one approach to another. ie.

// This code is weird. I tried doing it the obvious way, but that doesn't work because .. reasons ..

Sometimes, if the code is short, I'll even leave the old/obvious code there for future reference when I look at the weird code and say to myself:

"This is weird! Obviously it should work in this much simpler way.."


I mostly use comments to explain business rules, and why are they implemented there.

    # High value orders need to be approved before refund   
    # Similar logic is also applied elsewhere, this is here   
    # as a failsafe.   
    if ticketValue > 500:   
      emailCustomerSupport(ticket)   
    else   
      refund


I know you by ".. reasons .." you mean "<and I add the reasons here>", but I've seen too many comments worded exactly like that.

Some programmers use comments (correctly) to explain reasoning and context, some use them to redundantly say what the code already says and some, apparently, use them to apologise.


As a code reviewer, I love seeing comments like that, because it immediately flags "someone made a mistake here".

Sure, sometimes the code is right and the comment is wrong -- but sometimes the comment is right and the code is wrong, in which case the comment just saved me a lot of time.


I think it was Dijkstra that said that software actually lives in the mind of the programmer, and the code is just a distorted, lossy representation of that. Anything that gives us light on their thoughts is likely to improve our understanding of the software. Bugs happens when there's a mismatch between what's going on the mind of the programmer and what the code actually does, so when we read code a primary task is to understand what the programmer thought it should do.

Comments informs us what are the stuff that the programmer cares enough to write down. When we see a seemingly trivial comment, we may ask: why did they took time to write down that? Did they think there was any subtlety we aren't aware of? Or perhaps they were inexperienced with the language, to the point of having a hard time reading the code they themselves wrote? (if I put this comment on Google, will I find they copy-pasted from Stack Overflow? -- in this case, the comment may be very helpful, if only to track down that [0])

[0] but even better would be an IDE that highlighted code copy-pasted from Stack Overflow, Github repositories, etc


It's also possible that both the comment and the code are right, and there's some non-obvious reason why +=2 has the effect of adding 1 here, and is the only way to do it. (Not literally, as in this example, but something analogous.)


At the very least you know what the next likely alternative intended semantics is.


Yeah but I've also seen things like:

AddsFiveToNum(num) {

  return num - 5
}

A bunch too. So I don't think comments are solely at fault. Self documenting code is only as good as the person who wrote it, and the people who approved it. Sometimes a comment is warranted, sometimes it's not.


That one drives me nuts. A while back I changed some code that had a bool called “uninit” that when true, meant the value was initialized.


Yesterday I learned that in emacs lisp, "defvar" is a definition that is set one time only and from that point on can never be changed (i.e. can not be VARied) and "defconst" is a definition that can be changed (i.e. is not CONSTant). Naming things is hard.


I've completely given up on comments that aren't "here is the problem this code is solving". The comments end up being actually useful that way, and longer lived in their validity. Anything more granular than that, the code itself should make obvious.


A tool like this that tells me how surprising my code is, and takes into account comments around it, might change my mind on this front. If it always works as well as in the OP it would be super useful to be able to know how surprising the code I'm writing is (and I can then judge whether that's OK with me if it's surprising code), but this would also make it hard for the code and the comments to diverge greatly.

I mean, I still won't want "add 1 to x" comments of course.


Really a good code AI would note that the comments are misleading (which is sort of what this is doing).

It's actually completely achievable with today's models to look at the comment and the code immediately after it and see how surprising it is then note the comment could be incorrect.


Some people like granular comments, and some people only want high-level comments. I’ve long suspected that both camps are correct, because they’re using different languages.

A single line of Python data analysis code is often worth 20 lines of C++. If you would be willing to add one comment per 20 lines in C++, then nearly every line of your pandas gobbledygook is worth commenting.

Terse code is good, but that doesn’t necessarily mean the comments should be terse (or absent).


Using AI to reality-check comment would be cool.

But if you're using comments to explain the code, 9/10 you just wrote it in too unreadable way.

Sure, some algorithms are complex enough that some comments are needed to explain the how (that's the 1/10) but in most cases the comments should explain why, not how. So instead it should be

  // Add the calibrated skew to compensate for latency
  x+=2;
or whatever is the reason for the code existence.


Lesson? That’s at least a 40 year old piece of advice. People used to try to force me to write comments, and I would explain this to them.

“Don’t get suckered in by the comments-they terribly misleading. Debug only the code. Dave Storer Cedar Rapids, Iowa

https://moss.cs.iit.edu/cs100/Bentley_BumperSticker.pdf


That risk also exists for identifiers, which can mislead to exactly the same degree as they can inform. To avoid this hazard, run the code through an obfuscator that substitutes meaningless identifiers, before looking at it.


Why not both? A comment may be terribly misleading but they explain the programmers logic to a certain function. Always handy to know.

Each segment of code should have a diagram, a flow model, psuedo code, and comments. More the better.


No, not the more the better. Comments should be easily updatable to keep up with code changes.

If every refactor involves redrawing a bunch of fancy ASCII art, you'll either get less refactors, or outdated comments


That's why you keep the old comments and write the changes within.

    "If loop does something" #rev1

    "If loop did do something, it now does something twice" #rev2

    "If loop doesn't do something, it does something three times" #rev3

    "we loop three times because we processing supervariables" #rev4
And you have an wrong illusion of documentation. You don't need ascii diagrams. Why not a scribble in a sketch book? Whiteboards and photography exist. And the method above doesn't require you too redraw. You've already got the first and last revision. Besides, during an documentation cycle of your projects life-cycle is where you update all documentation.

> you'll either get less refactors, or outdated comments

If so, you're not disciplined enough. If your project is to be handed over down the line, more documentation is better than any and any documentation is better than none.


It's like the premature optimisation quote. The advice is technically sound, but it's still not really good advice because 90% of the times people quote it they're just using it as an excuse not to care about performance at all.

Or in your case not to write comments at all. That's obviously a terrible idea.


Code that I wrote, that I subsequently had to ask: Why aren't you processing the first/last element of this sequence? Why are you casting this to a list when it's already a tuple? Why are you filtering out Decimal("NaN") from tests but not float ones?


I once wrote some code so convoluted I put a mickey mouse ascii art in comments and a note that said "This is some Mickey Mouse BS. I hope you never have to maintain it."

That's probably the only useful comment I've ever written.


At least with version control its feasible to track the original comment/code and understand intent. Although, I doubt an AI is able to do this yet.


That is the work of a bad programmer. You should give as much attention to your comments as you do your code.


> what you think the code

I agree with that, but it gets to the point where people police all the comments in a codebase deeming them unuseful. I think, especially in a huge codebase, explaining why there is a certain block of code is very helpful to transfer knowledge.


Agreed, I've moved to only commenting WHY the function exists. The what can be stepped through. The why gives insight into intent, purpose and origin.


I see this a lot when departments require code be commented -- instead of why you get what, which should be obvious just by reading the code.


Reminds me of that saying to always take one compass or three.

(Although two would be useful as a signal that something has gone wrong!)


I'll go further. Comments are a code smell. They mean you've probably not modularized your code enough, or named your functions and variables descriptively enough. As I've approached 20 years of coding, I began to be able to count on one hand the times when I truly need to add a comment.

(I don't mean the DocBlock type comments for describing functions and class interfaces, that get compiled into docs.)

To all you downvoters: please do respond with examples of comments that are are counterexamples to what I said!


in non performance critical code, comments usually aren't necessary but they are godsends when you need to do something tricky for performance. standard examples are things like bithacks.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: