It’s been the exact opposite for me. The spaghetti code has always come from poorly conceived abstractions and the massive problem of inverting an API to reimplement functionality through the API that should be extensible within the API (but fails to be because of poor choices in abstraction or abstracting prematurely).
Later on that spaghetti code gets labeled as lacking abstraction, similar to what you are saying, despite the actual problem being too much abstraction and poorly designed abstraction that became load bearing in a way where everyone decides that living with API inversion as a reality is the lesser evil and figures they’ll probably quit the company and move on to greener pastures before it becomes their headache to deal with.
Absolutely this. I’d rather look at 200 lines of linear, inline documented code then a spaghetti mess of “helper” functions that do nothing better than obfuscate everything going on.
I’ve had a strict rule with my team of “1, 2, N”. I don’t want to see an abstraction until we’ve solved a problem similarly at least two times, and even then an abstraction may still be a poor idea.
Abstraction is an especially poor idea early in a project because often you only half know what you’re making (I’m in games). Requirements change, or a special case needs to be added, and all of a sudden you are trying to jam new behavior into “generic” helpers without breaking the house of cards built around them.
I agree that over-engineered helper function hell can be a real problem.
I disagree strongly with strictly enforcing the 3x rule. The right abstraction can be helpful even if it is used only once. The right abstraction will communicate its purpose clearly and make it easier to reason about the program, not harder. Obfuscating implementation details is a feature not a bug, as long as the boundaries of the abstraction are obvious. Another benefit is it makes it easier to test the logical units of your codebase.
"It’s nice to pretend that a four word function name can entirely communicate the state transformation it does, but in reality you need to know what it does." Are you suggesting you are cognizant of every line of code of every library you use in your work?
Actually yes, you should know to at least depth=1 what your magic incantations are doing when you call them.
And that’s part of my point, if you go that one level of depth and find an excessive amount of DRY, you’ll find it that much harder to know what the hell is going on.
Yes, you should understand what a function does when you call it. Not everyone who looks at a codebase is modifying the codebase or adding new function calls. The person referencing the code may already be 1-level deep in parsing the implementation.
Not all abstractions will seem like a magic incantations when you use them. Something like "convertToCamelCase" conveys its purpose clearly enough that the reader can assume what the low-level operations are. They don't need to look at these operations every time they need to reference the code.
200 lines of code means that you have to comprehend all 200 lines simultaneously since any line could potentially interact with any other line in that code block. Using functions where the state is passed as parameters limits the potential for code interactions through functional boundaries. The point of abstractions are to limit complexity by limiting potential interactions. Helper methods do a fine job of this.
That’s a gross over-generalization to assume that 200 lines is always a self-referential mess. Functions fundamentally transform data, and often that transformation is a linear process. If it’s not, sure, break it up in a more sensible manner.
Regardless, helper methods have a significant cognitive cost as well. It’s nice to pretend that a four word function name can entirely communicate the state transformation it does, but in reality you need to know what it does and mentally substitute that when reading the function using it. No free lunch.
I worked on a webapp that our team inherited which had 400-800 line controllers (and one that was a little over 1200 lines). When I first started looking at the code I was horrified but then I realized that everything was self contained and due to the linear flow, pretty easy to understand. You just had to get used to scrolling a lot!
The issue that we started having is that pull requests, code reviews, and anything that involved looking at diffs was a lot of work. There were two main issues:
1) Inadvertently creating a giant diff with a minor change that affected indenting, such as adding or removing an `if' statement.
2) Creating diffs that had insufficient context to understand: if your function is large enough, changes can be separated with enough other lines of code to make the diff not be standalone. You end up having to read a lot of unchanged code to understand the significance of the change (it would be an ideal way for a malicious developer to sneak in security problems).
>That’s a gross over-generalization to assume that 200 lines is always a self-referential mess.
The point is that you don't know this until you look. You have to look at all 200 lines to understand the function of even one line. When you leverage functional boundaries you generally can ignore the complexity behind the abstraction.
I'm not sure what point you're making. If you are just assuming that functional boundaries tend to not be maintained in practice then you're not contradicting anything I have said. Whether or not functional boundaries are easy/hard to observe depends on the language and coding conventions.
Later on that spaghetti code gets labeled as lacking abstraction, similar to what you are saying, despite the actual problem being too much abstraction and poorly designed abstraction that became load bearing in a way where everyone decides that living with API inversion as a reality is the lesser evil and figures they’ll probably quit the company and move on to greener pastures before it becomes their headache to deal with.
https://en.m.wikipedia.org/wiki/Abstraction_inversion