Hacker Newsnew | past | comments | ask | show | jobs | submit | msk20's commentslogin

Just FYI, On my Firefox its saying "Connection Secure (upgraded to https)", its actually using ECDHE CHACHA20 SHA256.

Note: I have "Enable HTTPS-Only Mode in all windows" on by default.


Looking at the file with changes https://github.com/postgres/postgres/blob/master/src/backend... , I have to say this source code repository is so well documented/commented and structured, I really gives you a huge trust in postgres to be used in your stack.


Large C codebases _have_ to be exceptionally nice, or they immediately collapse under their own weight. As a dev team, the language teaches you this the hard way. I've never seen a terrible huge C codebase (but have seen many in other languages).


> I've never seen a terrible huge C codebase

I have 100% confidence they exist. They just don't get uploaded to Github out of shame or embarrasment.


Can confirm they do. I worked at a company that was producing control systems for electric engines. Great environment and fun job but the code was beyond redemption. 15k line files with 2k+ lines #ifdef statements that ran different code for different customers, some variable names were just curses against pushy clients, not a single abstraction in sight.

Not only they exist, they power massive machines that could crush a person in the blink of an eye.


I think OP just found a really well done C file.

You can definitely see many famous projects on GitHub, where even though the code is logical, you can't help feel it's a bit all over the place.

cpython isn't all sun and roses.

A broad generalization, but with C and C++ I'd say the standards for "good code" have dramatically risen in the last 10 years.

Variable names seem to have doubled in size.


> I've never seen a terrible huge C codebase

libssl/openssl (before the rewrite/fixes especially)? Maybe doesn't qualify as huge?

See eg: https://www.youtube.com/watch?v=GnBbhXBDmwU "LibreSSL with Bob Beck" (the first 30 days)


Proprietary device drivers come to mind..


They don't tend to be huge && in C


AMD's linux graphics driver has entered the chat


Thankfully they don't follow the whole mindset of "self-documenting code"


PostgreSQL code quality is exceedingly fine indeed.


File could definitely be broken up a bit. Over 5000 lines! Just a nitpick though.


Quite the opposite, I hate it when projects have dozens upon dozens of modules with 1 function. Multiple huge files are the best sweet spot. (Only crazy adn exceptional things like putting everything into a single file damages the readability imho)


Files with one function vs files with 5000 lines are not the only two options.


I've got it! We could use files with one function that spans 5000 lines.


How about a function that spans 5000 files


Why even use 5000 lines when you you can often combine them all into one line.


That's the job of the compiler, then we can use a disassembler to read the actual optimized source code.


300 to 1000 lines per file is best IMO.


IMO, paintings with the color blue are the best.

What does number of lines have to do with anything?


At each extreme:

A source code file with very few lines will mean more cognitive load is required to remember which file (or package) some functionality is, and likely more effort to maintain the code.

Whereas, source code files which are very large mean more cognitive effort to remember where in the file some function is. In many languages, variables can be local to a file, which means use of such variables is riskier, etc.

It may be desirable to combine smaller files into a more coherent whole, or to split up an overly complicated large file into several smaller files.

Sure, without seeing code, I don't think you can come up with a concrete rule which exists in all cases. Rules of thumb can still be useful as an indication of maintenance effort.


I get the impression that long files are culturally acceptable in systems-level C code. E.g., just a cherry-picked file from Linux: kernel/sched/core.c is over 11k lines.

https://github.com/torvalds/linux/blob/master/kernel/sched/c...


I feel the long file issue is mostly gone - that the problem isn’t the file length but spaghetti code. If it makes sense to be in one file it should be in one file. Breaking it up simply to reduce file length is counter productive.


There are 18609 .c files in my checked-out copy of the FreeBSD src tree. The median length is 258 lines; 90% are 1373 lines or shorter; 99% are 5241 lines or shorter.

The statistics for the 6071 .c files in the FreeBSD kernel are somewhat higher -- median is 460 lines; 90th percentile is 2070 lines; 99th percentile is 7678 lines -- but your example of a 11133 line file is definitely at the extreme high end.


I didn't run any stats when I found that file. Just clicked around in github a handful of times looking for something that seemed like it'd be complex.


yes are people using editors that don't let them have multiple views of the same file or something?


A filepath is an index and a hiearchy that adds information and structure. It can't be completely replaced by editor affordances.


Depends on the language. C# somewhat replaces paths with namespaces, then you navigate classes and methods with editor tooling. I remember back when I was on Visual Studio writing C++ that they did something similar.


Actually I don’t mind big files. It is simpler scanning through it or doing a quick search than if you had a bunch of smaller files. And 5000 lines is not awkward for most editors, especially as many editors have the ability to collapse functions.


This feels like a good area for tooling (editors, source hosts, SCM extensions) to improve experience. I don’t always mind large source files (and sometimes may prefer them over large file system hierarchies), but the can be a pain to navigate in some circumstances.

As an example, making several related changes in very different parts of a file, where you need to cross-reference between them. The changes themselves might be small, but it’s a huge cognitive burden to alternate/iterate through them. I’d love to have a view which temporarily projects those targets as if they’re isolated files without changing the actual structure on disk. I’d love it so much I actually do this manually for a lot of tasks, creating temp files to prepare edits for related areas of code. But then I lose a lot of the benefits of tools which understand what’s being referenced. It would be great to just type a quick command (or click or whatever) to say “don’t refactor this function to another file, but let’s pretend you did, for a while”.


This is how older editors like emacs work. You interact with views/windows/tabs called buffers and those buffers can have files loaded into them. Multiple buffers can reference the same code file but view different sections simultaneously. So you can investigate or edit different parts of one huge file the same way you would smaller ones.


You just blew my mind.

I use JOE and JOE also supports multiple views into the same file. In fact, to open another file is two commands: the open view/window (^KO) command followed by edit file command (^KE). I've always used this facility for as long as I can remember and it never really occurred to me until now that people using more modern editors and especially GUI editors may not enjoy this same convenience--either not possible or no simple chain of command inputs to get there. And it's not like I don't use GUI editors, just not in situations where I would realize this feature was missing.


You can do this in vscode as well, even have it side by side. I’m missing something?


I just opened IntelliJ and it has this feature too, where you can "Split right" or "Split down" and have multiple views of the same file. Thanks for letting me know this is something editors might support.

For some reason I have not noticed that feature before. It's not like it is hidden either, as it is in the "right click" context menu that I use daily. I guess I need to learn the tool, so that I don't miss useful features like this.


For the "temporarily projects those targets as if they’re isolated files" part, you want something like narrow-indirect or edit-indirect.

https://www.emacswiki.org/emacs/NarrowIndirect

https://github.com/Fanael/edit-indirect


> The changes themselves might be small, but it’s a huge cognitive burden to alternate/iterate through them. I’d love to have a view which temporarily projects those targets as if they’re isolated files without changing the actual structure on disk.

This is exactly how vim buffers work (for instance in a split) work.


I’m glad to see I’ve brought the emacs/vim people together for a common purpose!


this is what :[v]split is for.. .


While I'm not sure that was a consideration here, sometimes C compilers produce better machine code when they got access to more function definitions. Eg. Sqlite recommends that embedders use the single ~10mb sqlite.c file[1] for both ease of use and performance reasons.

[1]: https://www.sqlite.org/amalgamation.html


Seems like something the compiler should take care of.


It can when you use LTO, but that tends to be very slow for large programs.


It is slow. Our (C++) project's MSVC release build ends with a glorious 2-minute run of link.exe with lto (/LTCG /O2) and aggressive inlining (/Ob3) enabled.

Limiting unit size in C++ helps with faster edit/compile/run cycles as well, which doesn't seem to be concern for C codebases in 21st century.


You hit most of the same issues (and some additional ones) when compiling the same code within one translation unit...


5000 lines looks like half are comments or whitespace, none of the functions look more than a couple hundred lines or a few levels of control flow depth. Pretty harsh nitpick.


The point of calling it a nitpick was to indicate precisely that my comment wasn't meant to be taken harshly.


But nitpick still means it is a nit, I don't really think it is at all. And you're saying it in response to someone who said the code is very clean, which is quite petty.


> nitpick: engage in fussy or pedantic fault-finding.

Yes, I'm fully aware that the comment is petty. The point is to communicate that I don't disagree that the code is clean and well written.


There is ZUUL Gating[0] CI it is actually the perfect solution for this. It works with Github or Git based repository system.

It automatically tests the changes with a simulated merge on master together. So it orders PR1 -> PR2 -> PR3 -> .... -> PR-100 by order of approval. If PR1 -> PR2 (Fails) -> PR3 -> .... -> PR-100

It restarts -> PR3 -> .... -> PR-100 and Up after removing PR2. This behavior is even customizable.

Video of it in action: https://zuul-ci.org/media/simulation.webm

Links: [0]:https://zuul-ci.org/


I don't really know much about optimizing storage costs, But You could learn from storage giants.

Example is Blackblaze storage pod 6.0 according to them it holds 0.5PB with a cost of 10k$, you will need about 20*10K$ = 200K$ + Maintenance(They also publish failure rates) , The schematics and everything is in their website and according to them they have already a supplier who provides them with such devices which you could probably buy from. Note: This was published 2016, they probably have Pod 7.0 by now so cost may be better.

Reference: https://www.backblaze.com/blog/open-source-data-storage-serv...


fyi that 10k includes no drives.


Reading: https://www.backblaze.com/blog/open-source-data-storage-serv... it seems the drives are included.

That 10.3k includes drives but you have to assemble the pod yourself.

For 12.8k you get drives and assembled pod from 3rd party manufacturer.

Backblaze pays about 8.7k at a scale for the whole enchilada

Those numbers do not make sense if we exclude drives. The server itself is not that expensive(2-3k tops) without the drives.


Is there any privacy benifit to use containers now with the new isolations built into firefox? I'm using cookie autodelete + Containers, So now its either isolate and keep them or Isolate and delete them. I quite like this.


Containers aren't going to give you additional protection against third-party cookies with this feature. But, you still have other useful benefits like having different sessions open on the same websites using containers, or just grouping websites by forcing them into specific containers (Work/Personal/Random etc..).


I'm not the one who asked the question but am in the same position. All third-party content is off, I'm using a long-term container for stuff where I need to be logged, and temporary containers and no first-party cookies for everything else. I do have some bugs with the interaction of both so I'm happy if I can have the same think with stock Firefox


I use containers to maintain multiple logins for the same site. It's handy to log in to the analytics + hosting account etc all in one container.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: