Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Calling Linux pipes "slow" is like calling a Toyota Corolla "slow". It's fast enough for all but the most extreme use cases. Are you racing cars? In a sport where speed is more important than technique? Then get a faster car. Otherwise stick to the Corolla.


This isn’t code in some project that will run only a few billion times in its lifetime; it is used frequently on millions, if not billions, of computers.

Because of that, it is economical to spend lots of time optimizing it, even if it only makes the code marginally more efficient.


Citation needed.

Pipes aren't used everywhere in production in hot paths. That just doesn't happen.


What production? You need to check your assumptions about what people do with general purpose computers and why. Just because it doesn't happen in your specific field of computing doesn't mean it never happens anywhere or that it just isn't important.


A lot of bioinformatics code relies very heavily on pipes.


How can I get hired to do bioinformatics with pipes all day? Sounds like a dream. I have VR and control systems experience.


A dream that pays $35k/yr


I'm ok with that if it means working with Pipes all day. Just needs affordable housing nearby... or my car...


That's not how economics works.

If 100 million people each save 1 cent because of your work, you saved 1 million in total, but in practice nobody is observably better off.


You’re describing the outcome of one individual person. Money is just a tool for allocating resources. Saving 1 million of resources is a good thing.


It's a meaningless thing if it's 1 million resources divided into 1 million actors who have no ability to leverage a short term gain of 1 resource. It's short term because the number of computers that are 100% busy 100% of the time is zero. A pipe throughput improvement means nothing if the computer isn't waiting on pipes a lot.


Eventually everyone ends up at a power plant, there's an insane amount of people living in the European grid. If an optimization ends up saving a couple tonnes of CO2 per year it is hard to not call it a good thing.

https://en.m.wikipedia.org/wiki/Synchronous_grid_of_Continen...


A couple tons spread across 400 million people with a per capita emission of 5 tons per year is in the noise. If we're at the point of trying to hyper optimize there are far more meaningful targets than pipe throughput.


You are arguing against the concept of "division of labor".

You are a few logical layers removed, but fundamentally that is at the heart of this. It isn't just about what you think can or can't be leveraged. Reducing waste in a centralized fashion is excellent because it will enable other waste to be reduced in a self reinforcing cycle as long as experts in their domain keep getting the benefits of other experts. The chip experts make better instructions, so the library experts make better software libs they add their 2% and now it is more than 4%, so the application experts can have 4% more theoughput and buy 4% fewer servers or spend way more than 4% less optimizing or whatever and add their 2% optimization and now we are at more than 6%, and the end users can do their business slightly better and so on in a chain that is all of society. Sometimes those gains are mututed. Sometimes that speed turns into error checking, power saving, more throughput, and every trying to do their best to do more with less.


Absolutely, if your focus is saving emissions don't optimize pipes. But if you optimize an interface people use it's a good thing either way right


It's because of you that my phone is so slow.


There are people whose lives are improved by having an extra cent to spend. Seriously. It is measurable, observable, and real. It might not have a serious impact on the vast majority of people, but there are people who have very, very little money or have found themselves on a tipping point that small; pinching pennies alters their utility outcomes.


https://xkcd.com/951/

Also, if you micro-optimize and that becomes your whole focus and ability to focus, your business is unable to innovate aka traverse the economic landscape and find new rich gradients and sources of "economic food", making you a dinosaur in a pit, doomed to eternally cannibalize on what other creatures descend into the pit and highly dependent on the pit not closing up for good.


No, they really aren't. Absolutely nobody's life is measurably improved because of 1 cent one time.

I admit my opinion is not based on first hand knowledge, but I have for years worked on projects trying to address poverty at different parts of this planet and can't think of a single one where this would be even remotely true.


> Absolutely nobody's life is measurably improved because of 1 cent one time...I admit my opinion is not based on first hand knowledge...

My opinion, however, is based on first-hand knowledge. I've been the kid saving those pennies, and I've worked with those kids. I understand that in the vast majority of cases, an extra penny does nothing more. That isn't what your original comment above claimed, nor is it what you've claimed here. My counterexample is enough to demonstrate the falsehood. Arguing that there are better ways to distribute these pennies is another matter, and I take that seriously as well.


>No, they really aren't. Absolutely nobody's life is measurably improved because of 1 cent one time.

Assuming a wage of $35/hour, each second is worth 1 cent. To save 1 cent you only need to reduce the time spent waiting for computers by a second across the entire lifetime of that person.

Now here is the beauty of this. There isn't just a single guy out there doing this. There are hundreds of thousands of people, possibly millions, doing it.


The beauty of math is that you can throw numbers around and multiply and divide them and do silly things with them.

The average human life expectancy is 77.5 years, or 2.4457e+9 seconds. If you divide that by, say, 1 billion daily active users of Google, you get 2.445. So if you work at Google, and optimize a slow process, and save every user 1 second, once, you've saved 2 lives. If you're a Microsoft and make boot up take 1 second less across their billion or so devices, same thing.


An investment or economic action is economically viable when societal benefits exceed the initial capital outlay. However, in capitalism people care more about the personal return on the initial capital outlay, than the actual societal benefits of their investment.

If society was a giant hivemind, then economic viability would take precedence over personal profit. Meanwhile if society is a bunch of isolated individuals, economic viability would take the backseat. So this tells us more about the limits of human psychology than it tells us about economics.


not if it costs 200 million in man-hours to optimize


So? It doesn't need to be visible to be worth optimizing?


If you are making an economic (financial) argument for change like the original comment did, then yes, it should be visible positive effect.

Obviously not if you are doing for your own fun or just improving the state of art.


Eventually everything ends up pumping the same resources from the same earth. The billion devices pool their 1 cents saved into the same pool of fossil fuels and the same power plants.

You don't need the effect to be observable on an individual level

It's something that is worth an engineer's time


Indeed. In the author's case, the slow pipe is moving data at 17 GB/s which is over 130 gbps.

I've used pipes for a lot of stuff over 10+ years, and never noticed being limited by the speed of the pipe, I'm almost certain to be limited by tar, gzip, find, grep, nc ... (even though these also tend to be pretty fast for what they do).


I had two cases in my practice where pipes were slow. Both related to developing a filesystem.

1. Logging. At first our tools for reading the logs from a filesystem management program were using pipes, but they would be overwhelmed quickly (even before it would overwhelm pagers and further down the line). We had to write our own pager and give up on using pipes.

2. Storage again, but a different problem: we had a setup where we deployed SPDK to manage the iSCSI frontend duties, and our component to manage the actual storage process. It was very important that the communication between these two components be as fast and as memory-efficient as possible. The slowness of pipes comes also from the fact that they have to copy memory. We had to extend SPDK to make it communicate with our component through shared memory instead.

So, yeah, pipes are unlikely to be the bottleneck of many applications, but definitely not all.


I have a project that uses a proprietary SDK for decoding raw video. I output the decoded data as pure RGBA in a way FFMpeg can read through a pipe to re-encode the video to a standard codec. FFMpeg can't include the Non-Free SDK in their source, and it would be wildly impracticable to store the pure RGBA in a file. So pipes are the only way to do it, there are valid reasons to use high throughput pipes.


What percentage of CPU time is used by the pipe in this scenario? If pipes were 10x faster, would you really notice any difference in wall-clock-time or overall-cpu-usage, while this decoding SDK is generating the raw data and ffmpeg is processing it? Are these video processing steps anywhere near memory copy speeds?


So pipes are the only way to do it

Lets not get carried away. You can use ffmpeg as a library and encode buffers in a few dozen lines of C++.


The parent comment mentioned license incompatibility, which I guess would still apply if they used ffmpeg as a library.


If the license is incompatible, it would still be incompatible regardless of whether you use library API calls or pipes.


And you go from having a well defined modular interface that’s flexible at runtime to a binary dependency.


You have the dependency either way, but if you use the library you can have one big executable with no external dependencies and it can actually be fast.

If there wasn't a problem to solve they wouldn't have said anything. If you want something different you have to do something different.


The context of this discussion is that it would be better if pipes were faster. Then you would have more options.


I replied to them saying "So pipes are the only way to do it".


ffmpeg's library is notorious for being a complete and utter mess


It worked extremely well when I did something almost exactly like this. I gave it buffers of pixels in memory and it spit out compressed video.


What about domain sockets?

It's clumsier, to be sure, but if performance is your goal, the socket should be faster.


It looks like FFmpeg does support reading from sockets natively[1], I didn't know that. That might be a better solution in this case, I'll have to look into some C code for writing my output to a socket to try that some time.

[1] https://ffmpeg.org/ffmpeg-protocols.html#unix


Why should sockets be faster?


Sockets remap pages without moving any data while pipes have to copy the data between fds.


Why not just store the output of the proprietary codec in an AVFrame that you'd pass to libavcodec in your own code?


At some point, I had a similar issue (though not related to licensing), and it turned out it was faster to do a high-bitrate H.264-encode of the stream before sending it over the FFmpeg socket than sending the raw RGBA data, even over localhost… (There was some minimal quality loss, of course, but it was completely irrelevant in the big picture.)


> There was some minimal quality loss, of course, but it was completely irrelevant in the big picture

But then the solutions are not comparable anymore, are they? Would a lossless codec instead have improved speed?


No, because I had hardware H.264 encoder support. :-) (The decoding in FFmpeg on the other side was still software. But it was seemingly much cheaper to do a H.264 software decode.)


H.264 has lossless mode.


I'm not sure that logic makes sense. Making a thing that's used ubiquitously a few percent faster it's absolutely a worthwhile investment of effort. Individual operations might but be very much faster but it's (in aggregate) a ton of electricity and time globally.


That's what's called premature optimization. Everywhere in our lives we do inefficient things. Despite the inefficiency we gain us something else: ease of use or access, simplicity, lower cost, more time, etc. The world and life as we know it is just a series of tradeoffs. Often optimization before it's necessary actually creates more drawbacks than benefits. When it's easy and has a huge benefit, or is necessary, then definitely optimize. It may be hard to accept this as a general principle, but in practice (mostly in hindsight) it becomes very apparent.

Donald Knuth thinks the same: https://en.wikipedia.org/wiki/Program_optimization#When_to_o...


It's definitionally not premature optimization. Pipes exist (and have existed for decades). This is just "optimization". "Premature" means it's too soon to optimize. When is it no longer too soon? In another few decades? When Linux takes another half of Windows usage? It would be premature if they were working on optimizations before there were any users or a working implementation. But it's not: they're a fundamental primitive of the OS used by tens of millions of applications.

The tradeoffs you're discussing are considerations. Is it worth making a ubiquitous thing faster at the expense of some complexity? At some point that answer is "yes", but that is absolutely not "When it's easy and has a huge benefit". The most important optimizations you personally benefit from were not easy OR had a huge benefit. They were hard won and generally small, but they compound on other optimizations.

I'll also note that the Knuth quote you reference says exactly this:

> Yet we should not pass up our opportunities in that critical 3%


Most people will never need to optimize pipes. If they don't need to do it, it's premature. If they need to do it, it's not premature


If you're saying that people that know they don't need to optimize pipes shouldn't optimize pipes, then yeah I mean that's kind of just common sense.


Sometimes the best answer really is a faster Corolla!

https://www.toyota.com/grcorolla/

(These machines have amazing engineering and performance, and their entire existence is a hack to work around rules making it unviable to bring the intended GR Yaris to the US market.. Maybe just enough eng/perf/hack/market relevance to HN folk to warrant my lighthearted reply. Also, the company president is still on the tools.


There's no replacement for displacement.


Apparently there is, because that car only has a 1.6L 3-cylinder engine and yet produces a whopping 300 horsepower.


When? In the RPM sweet spot after waiting an eternity for the turbos to spool? There's always a catch.


I didn't expect to be writing this comment on this article hah, but apparently there is such a thing called a surge tank for storing boost pressure to mostly eliminate turbo lag:

https://www.highpowermedia.com/Archive/the-surge-tank

https://forums.tdiclub.com/index.php?threads/air-tank-or-com...

It's such an obvious idea that I'm kind of shocked it took them until 2003 to do it. Surely someone thought of this in like the 60s.

I would probably do it differently with a separate supercharger to intermittently maintain another 1-2+ bar of boost to make the tank less than half as large, but that would add complexity, and what do I know.


Afaik the general solutions to turbo lag are 1) a smaller turbo, 2) two turbos in stages, one spools earlier, the other later, 3) a transmission/differential/tune tailored to be in boost near-constantly during acceleration (something like a 10-speed cvt designed to keep you in high revs when accelerating in sport mode; not only keeps you in boost but is the ideal power band for these non-diesel boosted sports cars)


> 10-speed cvt

CVTs shouldn't even have a concept of "speeds". I absolutely hate how manufacturers will build cars with CVTs and then make them only go into discrete gear ratios. It completely destroys the entire reason for having a CVT.

I understand that they do it because people don't like how CVTs sound/feel, but maybe they should all have 3 modes:

1. Eco - optimizes gear ratio for maximum effeciency

2. Performance - optimizes for maximum power

3. Sport - pretends to be a normal transmission for a better "feel".


A better analogy is its like a society that uses steam trains attempting to industrially compete with a society that uses bullet trains (literally similar by factor of improvement). The UK built its last steam train for national use in 1960, four years later the Shinkansen was in use in Japan. Which of those two nations has a strong international industrial base in 2024?


Well the Mallard's top speed was very close to the first generation Shikansen 0 trains.


The Mallard is such a beautiful train.


Wait, it depends on what you're doing. Pipes also create a subshell so they are a big nono when used inside a loop.

Suppose you're cycling on the lines of stdout and need to use sed, cut and so on, using pipes will slow down things considerably (and sed, cut startup time will make things worse).

Using bash/zsh string interpolation would be much faster.


I mean why waste CPU time moving data between buffers when you could get the same semantics and programming model without wasting that CPU time?


To be frank, this is more of a pretext to understand what pipes and vmsplice do exactly.


Replace “Linux pipes” by “Electron apps”, and people would not agree.

Also, why leave performance on the table by default? Just because “it should be enough for most people I can think of”?

Add Tesla motors to a Toyota Corolla and now you’ve got a sportier car by default.


electron apps are an optimization all by itself.

it's not optimizing footprint or speed of application. it's optimizing the resources and speed of development and deployment




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: