Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Nice read. I love the pprof tool: being able to pinpoint which line of code (or line of assembly code even) takes the most time is awesome.


While very neat in my experience this has two issues (using Python's line_profiler so it may not have the same problems):

1. it adds significant overhead to execution speed which means savings under line_profiler and savings without it may only be distantly related, not sure how much overhead pprof adds

2. it requires knowing which functions should be line_profiled, because when you have thousands or millions of LOCs, you've got no idea what to line-profile

I've never found "usual" whole-program profilers to be great at the latter, it may just be that I'm bad at reading them but they never really click, and when you've got a few "leaf functions" called from basically everywhere, they end up having a low SNR. Recently however I've started using sampling profilers and flamegraph representations[0] (or sunbursts, but there's no standard tool for that one) and found it to be a significantly superior way of identifying bottlenecks with very high SNR.

[0] http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html


"While very neat in my experience this has two issues (using Python's line_profiler so it may not have the same problems)... Recently however I've started using sampling profilers"

pprof is a sampling profiler. You can get it from Google for C, too. I'm not sure if Go has a "port" or just wrote something with the same ideas, but, well, I guess you could say I don't know precisely because it hardly matters.


> pprof is a sampling profiler.

Good to know.


> Recently however I've started using sampling profilers and flamegraph representations[0]

Maybe you should take a closer look at what pprof does and not draw conclusions only from Python's line_profiler. Go's profiler doesn't have flamegraphs, but it has directed graphs. Here's an example:

https://rawgit.com/benbjohnson/5b4fb2ffb6874484e586/raw/60af...

As you can see, it is very easy to see the bottlenecks. In fact the first thing I always do when profiling is outputting this svg to have a good first view of what's happening.

pprof is a "global" whole-program profilers, so you don't need to care about what line to profile or not. My only limitation for the moment is that you can't dive in cgo code.


> Maybe you should take a closer look at what pprof does and not draw conclusions only from Python's line_profiler. Go's profiler doesn't have flamegraphs, but it has directed graphs.

Python profilers have directed graph representations, they are little better than the normal textual output.

> As you can see, it is very easy to see the bottlenecks.

With trivial programs, the most terrible representations are no issue. Cycles in callstacks (partial mutual recursion) or "hot nodes" (dispatcher code) completely break directed graph output, but not flamegraphs.


pprof the tool is really just a visualization tool more than a profiler itself[0]. But yes, enabling it will enable it for the entire program, so you will see a massive performance hit when it's enabled.

[0] http://stackoverflow.com/questions/8083112/format-of-google-...


It's a sampling profiler; you shouldn't see a big performance hit.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: