More

jhokanson · on Aug 10, 2022

"Firstly, there are three main Python concurrency APIs, they are:" asyncio, threading, multiprocessing, ... oh, and concurrent.futures

All kidding aside, I used the multiprocessing module lately and it was a mess. Do I want 'map', 'starmap', 'imap', etc.? All I wanted was to run a function multiple times with different inputs (and multiple inputs per function call) and to fail when any launched process failed rather than waiting for every input variation to execute and then telling me about the error(which honestly I didn't think was asking for too much).

jayemar · on Aug 11, 2022

There's a lot of functionality in the multiprocessing library and it has its own problems, but I wouldn't call it a "mess" from your description. map, starmap, and imap are all useful for different applications depending on your usage and priorities. I'll agree that it can sometimes be difficult to understand the differences and which is best for your use case, but having used in different use cases I certainly appreciate the different functions available.

jhokanson · on April 9, 2022

> Spreadsheets.com, for example, lets users dump almost anything into a cell. Drop a photo or a PDF into a cell and the product will immediately create a thumbnail, which you can then expand, as if the spreadsheet were some sort of blog content-management system.

Yes please! Can I get this for Google Sheets? :/

jhokanson · on March 10, 2022

Speaking of math pages on Wikipedia ... and math text more generally

Is it just me or are we horrible at teaching advanced math? Where are the examples (with actual numbers)? Where is the motivation? Where are the pictures?

hinkley · on March 10, 2022

Randall Monroe has a comic about how most people need enough math to be able to handle a birthday dinner where the guests split the bill for the birthday boy/girl evenly and pay for their meals and tip separately.

That’s a pretty good bar and I wonder if we could just cut to that chase earlier. But I also believe that people need enough math to see when they’re being cheated, and I feel like you could just tell middle schoolers that and they would pay attention. Maybe even primary school.

You told Billy he could have three apples, and now there are two left. Did Billy take more apples than he should have?

It’s always how do you share your cookies fairly with your friends and if they’re my cookies why do I have to share them at all? Screw “fairly” I’m keeping the extras at least. That sort of sharing is a socially advanced concept they don’t entirely get just yet.

groby_b · on March 11, 2022

Except that humanity desperately needs a better understanding of probabilities and non-linear relationships. We don't use more than division because we haven't succeeded teaching more, not because nobody needs it.

eternityforest · on March 11, 2022

Do we actually need to know how it works, or do we just need to really deeply understand that common sense is not scientific evidence and everything we actually care about can't be predicted just by "Being smart and thinking about it"?

Actually being able to do stuff with Bayes law by hand is going to be not only hard to teach, but probably impossible to remember for those of us who don't actually do math in real life. People forget stuff after a few months or years.

I highly doubt the average person is interested in checking the math on a science paper, so if you want the general public to understand statistics you... have to show us all a reason to, and also teach us all of the related skills needed to make it useful. Or else.... we will all just forget, even with the best teacher in the world.

Most of us aren't doing random game engines as a hobby project or testing things on bacteria cultures.

Maybe they should teach it in context of how to understand a scientific paper, since that's one of the more relevant things for non-pros. If you just teach statistics alone people will say

"Ok, now I know that it's easy to lie to yourself if you don't use any numbers but I don't have collections of large numbers of data points in my life to actually analyze"

hinkley · on March 11, 2022

"Thinking Fast and Slow" makes a quite extended argument that our brains have a faulty intuition about probabilities, and I nodded along thinking of all the bad decisions I've watched my teams make over the years (either noticed retrospectively, or presaged by myself or some other old hat).

If you have a way to fix this, you would be set for life, going around playing a sort of corduroy jacketed Robin Hood, keeping the rich from stealing from the poor.

groby_b · on March 14, 2022

I have a corduroy jacket... so, halfway there? ;)

Realistically, it is (somewhat) fixable in small contexts. I've worked (and continue to work) on teams that are somewhat decent at risks and probabilities, but it's definitely an exceptional experience.

I don't know how to widely teach that. But I'm not yet ready to give up and say "it can't be taught", because those folks on my team are the counterexamples.

GoatOfAplomb · on March 10, 2022

> Where are the examples (with actual numbers)?

In upper-level undergraduate math, I made a game of seeing how many pages I would go before seeing 7 printed anywhere. It was usually 10 pages, if I included the page numbers.

ravi-delia · on March 10, 2022

What are we calling advanced math? There comes a point where I personally find it much easier to avoid examples until I'm problem-solving, since otherwise I'll get stuck in a loop of wondering if the thing I noticed generalizes. Could just be that my working memory is poor, but when I see a real honest number I know I'm in for a grueling day.

joppy · on March 10, 2022

Wikipedia is a terrible place to learn advanced mathematics, for the reasons you raise (and more). There are lots of terrific short books, and many terrific lectures online.

chobytes · on March 10, 2022

This is definitely a problem! Having a large set of interests and problems to draw examples and intuition from are how I deal with it. I suspect this is why so many mathematicians are also into physics.

exdsq · on March 10, 2022

100%! For those of us who need to learn from practical examples through to generalized intuition maths can be really really hard to learn depending on the source. Wish I was one of those people who finds it easier to learn from abstract first through to implementations second.

xyzzyz · on March 11, 2022

It's not just you, we are horrible at teaching advanced math. However, the reason for it is that advanced math is, as far as we can tell, just really, really, really hard. It's not that mathematicians don't care about teaching others (they very much do, and they try their best to get their understanding across to others), or that Wikipedia authors are particularly bad at clear exposition (they are, if anything, above average). Quite simply, we know of no royal road to understanding mathematics, you have to put in many hours to bite it in very small pieces.

Here's an example:

https://en.wikipedia.org/wiki/Homology_(mathematics)

It has motivation, examples, and even actual numbers (though they're really just 0 and 1. most of the time). In my opinion, it's very good and clear exposition, for an encyclopedic article. However, I strongly suspect that people without enough mathematical knowledge (and "enough" in this case is something in the neighborhood of "enough to obtain an undergraduate degree in Mathematics") will simply not get anything about it beyond "it's about number of holes" (and that's not even remotely close to the whole picture: homology theories are important and useful in context of things with no "holes" to speak of). If you think otherwise, but not know what a quotient group is, you're just fooling yourself.

This is something I observe on HN a lot: people don't understand advanced mathematics, and are dumbfounded by the fact, trying to blame weird notation mathematicians insist on, or lack of motivation/examples/pictures etc. I never see people here do the same with advanced physics ("if the Standard Model is so standard, why can't they briefly and clearly describe what it is" is not something I ever see), molecular biology, or material science. People seem to know their limits and understand that really grokking these fields requires many years of deep study.

I think it's because many people on HN have good experience learning mathematics at school: it was something they always grasped really easily, and were easily able to figure out how to calculate derivatives, integrals, get matrices into normal forms etc. I don't want to rain on anyone's parade, because these things are still relatively difficult, and it does require more intellectual ability and effort that probably 3/4ths of the population aren't capable of. However, relative to advanced mathematics, undergraduate calculus is really rather trivial stuff.

Point is, if you don't understand modern advanced mathematics, you shouldn't get any more disappointed than you are about not being able to play violin. These things just don't come easy.

jhokanson · on Feb 2, 2022

Does AMD have folks that you can reach out to regarding this? I know Intel has MKL and all the work around its own compiler for maximum speed. This seems like it should be trivial for someone at AMD to put together as an example of how to do things like this correctly ...

jhokanson · on Feb 2, 2022

It is not exactly clear to me what is going on with threads (I guess you are using all of them?). I haven't done too much in this space but anecdotally I've had better luck if my summation is explicitly split into sub-summation tasks. It is not clear if that is being done here. It looks like a single summation loop that the author is expecting the computer to magically split across multiple threads. I'd be interested in seeing what this looks like if instead the task were to add chunks of the original dataset into results per thread (e.g, first 8000 samples on first thread, next 8000 on 2nd thread, etc.), with a final accumulation loop across all threads. Again, the author may be trying this and this is not my area of expertise but I've had decent luck saturating the memory bus with a similar approach.

ashvardanian · on Feb 2, 2022

Here is the source and the threads: https://github.com/unum-cloud/ParallelReductions/blob/fd16d9...

OFC we don't expect the compiler to instantiate them for us, it's not OpenMP :) That one we covered in previous articles. OpenMP gave us about 50 GB/s with all cores enabled and 80 GB/s with part of them disabled.

a_t48 · on Feb 2, 2022

Is there an advantage to using taskflow for parallel for, if you already have another threadpool implementation? I recently removed taskflow in a project that was only being used for a parallel for loop (as part of a larger refactor, the code had a number of issues...), and I'm wondering if that was a mistake now that I see that pattern somewhere else. :)

ashvardanian · on Feb 2, 2022

Nope, dont worry :) I did it our of laziness. I didn’t want to implement a task queue for std::thread-s, so I took TaskFlow, as one of the most famous solutions. You can definitely get better async task management with enough C++ experience and time.

jhokanson · on Feb 2, 2022

My c++ is not great (so it is hard for me to tell what is going on) and I'm used to OpenMP where my understanding has always been that you tend to get a single thread per processor (or per hyper-thread) -- not sure if that is guaranteed with the way your code is laid out? Perhaps it really is a NUMA issue as others suggest. I will note that one other variation I had (as it looks like you are already splitting across threads) is that the chunk sizes were actually smaller than the # of threads which meant a faster thread would take more chunks rather than waiting on the slowest thread. Good luck!

hinkley · on Feb 2, 2022

Doug Lea of Java Memory Model and concurrency note went pretty far down this rabbit hole. Not only do you use separate counters/queues per thread/core, but you also put empty space around them so that you don't accidentally share cache lines. I don't know what they do now, but at the time some of the data structures in that library used arrays where only every 8th or 16th entry is used to avoid two cores trying to read from the same cache line.

Typically allocating a separate data structure per actor also accomplishes this as a happy accident. If the thread does the allocation, then it has a better chance of being in the right bank as well.

ashvardanian · on Feb 2, 2022

Yes, thats needed when you have counters in global memory. In that case, instead of just having vector<double> you would put each double into a stricture aligned to 64 byte addresses. Here all the counters are on local stack, so that trick unfortunately wont help

hinkley · on Feb 2, 2022

For the single threaded version, I believe they have a similar problem with

    auto sums = _mm256_set1_ps(0);
    for (; it + 8 < end; it += 8)
        sums = _mm256_add_ps(_mm256_loadu_ps(it), sums);

Where each SMD op is trying to overwrite to a compact data structure.

But in the threaded version https://github.com/unum-cloud/ParallelReductions/blob/fd16d9... they have separate slots for an accumulator but it's still in a shared vector, which most likely has the issue I described.

jhokanson · on Dec 9, 2021

Yes please! I like to search for examples of how to use libraries and often times the results are all the same exact call in forks or copies of the same code in multiple places. Perhaps deduplication could be optional when searching?

dstaheli · on Dec 14, 2021

We hear you, jhokanson. Thanks for this feedback.

jhokanson · on Oct 9, 2021

I thought this would be referencing work by Nathan Myhrvold using a new type of reactor that supposedly runs on "spent" nuclear waste and in the event of power failure just stops running safely. Not sure of the other logistic issues involved. The one thing I remember about transitioning to this approach is that Nathan said the US isn't very good about building new things, so they were going to build in China. But then it got shut down right as the anti-China trade policies started a few years ago. Not sure if there are big problems with this approach, but it sounded promising ...

jhokanson · on July 13, 2021

As someone that rarely works with Python lists, as opposed to numpy arrays, I was pleasantly surprised to see numpy does what I would expect in providing a mergesort option. I'm surprised Python doesn't, other than via heapq and only implemented in Python (according to my reading of the post and a very quick Google search).

Oops, just for fun the numpy documentation currently states: "The datatype determines which of ‘mergesort’ or ‘timsort’ is actually used, even if ‘mergesort’ is specified. User selection at a finer scale is not currently available." Awesome ...

Also, apparently mergesort may also be done using radix sort for integers "‘mergesort’ and ‘stable’ are mapped to radix sort for integer data types."

masklinn · on July 13, 2021

Why would Python provide a mergesort option when timsort was the replacement for a standard mergesort?

`heapq.merge` is not a mergesort, it's a merge of sorted sequences (aka just the merge part of a mergesort).

TheRealPomax · on July 13, 2021

Turns out that at least in terms of performance, everyone using numpy is fine with this. It just needs to run fast enough, not as fast as possible ;)

tehjoker · on July 13, 2021

This is not true. I have written many libraries to improve on the performance of numpy for image processing applications. Sort backs many operations, such as unique, that result in painfully slow code.

CogitoCogito · on July 13, 2021

I too have written a lot of extension code to speed up numpy code. It's often not even especially difficult since any code at that level tends to be very algorithmic and looks essentially the same if written in numpy or C++. Having control of the memory layout and algorithms can be a huge win.

Of course I don't _usually_ do this, but that's just be most of the code I write doesn't need it. But it's not at all some crazy sort of thing to do.

toxik · on July 13, 2021

Tangential but np.bincount is typically the fast version of np.unique. Not entirely the same thing, but it’s worth knowing about it.

kzrdude · on July 13, 2021

mergesort mandates that it's a stable sort, so I guess they get away with replacing "like for like" that way. For varying definitions of like.

jhokanson · on March 22, 2021

Interesting take. I did not interpret their statement as vindication of being correct but rather that this version of events is a possibility that can't currently be dismissed.

jhokanson · on March 22, 2021

Agreed. However my reaction when first hearing about the lab leak (middle of last year?) was that the leak stories were meant to be malicious/propaganda against China. I didn't take any of this seriously until an article in Politico a week or two ago.

But here's the kicker. Let's say this was a lab leak and as a reporter (which I'm not) I thought the evidence was good enough to warrant reporting. I'm not sure I would share it. The previous occupant of the white house did a great disservice in giving this whole thing a racially charged tone. I'm genuinely scared by the increased acts of violence against southeast Asians in the US and worry that stories like this will make it worse. I'm hoping that the new US government is secretly taking steps to help prevent what may have happened in that lab -- in addition to the large effort needed elsewhere to improve our handling after things had begun to spread.

Anyway, main point is that this was the first time in a long time (ever?) where I really wondered whether, given the circumstances, if it was good to share "the whole truth" (as best we know it) given that we don't know what happened and the potential real-life implications to many people in the US.