I'm mostly thinking Tornado/Python, where the async stuff happened via generators (IE, the "yield" keyword). But that meant there were large chunks of the python standard library that were basically off limits because they blocked and couldn't be used with a generator, so if you used those functions the main event loop would be stuck waiting.
For node, we happen to have a server that calls into a geometric modeler (for collaborative 3d modeling). Since it's doing a lot of math, you could totally conceive that while an expensive modeling operation is running and chewing through CPU cycles, all the other sessions on the system are just waiting. That's kind of a specific use case admittedly, but with threads it wouldn't even be an issue, but with async it's a problem. I get there's ways around it (offload the work to a worker process asynchronously, for instance, which is what we're doing), but it's annoying that it's a thing I have to think about when the functionality is built into the OS.
Some of these cooperatively multithreaded implementations have "green" varieties of all your standard functions; these are greenlet-aware (that is, aware of the cooperative threading & I/O loop that's happening) functions that do things like, for example, sleep. So, you might have a my_green_library.sleep and a calls_the_os.sleep; the latter of which will yield the hardware thread directly to the OS, and block that thread completely until its done. Whereas the former will perform a sort of userland context switch, and note something to the I/O loop, and then sleep until the next event.
Worse, this problem makes composition hard: you need to know the entire implementation of any function you call, in order to be aware of whether or not it will cause the calling thread to block.