The bottleneck/stress-point is almost certainly the unique IO patterns and latency expectations. I doubt CPU is the issue or that C is necessary; instead they need to figure out the right fanning/batching algorithms, and buy gobs of RAM and use it efficiently.
With their recent fundraising, they might even be able to run a couple parallel fixup projects in competition, and let the best win.
With their recent fundraising, they might even be able to run a couple parallel fixup projects in competition, and let the best win.