Here's a thought experiment for you. Let's say you can run 1000 agents at 10,000 tokens a second. Do you think you are going to be more productive than someone running at 6tk/sec with the same model?
Incase it's not clear, you will be generating 10,000,000 a second. Good luck verifying it. Token generation is not the bottleneck for creative work. If you are doing a predictable work and have a good workflow and massive dataset to process, then speed of token matters. If you are performing creative work like coding, it doesn't.
Incase it's not clear, you will be generating 10,000,000 a second. Good luck verifying it. Token generation is not the bottleneck for creative work. If you are doing a predictable work and have a good workflow and massive dataset to process, then speed of token matters. If you are performing creative work like coding, it doesn't.