Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This can never match the scale of organic training data




Or quality

Actually synthetic training dats is better, thats why the new models are all better at design.

If synthetic data is so much better then what are AI crawlers still DDOSing everyone for? Are they stupid?

Mostly. I had the "AI bot tsunami" problem on my own personal site and blocked a bunch of bot user agents in robots.txt. Most of them were from companies I had never heard of before. The only big AI name I recognized was GPTBot from OpenAI.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: