> But OpenAI appears to have some sort of data moat.
I'm willing to bet dollars to doughnuts that Google and Facebook have at least one, possibly 2 or more orders of magnitude more latent training data to work with - not including Googles search index.
My uninformed opinion is that Google and Meta's ML efforts are fragmented - with lots of serious effort going into increasing existing revenue streams with LLMs and the like being treated as a hobby or R&D projects. OpenAI is putting all its effort into a handful of projects that go into a product they sell. The dynamic and headcounts will change if the LLM market grows into billions
> My uninformed opinion is that Google and Meta's ML efforts are fragmented -
It seems more likely that at Google at least they just fell into the classic innovator's dilemma in which they were stuck trying to apply innovation to their current business models in an attempt at incremental innovation instead of seeking an entirely different customer and market.
I got the impression that Google was running Bard on a smaller model with presumably cheaper inference costs. I imagine the unit economics of both Bard and Chat GPT are negative right now and Google is trying to stay in the game without lighting too much money on fire.
Google and Facebook are not interested in pooling all their resources in order to build the next big thing. They are just interested in doing “enough” so that people keep using their platforms. The race is about how often a day every person on the planet spends on either google or Facebook/instagram. It’s about who is “the homepage of the internet”. They just need to be good enough so that traffic doesn’t move off to chat gpt.
I'm sure some people at google and meta were screaming at the top of their lungs to jump on the ai bandwagon before chatgpt - but you know how things work in large companies.
They're not as good at innovating, that's why they acquire startups all the time. It's a blood transfusion
Facebook.com already has decades-worth of natural language text and audio/video from uploads and "live" sessions. That is a deep pool, and wide too because Facebook probably has content in all currently-spoken natural languages, with the exception of those exclusively used by uncontacted peoples. That is a data moat.
I'm willing to bet dollars to doughnuts that Google and Facebook have at least one, possibly 2 or more orders of magnitude more latent training data to work with - not including Googles search index.
My uninformed opinion is that Google and Meta's ML efforts are fragmented - with lots of serious effort going into increasing existing revenue streams with LLMs and the like being treated as a hobby or R&D projects. OpenAI is putting all its effort into a handful of projects that go into a product they sell. The dynamic and headcounts will change if the LLM market grows into billions