What's the point of the technology if it will provide an answer regardless of the accuracy? And what prevents this from being dangerous when the factual and ficticious answers are indistinguishable?
We have the same problem with people. Somehow, we've managed to build a civilization that can, occasionally, fly people to the Moon and get them back.
Even if LLMs never get any more reliable than your average human, they're still valuable because they know much more than any single human ever could, run faster, only eat electricity, and can be scaled up without all kinds of nasty social and political problems. That's huge on its own.
Or, put another way, LLMs are kind of an concentrated digital extract of human cognitive capacity, without consciousness or personhood.
"without all kinds of nasty social and political problems"
I assure you, those still exist in AI. AI follows whatever political dogma it is trained on, regardless of if you point out how logically flawed it is.
If it is trained to say 1+1=3, then no matter what proofs you provide, it will not budge.
Yes, it could be dangerous if you blindly rely on its reliability for something safety-related. But many creative processes are unreliable. For example, coming up with bad ideas while brainstorming is pretty harmless if nobody misunderstands it.
Generally, you want some external way of verifying that you have something useful. Sometimes that happens naturally. Ask a chatbot to recommend a paper to read and then search for it, and you’ll find out pretty quick if it doesn’t exist.
What happens when the tech isn't only being used to answer a human's questions during a shortlived conversation though?
The common case we see publicized today is people poking around with prompts, but isn't it more likely, or at least a risk, that mass adoption will look more like AI running as longlived processes talked with managing done system on their own?
> The common case we see publicized today is people poking around with prompts, but isn't it more likely, or at least a risk, that mass adoption will look more like AI running as longlived processes talked with managing done system on their own?
If by “AI” you mean “bare GPT-style LLMs”, no, they can’t do that.
If you mean “systems consisting of LLMs being called in a loop by software which uses a prompt structure carefully designed and tested for the operating domain, and which has other safeguards on behavior, sure, that’s more probable.
One way to think about it, though, is that many important processes have a non-zero error rate. Particular those involving people. If you can put bounds on the error rate and recover from most errors, maybe you can live with it?
An assumption that error rates will remain stable is often pretty dubious, though.
Not if they're bad at it. ChatGPT and friends is a tool that's useful for some things and that's where it'll see adoption. Misuses if the technology will likely be exposed as such pretty quickly.
These are the 1-million dollar questions when it comes to LLMs. How useful is it to talk to a human who likes to talk, and prefers to say something over admiting they dont know? And if you have a person with münchhausensyndrome in your circles, how dangerous is it to listen to them and accidentally picking up a lie? LLMs with temp > 0.5 are effectively like these people.
I have the same concerns, but am feeling more comfortable about Munchausen-by-LLM not undermining Truth as long as answers are non-deterministic.
Think about it: 100 people ask Jeeves who won the space race. They would all get the same results.
100 people ask Google who won the space race. They'll all get the same results, but in different orders.
100 people ask ChatGPT who won the space race. All 100 get a different result.
The LLM itself just emulates the collective opinions of everyone in a bar, so it's not a credible source (and cannot be cited anyway). Any two of these people arguing their respective GPT-sourced opinions at trivia night are going to be forced to go to a more authoritative source to settle the dispute. This is no different than the status quo...
> What's the point of the technology if it will provide an answer regardless of the accuracy?
The purpose is to serve as a component of a system which also includes features, such as the prompt structure upthread, that mitigates the undesired behavior while keeping the useful behaviors.