That's how they are trained initially, but the resulting model isn't all that useful (was SOTA two years ago but this field moves fast).
A lot of the utility comes from the later finetuning. You can see this using the examples from the article, every mistake they identify with GPT-3 (which is the unfinetuned version) is answered correctly by chatGPT, which has gone through an extensive finetuning process called RLHF.
A lot of the utility comes from the later finetuning. You can see this using the examples from the article, every mistake they identify with GPT-3 (which is the unfinetuned version) is answered correctly by chatGPT, which has gone through an extensive finetuning process called RLHF.