Alpaca is not as cheap as you think

"Alpaca is just $100 and competitive with InstructGPT" -- takes like this are going around Twitter, adding to the (generally justified!) hype around AI models.

It is indeed a very encouraging result. Specifically, that it took so little compute to train something that achieves competitive results on benchmarks and is a relatively small model at 7B parameters. I haven't seen yet a demo that would show Alpaca do well in an application (like Jasper), but that might not be far off.

My main grudge with this argument is that while the compute was indeed very cheap, the data wasn't. They used InstructGPT to generate their data, which is clearly against OpenAI's terms of use (so they risk getting sued, because OpenAI might need to make an example out of someone).

If they had created the labels on their own, the 52k-example dataset would have cost much more than $100. Assuming a human labeller can write one example in 3 minutes, at $10/hr the cost of the dataset would be $26k. That is still cheap relative to e.g. training GPT-3 from scratch (~$500k), but far from the $100 that most hackers would be willing to pay.

Grumbling aside, I am glad to see Llama getting hacked and ported and optimized. It's encouraging to see that the open-source community can make wild and creative progress. Perhaps it's like with the Fosbury flop or 4-minute mile: once OpenAI has demonstrated something is possible, this proof of existence motivates the OSS community to achieve great things.