New Release: LLaMA
Besides being up to 65B parameters model, LLaMA stands out being built on open-source training data, therefore the models are made available to researchers, while its target competitor OpenAI's GPT-3 is not open.
Here are a few Meta employees tweeting about it.
Guillaume Lample, @GuillaumeLample
Today we release LLaMA, 4 foundation models ranging from 7B to 65B parameters.
LLaMA-13B outperforms OPT and GPT-3 175B on most benchmarks. LLaMA-65B is competitive with Chinchilla 70B and PaLM 540B.
The weights for all models are open and available at https://research.facebook.com/publications/llama-open-and-efficient-foundation-language-models/
Joelle Pineau, @jpineau1
Introducing LLaMA: A foundational, 65-billion-parameter large language model
And an analysis by non-Meta employee.
Guido Appenzeller, @appenz
1/5 Meta launched their GPT-3 competitor LLaMA today. Here is a quick analysis of how it stacks up, how open it is and how it changes the industry landscape.
Community: Stability AI
Stability AI, @StabilityAI, started an effort (i.e. a newsletter on substack) to share what the community is creating. I think this is a great idea.
Their first is an EXCLUSIVE interview w/ THE amazing @remi_molettee
Tutorial:
Misha Laskin, @MishaLaskin
Starting a blog about the engineering + scientific ideas behind training large models (e.g. transformers).
First post covers data parallelism, a simple and common technique for parallelizing computation across multiple devices.