Meta announces new large language model

 

Meta Platforms is releasing its new LLaMA (Large Language Model Meta AI) to researchers.

 Meta AI researchers developed a collection of LLaMA models, which they say can outperform OpenAI's GPT-3 even though they are smaller. A spokesperson said these new language models aren't being used in Meta's products, such as Instagram or Facebook.

  • According to Meta, the models are 7 billion to 65 billion parameters in size, smaller than GPT -3's 175 billion parameters.
  • One of those models, LLaMA-13B, is 10x smaller than GPT-3 and outperforms it on "most benchmarks," according to project member Guillaume Lample.
  • All the LLaMA models were trained using publicly available datasets such as Wikipedia and Common Crawl.
  • Meta says it will make the models available to academic researchers on a "case-by-case basis."
  • By comparison, the models underlying OpenAI's ChatGPT and Google's LaMDA are not publicly available.
  • The researchers will be "affiliated with organizations in government, civil society, and academia; and industry research laboratories."
  • The request form to apply is available here. Meta is also sharing the full code and weights.
  • In a Facebook post today, Meta CEO Mark Zuckerberg said large language models have demonstrated "a lot of promise in generating text, having conversations, summarizing written material and more complicated tasks like solving math theorems or predicting protein structures."

Post a Comment

Previous Next

Contact Form