Meta has entered the artificial intelligence (AI) game with its launch of a foundational, 65-billion-parameter large language model called the Meta AI (LLaMA), according to the tech company. LLaMA joins the ranks of other well-known AI models such as ChatGPT, Google’s Bard, and Microsoft’s AI-powered Bing as the world moves towards a smarter future.
In this article, we explore what LLaMA is, its potential uses, and how it could significantly improve AI-chatbots.
What is a Large Language Model (LLMS)?
Before diving into what Meta’s LLaMA platform is, it’s essential to understand what a Large Language Model (LLMS) actually is. LLMS are artificial intelligence (AI) systems that ingest massive amounts of digital text from internet sources, including blog posts, news articles, and social media updates.
These texts are used to train software such as ChatGPT to predict and produce content based only on a prompt by the user. Many of the AI-powered chatbots we see today are built on LLMS.
What is LLaMA?
Meta says that LLaMA is a state-of-the-art foundational large language model that will help AI researchers make progress in their work. Unlike a chatbot, LLaMA is a research tool that will help solve issues related to AI language models.
“Smaller, more performant models such as LLaMA enable others in the research community who don’t have access to large amounts of infrastructure to study these models, further democratizing access in this important, fast-changing field,” wrote Meta in a release.
Meta says that even though large language models have improved recently, researchers still don’t have full access to them because it takes a lot of time and money to train and run them. Researchers haven’t been able to figure out how and why these big language models work because they don’t have enough access to them. This has slowed down efforts to make them more reliable and fix problems like bias, toxicity, and the possibility of spreading false information.
Also read: Meta’s New Subscription Service: A Risky Move or a Smart Bet?
To address these challenges, Meta is training its models, which range from 7B to 65B parameters, on trillions of tokens while using public databases. This theoretically removes the reliance on proprietary and inaccessible datasets.
Like other large language models, LLaMA takes a sequence of words as input and predicts a next word to recursively generate text. To train its model, Meta selected material from the 20 languages with the greatest number of speakers, concentrating on those with Latin and Cyrillic alphabets.
Why is LLaMA so crucial in an AI-powered space?
Meta says that training smaller foundational models like LLaMA can be very helpful in the large language model space because testing new approaches, validating the work of others, and looking into new use cases require much less computing power and resources.
Foundational language models are trained using larger, unlabelled data sets, making them particularly ideal for customising them according to various tasks.
Does it have any limitations?
While Meta acknowledges that more research is needed to address the risks of bias, toxic comments, and hallucinations in most large language models, including LLaMA, it appears to have been built to allow researchers to test new approaches to limiting or eliminating these problems in large language models.
“As a foundation model, LLaMA is made to be flexible and can be used in many different situations,” Meta said. “This is different from a fine-tuned model that is made for a specific task.”
When will it be available?
Meta is releasing its LLaMA model under a non-commercial licence that focuses on research use cases. This is to keep the system from being abused and to maintain its integrity. Each request for use of the model will be evaluated individually.