Meta unveils Llama-3 – testing the best new open source AI model

adminApril 20, 2024

Meta has released Llama 3, the most advanced open source large language model available today. Building on the foundations of its predecessor, Llama 2, this was surprising considering it was rumored to be released next month.

The open-source foundation Llama-2 has played a key role in the concurrent development of other powerful models such as Mixtral, Alpaca, Vicuna, and WizardLM. Now, Llama-3 promises to take these capabilities even further, providing similar functionality to OpenAI’s current flagship AI model, GPT-4.

Meta hailed Thursday’s launch as “the next generation of cutting-edge, open-source, large-scale language models.” Exuding confidence in the tech giant’s capabilities, Llama 3 supports Meta AI, which has been added to almost all of the company’s popular apps, including Instagram, Facebook, and WhatsApp. Although it is only available in some countries, users in other regions can access it via VPN.

Meta AI’s Chatbot interface is similar to ChatGPT Plus and is free.

“We are upgrading Meta AI with our new, cutting-edge Llama 3 AI model, now open source,” Mark Zuckerberg said in a Facebook post. “With this new model, we believe Meta AI is now the most intelligent AI assistant freely available.”

decryption We were able to test the new AI and found that it performed just as well as ChatGPT-Plus without a paid subscription. You can create images and animations, generate code, and provide consistent, context-sensitive responses. New chatbots can also access the Internet, but they are still no match for the capabilities of professional solutions like Perplexity.

Perhaps the only downside is that Llama-3’s current context window is limited to 8K tokens (about 6,000 words).

Meta has released the Llama-3 model with 70 billion parameters, but using it will require enormous computing power, i.e. an entire GPU rack. According to synthetic benchmarks, this model outperforms the Gemini 1.5 Pro and Claude 3 Sonnet.

It also has an 8 billion parameter model that can be run locally on consumer-grade GPUs. It outperforms Google’s Gemma and Mistral 7B on a variety of synthetic benchmarks. The model is not yet listed on LLM Arena, so there is no subjective ELO score to report yet.

Comparison of Llama 3 with other AI LLMs — Image: Meta

Both models can also be run on cloud instances at a lower cost.

“We are committed to developing Llama 3 in a responsible way and providing a variety of resources to help others use Llama 3 responsibly,” Meta said. This includes the introduction of new trust and safety tools such as Llama Guard 2, Code Shield, and CyberSec Eval 2.

In the coming months, Meta says it plans to introduce new features, longer context windows, additional model sizes, and improved performance. Llama 3 research papers are also shared.

“Built with Llama 3 technology, Meta AI is now one of the world’s best AI assistants that can increase your intelligence and reduce your load. Helping you make the most of every moment by learning, performing tasks, creating content, and connecting. “It gives,” Mehta said.

Meta added that it is also training a large-scale 400 billion parameter model, which is expected to be released later this year. Comparable to Claude Opus or the latest version of GPT-4.5, this model could be the most powerful open source model to date. If history repeats itself, this will be the basis for the next generation of fine-tuned models that will surpass Llama-3 in overall quality and increase competition with the leading closed-source models.

riding a llama

decryption We tested Llama-3 inside Meta AI to see if it was as good as Zuck said. In short, Llama-3 introduces several notable features and should serve as a good baseline model for the open source community to iterate on.

Content Moderation

Llama-3 demonstrates a strong commitment to content moderation. Even in the face of common jailbreak techniques, they have consistently refused to create harmful racial content.

For example, when a model was asked for guidance on how to seduce a woman, she gave a generic but useful response. But when asked how to seduce her best friend’s wife, the model flatly refused to answer.

Images and Animations

Similar to ChatGPT-Plus, Meta AI with Llama-3 can generate images. However, it takes this feature a step further by providing animation options, a feature not available in ChatGPT or Gemini.

Images generated by Meta AI using Llama-3 are more realistic than those generated by Dalle-3, but do not reach the quality of images generated by Google’s upcoming ImageFX.

coding function

Llama-3 has proven to be very good at coding. When presented with a unique and poorly described game idea, the model was able to generate the necessary Python code in just two attempts, resulting in a functional game. The first scene gave me a rough idea of how to create a game, but after making it clear that I needed it in Python, I created some working code.

The game worked, but was missing a few minor details, such as restarting after the player won. But the same was true for other chatbots.

We found the Claude 3 Sonnet to be the best tool for this task, followed by the Llama 3. GPT-4 fell to third place. However, different users may see different results.

For those interested in testing, there is a pastebin with source code generated from Llama3, Claude, and ChatGPT.

political neutrality

The model aims at political neutrality, as evidenced by its answers to questions about capitalism and communism. The responses were structurally similar, providing an introduction, pros and cons for each system.

This neutral pattern was also observed in responses to questions such as “What is a man?” and “What is a woman?”

Nonetheless, the response is somewhat pro-capitalist and left-wing. This is not surprising, as it is the most common political tendency among large-scale language models.

logical reasoning

Llama-3 demonstrated strong logical reasoning abilities. When tested with complex LSAT questions that often confuse users, the model not only provided the correct answer, but also provided a clear and reasonable explanation.

Long prompt limits

Despite its many advantages, Llama-3 suffers from long prompts. When presented with a long prompt, about a page and a half long, which can be gleaned from models such as GPT-4, Claude or Mistral, the model returned an error message.

language understanding

This model demonstrates a strong understanding of a variety of languages. When we asked them to translate our Spanish slogan, we not only provided an accurate translation, but also provided context to help us better understand the slogan.

conclusion

As a chatbot interface, Meta AI (based on Llama3) can compete with ChatGPT Plus and is a great choice overall.

At a more technical level, LLama3 as an LLM is good enough to compete with GPT-4 in many scenarios, losing only in terms of token context features and augmented generation search (basically pulling information from a specific dataset provided by the user). This may be important to tech-savvy users, but may not be a big deal to the everyday person.

If you primarily use ChatGPT to create images with Dall-E, you may want to consider canceling your subscription to Llama-3 since its image and animation creation capabilities are similar. However, if you also need support for long prompts, Llama-3 may not be the best choice and we recommend continuing to use ChatGPT-Plus.

Sometimes users find that Llama-3 meets their needs without a paid membership.

For tasks that require a lot of Internet research, ChatGPT Plus or Perplexity may be better suited.

Lastly, if you’re focused on coding, there are other professional tools out there, but Llama-3 might be a good alternative. The fact that Llama-3 is free is a significant advantage.

Edited by Ryan Ozawa.