Once an obscure term buried deep in the realm of tech enthusiasts and researchers, Large Language Models (LLMs) have now been catapulted into the limelight, a fundamental component of AI and its revolutionary rise to prominence in 2023.
The surge began with the likes of GPT-3.5 and ChatGPT and extended rapidly to encompass a diverse array of models excelling in everything from professional coding to quirky conversational antics. Across this burgeoning landscape, some LLMs are seemingly versatile generalists, others shrinking violets, and even others who say the customizable nature and handheld convenience will only ensure a broader adoption of the LLM approach to AI data training.
This year, LLMs are not just a technological tool; they are becoming the digital confidants, creators, and sometimes even the slaves in our everyday lives. They’ve evolved from underlying technologies to front-and-center proof of their strength and currency—and variants are now everywhere, spreading across various platforms, and reshaping everything they touch.
Here are the most powerful LLMs you can try right now—based in no small part on our collective, subjective opinion. We’ll go beyond synthetic benchmarks’ cold hard data to share each model’s practical prowess and creative flair. Let’s review the roster of these two broad teams: open research and unabashed corporate accelerationists.
Top closed-source LLMs
GPT (OpenAI and Microsoft)
GPT (an acronym for Generative Pre-trained Transformer), the power behind OpenAI’s ChatGPT and Microsoft’s Copilot lineup, is a tour de force in the world of LLMs. The global AI hype wave started with the buzzworthy GPT-3.5 and has evolved into the more robust GPT-4.5 Turbo. This model, though not freely available like its predecessor, has set new standards for language understanding and generation.
GPT’s integration into widely-used platforms like Copilot has made it a household name in tech circles, significantly impacting how we interact with AI in our daily digital tasks.
The takeaway:
GPT stands out for its unparalleled performance on various synthetic benchmarks, making it the most powerful model currently on the market. However, its heavy censorship, implemented to ensure safe and responsible AI usage, can sometimes limit its creative potential. Microsoft’s version, available for free in Copilot, provides a glimpse into the future of AI-assisted coding, exemplifying the model’s versatility and power.
Claude (Anthropic)
Developed by a team of ex-OpenAI staff, Claude marks a paradigm shift in AI development. Eschewing OpenAI’s Reinforcement Learning from Human Feedback (RLHF) strategy, Claude adopts a “Constitutional AI” framework. This approach allows minimal human intervention while strictly adhering to a predefined set of rules, supposedly ensuring ethical AI behavior.
As the first free model capable of processing over 100K tokens of context, Claude sets new boundaries in AI’s understanding of lengthy and complex conversations. Its latest update makes it able to process over 200K tokens of context (almost twice the capabilities of GPT-4.5 Turbo), making it the most powerful LLM in terms of context capabilities.
The takeaway:
Its unique approach to AI governance and extended context understanding places Claude in a league of its own. While it trails GPT-4 in terms of accuracy and overall quality, its more creative and pleasant writing style offers a fresh perspective in AI interactions. However, its propensity for hallucinations is a trade-off for its more artistic and free-flowing expression.
Gemini (Google)
Gemini, Google’s latest foray into the LLM arena, stands out for its multimodal capabilities. Unlike ChatGPT Plus, which coordinates multiple AI models (like GPT, GPT-Vision, and Dall-E 3) to provide diverse outputs, Gemini was natively trained to understand and produce text and visual inputs and outputs.
Although not as publicly accessible as its competitors, Gemini promises to redefine Google’s range of services, leveraging its advanced multimodal understanding to offer richer, more integrated user experiences. Not every Google user has access to it, but those who do love it.
The takeaway:
Gemini’s visual and textual integration sets a new benchmark for LLMs, offering a glimpse into a more holistic AI future. Its rumored superiority over GPT-4 in multimodal tasks positions it as a top model in the AI landscape. Gemini’s potential to enhance Google’s ecosystem across various applications, meanwhile, makes it a LLM to watch closely in the coming years.
Top generalist open-source LLMs
LLaMA-2 (Meta)
LLaMA-2 is an open-source LLM developed by Meta. It is an evolution of the previous (and legendary) LLaMA model, which became widely popular among early AI enthusiasts. It’s available in versions from 7Bn to a hefty 70Bn parameters, which makes it a great choice for anyone in need of a model capable of providing good interactions ranging from the lightweight and occasional user to the heavy-duty professional.
The takeaway:
Its ability to be fine-tuned across a vast array of applications makes LLaMA-2 a versatile and powerful model. It seems to be more censored than its previous version, but it still provides an improved, more reliable output, making it a popular choice for developers looking to tailor it to specific tasks.
Mixtral 8X7B (Mistral AI)
Developed by the French startup Mistral AI, this LLM is an iteration of the much-acclaimed Mistral 7b model, enhanced by a ‘Mixture of Experts’ training approach. A mixture of experts involves partitioning the model into numerous neural networks, each specializing in different tasks or data types. This results in more efficient and effective learning without requiring powerful hardware.
The takeaway:
Its ingenious approach strikes a balance between quality and efficiency. This architecture leads to better throughput and also makes Mixtral a base for numerous derivative models. Considering it’s more powerful than Mistral 7b, this model is very promising and is already gaining steam in the open-source LLM community.
Falcon 180B (Technology Innovation Institute)
The arrival of Falcon 180B marked a monumental stride for open-source LLMs, boasting 180 billion parameters and training on an unprecedented 3.5 trillion tokens. As reported by Decrypt, this model is the culmination of an effort that involved a staggering 7 million GPU hours across 4,096 GPUs, all orchestrated to create one of the most potent models available to date.
It has some lighter versions, but those are not up to the standard the 180B model sets.
The takeaway:
Falcon 180B is not a consumer-grade model, but its prowess in handling instructions, engaging in rich dialogues, and coding tasks makes it a formidable tool for those able to access the necessary hardware. It provides quality outputs and, in general, can be a powerful ally for anyone willing to invest in it.
Top LLMs for work
Bloom (BigScience):
Imagine BLOOM to be a digital colossus, stretching its 176 billion parameters across the linguistic horizon. Bloom is adept in not just one or two languages, but 46 natural languages and an impressive array of 13 programming languages. This leviathan of language is the fruit of a year-long labor of love and intellect by a legion of over 1,000 researchers spanning the globe, working over 117 days on the Jean Zay supercomputer in France.
The takeaway:
BLOOM stands out as a beacon of linguistic diversity and a champion of the open-source movement. Its polyglot prowess sets it apart; its seamless integration with the Hugging Face ecosystem makes it available for anyone. It provides great quality results and is accurate enough for coding tasks and professional correspondence.
Mistral 7B (Mistral AI)
Mistral AI makes it to our list again with its original Mistral 7B model, trained with 7.3 billion parameters. This model became the hot topic of AI enthusiasts when news spread that it outperformed larger models across various benchmarks, especially in code generation and English language tasks.
To train it, Mistral AI used techniques like ”grouped-query attention” for faster inference and “Sliding Window Attention” for handling longer sequences more efficiently. Released under the Apache 2.0 license, Mistral 7B is very accessible for anyone willing to adapt the model according to their needs, be it a business chatbot, a document analyzer, a conversational AI, or just a funny bot with a personality.
The takeaway:
The model’s performance speaks for itself—it outperforms the already powerful Llama-2 13B and approaches the performance of specialized coding models. Its versatility earned it a place in the hearts of many AI aficionados worldwide, with many models trained with this tiny but powerful model as their base.
Top open-source LLMs for fun
Nous Hermes 2 – Yi-34B (Nous Research)
There are many “Hermes” LLMs floating around, but Nous Hermes 2 – Yi-34B is our favorite. Trained on 1,000,000 entries, predominantly generated by GPT-4, it’s base model Yi LLM, made some waves in the community for its high context capabilities and bilingual abilities. Honoring its name, Hermes provides uncensored knowledge, boasting a deep understanding of science and robust coding capabilities. Its unparalleled performance in all benchmarks for a Nous Research LLM has set it apart compared to models of a similar tier.
The takeaway:
In the realm of open-source LLMs for work, Nous Hermes 2 – Yi-34B stands out for its comprehensive approach and exceptional conversational and roleplay abilities, thanks to its use of ChatML. It is not as straightforward to set up for those unfamiliar with the ChatML style, but once you nail it, the results are very, very good. It is especially great for learning new things that can provide great conversations starters when properly set up.
Dolphin (Cognitive Computations)
Enter Dolphin, a daring entrant in the world of Large Language Models, fine-tuned with the robust Mixtral at its core. This model is not your average digital conversationalist; it seems to be designed for the thrill-seekers of digital dialogue, with great capabilities for those willing to do anything from funny and weird chats to enthusiasts willing to engage in more risqué roleplay.
But Dolphin’s realm extends beyond just NSFW entertainment. Its coding adeptness and sophisticated conversational capabilities make it a multifaceted tool for various applications. This unique cocktail of charm and technical finesse has quickly garnered Dolphin a reputation for daring innovation and versatility.
The takeaway:
In the ever-evolving landscape of open-source LLMs, Dolphin represents the cutting edge of rapid development and community-driven enhancements—for now. Its foundation on the Mixtral architecture speaks to a commitment to adaptability and community ambition, pushing the envelope of LLM capabilities. While its unfiltered nature caters to a specific audience, Dolphin is a testament to the desire for unrestrained digital expression and exploration.
WizardLM (OperatorX)
If you liked Dolphin, you’ll love entering into the enchanting world of WizardLM, Aitrepreneur’s chosen LLM for a realm of NSFW roleplay where only merit can crown you king. Despite grappling with the limitations of short memory, WizardLM weaves its magic across a wide array of topics, delivering responses with a consistency that’s nothing short of spellbinding. It’s not just a one-trick sorcerer either; other Wizard fine-tune code snippets specialize in areas like math and coding to make WizardLM a versatile companion for those who demand depth and delight.
Known particularly for its general 13B model, WizardLM excels in stirring up engaging, playful, and occasionally risqué dialogues. It’s like having an imaginative partner at your beck and call, ready to dive into a fantastical conversation immediately.
The takeaway:
WizardLM is the ally for those who value dependability and whimsy in their digital interactions. Whether you’re navigating the practicalities of work or the wilds of imaginative play, WizardLM stands out for its ability to keep the conversation flowing, relevant, and engaging. It’s the preferred choice for an open-source LLM that promises more than a conversation. WizardLM is offering an experience—where reliability meets a delightful dash of mischief for an altogether enlightening and entertaining digital journey.
Editor’s note: We took our time to configure chatbots based on the personalities of different historical figures and this model performed extremely well. Take your time, and you’ll be similarly rewarded. Have a great time with this model!