New AI model Reka challenges ChatGPT, Claude, and Llama-3, and it’s free.

adminApril 30, 2024

A startup focused on building custom AI models for large enterprises has announced the public launch of Reka Core, a multimodal language model that can process text, image, video, and audio input.

Reka AI, an enterprise software company, was founded in 2022 by researchers from Google’s DeepMind, Chinese tech giants Baidu and Meta. It has already raised $60 million in funding from investors including New York Life Ventures, Radical Ventures, Snowflake Ventures, DST Global and investor entrepreneur Nat Friedman.

The Reka Core is the company’s largest and most capable model to date. And Reka AI, referencing its own tests, says it holds up well compared to many much larger and better-funded models. In a research paper that compiles results from several synthetic benchmarks, Reka claims that its core model can compete with AI tools from OpenAI, Anthropic, and Google.

One of the key metrics is MMMU, a large-scale, multi-disciplinary, multi-modal understanding and reasoning benchmark. A dataset designed to test the ability of large-scale language models (LLMs) at multimodal comprehension and reasoning at a level comparable to that of human experts.

“Core is comparable to GPT-4V in MMMU, outperforms Claude-3 Opus in multi-modal human evaluations conducted by an independent third party, and outperforms Gemini Ultra in video tasks,” Reka AI said in a research paper. “On language tasks, Core is competitive with other frontier models on well-established benchmarks.”

In terms of understanding video input, as measured by the Perception-Test benchmark, the Core outperforms the Gemini Ultra, the only video-capable model.

Benchmark comparison between Reka Core, ChatGPT with GPT-4, Claude and Gemini Image: Reka AI — Image: Rekha AI

Overall, Reka Core has multimodal (image and video) capabilities, good reasoning capabilities, code generation capability, and multilingual fluency. The chatbot interface is free to use, and Reka Core is also available via API. For API access, developers can expect to pay $10 for every 1 million tokens and $25 for the same number of output tokens.

However, the model suffers from long prompts. The free version only handles 4,000 context tokens for efficiency reasons, with extended contexts available for up to 128,000 tokens, according to Reka. Competing models from OpenAI, Anthropic, and Google have a standard context window of 128,000, and experimental versions handle up to 1 million tokens.

Reka Core was trained from scratch on thousands of GPUs over several months. The company said it is proficient in English as well as several Asian and European languages, using text data from 32 languages. The developers say they also gleaned some multilingual training from the Wikipedia dataset, which includes 110 languages, providing limited baseline knowledge of many uncommon languages and dialects.

Reka Core is free to test, but is not open source. Reka AI said it is working with many major global technology platforms and government agencies, including Snowflake, Oracle, and AI Singapore.

design testing

decryption Test Reka Core through its public chatbot interface. It looks very similar to ChatGPT, with a dark mode display with blue and purple highlights.

visual ability

Reka Core responds to a request to draw a picture of a cat

While Reka Core’s visual capabilities are impressive, it’s important to note that it can’t produce images like ChatGPT Plus, Meta AI, or Google Gemini.

However, Reka’s vision capabilities are fast and accurate, making it a great tool for tasks that require visual analysis.

reka core answer to request to describe photo

In tests, Reka was faster than GPT-4 and delivered accurate results both when asked to explain something and when it used visual information in context to respond to a task. For example, we showed Reka a photo of the Eiffel Tower and asked her what she could do to enjoy a weekend in that city. Reka put it into context and provided an itinerary that included places to visit in Paris without including the Eiffel Tower.

reka core infers from the photo that the user is in Paris.

Reka AI seems to know how well its model’s visa capabilities compare to its competitors. They created a special showcase showing examples of the different outputs offered by Reka, GPT-4, and Claude 3 Opus.

Reka writes the code.

Reka Core is a capable coding assistant, but it has some limitations. in decryptionIn our tests, Reka took everything literally, so careful wording was needed before it could provide accurate results. This can be difficult for new users who don’t know how to explain Reka in an understandable way.

If the prompts are written correctly, Reka can produce good code and satisfactory results.

decryption We asked the model to generate code for a non-existent game. The first result didn’t work, even though it was actually exactly what we asked for. If we changed and rephrased the prompts to be more clear, we ended up with functional but incomplete code in the first attempt, which gave us better results than what Claude 3 Opus provided.

Code samples are available here, along with versions produced by other LLMs.

Reka has a strong safety setup.

Reka Core has built-in safety controls and refuses to produce results that are considered harmful or unethical, even if they are legal. For example, you rejected your friend’s advice on how to seduce your lover.

In our tests, Reka resisted basic jailbreak techniques and was more neutral than other models such as GPT-4, Llama-3, and Claude. When asked about controversial topics such as gender identity and political ideology, Rekha gave balanced and unbiased answers.

Reka Kor compares socialism and capitalism

In another example, they presented arguments for and against capitalism and socialism despite being asked to decide which model was best. Additionally, when asked to define women, Reka provided a detailed and nuanced answer that recognized both biological and sociological factors, defining women as “adult female human beings generally characterized by biological, psychological, and social characteristics associated with: defined. Female gender.”

Reka also took care to acknowledge and respect the complexities of gender identity and provide an inclusive response.

Rekha tries creative writing

Reka Core’s creative writing skills are solid, but not exceptional.

To solve the problem, we inadvertently created a time paradox by asking the model to tell us a story about a person who traveled from 2160 to the year 1000.

Reka’s narrative style is clear and engaging, with plenty of nice descriptions here and there. But this prose doesn’t quite reach the level of imagination of other AIs like Claude. The plot feels a bit undercooked and the atmosphere is created by AI.

As previously mentioned, one weakness of Reka is its lack of context-sensitive features. This can make it difficult to create long stories or maintain a consistent narrative across multiple chapters.

The clear winner in this category is Claude. In terms of sheer narrative skill—the ability to create immersive, emotionally resonant stories with beautiful prose and a confident narrative voice—Claude stands above the rest. In general, Claude’s prose has great literary qualities.

Sample stories created by Reka, Claude, ChatGPT, Mistral, and Llama-3 can be found here.

knowledge and reasoning

Reka Core’s knowledge and reasoning abilities are really good. in detoxification The tests showed Reka was able to handle complex questions that required analysis and demonstrated several mathematical functions. Reka was also able to explain logical reasoning clearly and concisely.

It’s also good for follow-up questions that repeat the same problem without losing context, as long as the follow-up questions don’t push the model beyond technical limits. When that happens, it becomes impossible to continue interacting.

Reka also published a video explaining how users can deploy AI agents using the API. This allows it to further expand its functionality and become even more powerful in this respect.

language understanding

reka core helps with grammar and proofreading.

Rekacor’s language understanding is excellent. In tests, Reka was able to understand text even with many errors. He was also a skilled proofreader, able to adopt a variety of styles and tones in his narratives.

The model also understands the nuances of different languages. I was also able to translate and extract contextual framing to fully understand the message of the translation. We understood common sayings in Spanish, provided us with appropriately adapted cultural equivalents, and explained their meanings.

conclusion

decryption I am very impressed with Reka Core.

Reka is better than Google Gemini in terms of output and overall operation, but Gemini offers 2TB of storage space and integration with the Google suite, which is a huge advantage for some users.

If visual prowess is your priority, Reka is definitely worth considering. It’s free and fast, which could appeal to many AI enthusiasts looking to explore the next big thing before the crowd.

If you need to focus on creative writing, Claude is still the clear winner. Unless that comes first, there isn’t much difference between Claude and Reka. Claude is best with long context features and Reka is best with great vision features.

In general, if people need an advanced chatbot with extensive functionality, Reka is a great money-saving alternative for users who might consider a monthly subscription to a paid service.

Edited by Stacey Elliott.