Ladies and gentlemen, there’s a new AI image generator in town and it’s surprisingly good.
This is amazing because it comes from Google and isn’t the basic, somewhat ugly, lazy generator you usually see on Bard. It’s also hidden from the general public, but that doesn’t mean it’s unusable.
The name is ImageFX and it is Google’s latest venture into the realm of AI image creation. It is available through Google’s AI Test Kitchen, an experimentation platform that allows users to interact with Google projects still in development.
Despite being in early beta stages, ImageFX delivers amazing results in terms of accuracy and realism. However, availability is limited to certain regions, including the United States, Kenya, New Zealand, and Australia, and availability is limited to English. This demonstrates Google’s thoughtful approach and desire for a controlled environment for user feedback and system improvement.
People living outside of permitted areas may use methods such as VPNs or proxies to circumvent geo-restrictions and do so at their own risk.
Powering ImageFX is Imagen 2, a sophisticated AI model developed by DeepMind, Google’s renowned AI lab. Imagen 2 is designed to interpret and visualize text prompts and boasts the ability to create a variety of images and styles. Google claims that Imagen 2 sets a new standard for image quality in a generation of AI models.
The introduction of ImageFX is part of Google’s broader strategy to explore different aspects of generative artificial intelligence. It joins a suite of professional tools that include MusicFX for music production and TextFX for stylized text creation.
Google vs. Dall-e 3 vs. MidJourney
Google’s ImageFX is entering the AI-based image generator space, competing directly with established players like Dall-E 3 and MidJourney. A distinct advantage of ImageFX in its early beta phase is its free access. Dall-E’s ChatGPT integration costs $20 per month, while MidJourney’s annual subscription is closer to $100.
Although cost-effectiveness is a big factor, what sets these tools apart is their comparison capabilities and output quality. ImageFX excels at creating surreal images, surpassing Dall-E 3’s somewhat cartoonish presentation and MidJourney’s focus on aesthetically appealing visuals.
But just because ImageFX is free doesn’t mean it’s bad. ImageFX offers unique features such as seed control, allowing users to fine-tune the creative process by adjusting the initial noise configuration. This level of control is unmatched by Dall-E 3 or MidJourney, allowing users to make subtle adjustments while retaining key elements of the image.
ImageFX can also highlight key prompt words and suggest creative alternatives, a feature that competitors do not offer.
However, ImageFX has limitations. These tools only create square images, while Dall-E 3 and MidJourney provide flexibility in aspect ratio. Additionally, unlike MidJourney, ImageFX does not support image editing features such as inpaint and outpaint, which limits its versatility. Finally, Dall-E 3’s conversational functionality, which allows beginners to instruct the model in natural language, contrasts with the keyword-based prompting required by ImageFX and MidJourney.
The approach to prompting also differs greatly between these models. ImageFX does not support negative prompts that allow the user to specify what to exclude from the image. MidJourney offers this feature to add precision to your creative process. Although Dall-E 3 lacks direct negative messages, its interactive interface allows users to indirectly guide the model, providing alternative approaches to improve image output.
An image is worth a thousand words
decryption With access to ImageFX, we were able to compare that generation to the MidJourney and Dall-E 3. We used the same prompts for all models and the results below are always presented in the same order from left to right. The first is ImageFX and the second is MidJourney. The third is Dall-E 3.
Realism:
Prompt: A photo of a cryptocurrency trader looking worried.
Both ImageFX and MirJourney produced very realistic results. However, in terms of style, ImageFX realistic MidJourney, on the other hand, looks a bit more. hyperrealismThat is, the first is more faithful to real life, while the second is more artistic, using saturated colors, exaggerated bokeh, etc.
Dalle-3 fails to produce photos. Instead, we created a 3D rendering that focuses more on the content. It was easy to tell that he was a cryptocurrency trader due to the chart in the background, but the photo clearly was not.
Illustration:
Prompt: Illustration of a mysterious bear riding a cybernetic wave.
This prompt was a bit more abstract to test how the model interprets non-standard ideas. ImageFX and MidJourney produced the most aesthetically pleasing images, but MidJourney was closer to rendering than illustration, while ImageFX tried to capture the essence of the cybernetic wave. Instead, MidJourney associated the term “cybernetics” with bears. Dall-e 3 captures the essence even more closely. It’s clearly an illustration and has a similar cybernetic aesthetic, but the bear’s shape is wrong and the image quality is lacking compared to its competitors.
Long natural language:
Prompt: Highly detailed photo sci-fi of a mysterious computer expert working on a laptop. Behind him, an FBI agent awaits to capture him in a photorealistic, intricate wide shot.
To make this comparison, the prompt for MidJourney was changed to “Highly detailed photo sci-fi, wide shot, realistic, complex, close-up of a mysterious computer expert working on a laptop with an FBI agent waiting to capture him.”
MidJourney refused to produce the image at first prompt.
ImageFX creates stunning, detailed photos by taking every detail into account. MidJourney didn’t create any “mysterious” computer experts. It also sticks to its signature style with excessive bokeh, eye-catching light trails or raindrops of different generations. The rest seemed to depict astronauts, cyberpunk marines, or something similar, so this was the best example. Dall-E creates an image that recognizes all the elements of the prompt (the FBI logo, the mysterious computer expert, etc.), but it’s not a photo and has the hacker’s anatomy wrong, featuring typical spaghetti fingers. .
Text in image:
Prompt: A futuristic city with a neon sign that says “EMERGE by Decrypt”
In general, the best text generator is Dall-e 3, but in this specific case and under the conditions set by the comparative methodology, the text was poorly written. ImageFX cannot generate full text. There is a text generation feature, but it’s probably the least impressive.
In other words, Dall-E and ImageFX were the best at capturing the essence of futuristic cities, while MidJourney created cities that were aesthetically pleasing, but not futuristic at all.
conclusion
AI enthusiasts now have access to a wealth of AI models that meet a variety of needs. There is no need to pick a winner as most of them are available for free. Each product has a specific use case that makes it stand out.
If you don’t want to spend money, ImageFX is the best of the three. It’s also the best in terms of photorealism.
MidJourney is perfect for people who aren’t good at following prompts but are looking for aesthetically pleasing images.
Dall-E 3 is best suited for beginners who want to create renderings and don’t even want to think about on-the-fly engineering, keywords and parameters, but instead want to talk to the AI as if it were another friend.
But when it comes down to it, we really liked ImageFX.
Edited by Ryan Ozawa.