Why is Elon Musk’s Grok2 using FLUX.1 AI?

FLUX.1 AI is one of very few text-to-image models that generate human hands right and it can be locally hosted too.

Share

Published on August 21, 2024

by Sagar Sharma

FLUX.1 AI, developed by Black Forest Labs, a German artificial intelligence startup, is quickly proving itself superior to the competition in both image quality and creative flexibility. The startup has also collaborated with GrokAI and is experimenting with their FLUX.1 model to expand Grok’s capabilities on X.

Unlike other popular AI image generators like DALL-E and Midjourney, FLUX.1 appears to have minimal content filters or restrictions. This aligns with Elon Musk’s stated goal of pushing back against what he sees as over-censorship of AI platforms.

This way, Grok 2 users will now allow users to create uncensored deepfakes, like one where the 2024 US presidential candidates Kamala Harris and Donald Trump pose as a couple.

Apart from fewer restrictions and filters, FLUX.1 AI competes with popular models like Midjourney, DALL-E, Stable Diffusion, etc. and beats all of them in-terms of ELO scores.

ELO-comparision — (FLUX.1 AI gets more ELO scores than popular models. Source: Medium)

A key differentiator of FLUX.1 is its ability to accurately render human hands and legs, an area where previous AI image models struggled due to inadequacies in training datasets.

FLUX.1 also supports a diverse range of aspect ratios and resolutions up to 2.0 megapixels.

The Fusion of Multiple Architectures

What sets FLUX.1 AI apart from other models is its unique mixture of multiple architectures, allowing it to achieve superior results. At the core, FLUX.1 AI combines the strengths of both transformers and diffusion models. This powerful blend allows FLUX.1 AI to generate images with unprecedented speed and quality.

Transformers that work on neural networks excel at understanding and processing sequential data, such as text. They help FLUX.1 AI interpret and accurately translate text prompts into visual representations. Diffusion models, on the other hand, are skilled at generating high-quality images by iteratively refining noise into coherent structures. FLUX.1 AI leverages diffusion techniques to create images with intricate details and realistic textures.

Another great feature of FLUX.1 AI is prompt adherence and quality. Whether you use simple or complex prompts, the model delivers high-quality images that closely match the input description. A Medium user named LM Po used the prompt “a cat looking into a camera, point of view fisheye lens” and it gave results that can be compared to Midjourney V6.

(FLUX.1’s output after “a cat looking into a camera, point of view fisheye lens” prompt)

FLUX.1 AI Challenges Others Open Source Models

FLUX.1 AI is better than Stable Diffusion — Generated by FLUX.1 AI

When you compare the best open source text-to-image models, you are left with limited choices including FLUX.1 AI and Stable Diffusion. However, compared to Stable Diffusion, FLUX.1 AI is quite easy to prompt.

In head-to-head comparisons, FLUX.1 AI consistently outperforms Stable Diffusion in generating photorealistic images with lifelike details and textures.

Because of its open-source nature, there are multiple forks available for FLUX.1 AI. One good example is Marketing Assistant App based on FLUX.1 [Schnell], allowing users to create social media content, marketing advertisements, and more for free.

Pierrick Chevallier, an AI designer on X, said that his experience with FLUX.1 AI was amazing. “Plus, its text handling is way better than SD3 and Midjourney,” he added, further praising model.

Users across multiple social media platforms are now blending FLUX.AI with other tools like Midjourney, Udio and Luma AI to create videos with transitions that would have required hours of work.

Surprisingly, Black Forest Labs, the company behind the new open-source AI tool consists of former Stability AI employees. These are the same minds behind tools like Stable Diffusion, Latent Diffusion, and Stable Diffusion XL.

In short, move over Stable Diffusion, Midjourney, Imagen and DALL-E, the new AI image generation champion in town is here to stay.

📣 Want to advertise in AIM? Book here

Sagar Sharma

A software engineer who loves to experiment with new-gen AI. He also happens to love testing hardware and sometimes they crash. While reviving his crashed system, you can find him reading literature, manga, or watering plants.