Flux.1 – Does this Image Generator Surpass Stable Diffusion?
On 1st August 2024, Black Forest Labs, a leading name in AI, introduced Flux.1, a powerful text-to-image synthesis model that promises to redefine the standards of image generation. With roots in the development team behind Stable Diffusion, Flux.1 is poised to set a new benchmark for creativity, efficiency, and accessibility in AI-driven media.
What is Black Forest Labs?
Black Forest Labs is an innovative company founded by former members of Stability AI, the creators of the popular Stable Diffusion AI Generator. The company is dedicated to advancing generative AI technologies, focusing on creating state-of-the-art models for media such as images and videos. Their mission is to push the boundaries of creativity, efficiency, and diversity in AI, and to make these technologies accessible to a wide audience.
Seamless Collaboration | Cost-Efficient Solutions | Faster Time-to-Market
Flux.1 – The New Era of Image Generation
Flux.1 is a next-generation text-to-image synthesis model developed by Black Forest Labs. It is designed to generate high-quality images from textual descriptions, surpassing many existing models in terms of visual quality, prompt adherence, and output diversity. This new Artificial Intelligence (AI) mode, is recognised for its advanced capabilities, including exceptional rendering of human figures and compatibility with local devices, making it a strong competitor in the AI image generation landscape.
Versions of Flux.1 - The Three Models:
1- Flux.1 [schnell]:
Flux.1 [schnell] is optimised for fast, local development and personal use. This version is designed to run efficiently on high-performance laptops, providing rapid image generation without the need for cloud resources.
2- Flux.1 [dev]:
Flux.1 [dev] is an open-weight model intended for non-commercial applications. It allows developers and hobbyists to explore the full potential of AI image generation while being adaptable for various projects.
3- Flux.1 [pro]:
Flux.1 [pro] is the top-tier version, designed for professional, commercial use. It offers the highest performance and is suitable for businesses looking to incorporate generative AI image services into their offerings. Companies are already adopting the Pro version for its robust capabilities and high-quality outputs.
Key Features of Flux.1:
1- Speed and Efficiency:
The model excels in speed and efficiency, making it a standout model in the world of AI and machine learning. Its architecture is designed to deliver rapid image synthesis without compromising on quality, allowing users to generate high-resolution images swiftly. This speed is particularly beneficial for industries that require quick turnaround times, such as advertising, design, and media production.
- Flux.1 [schnell]:
This version is tailored for users who prioritize speed in their development workflow. Optimized for local execution, Flux.1 [schnell] runs efficiently on high-performance laptops, enabling fast image generation without the need for cloud computing. This makes it ideal for personal projects, rapid prototyping, and scenarios where latency is a concern.
- Flux.1 [dev]:
The Dev version is designed for developers and researchers who need access to a powerful AI model without the constraints of commercial usage. Although it is open-weight, it maintains a balance between speed and flexibility, providing robust performance for non-commercial applications. It allows developers to experiment and innovate while benefiting from the model’s efficient architecture.
- Flux.1 [pro]:
Flux.1 [pro] is the professional-grade version, offering the highest level of performance for commercial use. It is designed to handle large-scale projects with ease, delivering fast and reliable results for businesses that demand top-tier image generation capabilities. Whether it’s for creating marketing materials or generating complex visual content, it ensures that speed and efficiency are never compromised.
2- Prompt Adherence and Quality:
Two of the most important qualities to consider in Generative AI development are prompt adherence and quality, and this new AI model has them. The model is engineered to interpret textual descriptions with high accuracy, producing images that closely match the provided prompts. This level of precision is crucial for applications where the visual representation of specific ideas or concepts is necessary. Flux.1 generates images quickly and ensures that the results are of the highest quality, becoming one of the efficient tools for both individuals and professionals.
Performance and Capabilities:
The model is equipped with a 12-billion-parameter architecture, which allows it to handle complex image-generation tasks with remarkable efficiency. This large-scale architecture enhances the model’s ability to generate high-quality images that are visually impressive and adhere closely to the user’s prompts.
It stands out in its ability to render intricate details, making it suitable for applications that require high levels of precision, such as custom product development, marketing, and digital art. The model’s performance is further enhanced by its adaptability, allowing users to run it on different hardware setups, from local machines to cloud-based servers, without sacrificing speed or quality.
Whether you’re using Schnell for rapid prototyping, dev for exploratory projects, or pro for commercial-grade outputs, the model consistently delivers exceptional results. Its capabilities extend beyond just generating images; It is also adept at handling diverse prompts, making it a powerful tool for creative professionals who need flexibility and reliability in their workflows.
Comparison between Flux.1 and Stable Diffusion:
Flux AI Image Generator:
You can access Flux.1 on Flux AI Image Generator. Try this model to generate images. It offers free as well as paid plans. Following is the breakdown of the pricing plans offered for individuals and professionals:
1- Free Plan:
The Free Plan is a great starting point, offering 150 credits per month, with 10 credits per day. This plan allows you to generate images using FLUX.1 [schnell] at a rate of 1 credit per image. You’ll have access to a shared generation queue and can view your generation history for up to 7 days. Additionally, you can run 1 job at a time.
2- Basic Plan:
The Basic Plan, priced at $9.9 per month, provides 1,000 credits per month, with 100 credits per day. This plan unlocks access to FLUX.1 [schnell], [dev], and [pro], and offers a priority generation queue, 30-day history, and fast generation speeds. You’ll also be able to generate images in batches, keep your generations private, and use the images commercially.
3- Pro Plan:
The Pro Plan, discounted to $29.9 per month, offers the most comprehensive features. You’ll get 5,000 credits per month, with 500 credits per day, and full access to all FLUX.1 models. This plan prioritizes your place in the generation queue, offers unlimited history, and includes all the features from the Basic Plan, including batch generation, private generations, and a commercial license.
Using Flux.1 on Online Platforms:
You can access Flux.1 on various online platforms. Here are a few websites that allow users to use this model:
1- Huggingface:
Visit Huggingface’s model hub and search for Flux.1 to explore and use the model through their interface. Sign up or log in to run the model with your own parameters.
2- Fal.ai:
Go to fal.ai, navigate to the model section, and find Flux.1 for direct access and usage. You may need to create an account to interact with the model.
3- Seaart.ai:
On seaart.ai, locate the model repository or search for Flux.1 to begin using the model. Registration might be required for full access.
4- Replicate:
Access Flux.1 on Replicate by visiting their platform and searching for the model in their collection. You can run the model directly or integrate it into your own applications.
5- Poe.com:
Find Flux.1 on poe.com by searching their platform or browsing their available models. Some features may be restricted to registered users.
6- ToastAI:
Go to ToastAI and look for Flux.1 in their model library to start using it. Create an account to access all functionalities and customizations.
Running Flux on Google Colab:
As mentioned earlier, FLUX.1 is open-source, making it accessible for users to experiment with and develop further. Let’s demonstrate how to access the model and run it on Google Colab:
- Go to Google Colab.
- Click on “File” -> “New notebook”.
- FLUX.1 requires 32GB of GPU RAM to run, so ensure you select the A100 GPU runtime.
- Copy the provided code and paste it into a cell in the notebook.
- Click the play button or press Shift + Enter to run the cell.
# install the packages
!pip install git+https://github.com/huggingface/diffusers.git
!pip install transformers sentencepiece accelerate protobuf
import torch
from diffusers import FluxPipeline
import diffusers
from PIL import Image
import matplotlib.pyplot as plt
# Modify the rope function to handle CUDA device
_flux_rope = diffusers.models.transformers.transformer_flux.rope
def new_flux_rope(pos: torch.Tensor, dim: int, theta: int) -> torch.Tensor:
assert dim % 2 == 0, "The dimension must be even."
if pos.device.type == "cuda":
# Move tensor to CPU for ROPE computation, then move it back to CUDA
return _flux_rope(pos.to("cpu"), dim, theta).to(device=pos.device)
else:
# Perform ROPE computation directly if tensor is not on CUDA
return _flux_rope(pos, dim, theta)
diffusers.models.transformers.transformer_flux.rope = new_flux_rope
# Load the Flux Schnell model
pipe = FluxPipeline.from_pretrained(
"black-forest-labs/FLUX.1-schnell",
revision='refs/pr/1',
torch_dtype=torch.bfloat16
).to("cuda")
# Define the prompt
# This is the textual description that the model will use to generate the image
prompt = "A modern, minimalist house with large windows and a flat roof."
# Generate the image
out = pipe(
prompt=prompt,
guidance_scale=0.,
height=1024,
width=1024,
num_inference_steps=4,
max_sequence_length=256,
).images[0]
# Save the generated image
out.save("gen_image.png")
# Display the generated image
image = Image.open("gen_image.png")
plt.imshow(image)
plt.axis('off') # Hide axes
plt.show()
Using Flux.1 with Diffusers Python Library:
To use FLUX.1 [dev] with the diffusers python library, first install or upgrade diffusers:
pip install -U diffusers
Then use FluxPipeline to run the model:
import torch
from diffusers import FluxPipeline
pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16)
pipe.enable_model_cpu_offload()
#save some VRAM by offloading the model to CPU. Remove this if you have enough GPU power
prompt = "A cat holding a sign that says hello world"
image = pipe(
prompt,
height=1024,
width=1024,
guidance_scale=3.5,
num_inference_steps=50,
max_sequence_length=512,
generator=torch.Generator("cpu").manual_seed(0)
).images[0]
image.save("flux-dev.png")
Seamless Collaboration | Cost-Efficient Solutions | Faster Time-to-Market
Conclusion:
Flux.1 represents a significant leap forward in text-to-image synthesis, offering unparalleled speed, efficiency, and quality. Developed by Black Forest Labs, it stands as a strong competitor to Stable Diffusion, bringing advanced capabilities and versatility to the table. Whether you’re seeking rapid image generation or high-quality outputs, Flux.1 and Stable Diffusion both provide robust solutions, each catering to different needs and preferences in the landscape of generative AI.