Bokeh Diffusion: Defocus Blur Control in Text-to-Image Diffusion Models

Bokeh Diffusion is an image generator that allows precise control over the blur effect in the background of an image. This effect, known as bokeh, is often used in professional photography to create a shallow depth of field, making the subject stand out while giving the image a three-dimensional appearance.

For a long time, I have been waiting for an AI tool capable of controlling this bokeh effect, and now it has finally arrived.

What is Bokeh Diffusion?

Bokeh Diffusion is a text-to-image AI model that allows precise control over background blur (bokeh) in generated images. Unlike traditional models that rely on vague prompt engineering for blur effects, Bokeh Diffusion explicitly adjusts the blur level using a defocus parameter while preserving scene consistency.

Bokeh Diffusion

It achieves this through a hybrid training method that combines real-world and synthetic blur data, enabling flexible depth-of-field control and real-image editing.

Bokeh Diffusion Overview:

Detail	Description
Name	Bokeh Diffusion
Purpose	Defocus Blur Control in Text-to-Image Diffusion Models
Paper	arxiv.org/abs/2503.08434
GitHub Repository	github.com/atfortes/BokehDiffusion
Official Website	atfortes.github.io/projects/bokeh-diffusion/

Understanding the Bokeh Effect in AI Image Generation

In professional photography, adjusting the bokeh effect enhances the visual appeal of images. With Bokeh Diffusion, AI now enables users to control how clear or blurry the background appears.

To illustrate how this works, let's take a look at some examples:

If you keep the same text prompt and all other settings identical but adjust the bokeh value, the background clarity changes.
A photo of a cat remains the same in every aspect, but by adjusting the bokeh from 0 to 30, the background progressively becomes blurrier.
A drone with a cityscape in the background exhibits similar behavior when the bokeh value is modified.

How Bokeh Diffusion Works

The bokeh effect in this AI model ranges from 0 to 30, where:

0 results in a background that is completely clear and detailed.
30 produces an extremely blurred background.

Here are some examples demonstrating how different bokeh values influence the background blur:

Subject	Bokeh Value	Background Description
Smoothie with market	0	Market details remain sharp and clear.
Car in the city	1	Background is still very detailed.
Red wine on a table	12	Background slightly blurred but recognizable.
Cow on a farm	14	Moderate background blur.
Woman in a park	18	Noticeably blurred background.
Man in a busy street	29	Extremely blurred background.
Seashell on a beach	29	Background nearly indistinguishable.

Comparison with Flux Image Generator

Many AI image generators, such as Flux, allow users to specify background blurriness in prompts. However, in practice, Flux does not offer the same level of control.

It often applies a generic blur effect, making most backgrounds uniformly blurry.

Bokeh Diffusion

In contrast, Bokeh Diffusion provides full control over the depth of field, allowing for:

Completely clear backgrounds.
Slightly blurred backgrounds.
Highly blurred backgrounds.

This level of control ensures that the generated images achieve the desired effect with precision.

The Mechanism Behind Bokeh Diffusion

Bokeh Diffusion operates using the following components:

Text Prompt – The main input provided by the user.
Bokeh Parameter – A specialized setting that determines the background blur level.
Grounded Self-Attention Component – Ensures that the subject remains sharp and consistent while only altering the background blur.

GitHub Repository and Future Updates

The developers have released a GitHub repository for Bokeh Diffusion and have indicated that more updates will be coming soon. Those interested in experimenting with this tool should stay tuned for further developments.

Bokeh Diffusion is a step forward in AI image generation, offering unparalleled control over depth of field. As AI continues to evolve, this capability opens up new possibilities for photographers, designers, and digital artists looking to enhance their visuals with precise background adjustments.

Bokeh Diffusion Method:

Bokeh Diffusion employs a unique combination of three core components to achieve lens-like bokeh effects without altering the scene's structure:

Hybrid Dataset Pipeline: This method integrates real-world images, which provide authentic bokeh effects and diversity, with synthetic blur augmentations to create contrastive pairs. This dual approach ensures that the defocus realism is anchored while providing robust examples for training.

Bokeh Diffusion Unbounded

Defocus Blur Conditioning: A physically interpretable blur parameter, ranging from 0 to 30, is injected through decoupled cross-attention in the deeper layers of the U-Net architecture. This technique maintains semantic features while precisely controlling the defocus level.
Grounded Self-Attention: A "pivot" image is used to anchor the scene layout, ensuring consistent object placement across varying blur levels. This mechanism prevents unintended shifts in content when adjusting the defocus, maintaining the integrity of the scene.

These components work in harmony to provide unparalleled control over the depth of field, allowing users to create images with precise background adjustments while preserving the original scene's structure.

Bokeh Diffusion: Defocus Blur Control in Text-to-Image Diffusion Models

What is Bokeh Diffusion?

Bokeh Diffusion Overview:

Understanding the Bokeh Effect in AI Image Generation

How Bokeh Diffusion Works

Comparison with Flux Image Generator

The Mechanism Behind Bokeh Diffusion

GitHub Repository and Future Updates

Bokeh Diffusion Method:

Related Posts

3DTrajMaster: A Step-by-Step Guide to Video Motion Control

Browser-Use Free AI Agent: Now AI Can control your Web Browser

Caracal AI: Free Tool for Handwritten Text Recognition, Extract text from Images