Stable Diffusion 3 Unveiled With Legible Text And Multiple Subjects For Amazing Results

by Zak Killian — Thursday, February 22, 2024, 02:45 PM EDT

Cloud-hosted AI image generators like OpenAI's DALL-E 3 and Meta's Imagine are really fun to play with, but they can come with onerous restrictions in terms of content, blocking even prompts that don't seem objectionable at all. They also are known to massage users' prompts—Gemini was recently noted to insert "diverse" in any prompt containing a human, while OpenAI's DALL-E series prefers the term "ethnically ambiguous".

unsafe image content detected2 — *Anyone who's used Microsoft's Copilot will be familiar with this tedious message.*

You can completely avoid these types of restrictions and prompt injections by using a locally-hosted AI. Arguably the most popular of these, at least for image generation, is Stability AI's Stable Diffusion, as well as models derived from it. Stable Diffusion is a fantastic tool, but historically it has struggled compared to the cloud-hosted tools developed by megacorporations with megacorporation money. For example, current versions of Stable Diffusion can't really produce legible text in images, and they usually fail on prompts that feature multiple characters.

stable diffusion group1 — *Images with multiple distinct subjects are a challenge even for DALL-E 3.*

Well, apparently the next release from Stability AI is going to solve these problems. It will, predictably, be known as Stable Diffusion 3, and it'll comprise a suite of models ranging from 800 million to 8 billion parameters. Stability AI says that it achieved these results by combining a diffusion transformer design with a newer paradigm known as "Flow Matching for Generative Modeling". Company CEO Emad Mostaque describes the new model's architecture as being similar to OpenAI's recently-revealed Sora video model.

stable diffusion group2 — *Images with legible text like this were impossible on previous releases.*

The model isn't available to the public yet in any form, so all of the images you're seeing here are examples provided by Stability AI. Believe us, if it were available, we'd already be on top of it. Your author in particular is a huge fan of Stable Diffusion and has spent many dozens of hours dialing in prompts. Instead, the company is taking applications to join a waitlist for the preview phase leading up to final release.

Prompt: studio photograph closeup of a chameleon over a black background

Stability AI's announcement includes a note that the developers have taken "reasonable steps to prevent the misuse of Stable Diffusion 3 by bad actors." We're not sure exactly what that means, but hopefully it doesn't ultimately result in the same sort of ideology-driven censorship that we see with cloud-hosted AI generators. In any case, it's likely that the community will take Stable Diffusion 3 and run with it as it has done with Stable Diffusion 2 and Stable Diffusion XL. Here's looking forward to the eventual launch of the new model.

Tags: AI, stable diffusion, generative ai, stability ai

Which New GPU Is For You?

Stay updated with the latest news and updates. Subscribe to our newsletter!

Subscribe Now

Stable Diffusion 3 Unveiled With Legible Text And Multiple Subjects For Amazing Results

Login with Social Media or Manually