Stability.ai launches DreamStudio Beta, a text-to-image generator web app open to the public

A full look at how to create images of your own by engineering simple prompts that add style and finishing touches through AI that competes with DALL-E 2

Aug 20, 2022

“Kanye West as Tyrion Lannister in a Game of Thrones dream world, in the style of Joseph Cornell, hyperrealistic photograph with Unreal Engine lighting 8k”

The competition between DALL-E 2 and Stable Diffusion is heating up.

Stability.ai today launched DreamStudio Beta, a text-to-image web app for creating art through prompts and filters. New users get 200 image credits for free, and it is only £10 for every additional ~1k standard generation.

“DreamStudio is a new suite of generative media tools engineered to grant everyone the power of limitless imagination and the effortless ease of visual expression through a combination of natural language processing and revolutionary input controls for accelerated creativity.” — DreamStudio

The web app builds on artificial intelligence called Stable Diffusion that was recently opened for beta access to researchers. The Beta version is called DreamStudio Lite, and the company plans to release DreamStudio Pro (video/audio) and Enterprise (studios) soon. Unlike DALL-E 2, Stable Diffusion does not filter out public figures.

Benj Edwards AI/ML @ai_benj

Stable Diffusion opened DreamStudio web interface recently and shut down its Discord image generation bot. They’re moving to a commercial model, charging for generations in DreamStudio, while planning a public release (maybe Monday) of the weights for local gen on your own GPU

After playing around with the web app this afternoon, I’m amazed by its power, accuracy, and speed. This is a new era for AI-generated art, and people will only be limited by the power of their imagination. As I wrote about last week, it’s not without controversy. The possibilities feel endless.

Here’s how the DreamStudio Beta web app works:

First, users see a simple web app with a prompt on the bottom and filters on the right.

Users then simply choose the width and height of their image, the “cfg scale” that adjusts how similar the image will be to the prompt, the steps that the AI will spend generating (diffusing) the image, the number of images to generate, what sampler method to use (k_lms by default), and a seed number.

A guide on the site for “engineering prompts” recommends that users start with a raw prompt, such as a “R2D2,” “Emma Watson,” or “soldier.”

Next, users can enter a well-chosen style such as “raw prompt, hyperrealistic,” “oil painting,” or “concept art.”

Then users can enter the artistic style of the image by calling out a specific artist, such as “made by Pablo Picasso.” A set of artists are here for inspiration.

Finally, users can put the finishing touches on their art with things like “highly detailed,” “dramatic lighting,” or “post-processing.”

For example, building on the Kanye West image I made above, here is an image I made today with the prompt, “Emma Watson as Tyrion Lannister in a Game of Thrones dream world, in the style of Chuck Close, hyperrealistic photograph with Unreal Engine lighting 8k.”

Emma Watson as Tyrion Lannister in a Game of Thrones dream world, in the style of Chuck Close, hyperrealistic photograph with Unreal Engine lighting 8k

The images are licensed under the CC0 1.0 public domain license.

Here’s a sample tutorial for anyone who wants to try it out:

Artificial Conversation

Discussion about this post