Clicky
Artificial intelligence

How AI image generators, like DALL-E, Lensa, and stable streaming work

AI

Warning: This chart requires JavaScript. Please enable JavaScript for a better experience.

A strange and powerful collaborator awaits you. Offer it a few words and it will create an original scene, based on your description.

These are images generated by artificial intelligence, a emerging technology now in the hands of anyone with a smart phone.

The results can be astonishing: sharp, beautiful, fantastic and sometimes strangely realistic. But they can also be muddy and grotesque: distorted faces, gibberish road signs and distorted architecture.

How it works? Keep scrolling to learn step by step how the process unfolds.

a photo

Van Gogh

stained glass

a magazine cover

Like many cutting-edge technologies, AI-generated artworks raise a host of thorny legal, ethical, and moral issues. The raw data used to train the models is pulled directly from the internet, requiring image generators to replicate many of the biases found online. This means they can reinforce faulty assumptions about race, class, age, and gender.

The datasets used for training also often include copyrighted images. This outrages some artists and photographers whose work is ingested into the computer without their permission or compensation.

[AI selfies — and their critics — are taking the internet by storm]

In the meantime, the risk of creating and amplifying misinformation is enormous. That’s why it’s important to understand how the technology actually works, whether it’s creating a Van Gogh the artist never painted, or a scene from the January 6 attack on the US Capitol that never did. has ever appeared in a photographer’s viewfinder.

Faster than society can anticipate and solve these problems, artificial intelligence technologies are racing ahead.

About this story

The Washington Post generated the AI ​​images featured in this article using stable streaming 2.0. Each image was generated using the same parameters and seeds, which means that the “noise” used as a starting point was the same for each image.

The animations on this page show the actual denoising process. We modified the stable streaming code to save intermediate frames as the denoising process happened.

In addition to interviewing researchers and examining the broadcast pattern in detail, The Washington Post analyzed the images used to form a stable broadcast for the database section of this story. The images selected for this explainer were either from the stable streaming database and public domain, licensed from The Post, or closely resembled the images. The image database used to form the stablecast includes copyrighted images that we do not have the right to publish.

Edited by Karly Domb Sadof, Reuben Fischer-Baum and Anne Gerhart. Revision by Paola Ruano.

Leave a Reply