How AI Image Generators Work (Stable Diffusion / Dall-E) - Computerphile

906,920

27,911 0

Published 2022-10-04

AI image generators are massive, but how are they creating such interesting images? Dr Mike Pound explains what's going on.

Thumbnail image partly created by DALL-E with the prompt: "Computerphile YouTube Video presenter Mike Pound Explains Diffusion AI methods thumbnail with green computer style title text on a black background with grey binary"

www.facebook.com/computerphile
twitter.com/computer_phile

This video was filmed and edited by Sean Riley.

Computer Science at the University of Nottingham: bit.ly/nottscomputer

Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com/

All Comments (21)

@InfinityDz 1 year ago

Glad to have finally found someone I can actually listen to about AI, someone that doesn't hype things up and isn't trying to sell me something.
@Qman621 1 year ago

Stable diffusion doesn't actually actually apply noise to images, it uses a compressed low dimensional latent representation of the image and applies noise to that. The model is running in this abstract latent space, and then the autoencoder recreates the image afterwards.
@wlockuz4467 1 year ago

Finally! Ever since Stable Diffusion was released I was looking for an explainer on how it worked that wasn't "Oh it generates images from noise" or something that went too deep into technicals that I didn't understand. Very beautifully explained Dr. Mike Pound! Hope you do another video where you dive into the code where we can see the parts which were visualized here. One thing that's still unclear to me is how was the network trained to relate text with images and how does it utilize this information when actually producing images?
@ayushdhar 1 year ago

A deep dive on the google colab code would be amazing!
@kgsz 1 year ago

Came here by accident and man, aren't you the gifted one? I was engrossed in the video knowing barely anything about the technologies and techniques uses, and I don't feel dumber -- that's an achievement :) Thanks again, will pop here often.
@beachdancer 1 year ago

the explanation sounds like magic. It is like a sculptor saying he just chips away pieces of the stone until he finds the horse hidden inside.
@Ultimatro 1 year ago

So stable diffusion is just the AI version of that sculpting joke: Start with a big block and take away the parts that dont fit
@aijeveryday_guy 1 year ago

I couldn't agree more! Since the release of Stable Diffusion, I've been searching for an explanation that strikes the right balance between simplicity and technicality. Your video did an excellent job of providing a clear understanding without overwhelming us with excessive technical details. Dr. Mike Pound, you have a remarkable talent for explaining complex topics in a beautifully straightforward manner!
@juliankandlhofer7553 1 year ago

Oh i DEFINITELY want to see mike's deep dive into the code!
@myce-liam 1 year ago

Pounding that like button! You guys have inspired me to start an undergraduate degree in Cyber Security - thank you for all of your videos!
@carlborgen 1 year ago

Would have been nice hear a bit more about the "gpt-style transformer embedding". Wouldn't those classifications have to be included in the training data already?
@serhat757 1 year ago

Can't believe Mike can effortlessly make that shape with his hand (little finger) at 5:37
@Mutual_Information 1 year ago

Add noise to images and train a model to undo that addition.. then you have something that maps from noise to images. One thing I find so impressive about these researchers.. is that they would try this. It’s so bizarre.. just because, from a distance, it’s not at all clear that such a task is doable.
@lpbaybee4942 1 year ago

The best compsci content on the internet, period.
@Jinjukei 1 year ago

Thank you so much for talking about this topic! Great and very enjoyable!
@danieletorrigiani 1 year ago

Wow! Had not seen listing paper since my dad was trying to teach me basic on a commodore 64. Had no idea it was still a thing. Big jump from having to read code on paper to make sense of it to this.
@kevinb3300 1 year ago

Awesome video. That’s the clearest explanation I’ve seen. The hand drawn explanation explained it so perfectly. Would love to see a follow up video that goes through the code. Also would be awesome to include examples when talking about the muppets in the kitchen, etc.
@smivan. 1 year ago

Thank you for covering this topic!
@michipeka9973 1 year ago

Fantastic explanation!
@dileepvr 1 year ago

12:58 I'd like to hear more about that GPT-style transformer embedding of text. Was text part of the training set?