Programming Project #5

Introduction

This project focuses on exploring the capabilities of diffusion models for various tasks, including image sampling, inpainting, and creating optical illusions. It serves as the first part of a larger project, allowing me to get hands-on experience with pretrained diffusion models, implementing sampling loops, and performing creative experiments with guided generation.

Part 1: Sampling Loops

1.1 Implementing the Forward Process

I implemented a function to add noise to an image progressively. This helped me understand how diffusion models generate noisy images and reverse the process. Below are the results for the Campanile test image at different noise levels:

1.2 Classical Denoising

Using Gaussian blur, I attempted to denoise the images from the previous step. While this method reduces noise, it struggles to recover the original details effectively. The results are displayed below, side by side for comparison:

Comparison at t=250

Comparison at t=500

Comparison at t=750

1.3 One-Step Denoising

Here, I used a pretrained diffusion model to estimate and remove noise from the images. This method produced much better results compared to Gaussian blur. Below are the noisy images, Gaussian blur results, and the diffusion model denoising results for comparison:

Comparison at t=250

Comparison at t=500

Comparison at t=750

1.4 Iterative Denoising

I implemented an iterative denoising process to gradually refine noisy images into clean ones. This allowed for more effective recovery compared to a single-step approach. Below are the deliverables for this part:

Create strided_timesteps: a list of monotonically decreasing timesteps, starting at 990, with a stride of 30, eventually reaching 0. Also initialize the timesteps using the function stage_1.scheduler.set_timesteps(timesteps=strided_timesteps).
Show the noisy image every 5th loop of denoising (it should gradually become less noisy).
Show the final predicted clean image, using iterative denoising.
Show the predicted clean image using only a single denoising step, as was done in the previous part. This should look much worse.
Show the predicted clean image using Gaussian blurring, as was done in part 1.2.
Complete the iterative_denoise function.

Noisy Image (Every 5th Loop)

Final Predicted Clean Image (Iterative Denoising)

Predicted Clean Image (Single Step Denoising)

Predicted Clean Image (Gaussian Blurring)

Part 1.5: Diffusion Model Sampling

In this section, I generated images from pure noise using the iterative_denoise function. This effectively demonstrates the diffusion model's ability to transform noise into realistic images.

Generated Samples

Part 1.6: Classifier-Free Guidance (CFG)

By implementing the iterative_denoise_cfg function, I was able to enhance the quality of generated images using CFG. This method significantly improved the realism of the results by combining conditional and unconditional noise estimates.

Generated Samples with CFG

Part 2: Diffusion Models from Scratch

In Part 2, I trained a diffusion model from scratch on the MNIST dataset. This involved implementing a UNet architecture, training it as a denoiser, and using it iteratively for diffusion-based image generation. The tasks progressed from single-step denoising to time-conditioned and class-conditioned UNet models.