Week 1: VAEs, GANs & Latent Space Representations
Master generative modeling: Variational Autoencoders, Generative Adversarial Networks, Diffusion Models, and Flow-based models — theory and PyTorch implementation.
- Derive the VAE evidence lower bound (ELBO)
- Train a GAN and understand mode collapse
- Implement a DDPM diffusion model
- Visualize and interpolate in latent spaces
This first lecture establishes the foundational framework for Generative Models. By the end of this session, you will have the conceptual grounding and practical starting point needed for the rest of the course.
Key Concepts
The lecture introduces the four main pillars of this course: Autoencoders & Latent Representations, Variational Autoencoders (VAE) & ELBO, Generative Adversarial Networks (GAN), Diffusion Models: DDPM & Score Matching. Each will be explored in depth over the 14-week curriculum, with hands-on projects reinforcing theory at every stage.
This Week's Focus
Focus on mastering: Autoencoders & Latent Representations and Variational Autoencoders (VAE) & ELBO. These are the prerequisites for everything in Week 2. The concepts build on each other — do not skip the practice exercises.
AI303 Project 1: Conditional Image Generator
Build a conditional GAN (cGAN) or conditional VAE that generates images conditioned on class labels. Train on CIFAR-10 or CelebA. Evaluate with FID and Inception Score.
- cGAN or cVAE PyTorch implementation
- Latent space visualization (t-SNE / UMAP)
- FID score and Inception Score evaluation
- Interpolation and conditional generation samples
These represent the style and difficulty of questions you'll see on the midterm and final. Start thinking about them now.
Derive the ELBO for a VAE. What does each term (reconstruction + KL) represent?
What is mode collapse in GANs? Describe two training techniques that mitigate it.
Explain the score function in score-based diffusion models. How does it relate to DDPM?