Detailed Syllabus and Lectures


Lecture 14: Pretraining for Vision and Language (slides) (video)

feature representations for vision and language, model architectures, pre-training tasks, downstream tasks, what's next

Please study the following material in preparation for the class:

Required Reading:

Suggested Video Material:


Additional Resources:


Lecture 13: Pretraining Language Models (slides) (video 1) (video 2)

RNN-based language models, contextualized word embeddings, scaling up generative pretraining (GPT-1, GPT-2, GPT-3) models, masked language modeling and BERT-based models

Please study the following material in preparation for the class:

Required Reading (more s denote higher priority):

Suggested Video Material:


Additional Resources:


Lecture 12: Self-Supervised Learning (slides) (video 1) (video 2)

denoising autoencoder, in-painting, colorization, split-brain autoencoder, proxy tasks in computer vision: relative patch prediction, jigjaw puzzles, rotations, contrastive learning: word2vec, contrastive predictive coding, instance discrimination, current instance discrimination models

Please study the following material in preparation for the class:

Required Reading:

Suggested Video Material:


Additional Resources:


Lecture 11: Strengths and Weaknesses of Current Models (slides) (video 1) (video 2)

a critique of autoregressive models, flow-based models, latent variable models, and implicit models

Please study the following material in preparation for the class:

Suggested Video Material:


Additional Resources:


Lecture 10: Discrete Latent Variable Models (slides) (video 1) (video 2)

REINFORCE, Gumbel-Softmax, Straight-through estimator, neural variational inference and learning, vector quantization VAE (VQ-VAE), VQ-VAE-2, VQ-GAN, discrete flows, discrete integer flows, GANs for text: SeqGAN, MaskGAN, ScratchGAN

Please study the following material in preparation for the class (more s denote higher priority):

Required Reading:

Suggested Video Material:


Additional Resources:


Lecture 8-9: Generative Adversarial Networks (slides) (video 1, 2, 3, 4)

implicit models, generative adversarial networks (GANs), evaluation metrics, theory behind GANs, GAN architectures, conditional GANs, cycle-consistent adversarial networks, representation learning in GANs, applications

Please study the following material in preparation for the class:

Required Reading (more s denote higher priority):

Suggested Video Material:


Additional Resources:


Lecture 7: Variational Autoencoders (slides) (video 1) (video 2)

latent variable models, variational autoencoders, importance weighted autoencoders, variational lower bound/evidence lower bound, likelihood ratio gradients vs. reparameterization trick gradients, Beta-VAE, variational dequantization

Please study the following material in preparation for the class:

Required Reading (more s denote higher priority):

Suggested Video Material:


Additional Resources:


Lecture 6: Normalizing Flow Models (slides) (video 1) (video 2)

1-D flows, change of variables, autoregressive flows, inverse autoregressive flows, affine flows, RealNVP, Glow, Flow++, FFJORD, multi-scale flows, dequantization

Please study the following material in preparation for the class:

Required Reading:

Suggested Video Material:


Additional Resources:


Lecture 5: Autoregressive Models (slides) (video 1) (video 2)

histograms as simple generative models, parameterized distributions and maximum likelihood, RNN-based autoregressive models, masking-based autoregressive models

Please study the following material in preparation for the class:

Required Reading:

Suggested Video Material:


Additional Resources:


Lecture 4: Neural Building Blocks III: Attention and Transformers (slides) (video)

content-based attention, location-based attention, soft vs. hard attention, self-attention, attention for image captioning, transformer networks

Please study the following material in preparation for the class:

Required Reading:

Suggested Video Material:


Additional Resources:


Lecture 3: Neural Building Blocks II: Sequential Processing with Recurrent Neural Networks (slides) (video)

sequence modeling, recurrent neural networks (RNNs), RNN applications, vanilla RNN, training RNNs, long short-term memory (LSTM), LSTM variants, gated recurrent unit (GRU)

Please study the following material in preparation for the class:

Required Reading:

Suggested Video Material:


Additional Resources:


Lecture 2: Neural Building Blocks I: Spatial Processing with CNNs (slides) (video)

deep learning, computation in a neural net, optimization, backpropagation, convolutional neural networks, residual connections, training tricks

Please study the following material in preparation for the class:

Required Reading:

Suggested Video Material:


Additional Resources:



Lecture 1: Introduction to the course (slides) (video)

course information, unsupervised learning

Please study the following material in preparation for the class:

Required Reading: