Detailed Syllabus and Lectures

Lecture 13: Self-supervised Learning (slides)

what is self-supervised learning, self-supervised learning in NLP, self-supervised learning in vision, multimodal self-supervised learning

Please study the following material in preparation for the class:

Required Reading:

[Blog post] Self-supervised learning: The dark matter of intelligence, Yann LeCun and Ishan Misra.
[Blog post] Self-Supervised Representation Learning, Lilian Weng.

Additional Resources:

Distributed Representations of Words and Phrases and their Compositionality, Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, Jeffrey Dean, NIPS 2013.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova, NAACL 2019.
Context Encoders: Feature Learning by Inpainting, Deepak Pathak, Philipp Krähenbühl, Jeff Donahue, Trevor Darrell, Alexei A. Efros, CVPR 2016.
Unsupervised Visual Representation Learning by Context Prediction, Carl Doersch, Abhinav Gupta, Alexei A. Efros
Unsupervised Representation Learning by Predicting Image Rotations, Spyros Gidaris, Praveer Singh, Nikos Komodakis, ICLR 2018.
Representation Learning with Contrastive Predictive Learning, Aaron van den Oord, Yazhe Li, Oriol Vinyals, ICLR 2018.
A Simple Framework for Contrastive Learning of Visual Representations, Ting Chen, Simon Kornblith, Mohammad Norouzi, Geoffrey Hinton, arXiv preprint arXiv:2002.05709, 2020.
Revisiting Self-Supervised Visual Representation Learning , Alexander Kolesnikov, Xiaohua Zhai, Lucas Beyer, CVPR 2019.

Lecture 12: Variational Autoencoders, Denoising Diffusion Models (slides)

variational autoencoders (VAEs), vector quantized variational autoencoders (VQ-VAEs), denoising diffusion models

Please study the following material in preparation for the class:

Required Reading:

Tutorial on Variational Autoencoders, Carl Doersch.
An Introduction to Variational Autoencoders, Diederik P. Kingma, Max Welling.
Diffusion Models: A Comprehensive Survey of Methods and Applications, Ling Yang et al.

Additional Resources:

[Blog post] Intuitively Understanding Variational Autoencoders, Irhum Shafkat.
[Blog post] A Beginner's Guide to Variational Methods: Mean-Field Approximation, Eric Jang.
[Blog post] Tutorial - What is a variational autoencoder?, Jaan Altosaar
beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework, Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, Alexander Lerchner, ICLR 2017.
Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations, Francesco Locatello, Stefan Bauer, Mario Lucic, Gunnar Rätsch, Sylvain Gelly, Bernhard Schölkopf, Olivier Bachem.
Generating Diverse High-Fidelity Images with VQ-VAE-2, Ali Razavi, Aaron van den Oord, Oriol Vinyals.
[Blog post] What are Diffusion Models?, Lilian Weng.
High-Resolution Image Synthesis with Latent Diffusion Models, Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, Björn Ommer, CVPR 2022.

Lecture 11: Autoregressive and Flow Models (slides)

autoregressive models, normalizing flows

Please study the following material in preparation for the class:

Required Reading:

Chapter #14 of the Deep Learning text book.
[Blog post] Normalizing Flows Tutorial, Part 1: Distributions and Determinants, Eric Jang
[Blog post] Normalizing Flows Tutorial, Part 2: Modern Normalizing Flows, Eric Jang
[Blog post] Flow-based Deep Generative Models, Lilian Weng

Additional Resources:

Pixel Recurrent Neural Networks, Aaron van den Oord, Nal Kalchbrenner, Koray Kavukcuoglum ICML2016.
Conditional Image Generation with PixelCNN Decoders, Aaron van den Oord, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, Koray Kavukcuoglu, NIPS2016.
Unsupervised Feature Learning and Deep Learning, Andrew Ng.
[Blog post] Unsupervised Sentiment Neuron, Alec Radford, Ilya Sutskever, Rafal Jozefowicz, Jack Clark and Greg.
Normalizing Flows: An Introduction and Review of Current Methods, Ivan Kobyzev, Simon J.D. Prince, and Marcus A. Brubaker, arXiv preprint, arXiv:1908.09257, 2020.
Normalizing Flows for Probabilistic Modeling and Inference, George Papamakarios, Eric Nalisnick, Danilo Jimenez Rezende, Shakir Mohamed, Balaji Lakshminarayanan, arXiv preprint, arXiv:1912.02762, 2019
[Blog post] Glow: Better Reversible Generative Models, OpenAI
Density estimation using Real NVP, Laurent Dinh, Jascha Sohl-Dickstein, Samy Bengio, ICLR 2017.

Lecture 10: Generative Adversarial Networks (slides)

unsupervised representation learning, generative adversarial networks (GANs), conditional GANs, applications of GANs

Please study the following material in preparation for the class:

Required Reading:

Chapter #13 of the Deep Learning text book.
NIPS 2016 Tutorial: Generative Adversarial Networks, Ian Goodfellow
Generative Adversarial Networks: An Overview, Antonia Creswell, Tom White, Vincent Dumoulin, Kai Arulkumaran, Biswa Sengupta, Anil A Bharath
How to Train a GAN? Tips and tricks to make GANs work, Soumith Chintala, Emily Denton, Martin Arjovsky, Michael Mathieu

Additional Resources:

[Blog post] How to Train a GAN? Tips and tricks to make GANs work, Soumith Chintala, Emily Denton, Martin Arjovsky and Michael Mathieu.
[Blog post] The GAN Zoo, Avinash Hindupur
[Blog post] GAN Playground, Reiichiro Nakano
[Blog post] GANs comparison without cherry-picking, Junbum Cha
[Twitter thread] Thread on how to review papers about generic improvements to GANs, Ian Goodfellow

Lecture 9: Graph Neural Networks (slides)

graph structured data, graph neural nets (GNNs), GNNs for ”classical” network problems

Please study the following material in preparation for the class:

Required Reading:

Semi-Supervised Classification with Graph Convolutional Networks, Thomas Kipf, Max Welling, ICLR 2017
Relational inductive biases, deep learning, and graph networks, Peter W. Battaglia et al., arXiv Preprint arXiv:1806.01261, 2018

Additional Resources:

A Practical Tutorial on Graph Neural Networks, Isaac Ronald Ward, Jack Joyner, Casey Lickfold, Yulan Guo, Mohammed Bennamoun, ACM Computing Surveys, Vol. 54, No: 10, September 2022.
A Gentle Introduction to Graph Neural Networks, Benjamin Sanchez-Lengeling, Emily Reif, Adam Pearce, Alexander B. Wiltschko, Distill, 2021
[Blog post] Graph Convolutional Networks, Thomas Kipf

Lecture 8: Attention and Transformers (slides)

content-based attention, location-based attention, soft vs. hard attention, self-attention, attention for image captioning, transformer networks, vision transformers

Please study the following material in preparation for the class:

Required Reading:

Attention and Augmented Recurrent Neural Networks, Chris Olah and Shan Carter. Distill, 2016
[Blog post] The Illustrated Transformer, Jay Alammar
[Blog post] Transformers for Image Recognition at Scale, Neil Houlsby and Dirk Weissenborn

Additional Resources:

Neural Machine Translation by Jointly Learning to Align and Translate, D. Bahdanau, K. Cho, Y. Bengio, ICLR 2015
Sequence Modeling with CTC, Awni Hannun, Distill, 2017
Recurrent Models of Visual Attention, V. Mnih, N. Heess, A. Graves, K. Kavukcuoglu, NIPS 2014
DRAW: a Recurrent Neural Network for Image Generation, K. Gregor, I. Danihelka, A. Graves, DJ Rezende, D. Wierstra, ICML 2015
Attention Is All You Need, Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, NIPS 2017
[Blog post] What is DRAW (Deep Recurrent Attentive Writer)?, Kevin Frans
[Blog post] The Transformer Family, Lilian Weng

Lecture 7: Recurrent Neural Networks (slides)

sequence modeling, recurrent neural networks (RNNs), RNN applications, vanilla RNN, training RNNs, long short-term memory (LSTM), LSTM variants, gated recurrent unit (GRU)

Please study the following material in preparation for the class:

Required Reading:

Chapter #10 of the Deep Learning text book.
Section 5 of Generating Sequence with Recurrent Neural Networks, A. Graves, ArXiV

Additional Resources:

[Blog post] Understanding LSTM Networks, Chris Olah.
[Blog post] The Unreasonable Effectiveness of Recurrent Neural Networks, Andrej Karpathy.
Learning Long-Term Dependencies with Gradient Descest is Difficult, Yoshua Bengio, Patrice Simard, and Paolo Frasconi.
Long Short-Term Memory, Sepp Hochreiter and Jürgen Schmidhuber.

Lecture 6: Understanding and Visualizing Convolutional Neural Networks (slides)

transfer learning, interpretability, visualizing neuron activations, visualizing class activations, pre-images, adversarial examples, adversarial training

Please study the following material in preparation for the class:

Required Reading:

Matthew D Zeiler and Rob Fergus, Visualizing and Understanding Convolutional Networks, ECCV 2014.
Christian Szegedy et al. Intriguing properties of neural networks, arXiv preprint arXiv:1312.6199v4

Additional Resources:

[Blog post] Understanding Neural Networks Through Deep Visualization, Jason Yosinski, Jeff Clune, Anh Nguyen, Thomas Fuchs, and Hod Lipson.
[Blog post] The Building Blocks of Interpretability, Chris Olah, Arvind Satyanarayan, Ian Johnson, Shan Carter, Ludwig Schubert, Katherine Ye and Alexander Mordvintsev.
[Blog post] Feature Visualization, Chris Olah, Alexander Mordvintsev and Ludwin Schubert.
[Blog post] An Overview of Early Vision in InceptionV1, Chris Olah, Nick Cammarata, Ludwig Schubert, Gabriel Goh, Michael Petrov, Shan Carter.
[Blog post] OpenAI Microscope.
[Blog post] Breaking Linear Classifiers on ImageNet, Andrej Karpathy.
[Blog post] Attacking machine learning with adversarial examples, OpenAI.

Lecture 5: Convolutional Neural Networks (slides)

convolution layer, pooling layer, cnn architectures, design guidelines, semantic segmentation networks, addressing other tasks

Please study the following material in preparation for the class:

Required Reading:

Chapter #9 of the Deep Learning textbook.

Additional Resources:

Andrej Karpathy's CS231n notes on Convolutional Networks.
Hiroshi Kuwajima’s Memo on Backpropagation in Convolutional Neural Networks.
A guide to convolution arithmetic for deep learning, Vincent Dumoulin and Francesco Visin.
Deep Convolutional Neural Networks for Image Classification: A Comprehensive Review, Waseem Rawat and Zenghui Wang.
[Blog post] Understanding Convolutions, Christopher Olah.
[Blog post] Deconvolution and Checkerboard Artifacts, Augustus Odena, Vincent Dumoulin, Chris Olah.
[Blog post] Deep Learning for Object Detection: A Comprehensive Review, Joyce Xu.
[Blog post] A Brief History of CNNs in Image Segmentation: From R-CNN to Mask R-CNN, Dhruv Parthasarathy

Lecture 4: Training Deep Neural Networks (slides)

data preprocessing, weight initialization, normalization, regularization, model ensembles, dropout, optimization methods

Please study the following material in preparation for the class:

Required Reading:

Chapter #7 and Chapter #8 of the Deep Learning text book.

Additional Resources:

Stochastic Gradient Descent Tricks, Leon Bottou.
Section 3 of Practical Recommendations for Gradient-Based Training of Deep Architectures, Yoshua Bengio.
Troubleshooting Deep Neural Networks: A Field Guide to Fixing Your Model, Josh Tobin.
[Blog post] Initializing neural networks, Katanforoosh & Kunin, deeplearning.ai.
[Blog post] Parameter optimization in neural networks, Katanforoosh et al., deeplearning.ai.
[Blog post] The Black Magic of Deep Learning - Tips and Tricks for the practitioner, Nikolas Markou.
[Blog post] An overview of gradient descent optimization algorithms, Sebastian Ruder.
[Blog post] Why Momentum Really Works, Gabriel Goh

Lecture 3: Multi-layer Perceptrons (slides)

feed-forward neural networks, activation functions, chain rule, backpropagation, computational graph, automatic differentiation, distributed word representations

Please study the following material in preparation for the class:

Required Reading:

Chapter 6 of the Deep Learning text book.
Yoav Goldberg's A Primer on Neural Network Models for Natural Language Processing, 3 to 6
Mathieu Blondel's presentation on Automatic differentiation
[Blog post] How Backpropagation Is Able To Reduce the Time Spent on Computing Gradients

Additional Resources:

Hinton's Coursera class on Neural Networks, Lecture 1 to 3.
[Blog post] Neural Networks, Manifolds, and Topology, Christopher Olah.
[Blog post] Calculus on Computational Graphs: Backpropagation, Christopher Olah.
Chapter 16 of Jurafsky and Martin's Speech and Language Processing book (3rd Edition draft)

Lecture 2: Machine Learning Overview (slides)

types of machine learning problems, linear models, loss functions, linear regression, gradient descent, overfitting and generalization, regularization, cross-validation, bias-variance tradeoff, maximum likelihood estimation

Please study the following material in preparation for the class:

Required Reading:

Chapter 5 of the Deep Learning text book.

Additional Resources:

A few useful things to know about machine learning, P. Domingos. Communications of the ACM, 55 (10), 78-87, 2012.
The uneasy relationship between deep learning and (classical) statistics, Boaz Barak, June 2022.

Lecture 1: Introduction to Deep Learning (slides)

course information, what is deep learning, a brief history of deep learning, compositionality, end-to-end learning, distributed representations

Please study the following material in preparation for the class:

Required Reading:

Chapter 1 of the Deep Learning text book.
[Blog post] AI Winter. How Canadians contributed to end it?, Pavan Mirla.
The Bandwagon, Claude E. Shannon. IRE Transactions on Information Theory, Vol. 2, Issue 3, 1956
Chapter 1: The Philosophy and the Approach of David Marr's Vision, 1982.

Additional Resources:

The unreasonable effectiveness of deep learning in artificial intelligence, Terrence J. Sejnowski, PNAS, 2020.
Deep Learning, Yann LeCun, Yoshio Bengio, Geoffrey Hinton. Nature, Vol. 521, 2015.
Deep Learning in Neural Networks: An Overview, Juergen Schmidhuber. Neural Networks, Vol. 61, pp. 85–117, 2015.
On the Origin of Deep Learning, Haohan Wang and Bhiksha Raj, arXiv preprint arXiv:1702.07800v4, 2017

COMP541

COMP541: Deep Learning

Detailed Syllabus and Lectures

Lecture 13: Self-supervised Learning (slides)

Required Reading:

Suggested Video Material:

Additional Resources:

Lecture 12: Variational Autoencoders, Denoising Diffusion Models (slides)

Required Reading:

Suggested Video Material:

Additional Resources:

Lecture 11: Autoregressive and Flow Models (slides)

Required Reading:

Suggested Video Material:

Additional Resources:

Lecture 10: Generative Adversarial Networks (slides)

Required Reading:

Suggested Video Material:

Additional Resources:

Lecture 9: Graph Neural Networks (slides)

Required Reading:

Suggested Video Material:

Additional Resources:

Lecture 8: Attention and Transformers (slides)

Required Reading:

Suggested Video Material:

Additional Resources:

Lecture 7: Recurrent Neural Networks (slides)

Required Reading:

Suggested Video Material:

Additional Resources:

Lecture 6: Understanding and Visualizing Convolutional Neural Networks (slides)

Required Reading:

Suggested Video Material:

Additional Resources:

Lecture 5: Convolutional Neural Networks (slides)

Required Reading:

Suggested Video Material:

Additional Resources:

Lecture 4: Training Deep Neural Networks (slides)

Required Reading:

Suggested Video Material:

Additional Resources:

Lecture 3: Multi-layer Perceptrons (slides)

Required Reading:

Suggested Video Material:

Additional Resources:

Lecture 2: Machine Learning Overview (slides)

Required Reading:

Suggested Video Material:

Additional Resources:

Lecture 1: Introduction to Deep Learning (slides)

Required Reading:

Additional Resources: