🎓 University of America — Course Portal
🤖 Artificial Intelligence Week 1 of 14 BSc · Y3 ⏱ ~50 min

Week 1: CNNs, RNNs, Transformers & Modern Training

Advanced deep learning: convolutional networks for vision, recurrent architectures for sequences, transformer models, and modern training techniques in PyTorch.

UA
University of America
AI301 — Lecture 1 · BSc Y3
🎬 CC Licensed Lecture
0:00 / —:—— 📺 MIT OpenCourseWare (CC BY-NC-SA)
🎯 Learning Objectives
  • Design and train ResNet-style architectures
  • Implement multi-head self-attention from scratch
  • Apply modern training techniques (mixup, cutmix, label smoothing)
  • Build a transformer for sequence-to-sequence tasks
Topics Covered This Lecture
CNN Architectures: ResNet, EfficientNet, ViT
Sequence Models: LSTM, GRU, Transformer
Self-Attention & Multi-Head Attention
Training Tricks: Augmentation, Label Smoothing, LR Schedules
📖 Lecture Overview

This first lecture establishes the foundational framework for Deep Learning. By the end of this session, you will have the conceptual grounding and practical starting point needed for the rest of the course.

Why this matters Advanced deep learning: convolutional networks for vision, recurrent architectures for sequences, transformer models, and modern training techniques in PyTorch. This lecture sets up everything that follows — make sure you understand the core concepts before proceeding to Week 2.

Key Concepts

The lecture introduces the four main pillars of this course: CNN Architectures: ResNet, EfficientNet, ViT, Sequence Models: LSTM, GRU, Transformer, Self-Attention & Multi-Head Attention, Training Tricks: Augmentation, Label Smoothing, LR Schedules. Each will be explored in depth over the 14-week curriculum, with hands-on projects reinforcing theory at every stage.

# Quick Start: verify your environment is ready for AI301 import sys print(f"Python {sys.version}") # Check key libraries are installed try: import numpy, pandas, matplotlib print("✅ Core libraries ready") except ImportError as e: print(f"❌ Missing: {e} — run: pip install numpy pandas matplotlib")

This Week's Focus

Focus on mastering: CNN Architectures: ResNet, EfficientNet, ViT and Sequence Models: LSTM, GRU, Transformer. These are the prerequisites for everything in Week 2. The concepts build on each other — do not skip the practice exercises.

📋 Project 1 of 3 50% of Final Grade

AI301 Project 1: Transformer from Scratch

Implement a complete encoder-decoder transformer from scratch in PyTorch for machine translation. Train on a subset of WMT14 En-De and report BLEU score.

  • Full transformer implementation (encoder, decoder, attention)
  • Positional encoding and masking implementations
  • Training on WMT14 subset with learning rate warmup
  • BLEU score and attention visualization
50%
3 Projects
20%
Midterm Exam
30%
Final Exam
📝 Sample Exam Questions

These represent the style and difficulty of questions you'll see on the midterm and final. Start thinking about them now.

Conceptual Short Answer

Explain scaled dot-product attention. Why do we scale by 1/√dk?

Analysis Short Answer

What is the purpose of the causal mask in autoregressive transformer decoders?

Applied Code / Proof

Compare the computational complexity of self-attention vs RNNs for sequences of length n.