Week 1: Supervised Learning, Overfitting & The ML Pipeline
Build your ML intuition: supervised learning algorithms, the bias-variance tradeoff, regularization strategies, and the complete model evaluation pipeline.
- Implement linear regression with gradient descent from scratch
- Understand and diagnose bias-variance tradeoff
- Apply regularization (L1, L2) to prevent overfitting
- Build and evaluate a complete ML pipeline
This first lecture establishes the foundational framework for Introduction to Machine Learning. By the end of this session, you will have the conceptual grounding and practical starting point needed for the rest of the course.
Key Concepts
The lecture introduces the four main pillars of this course: Supervised Learning Framework, Linear & Polynomial Regression, Regularization: L1 & L2, Model Evaluation: CV, Metrics. Each will be explored in depth over the 14-week curriculum, with hands-on projects reinforcing theory at every stage.
This Week's Focus
Focus on mastering: Supervised Learning Framework and Linear & Polynomial Regression. These are the prerequisites for everything in Week 2. The concepts build on each other — do not skip the practice exercises.
AI102 Project 1: From Scratch ML Pipeline
Implement linear regression, logistic regression, and k-NN entirely from scratch in NumPy. Compare against scikit-learn implementations. Evaluate on 3 UCI datasets.
- NumPy implementations of 3 algorithms
- Correctness verification against scikit-learn
- Performance comparison on 3 datasets
- Written analysis of algorithm trade-offs
These represent the style and difficulty of questions you'll see on the midterm and final. Start thinking about them now.
Derive the normal equation for linear regression. When is it preferable to gradient descent?
Explain L2 regularization. Show mathematically why it shrinks weights toward zero.
What is the difference between a parametric and non-parametric model? Give examples of each.