Week 1: Supervised Learning, Bias-Variance & Model Evaluation
Understand the core ML toolkit: decision trees, SVMs, regularization, cross-validation, and the full model evaluation pipeline.
- Implement linear and logistic regression from scratch
- Understand bias-variance tradeoff and how to diagnose it
- Apply k-fold cross-validation correctly
- Evaluate models with precision, recall, F1, and ROC-AUC
This first lecture establishes the foundational framework for Machine Learning Fundamentals. By the end of this session, you will have the conceptual grounding and practical starting point needed for the rest of the course.
Key Concepts
The lecture introduces the four main pillars of this course: Linear & Logistic Regression, Decision Trees & Random Forests, SVM & Kernel Methods, Model Evaluation Framework. Each will be explored in depth over the 14-week curriculum, with hands-on projects reinforcing theory at every stage.
This Week's Focus
Focus on mastering: Linear & Logistic Regression and Decision Trees & Random Forests. These are the prerequisites for everything in Week 2. The concepts build on each other — do not skip the practice exercises.
DS201 Project 1: Binary Classification Pipeline
Build a complete ML pipeline for a binary classification problem (fraud detection or churn prediction). Include feature engineering, model comparison, hyperparameter tuning, and final evaluation.
- Feature engineering notebook
- 3+ model comparison with cross-validation
- Hyperparameter tuning (GridSearchCV or RandomSearch)
- Final model card with metrics and limitations
These represent the style and difficulty of questions you'll see on the midterm and final. Start thinking about them now.
Explain the bias-variance tradeoff and how regularization addresses it.
When would you prefer precision over recall? Give a medical diagnosis example.
Write scikit-learn code to implement a 5-fold cross-validated logistic regression pipeline.