🎓 University of America — Course Portal
Data ScienceDS201 › Week 1
📊 Data Science Week 1 of 14 BSc · Y2 S1 ⏱ ~50 min

Week 1: Supervised Learning, Bias-Variance & Model Evaluation

Understand the core ML toolkit: decision trees, SVMs, regularization, cross-validation, and the full model evaluation pipeline.

UA
University of America
DS201 — Lecture 1 · BSc Y2 S1
🎬 CC Licensed Lecture
0:00 / —:—— 📺 MIT OpenCourseWare (CC BY-NC-SA)
🎯 Learning Objectives
  • Implement linear and logistic regression from scratch
  • Understand bias-variance tradeoff and how to diagnose it
  • Apply k-fold cross-validation correctly
  • Evaluate models with precision, recall, F1, and ROC-AUC
Topics Covered This Lecture
Linear & Logistic Regression
Decision Trees & Random Forests
SVM & Kernel Methods
Model Evaluation Framework
📖 Lecture Overview

This first lecture establishes the foundational framework for Machine Learning Fundamentals. By the end of this session, you will have the conceptual grounding and practical starting point needed for the rest of the course.

Why this matters Understand the core ML toolkit: decision trees, SVMs, regularization, cross-validation, and the full model evaluation pipeline. This lecture sets up everything that follows — make sure you understand the core concepts before proceeding to Week 2.

Key Concepts

The lecture introduces the four main pillars of this course: Linear & Logistic Regression, Decision Trees & Random Forests, SVM & Kernel Methods, Model Evaluation Framework. Each will be explored in depth over the 14-week curriculum, with hands-on projects reinforcing theory at every stage.

# Quick Start: verify your environment is ready for DS201 import sys print(f"Python {sys.version}") # Check key libraries are installed try: import numpy, pandas, matplotlib print("✅ Core libraries ready") except ImportError as e: print(f"❌ Missing: {e} — run: pip install numpy pandas matplotlib")

This Week's Focus

Focus on mastering: Linear & Logistic Regression and Decision Trees & Random Forests. These are the prerequisites for everything in Week 2. The concepts build on each other — do not skip the practice exercises.

📋 Project 1 of 3 50% of Final Grade

DS201 Project 1: Binary Classification Pipeline

Build a complete ML pipeline for a binary classification problem (fraud detection or churn prediction). Include feature engineering, model comparison, hyperparameter tuning, and final evaluation.

  • Feature engineering notebook
  • 3+ model comparison with cross-validation
  • Hyperparameter tuning (GridSearchCV or RandomSearch)
  • Final model card with metrics and limitations
50%
3 Projects
20%
Midterm Exam
30%
Final Exam
📝 Sample Exam Questions

These represent the style and difficulty of questions you'll see on the midterm and final. Start thinking about them now.

Conceptual Short Answer

Explain the bias-variance tradeoff and how regularization addresses it.

Analysis Short Answer

When would you prefer precision over recall? Give a medical diagnosis example.

Applied Code / Proof

Write scikit-learn code to implement a 5-fold cross-validated logistic regression pipeline.