⚙️ AI Engineering Week 1 of 14 BSc · Y2 ⏱ ~50 min

Week 1: Feature Engineering, Training Pipelines & Reproducibility

The engineering perspective on ML: robust feature pipelines, automated model selection, reproducible training infrastructure, and production-readiness.

University of Aliens

AIE201 — Lecture 1 · BSc Y2

🎬 CC Licensed Lecture

0:00 / —:—— 📺 MIT OpenCourseWare (CC BY-NC-SA)

🎯 Learning Objectives

Design reproducible ML training pipelines with experiment tracking
Engineer features from raw data at production scale
Implement automated model selection and hyperparameter optimization
Package and version ML models for deployment

Topics Covered This Lecture

Experiment Tracking: MLflow & W&B

Feature Engineering at Scale

AutoML & Hyperparameter Optimization

Model Packaging & Versioning

📖 Lecture Overview

This first lecture establishes the foundational framework for Machine Learning Engineering. By the end of this session, you will have the conceptual grounding and practical starting point needed for the rest of the course.

        Why this matters
        The engineering perspective on ML: robust feature pipelines, automated model selection, reproducible training infrastructure, and production-readiness. This lecture sets up everything that follows — make sure you understand the core concepts before proceeding to Week 2.
      

Key Concepts

The lecture introduces the four main pillars of this course: Experiment Tracking: MLflow & W&B, Feature Engineering at Scale, AutoML & Hyperparameter Optimization, Model Packaging & Versioning. Each will be explored in depth over the 14-week curriculum, with hands-on projects reinforcing theory at every stage.

# Quick Start: verify your environment is ready for AIE201
import sys
print(f"Python {sys.version}")

# Check key libraries are installed
try:
    import numpy, pandas, matplotlib
    print("✅ Core libraries ready")
except ImportError as e:
    print(f"❌ Missing: {e} — run: pip install numpy pandas matplotlib")

This Week's Focus

Focus on mastering: Experiment Tracking: MLflow & W&B and Feature Engineering at Scale. These are the prerequisites for everything in Week 2. The concepts build on each other — do not skip the practice exercises.

📋 Project 1 of 3 50% of Final Grade

AIE201 Project 1: Reproducible ML Training System

Build a fully reproducible ML training system for a classification task: feature store, experiment tracking (MLflow), hyperparameter optimization (Optuna), and model registry.

Feature engineering pipeline with data validation
MLflow experiment tracking integration
Optuna hyperparameter optimization study
Model registered in MLflow with performance report

50%

3 Projects

20%

Midterm Exam

30%

Final Exam

📝 Sample Exam Questions

These represent the style and difficulty of questions you'll see on the midterm and final. Start thinking about them now.

Conceptual Short Answer

What is experiment reproducibility in ML? List 5 sources of non-reproducibility and how to fix each.

Analysis Short Answer

Explain the difference between feature selection, feature extraction, and feature engineering.

Applied Code / Proof

How does Optuna's TPE sampler differ from random search for hyperparameter optimization?