Week 1: Derivatives, the Chain Rule & Gradient Descent
Master derivatives, integrals, and multivariate calculus — the mathematical engine behind every optimization algorithm in machine learning.
- Compute derivatives using limit definition and rules
- Apply the chain rule to composite functions
- Understand gradient descent as iterative derivative application
- Calculate partial derivatives for multivariate functions
This first lecture establishes the foundational framework for Calculus for Data Scientists. By the end of this session, you will have the conceptual grounding and practical starting point needed for the rest of the course.
Key Concepts
The lecture introduces the four main pillars of this course: Limits & Continuity, Differentiation Rules, Chain Rule, Partial Derivatives & Gradients. Each will be explored in depth over the 14-week curriculum, with hands-on projects reinforcing theory at every stage.
This Week's Focus
Focus on mastering: Limits & Continuity and Differentiation Rules. These are the prerequisites for everything in Week 2. The concepts build on each other — do not skip the practice exercises.
MATH101 Project 1: Gradient Descent from Scratch
Implement gradient descent in pure Python/NumPy to minimize a quadratic cost function. Visualize the loss surface, convergence path, and compare different learning rates.
- Pure Python gradient descent implementation
- Loss surface visualization (3D + contour plot)
- Convergence analysis for 3 learning rates
- Written derivation of the gradient update rule
These represent the style and difficulty of questions you'll see on the midterm and final. Start thinking about them now.
Using the chain rule, find d/dx[sin(x²)].
Explain why a learning rate that is too large causes gradient descent to diverge.
Write the gradient update rule for linear regression loss L = (1/2n)Σ(ŷ - y)².