Week 1: Cloud Architecture, Managed Services & Cost Optimization
Deploy data science workloads in the cloud: managed compute, cloud storage, serverless functions, auto-scaling, and cost-efficient architectures.
- Deploy a data science workflow on AWS, GCP, or Azure
- Use managed ML services (SageMaker, Vertex AI, AzureML)
- Containerize applications with Docker
- Build auto-scaling data pipelines with serverless compute
This first lecture establishes the foundational framework for Cloud Computing for Data Science. By the end of this session, you will have the conceptual grounding and practical starting point needed for the rest of the course.
Key Concepts
The lecture introduces the four main pillars of this course: Cloud Fundamentals: IaaS, PaaS, SaaS, Docker & Container Orchestration, Managed ML Services, Cost Optimization & FinOps. Each will be explored in depth over the 14-week curriculum, with hands-on projects reinforcing theory at every stage.
This Week's Focus
Focus on mastering: Cloud Fundamentals: IaaS, PaaS, SaaS and Docker & Container Orchestration. These are the prerequisites for everything in Week 2. The concepts build on each other — do not skip the practice exercises.
DS305 Project 1: Cloud-Deployed ML Pipeline
Deploy a complete ML training and inference pipeline on a cloud platform. Use managed storage, compute, and containerized inference. Measure and optimize cloud costs.
- Docker container for model training and serving
- Cloud deployment manifest (AWS/GCP/Azure)
- Auto-scaling inference endpoint with load test results
- Cost breakdown and optimization analysis
These represent the style and difficulty of questions you'll see on the midterm and final. Start thinking about them now.
Explain the difference between IaaS, PaaS, and SaaS with data science examples for each.
What are the advantages of using Docker containers for ML deployment?
Describe a cloud architecture for running a daily batch ML training job on 100GB of data.