🎓 University of America — Course Portal
Data ScienceDS203 › Week 1
📊 Data Science Week 1 of 14 BSc · Y2 S1 ⏱ ~50 min

Week 1: EDA Workflow, Distribution Analysis & Pattern Discovery

Learn the systematic EDA process: data profiling, distribution analysis, correlation studies, outlier detection, and business insight generation.

UA
University of America
DS203 — Lecture 1 · BSc Y2 S1
🎬 CC Licensed Lecture
0:00 / —:—— 📺 MIT OpenCourseWare (CC BY-NC-SA)
🎯 Learning Objectives
  • Profile any dataset systematically using a structured checklist
  • Identify distribution shapes and their implications
  • Detect and handle outliers appropriately
  • Generate and communicate business-relevant insights
Topics Covered This Lecture
Data Profiling Checklist
Distribution Analysis
Correlation & Multivariate EDA
Outlier Detection Methods
📖 Lecture Overview

This first lecture establishes the foundational framework for Exploratory Data Analysis. By the end of this session, you will have the conceptual grounding and practical starting point needed for the rest of the course.

Why this matters Learn the systematic EDA process: data profiling, distribution analysis, correlation studies, outlier detection, and business insight generation. This lecture sets up everything that follows — make sure you understand the core concepts before proceeding to Week 2.

Key Concepts

The lecture introduces the four main pillars of this course: Data Profiling Checklist, Distribution Analysis, Correlation & Multivariate EDA, Outlier Detection Methods. Each will be explored in depth over the 14-week curriculum, with hands-on projects reinforcing theory at every stage.

# Quick Start: verify your environment is ready for DS203 import sys print(f"Python {sys.version}") # Check key libraries are installed try: import numpy, pandas, matplotlib print("✅ Core libraries ready") except ImportError as e: print(f"❌ Missing: {e} — run: pip install numpy pandas matplotlib")

This Week's Focus

Focus on mastering: Data Profiling Checklist and Distribution Analysis. These are the prerequisites for everything in Week 2. The concepts build on each other — do not skip the practice exercises.

📋 Project 1 of 3 50% of Final Grade

DS203 Project 1: Deep EDA on a Business Dataset

Perform a comprehensive EDA on a business dataset (sales, customer, or operations data). Deliver a structured EDA report with visualizations and actionable recommendations.

  • EDA notebook with systematic profiling
  • Univariate, bivariate, and multivariate analyses
  • Automated profiling with pandas-profiling or ydata
  • Executive summary: top 5 findings with chart evidence
50%
3 Projects
20%
Midterm Exam
30%
Final Exam
📝 Sample Exam Questions

These represent the style and difficulty of questions you'll see on the midterm and final. Start thinking about them now.

Conceptual Short Answer

A feature has skewness of 3.2. What does this indicate and how would you transform it?

Analysis Short Answer

Describe the four steps in a systematic EDA workflow.

Applied Code / Proof

What is the IQR method for outlier detection? Write Python code to implement it.