Data Preparation for Analytics

Programme outline

Learning objectives
  1. Understand the role of data preparation in big data.
  2. Structure data into arrays and DataFrames, performing operations like reshaping, slicing, appending, dropping, transposing, and melting with Python.
  3. Apply data cleaning techniques such as imputing missing values, renaming columns, and handling unbalanced datasets.
  4. Enrich data by merging tables.
  5. Aggregate data using pivot tables and groupby operations with Python.
Day 1
  • Role of data preparation in big data
  • Data types, date sizes, data encoding
  • Data structuring with array
  • Data structuring with DataFrame – Reshape, slicing, append and drop, transpose and shift, melt
Day 2
  • Data cleaning – Imputing missing values, different types of missing data (MCAR, MAR, MNAR), renaming column names, dropping duplicates, dropping rows and columns, strings manipulation, (handling unbalanced dataset, data transformation)
  • Data enrichment – Access and manipulate dates and times, join tables with merge()
  • Data aggregation – Pivot table, groupby, aggregate, describe
Day 3
  • Project consultation
  • Project presentation
Mode of assessment
  • Assignment
  • Project

 

What’s next

Find out more

Mailing list

Subscribe to our mailing list and learn about the latest developments in SUTD Academy.

Get in touch

Submit an enquiry or schedule a call with our friendly team at +65 6499 7171.