Data Wrangling and Preparation with Programming

Part of the ModularMaster in Data Science (Programming Track) programme

In computer science particularly, you may have heard of the phrase "Garbage in, garbage out"; and that's true for data analysis, the importance of having quality and the right data format is critical to any data science project. Data wrangling is the function of transforming raw data into data that is usable down the data science process.

This course is designed to introduce the general concepts of Data Science with a focus on Data Wrangling for participants who prefer a programming approach. By the end of this course, participants will be able to wrangle data and prepare the dataset ready for Machine Learning Model.


Plan your learning path

This course can be taken as a module on its own or as part of the Graduate Certificate in Data Analytics (Programming) stack and participants will earn 12 subject credits which can also be used towards completing the ModularMaster in Data Science.


Course Details

Course Dates:
Class: August 2024 (19, 26), September 2024 (2, 9)
Consultation: September 2024 (16) 
Presentation: September 2024 (23)

Sign Up Now

Closing date: 26 July 2024

Duration
5 days, 9.00AM – 5.00PM

 

Who Should Attend

 
  • Participants in roles which require hands-on preparation of data with programming

  • Aspiring data/business analysts or data scientists looking to acquire skills in the data preparation and wrangling component

  • Ideal for participants who are in or looking to acquire skills for roles such as entry-level Data Engineers/Data Analysts

Prerequisites

  • Participants should preferably have passed mathematics at least ‘O’ Level or equivalent.

  • Participants should be conversant with basic IT skills such as software installation, file management and web navigation.

  • Participants are encouraged to complete the Foundation of Data Science before enrolling in this course.

  • Participants are required to pass a pre-course assessment to ensure participants have the requisite knowledge of Python programming. This assessment can be waived if participants have completed both Fundamentals in Python (Basic) and Fundamentals in Python (Intermediate).

  • Participants are required to bring their laptops.

Programme Outline

Learning Objectives and Structure
  1. Perform the basic ML model component for the role of a junior data scientist

  2. Understand the types of data and databases in the business context

  3. Appreciate the use of data dictionary and harness the potential of metadata for data science

  4. Acquire organizational dataset from data lakes and other democratized data sources for data enrichment purposes

  5. Structure data into an appropriate form for data analysis

  6. Manipulate data structures to support data-wrangling phase

  7. Perform data wrangling on the acquired dataset

  8. Address data quality issues with appropriate data cleansing technique

  9. Iterate the data mining process progressively with the provision of data wrangling and exploratory analysis tools

Programme Structure: Participants will go through 4 days of training. Class will reconvene on the 5th day for a presentation as part of the course assessment.

Day 1
  • Overview of Data Science Pipeline
  • What is Data Wrangling and Data Preparation?
  • Data Acquisition
  • Understand how data scientist prepares the dataset for data modelling
  • What is data discovery?
  • Types of data
  • Types of databases
  • Data Dictionary and Metadata
  • Data Models
Day 2
  • Data Mining and CRISP-DM
  • Common Computing Infrastructure
  • Interactive Data Exploratory Analysis (IDEA)
  • Basics of Descriptive Statistics
Day 3
  • Breakdown of Data Preparation Phases
  • Dataset Structuring: Data Frame Handling
  • Data Cleaning
Day 4
  • Data Enrichment and alternative sources
  • Data Enrichment: Data Aggregation
  • Data Enrichment: Data Standardisation
Day 5
  • Project Presentation
Assessment

Participants will be assessed via group based project presentation on the 5th session of the course. There will also be formative assessment and case studies to assess a participant's understanding and competency.

Subject Credits

Upon completion and satisfying the requirements of passing this course, learners will be awarded 12 subject credits.

Course Fees and Funding

Full course fee inclusive of prevailing GST

You pay
S$4,905.00

SkillsFuture Course Fee subsidy (70%)

  • For Singapore Citizens < 40 years old 
  • For Permanent Residents

You pay
S$1,471.50

Mid-Career Enhanced Subsidy (90%)

  • For Singapore Citizens ≥ 40 years old

You pay
S$571.50

Enhanced Training Support for SMEs (90%)

  • For SME - Sponsored employees

You pay
S$571.50

The above module fee payable is inclusive of prevailing GST. 

Instructor

Thia Wei Soon
Instructor, SUTD Academy

Wei Soon has more than ten years of experience working in the manufacturing and IT sectors. He worked as a data scientist using data analytics and machine learning to deliver actionable insights and drive strategic marketing initiatives. In recent years, as a technology consultant, he successfully helped clients to streamline enterprise operations and achieved cost saving through the adoption of robotic process automation.

Wei Soon has a Master of IT in Business Artificial Intelligence from Singapore Management of University and a B.Eng in Mechanical Engineering from Nanyang Technological University. He is proficient with tools such as Tableau, Jupyter, RStudio, MS Visual Studio, Automation Anywhere, UiPath, and programming languages such as Python, R, C#, HTML5, and JavaScript.


Policies and Financing Options

SSG Funding Terms and Conditions

Use of Personal Details

In consideration of the subsidy provided by SkillsFuture Singapore Agency (“SSG”) through the SUTD Academy for the Course,
 

I consent to:

The collection, use and disclosure to relevant third parties of my personal data by the SUTD Academy including but not limited to personal particulars, attendance records, assessment/performance records, for the following purposes:

  1. Reporting of national statistics and conducting of holistic continuing education training research and analysis;

  2. Facilitate the conduct of the relevant surveys and audits in relation to the Course;

  3. General administration of the Course including but not limited to processing of the subsidy provided by SSG;

  4. Publicity and marketing of the Course or other Courses to be provided by SSG or SUTD Academy; and

  5. SSG or its Appointed Auditors or Nominated Representatives to directly contact Course Participant to obtain information deemed necessary for the purposes of conducting effectiveness survey or audits in relation to the Course.

SUTD will have to claim the full course fee from participant who is unable to fulfil the SSG funding requirements stated below.

I agree to:

  1. Attend and complete all lectures, class exercises, workshops and assessments;

  2. Complete the Course feedback at the end of the Course;

  3. Complete the post Course survey sent about 3 to 6 months after class attendance; and

  4. Sign up for a personal email account.

SUTD Privacy Statement

For more information on SUTD's privacy statement, please visit https://sutd.edu.sg/Privacy-Statement.

SUTD Terms and Conditions

Methods of Payment

Learn more about the available payment modes.

Cancellation & Refund Policy

  1. If a written notification is sent to sutd_academy@sutd.edu.sg within 24 hours after course registration deadline there will be no cancellation charges. A full refund will be made. 

  2. No refund is provided if written notification is more than 24 hours after course registration deadline. SUTD Academy reserves the rights to collect the full fee amount from the participant.

Replacement Policy

Companies may replace participants who have signed up for the course by giving a 3-working day notice before the course commencement date to sutd_academy@sutd.edu.sg. Terms and conditions apply.

Registration Policy

  1. Course may be cancelled due to insufficient participants. SUTD Academy will not be responsible or liable in any way for any claims, damages, losses, expenses, costs or liabilities whatsoever (including, without limitation, any direct or indirect damages for loss of profits, business interruption or loss of information) resulting or arising directly or indirectly from any course cancellation.

  2. Course enrolment is based on a first-come, first-served basis.

  3. SUTD Academy reserves the right to change or cancel any course or instructor due to unforeseen circumstances. 

Types of Funding

Funding under Mid-Career Enhanced Subsidy ("MCES")

  1. MCES is an enhanced Subsidy to encourage mid-career individuals to upskill and reskill, thereby helping them to remain competitive and resilient in the job market. With this, all Singaporeans aged 40 and above will receive higher subsidies of up to 90% course fee subsidy for SSG-funded certifiable courses.

  2. Individuals/employers are not required to submit an application for the MCES. Those pursuing SSG-funded programmes will be charged the appropriate subsidised fees by SUTD Academy if they are eligible MCES. Individuals/employers will only need to pay the nett fee (full course fee after SSG's grant).

    For more info, please visit SkillsFuture website at https://www.skillsfuture.gov.sg/enhancedsubsidy

Funding under Enhanced Training Support for SMEs ("ETSS")

  1. ETSS is an enhanced funding to enable SMEs to send their employees for training.

  2. SMEs will enjoy subsidies of up to 90% of the course fees when they sponsor their employees for SSG-funded certifiable courses.

  3. In addition to higher course fee funding, SMEs can also claim absentee payroll funding of 80% of basic hourly salary at a higher cap of $7.50 per hour. SMEs may apply for the absentee payroll via the SkillsConnect system.

  4. To qualify, SMEs must meet all of the following criteria:
    - Organisation must be registered or incorporated in Singapore
    - Employment size of not more than 200 or with annual sales turnover of not more than $100 million
    - Trainees must be hired in accordance with the Employment Act and fully sponsored by their employers for the course
    - Trainees must be Singapore Citizens or Singapore Permanent Residents

    For more info, please visit SSG website at https://www.ssg.gov.sg/programmes-and-initiatives/funding/enhanced-training-support-for-smes1.html


Funding under Union Training Assistance Programme ("UTAP")

UTAP is a training benefit for NTUC members to defray their cost of training. This benefit is to encourage more union members to go for skills upgrading.

NTUC members enjoy 50% unfunded course fee support for up to $250 each year when you sign up for courses supported under UTAP (Union Training Assistance Programme).

For more info, please visit https://e2i.com.sg/individuals/ntuc-education-and-training-fund/.
 


Funding under Post-Secondary Education Account ("PSEA")

The Post-Secondary Education Account (PSEA) is part of the Post-Secondary Education Scheme to help pay for the post-secondary education of Singaporeans.

This is part of the Government’s efforts to encourage every Singaporean to complete their post-secondary education. It also underscores the Government’s commitment to support families in investing in the future education of their children and to prepare them for the economy of the future. PSEA is not a bank account.

It is administered by the Ministry of Education (MOE) and is opened automatically for all eligible Singaporeans.

Account holders can use their PSEA funds to pay for their own or their siblings’ approved fees and charges for approved programs conducted by approved institutions.

However, you will have to check your eligibility and balance by contacting MOE first.

Contact MOE at (65) 6260 0777

E-mail to MOE at contact@moe.edu.sg

Click here for MOE website.