DS250
  • Home
  • Projects
  • Contact
  • Materials
  • Navigate
    Slides Syllabus FAQ
  • Slides
    • Week 12-13: Project 6 - Github
      • Day 3: Resume Fork and Merge
      • Day 2: Commit, push, fork, and merge
      • Day 1: Git and Github
    • Week 10-11: Project 5 - Star Wars
      • Day 4: May the ML columns be with you
      • Day 3: Validating data, cleaning columns
      • Day 2: Star Wars and strings
      • Day 1: The war with Star Wars
    • Week 8-9: Project 4 - Homes
      • Day 4: Evaluating Our Models, Part 2
      • Day 3: Training a Classifier, Part 2
      • Day 2: Intro to Machine Learning
      • Day 1: Intro to ML
    • Week 6-7: Project 3 - Baseball
      • Day 4: Practice Coding Challenge
      • Day 3: The end of baseball
      • Day 2: SQL Calculations
      • Day 1: Intro to Project 3
    • Week 4-5: Project 2 - Flights
      • Day 4: Exporting JSON
      • Day 2B: Missing Data
      • Day 2: Transforming Data
      • Day 1: Intro to Flights Data
    • Week 2-3: Project 1 - Names
      • Day 3: Making your name stand out
      • Day 2: Seeing names with Altair
      • Day 1: Exploring names with pandas
    • Week 1: Introduction
      • Day 2: Project 0
      • Day 1: Welcome

Week 10-11: Project 5 - Star Wars

A significant portion of a data scientist’s job is data cleaning. during these two weeks we will not hide the data munging from you. We will practice data cleaning using a Star Wars survey from FiveThirtEight. Survey data is notoriously difficult to handle. Even when the data is recorded cleanly the options for ‘write in questions’, ‘choose from multiple answers’, ‘pick all that are right’, and ‘multiple choice questions’ makes storing the data in a tidy format difficult.

Updated on 15 Sep 2020

Day 1: Git and Github Day 4: May the ML columns be with you

J. Hathaway and BYU-I ©