Welcome to class!
Gratitude Journal
Announcements
Project 2: Late Flights and Missing Data
JSON files (JavaScript Object Notation)
Today, JSON is the de-facto standard for exchanging data between web and mobile clients and back-end services. source
Introduce the data
Load the JSON file and spend a few minutes studying it. Can you learn enough about it to describe the columns and rows?
Hints:
- You can use
.describe()
to learn about the distribution of a numeric variable. - You can use
.value_counts()
to learn about the distribution of a categorical variable. .crosstab()
creates a “cross tabulation” of two or more categorical variables.
Can you trust the data?
Do you notice anything interesting about the flights data?
Question Brainstorming
In your group, try to answer the following questions about your assigned “Grand Question”:
- What is our goal?
- How can we get there?
- What will the answer look like when we’re done?
Project 2 FAQs
Not all missing data is represented as np.nan
. For an example, look at the column that counts delays due to late aircraft.
We will learn how to identify and deal with missing data next week. For now, we can drop rows we don’t want using square brackets []
or .query()
.
num_of_delays_weather
num_of_delays_late_aircraft
num_of_delays_nas