Background
We will complete six projects during the semester that each take about four days of class. On average, a student will spend 2 hours outside of class per hour in class to complete the assigned readings, submit any Canvas items, and complete the project (for a total of 8 hours per project). The instruction for each project will be structured into sections as written on this page.
This first Background section provides context for the project. Make sure you read the background carefully to see the big picture needs and purpose of the project.
Python and VS Code are tools commonly used in the field of data science. During our first two days of class we will get VS Code prepped for data science programming. Completing Project 0 will set you pu for success the rest of the semester.
Data
Every data science project should start with data, and our class projects are no different. Each project will have ‘Download’ and ‘Information’ links like the ones below.
Download: mpg data
Information: Data description
Readings
The Readings section will contain links to reading assignments that are required for each project, as well as optional references. Remember that you are reading this material to build skills. Take the time to comprehend the readings and the skills contained within.
We recommend reading through the assigned material once for a general understanding before the first day of each project. You will reread and reference the material multiple times as you complete the project.
The readings listed below are required for the first two days of class.
- Python for Data Science (P4DS): Introduction
- P4DS: Data Visualization Section 3.1 & 3.2 Only
- Saving Altair charts
- Quarto for DS
Optional References
Questions and Tasks:
This section lists the questions and tasks that need to be completed for the project. Your work on the project must be compiled into a rport and submitted in Canvas by the weekend following the last day of material for the project.
- Finish the readings and be prepared with any questions to get your environment working smoothly (class for on-campus and Slack for online)
- In VS Code, write a python script to create the example Altair chart from section 3.2.2 of the textbook (part of the assigned readings). Note that you have to type chart to see the Altair chart after you create it.
- Your final report should also include the markdown table created from the following (assuming you have
mpg
from question 2).
print(mpg
.head(5)
.filter(["manufacturer", "model","year", "hwy"])
.to_markdown(index=False))
Deliverables:
Deliverables are “the quantifiable goods or services that must be provided upon the completion of a project”. In this class the deliverable for each project is a HTML report created using Quarto. This final section will be the same for each project.
Use this template to submit your Client Report. The template has three sections (for additional details please see the instructional template):
- A short summary that highlights key that describes the results describing insights from metrics of the project and the tools you used (Think “elevator pitch”).
- Answers to the grand questions. Each answer should include a written description of your results, code snippets, charts, and tables.