DS 250 Syllabus
Most people would sooner die than think, and most of them do.
-Bertrand Russell-
Overview
This course provides a better understanding of data science programming. If you have signed up for this class, you are most likely driven by curiosity and interested in how data decisions are made (sometimes called data intuition). Possibly, you have a more empathetic approach to how the world works and how problems can be solved. Or you have an eye for how society reports and uses data to make impactful decisions.1.
Upon completing this course, you will be able to use data-informed programming in Python to handle, format, and visualize data. We will introduce you to data wrangling techniques, analytical methods, and the grammar of graphics. Specifically, as a successful learner, you will be able to;
- Use functions, data structures, and other programming constructs efficiently to process and find meaning in data.
- Programmatically load data from various types of data sources, including files, databases, and remote services.
- Use data manipulation libraries to perform straightforward analysis, produce charts, and prepare data for machine learning algorithms.
- Use machine learning libraries to discover insights, make predictions, and interpret the success of these algorithms.
- Use industry-leading tools to collaborate and share your work.
Principles of DS teaching
The course follows these principles of teaching Data Science2
- Organize the course around a set of diverse projects
- Integrate computing into every aspect of the course
- Teach abstraction, but minimize reliance on mathematical notation
- Structure course activities to realistically mimic a data scientist’s experience
- Demonstrate the importance of critical thinking/skepticism through examples
Competency assumptions
This course focuses on programming with data to find insights. The prerequisite for this course is an introductory programming course in Python (CSE 110)3. We recommend taking CSE 111 before or during the same semester you take this course - especially if programming is complicated for you. We assume that you do know what the Terminal is and how to execute scripts.
An understanding of standard deviation and variance will be valuable.
Course materials and structure
This course focuses on building core data science skills. You will learn to program with Python, but you will also learn how to communicate and collaborate with your peers and mentors.
Course communication
How do I talk with my teacher, TA, and other students in this class?
- We use Slack for most class and one-on-one communication. Don’t email or direct message using I-Learn.
A. Should I paste code snippets in our class Slack channel to get help? Yes.
B. Should I ask questions about the projects and the readings in our class Slack channel? Yes.
C. Should I post random quotes or videos in our class Slack channel? No. Use the #random channel.
- All assignments are submitted in I-Learn.
A. Each project submission requires you to submit a short message to the teacher about your work (use the submission comments).
B. We will respond to your message with edits you can make to earn full credit on your resubmit.
C. Class announcements about the grading of projects are posted in Slack.
Online reading materials
- Python for Data Science python4DS is a port of R for Data Science (2e) into Python.
- Pandas User Guide
- Lets-Plot User Guide
- Python Data Science Handbook
- SQL
Preparation
In my experience, getting lectured training outside of college is even more expensive than it is in college. A week’s worth of training can cost more than a semester of school here at BYU-I. Due to this expense, learning how to digest online material gain understanding before going to the expert with questions is a valuable skill to develop. I expect that you have completed the assigned reading material before class begins.
Specifications Grading
Grading is a nasty side effect of mass learning and academia. We are in a class at a university and will have to manage this side effect. However, we don’t have to let it control our learning, thinking, or this class. Learning and thinking should motivate each activity.
As we team, teacher and student, we have the challenge to become more! We have worked hard to identify the specifications needed for a python user of the pandas and Altair packages. Our goal is to align your grade with the skill specification you have mastered. In other words, the grade you want will determine how much work you will do. We will not score individual tasks in the class on a percentage scale. If your work meets the specified criteria, you will get full credit.
In a specifications-grading system, all tasks are evaluated on a high-standards pass/fail basis using detailed checklists of task requirements and expectations4. You earn your letter grade by earning passing marks on a set of tasks. This system provides various choices and is closer to how learning and work occur in the real world. It will be easy for us to tell if work is complete, done in good faith, and consistent with the requirements.
Footnotes
https://medium.com/@nikhilbd/what-makes-a-good-data-scientist-engineer-a8b4d7948a86#.jr80wl98y. I suppose some of you are just taking this class because your degree says you can, and it fits in your schedule. If so, we should chat to make sure this is the right class for you.↩︎
https://arxiv.org/ftp/arxiv/papers/1612/1612.07140.pdf. You will see this pattern in DS 350, DS 460, and Math 488. It will progressively get more realistic.↩︎
We do expect that this is not your first experience with Python and VS Code. If you have done other programming courses, you should be able to succeed in this course. If you have any questions, please ask.↩︎
Making the right checklists can be difficult. Bad checklists could fall in the following categories – vague and imprecise; too long; hard to use; impractical; too pedantic. Useful checklists are precise, efficient, easy to use and understand. This is the first time this course has been offered, so we will have to work together to ensure the requirements are reasonable.↩︎