Syllabus

Overview

By the end of the semester, each student will be able to:

  1. Integrate and extend previously learned data science tools to analyze remote and distributed data in business contexts.
  2. Explore, interpret, conceptualize, and validate assumptions of data at scale.
  3. Understand the differences and benefits of current industry technologies for big data storage and analysis.
  4. Leverage parallel processing for analysis.

The course follows these principles of teaching Data Science;1

  1. Organize the course around a set of diverse case studies
  2. Integrate computing into every aspect of the course
  3. Teach abstraction, but minimize reliance on mathematical notation
  4. Structure course activities to realistically mimic a data scientist’s experience
  5. Demonstrate the importance of critical thinking/skepticism through examples

You will find value in reading my learning manifesto.

Competency Assumptions

We assume that you have experience using data science programming in Python as practiced in CSE 250. You will also need a background in data science programming in R as practiced in CSE 350 / Math 335 or experience with Machine Learning as practiced in CSE 450. You can see all the prerequisites at the BYU-I Catalog

Course Format

This course assumes that you are capable of guided learning and working in teams.

Preparation

In my experience, getting lectured training outside of college is even more expensive than it is in college. A week’s worth of training can cost more than a semester of school here at BYUI. I expect that you have completed the assigned reading material before class begins.

There will be a few coding challenges that pop-up at the beginning of class during the semester to make sure you are keeping up with the material.

Class Time

The goal is to avoid traditional lectures in class. We will use class time for the following team activities.

Presentations

These presentations are not expected to be high impact proposals with highly polished slides. However, they should be organized and clear as your slides will persuade the class to move with your group’s decision.

Learning and Training

Each partner group will provide one 40-60 minute training on the class selected learning topics. These presentations should have a hands-on coding activity and be self-contained in a GitHub repo within our CSE451 GitHub organization.

Grading

The grading system’s influence on our thinking is a side effect of mass learning and academia. We are in a class at an accredited university and will have to manage this side effect. However, we don’t have to let it control our learning, thinking, or work. Discovering and practicing industry pertinent skills should motivate each activity.

The class performance is tracked in four areas - impact, involvement, hours, and understanding. These areas generally map to how you will be valued at your future employer. Each area is essential to maximize your perceived performance, but all areas do not need to be exceptional to earn the highest marks in this course or to survive in industry.

Impact

If your team doesn’t understand why they need your services, they will eventually not need you.

Involvement

If your team and manager don’t see and hear your ideas and work, they will question your leadership and interest.

Hours

Putting in the time is the best predictor of success

Understanding

You should know how to do things. But not everything.

Competency Scale

The below tables summarizes the specifications based grading for the course. You should read the details below for additional understanding.

Grade Hours Understanding Involvement Impact
A 107 3 & 4 < 3 warnings & < 3.1 hours class missing Active all & key > 2
B 98 3 & 3 < 9.1 hours class missing or write-up Active most & key > 1
C 75 3 anytime < 4 warnings Active often & key > 0
D 50

A Details:

B Details:

C Details:

D Details:

Coding Challenges

The coding challenges will be graded on a four-point scale - 1) Submitted work, 2) Some code aligns with the challenge, 3) Strong performance with satisfactory code, 4) Near flawless performance with clean and concise code.

Negotiating Competency Grade

If you feel you have greatly exceeded one of the competency areas, you can use that excess to negotiate a short coming in a different competency. Here are a few examples you could argue (These are example arguments and are not intended to signify a path to the grade requested).

I only got a satistfactory score on my final coding challenge, but I completed 119 hours and was a key contributor on 5 projects. As such I request an A.

I was only recognized as a key contributor on one project. However, I worked 107 hours and staid involved in all work during class. As such, I request a B.

I only worked 50 hours in this class. However, I got all 3s on my coding challenges and a 4 on the final coding challenge. In addition, I was a key contributor on 5 projects and never missed class. I request an A-.


  1. https://arxiv.org/ftp/arxiv/papers/1612/1612.07140.pdf

    ↩︎