class: inverse, center, middle # Introduction to CSE 451 ## Big Data Programming & Analytics #### Because we need at least one class with the word _big_ in the title... ![](../images/CSE451_Hex.png) .large[J. Hatthaway | BYU-I | Winter 2021] --- layout: true background-image: url(../images/CSE451_Hex.png) background-position: top right background-size: 7.5% --- class: left, middle, inverse name: first_slide # Welcome to CSE 451: Big Data Programming and Analytics .left-column[ ![](../images/CSE451_Hex.png) ] .right-column[ ### By the end of the semester, each student will be able to: #### 1. Integrate and extend previously learned data science tools to analyze remote and distributed data in business contexts. #### 2. Explore, interpret, conceptualize, and validate assumptions of data at scale. #### 3. Understand the differences and benefits of current industry technologies for big data storage and analysis. #### 4. Leverage parallel processing for analysis. ] ??? Presenter Notes go here --- class: top, left, inverse background-image: url(https://images.pexels.com/photos/924824/pexels-photo-924824.jpeg?auto=compress&cs=tinysrgb&dpr=3&h=750&w=1260) background-size: cover # About *me* **From 2015 - 2021:** I have spent the last 5 years here at BYU-I and the last 4 years working on the data science degree. Over the last year we have built out the degree with four new classes. CSE 451 is the final course in the development. - [Math 119: Applied Calculus for Data Analytics](https://byuistats.github.io/M119/) - [CSE 150: Data Intuition and Insight](https://byuistats.github.io/CSE150/) - [CSE 250: Data Science Programming](https://byuistats.github.io/CSE250-Course/) - [CSE 451: Big Data Programming and Analytics](https://byuistats.github.io/CSE451-Course/) We have also built out an analytics arm at the [Research and Business Development Center](http://www.rbdcenter.org/data-analytics/) -- **From 2005 - 2015:** I worked at PNNL on big data applications using [Hadoop](https://hadoop.apache.org/) in the [climate science space](https://www.pnnl.gov/science/highlights/highlight.asp?id=1609). <img src="https://upload.wikimedia.org/wikipedia/en/thumb/1/17/Pacific_Northwest_National_Laboratory_logo.svg/1920px-Pacific_Northwest_National_Laboratory_logo.svg.png" width="300" height="150" align="right" /> --- class: center, middle # Spark for Data Science **We will focus on [Spark](https://spark.apache.org/), [SparkSQL](https://spark.apache.org/sql/), and [MLlib](https://spark.apache.org/mllib/) in this class within [Docker](https://www.docker.com/) and the [Databricks PaaS](https://databricks.com/product/unified-data-analytics-platform).** .pull-left[ ![](https://spark.apache.org/images/spark-logo-trademark.png) ] .pull-right[ ![](https://upload.wikimedia.org/wikipedia/commons/6/63/Databricks_Logo.png) ] --- class: left, middle # [Syllabus and Course Goals](../syllabus.html) in Data Science ## Design Objectives: _Your goal is to communicate your team's understanding of the course in 3-minutes to me. You have 10 minutes to build the slides._ ### 1. Build a three minute slide presentation on CSE 451 Course Goals. ### 2. Make sure to document how the course will be graded. ### 3. Create a diagram that maps the flow of the class over the semester. ### 4. Propose that next thing we should learn. ### 5. Your final slide should list questions.
10
:
00
--- class: top, left, inverse background-image: url(https://images.pexels.com/photos/207662/pexels-photo-207662.jpeg?cs=srgb&dl=pexels-pixabay-207662.jpg&fm=jpg) background-size: cover # Our next topic discussion? --- class: top, left, inverse background-image: url(https://images.pexels.com/photos/207662/pexels-photo-207662.jpeg?cs=srgb&dl=pexels-pixabay-207662.jpg&fm=jpg) background-size: cover # Our next topic discussion? .pull-right[ ##### __Which team wants to build the introduction and activity?__ ] --- class: left, middle # [Design Thinking](../thinking.html) in Data Science
10
:
00
## Design Objectives: _Explain what design thinking is and how it can be used with data science._ #### 1. Build a five minute slide presentation on design thinking and how we could use it in our course. #### 2. Make sure to show the process and pick which part of the process your team feels in the most crucial. #### 3. Propose how the design thinking process can fit with our class and data science work. #### 4. Your final slide should list questions. <!-- https://arm.rbind.io/slides/xaringan.html --> <!-- https://arm.rbind.io/slides/xaringan.html#90 --> <!-- A useful workflow for presenting; if you have two screens, turn off display mirroring then press c to clone: --> <!-- Press p for presenter mode on laptop --> <!-- Wish you had used ??? to add presenter notes -->