1  Technology Set-up

Readings teach you to do the things that a data analyst or scientist does and prepare your for the task. There are a lot of readings for this task, but many of them are short.

  1. Being a better online reader
  2. Introduction: R for Data Scientists (2nd ed)
  3. Chapter 2 Workflow Basics: R for Data Scientists (2nd ed)
  4. Lesson 3 R and RStudio basics, skip section 3.3
  5. Modern Dive: Chapter 1 Getting Started with Data in R
  6. Chapter 6.1 Workflow: Scripts: R for Data Science (2nd ed)
  7. Optional: If you are new to R and want more practice, I recommend the ‘swirl’ package. It is an interactive tutorial. Follow the instructions below:
    • Run the following code, then follow the prompts. Choose option 1: R programming. When presented with 15 different modules I recommend completing 1, 3, 4, 5, 7, 8, 12. (7 and 8 are optional, we will hit them again later in the semester). Say “no” when it asks if you want credit for completing a module.

install.packages("swirl")

library(swirl)

  1. Read the syllabus and ensure you understand the course procedures and grading.

  2. Download and install the latest versions of R and RStudio.

  3. If necessary, adjust your settings in RStudio to use the code diagnostics. Feel free to place a check on every diagnostic for now, and then unselect those, throughout the semester, that you no longer find relevant.

  4. IMPORTANT: Create a GitHub account

  5. IMPORTANT Post your GitHub username on the GitHub Usernames google sheet. This will automatically ensure that a GitHub repository is created for you and that you are added to our class organization so that you can see your classmates’ repositories.

    • Normally you would create your own repository, but because we want your class repository saved in a special organization we are creating it for you. You will be using this private repository to turn in assignments.

After roughly 30 minutes (during normal working hours), you should receive two emails (ckeck your spam/clutter folder). Accept the invitation in each one. If you do not see the two emails, you can get to the invitations this way:

  • To join our class organization click on the following URL and choose to accept the invitation. https://github.com/orgs/BYUI335

  • To get to the invitation to your personal repo, you can create the url you need to visit. Use the address below as an example, but replace the REPO_NAME portion with the actual name of your repo:

    https://github.com/BYUI335/REPO_NAME/invitations

    • So, for my repo as an example in the Winter2023 semester (my name is David Palmer), it would be:

      https://github.com/BYUI335/DS350_WI23_Palmer_Davi/invitations

    • After following the URL with your repo name, you will see an option to join the repo, click ‘Accept Invitation’.

If you are already a member of the class organization and you already have accepted access to your repo, clicking these URLs will just take you to the main page of the organization and the repo, respectively.

  1. Join the Data Science Society workspace in Slack. This is where we will have class conversations and ask questions. Here is the link to join: https://join.slack.com/t/byuidss/signup . The following sub-bullets provide additional detailed instructions about setting up and using slack.

    • Create a professional username, something you could share with an employer
    • In addition to the channels you are automatically added to, be sure to join the following two channels:
      • Our class channel. It will look something like #ds350_wi23_palmer or #ds350_sp23_ball, where the last part of the channel name corresponds to the last name of the instructor. Find the appropriate channel and add yourself to it. Once you are in the class channel, introduce yourself by replying to the introduction thread by clicking the thread, hit reply, then type a brief introduction.
      • #tutoring_lab, add this channel. It is where you can ask questions about code related to specific tasks or case studies.
      • Browse through the channels and join as as many as you are interested in. #internships-and-jobs is a particularly active channel. This community will be a great resource, even if you are not a Data Science major.
    • It is highly recommended you install Slack as an app on your PC and/or phone. The more comfortable you become with using Slack, the easier it will be to collaborate with each other throughout your professional career.
  2. Post a professional picture to your GitHub, LinkedIn, and Slack icons. Go to the BYU-Idaho LinkedIn Photobooth if you don’t have a professional picture.

  3. Make plans / try to attend Data Science Society on the second Wednesday of the semester. Details will be posted on slack, usually in the #general channel. (Attendance is not taken)

  4. Download the R script first-R-script.R and open it in RStudio. Run each line of code, one at a time (figure out the keyboard shortcut to do this). Note any questions you want to ask of your classmates or instructor.

    • It is not crucial that you understand what each line of code does at this point in the semester. However, briefly try guessing what each line of code does before running it, but don’t worry if you don’t understand all (or any of) the code.

Submit

Your submission of your GitHub info to the google sheet, and your presence on slack will be used to determine if this assignment was completed.