1. Introduction

Version control systems (VCS) allow developers to maintain a record of how their code has changed over time. When used properly, a VCS can help a developer track down the exact point in time when a bug was introduced or fixed, easily undo changes, and collaborate with other developers.

There are many types of version control systems. Some of the more popular ones include CVS, subversion, mercurial, and Git. In recent years, Git has quickly become the most popular of the group.1 You should watch this 20 minute video for a quick introduction to Git. (warning, in the video there is a picture of someone giving the middle finger as a rude gesture).

You can host a public (open to the world) or private (open to just you or a few individuals) repository on GitHub. GitHub has many useful features beyond the standard Git functions.

2. Git Installation

Git Installation: Windows

To install Git for Windows, click here: Git for Windows. This will install msysgit or “Git Bash” in addition to some other useful tools, such as the Bash shell. Yes, all those names are totally confusing, but you might encounter them elsewhere and I want you to be well-informed.

This method of installing Git for Windows leaves the Git executable in a conventional location, which will help you and other programs, e.g. RStudio, find it and use it. This also supports a transition to more expert use, because the “Git Bash” shell will be useful as you venture outside of R/RStudio.

  • When asked about “Adjusting your PATH environment”, make sure to select “Git from the command line and also from 3rd-party software”. Otherwise, we believe it is good to accept the defaults.
  • Note that RStudio for Windows prefers for Git to be installed below C:/Program Files and this appears to be the default. This implies, for example, that the Git executable on my Windows system is found at C:/Program Files/Git/bin/git.exe. Unless you have specific reasons not to, follow this convention.

If you follow these instructions you can use the shell (also called the terminal) directly from within RStudio. We will walk you through how/when to use it in videos referenced later on this page. If you are extra ambitious see this article for more information on how to use it and setup RStudio to use Git Bash, rather than the Command Prompt.

Git Installation: Mac OS X

Mac OS X already includes the shell, so all you need to do is install Git.

Git Installation: Linux

If Git is not already available on your machine you can try to install it via your distro’s package manager.

Debian/Ubuntu

sudo apt-get install git

Fedora/Redhat Linux

sudo yum install git

3. Personalize Git

You only have to do this once per machine.

In order to track changes and attribute them to the correct user, we need to tell Git your name and email address.

Option 1: use usethis

The usethis package includes helpful functions for common setup and development operations in R. Install it by running the command

install.packages("usethis")

from the console in RStudio. Then run the following commands:

library(usethis)
use_git_config(user.name = "hathawayj", user.email = "hathawayj@byui.edu")

Replace hathawayj and hathawayj@byui.edu with your name and email address. Your name could be your GitHub username, or your actual first and last name. Your email address must be the email address associated with your GitHub account.

Option 2: use the shell

Open the shell on your computer. From there, type the following commands (replace the relevant parts with your own information):

  • git config --global user.name 'hathawayj'
    • This can be your full name, your username on GitHub, whatever you want. Each of your commits will be logged with this name, so make sure it is informative for others.
  • git config --global user.email 'hathawayj@byui.edu'
    • This must be the email address you used to register on GitHub.

You will not see any output from these commands. To ensure the changes were made, run git config --global --list.

Trouble Shooting

RStudio can only act as an interface for Git if Git has been successfully installed AND RStudio can find it.

A basic test for successful installation of git is to simply enter git in the shell. It will print a bunch of stuff to the screen, which is fine. However, if you get a complaint about git not being found, it means installation was unsuccessful or that it is not being found, i.e. it is not on your PATH.

If you are not sure where the git executable lives, try this in a shell:

  • which git (Mac, Linux)
  • where git (most versions of Windows)

If Git appears to be installed and findable, launch RStudio and try again. If it still doesn’t work, quit and re-launch RStudio if there’s any doubt in your mind about whether you opened RStudio before or after installing Git.

From RStudio, go to Tools > Global Options > Git/SVN and make sure that the box Git executable points to the Git executable. It should read something like:

  • /usr/bin/git (Mac, Linux)
  • C:/Program Files (x86)/Git/bin/git.exe (Windows)

If you make any changes, restart RStudio and try the steps at the top of the page again.

Still not working? Try googling your problem or speak with myself or the TA.

4. Syncing Github and RStudio

Step 1: Connect RStudio to GitHub

Now that RStudio can find Git on your computer, we need to connect RStudio to GitHub online. You should have already signed up for a GitHub account.

Note that there have been recent changes in how RStudio authenticates for using GitHub, so some of the helpful blogs/resources online are now outdated. The instructions here are lifted from the rfortherestofus.com website.

I recommend (and explain below) connecting RStudio and GitHub by using your username and a Personal Access Token (PAT) for HTTPS operations. Alternatively, you could set up SSH keys.

1a. Get a PAT

To generate a personal acces token, run the following code in your R console. It will take you to the appropriate page on the GitHub website, where you’ll give your token a name and copy it (don’t lose it because it will never appear again!). On that same page, I recommend setting the expiration option to “No expiration”, or choose “custom” and set it to something longer than the semester, so that you don’t have to go through this process again. Watch the 1 minute video demonstration..

install.packages("usethis") #ignore this line if you installed the package already
library(usethis) #ignore this line if you loaded the package already
create_github_token()

1b. Store your PAT in RStudio

Now that you’ve created a Personal Access Token, we need to store it so that RStudio can access it and know to connect to your GitHub account. Run the code below, and when prompted, enter your GitHub username and the Personal Access Token as your password (NOT your GitHub password). Once you’ve done all of this, you have connected RStudio to GitHub!

install.packages("gitcreds")
library(gitcreds)
gitcreds_set()

Step 2: Go to your Github repo

  • Go to GitHub.com and login.
  • We have already created a repo for you to use in this class. If you do not see the repo when you log-in there may be a few reasons. The most common reason for not seeing the repo is because you did not accept the invitations in 2 separate emails.
    • One of the emails invited you to join the group, this allows you to see all the repos of students in the class (including yours)
    • Another separate email invited you to accept write access for your repo

These emails often are filtered to your junk or clutter email folders automatically, so be sure to check there for them. Also, be sure you click the “accept” link in the invite, not just a link to view the repos.

Step 3: Clone GitHub repo with RStudio

In RStudio, start a new Project: File > New Project > Version Control > Git.

  • In the “repository URL” paste the URL of your new GitHub repository. This url can be found by clicking on the big green button at the top of your repository. The url will be something like https://github.com/hathawayj/myrepo.git. See the picture below. Clip on the clipboard icon to copy the url.

  • If you do NOT see an option to get the Project from Version Control? Make sure RStudio can find Git (see above).
  • Decide where to store the local directory for the Project. Don’t scatter everything around your computer - have a central location, or some meaningful structure.
  • I suggest you check “Open in new session”, as that’s what you’ll usually do in real life.
  • Click “Create Project” to finish the process of downloading all the files and folders from the repository to your local machine. You successfully created all of these things:
    • a directory on your computer
    • a Git repository, linked to a remote GitHub repository *an RStudio Project

Whenever possible, this will be the preferred route for setting up your R projects.

Note: This is probably the simplest way to connect RStudio and Github. However, if you would like to connect GitHub to a previously created R-Studio project you can follow this guide.

Step 4: Pull, add, commit and push to Github

Do this every time you finish a valuable chunk of work, at least once a day.

You can watch this video for a step-by-step demonstration of the steps described below.

To test it out, look in RStudio’s file browser pane for the README.md file at the top level directory of your project. Double click it to open it. Modify the README.md file by adding the line “This is a line from RStudio”. Save your changes. Now sync your local project with the online Github repo by following these 4 steps in the Terminal. (Usually the terminal tab is located next to your Console tab in Rstudio).

  • At the prompt in the terminal type git pull and hit enter. This will bring any changes that others may have pushed to your Github repor down to your local machine. This is particularly helpful if you are working as a team on a larger project, or if you are accessing the Github repo from multiple computers (i.e. your work computer and your home computer). You may be asked to resolve conflicts if your local version conflicts with what is found in the repository.
  • Next type git add . and hit enter. The period means you are staging all the files in the Git pane. In the uncommon occurence that you only want to upload certain files you can specify them by name.
  • Next type git commit -m"put a cutomized message here" and hit enter. This batches the changes and will be something that git tracks. The -m stands for message. The customized message is not optional, it should describe the nature of the changes you have made.
  • Next type git push and hit enter. This officially pushes from your local machine to the Github repository.

Note, you cannot copy and paste into the terminal, you will have to type it out. There is a point and click method to do this as well in Rstudio, but it is slow and clunky. If you want to use it instead of the terminal commands, you can try watching this video.

Caution: Before you push your changes to GitHub, first you should pull from GitHub. Why? If you make changes to the repo in the browser or from another machine or (one day) a collaborator has pushed, you will be happier if you pull those changes in before you attempt to push.

Here is an image that illustrates the work flow commands that were just described.

Please take care to save the homework in the right folder and push it to Github in order for it to be graded.