usethis::use_git_config(
user.name = "Your name",
user.email = "Email associated with your GitHub account"
)
Lab 0
Hello, World and STA 199!
This lab will walk you through setting up the course technology, which consists of the following:
R: The programming language you’ll learn in this course. It is very popular in statistics and data science;
RStudio: The piece of software (a.k.a. the integrated development environment, IDE) you’ll use to write R code in. So, R is the name of the programming language itself, and RStudio is a convenient interface;
Git: Version control system (like “Track Changes” features from Microsoft Word but more powerful);
GitHub: A web hosting service for the Git version control system that also allows for transparent collaboration between team members. GitHub is the home for your Git-based projects on the internet (like DropBox but much better).
Version control software like Git and GitHub is indispensable in modern data science practice, and learning how to use it is one of the things that distinguishes STA 199 from a similar course like STA 101. At a minimum, version control helps prevent your files from looking like this:
We’ve all been there, but it’s time to kick the habit.
1. Access R and RStudio
Instead of asking you to download any new programs onto your computer, we host everything pre-packaged for you in the cloud. You just have to login through your web browser, and you’re ready to go:
- Go to https://cmgr.oit.duke.edu/containers and login with your Duke NetID and Password.
- Click
STA198-199
to log into the Docker container. You should now see the RStudio environment.
Go to https://cmgr.oit.duke.edu/containers and under Reservations available click on reserve STA 198-199 to reserve a container for yourself.
A container is a self-contained instance of RStudio for you, and you alone. You will do all of your computing in your container.
Once you’ve reserved the container you’ll see that it will show up under My reservations.
To launch your container click on it under My reservations, then click Login, and then Start.1
Please double and triple check that you have reserved the STA 198-199
container and not some other. The various containers differ in the software they include, so if you reserve something else, you may be missing something we need.
2. Create a GitHub account
Go to https://github.com/ and walk through the steps for creating an account. You do not have to use your Duke email address, but I recommend doing so.2
You’ll need to choose a user name. I recommend reviewing the user name advice at https://happygitwithr.com/github-acct#username-advice before choosing a username.
If you already have a GitHub account, you do not need to create a new one for this course. Just log in to that account to make sure you still remember your username and password. If you are unsure of your login credentials, carefully follow GitHub’s instructions for recovering this information. If you accumulate too many failed login attempts, you will be locked out of your account for the day, which will make it difficult for you to complete the rest of this lab.
3. Set up your SSH key
You will authenticate GitHub using SSH (Secure Shell Protocol – it doesn’t really matter what this means for the purpose of this course). Below is an outline of the authentication steps; you are encouraged to follow along as your TA demonstrates the steps.
You only need to do this authentication process one time on a single system.
- Go back to your RStudio container and type
credentials::ssh_setup_github()
into your console. - R will ask “No SSH key found. Generate one now?” You should click 1 for yes.
- You will generate a key. R will then ask “Would you like to open a browser now?” You should click 1 for yes.
- You may be asked to provide your GitHub username and password to log into GitHub. After entering this information, you should paste the key in and give it a name. You might name it in a way that indicates where the key will be used, e.g.,
sta199
).
You can find more detailed instructions here if you’re interested.
4. Configure Git to introduce yourself
Next, you need to configure your git so that RStudio can communicate with GitHub. This requires two pieces of information: your name and email address.
To do so, you will use the use_git_config()
function from the usethis
package.
You’ll hear about 📦 packages a lot in the context of R – basically they’re how developers write functions and bundle them to distribute to the community (and more on this later too!).
Type the following lines of code in the console in RStudio filling in your name and the address associated with your GitHub account.
For example, mine would be
usethis::use_git_config(
user.name = "John Zito",
user.email = "johnczito@gmail.com"
)
I used my gmail because that is the one I used to create my GitHub account. You should also be using the email address you used to create your GitHub account, it’s ok if it isn’t your Duke email.
When you input the usethis::use_git_config(...)
command into the console and hit enter/return, it will appear as if nothing happened. To verify that everything worked, you can briefly switch over to the Terminal tab (should be to the right of Console) and type the commands git config user.name
and git config user.email
. If all is well, these will return whatever text you originally provided. If you notice a mistake or typo, you can just go back to the Console and rerun usethis::use_git_config(...)
with modified inputs.
The following video walks you through the steps outlined in the SSH key generation and Git configuration sections above.
5. Baby’s first repo
A GitHub repository (repo) is a collection of files hosted on GitHub. Repos are the main way we will distribute files to you during the semester. When you copy the files in a repo to your local computing environment (your container, in this case), that’s called “cloning”. So, let’s clone our first repo:
Go to the course organization at github.com/sta199-s25 organization on GitHub. Click on the repo hello-world.
Click on the green CODE button, select Use SSH (this might already be selected by default, and if it is, you’ll see the text Clone with SSH). Click on the clipboard icon to copy the repo URL.
In RStudio, go to File ➛ New Project ➛Version Control ➛ Git.
Copy and paste the URL of your assignment repo into the dialog box Repository URL. Again, please make sure to have SSH highlighted under Clone when you copy the address.
Click Create Project, and the files from your GitHub repo will be displayed in the Files pane in RStudio.
You will need to get used to these steps, because you’ll probably clone at least one new repo every week.
6. Hello STA 199!
Fill out the course “Getting to know you” survey on Canvas: https://canvas.duke.edu/courses/50057/quizzes/30406.
We will use the information collected in this survey for a variety of goals, from inviting you to the course GitHub organization (more on that later) to getting to know you as a person and your course goals and concerns.
Footnotes
Yes, it’s too many steps. I don’t know why! But it works, and you’ll get used to it. Trust me, it beats downloading and installing everything you need on your computers!↩︎
GitHub has some perks for students you can take advantage of later in the course or in your future work, and it helps to have a .edu address to get verified as a student.↩︎