Lecture 1
Duke University
STA 199 Spring 2025
2025-01-14
If you have not yet completed the Getting to Know You
survey, please do so ASAP!
If you have not yet accepted the invite to join the course GitHub Organization, please do so pronto!
Make your appointments in the Testing Center now!
Any questions about the syllabus?
Course operation
Doing data science
By the end of the course, you will be able to…
What does it mean for a data analysis to be “reproducible”?
Short-term goals:
Long-term goals:
Packages: Fundamental units of reproducible R code, including reusable R functions, the documentation that describes how to use them, and sample data1
As of 27 August 2024, there are 21,168 R packages available on CRAN (the Comprehensive R Archive Network)2
We’re going to work with a small (but important) subset of these!
Option 1:
Sit back and enjoy the show!
Option 2:
Go to your container and launch RStudio.
install.packages()
, once per system:Note
We already pre-installed many of the package you’ll need for this course, so you might go the whole semester without needing to run install.packages()
!
library()
, once per session:If data analysis was cooking…
RStudio is your kitchen. It comes with a fridge, a stove, a sink, etc pre-installed;
Installing a package would be like buying more appliances at the store: mixer, blender, toaster, instapot, air fryer;
Loading a package would be like taking these things out of the cupboard;
Your containers are like kitchens where we have already bought all of the extra appliances for you. In other words, “batteries included.”
aka the package you’ll hear about the most…
Object documentation can be accessed with ?
GitHub is the home for your Git-based projects on the internet – like DropBox but much, much better
We will use GitHub as a platform for web hosting and collaboration (and as our course management system!)
with human readable messages
Option 1:
Sit back and enjoy the show!
Note
You’ll need to stick to this option if you haven’t yet accepted your GitHub invite and don’t have a repo created for you.
Option 2:
Go to the course GitHub organization and clone ae-your_github_name
repo to your container.
Find your application repo, that will always be named using the naming convention assignment_title-your_github_name
Click on the green “Code” button, make sure SSH is selected, copy the repo URL
yes
in the pop-up dialogueNever received GitHub invite \(\rightarrow\) Fill out “Getting to know you survey
Never accepted GitHub invite \(\rightarrow\) Look for it in your email and accept it
Cloning repo fails \(\rightarrow\) Review/redo Lab 0 steps for setting up SSH key
Still no luck? Visit OH or post on Ed.
Option 1:
Sit back and enjoy the show!
Note
If you chose (or had to choose) this option for the previous tour, or if you couldn’t clone your repo for any reason, you’ll need to stick to this option.
Option 2:
Go to RStudio and open the document ae-01-meet-the-penguins.qmd
.
Once we made changes to our Quarto document, we
went to the Git pane in RStudio
staged our changes by clicking the checkboxes next to the relevant files
committed our changes with an informative commit message
pushed our changes to our application exercise repos
confirmed on GitHub that we could see our changes pushed from RStudio