DA 101, Dr. Ladd
Week 2
It’s the code you write.
somedata <- read.csv("yourfile.csv")
When you “Open R”, you’re really opening RStudio.
You set the Working Directory to tell R which folder/directory to use on your computer. You can:
setwd('path/to/working_directory')
myvar <- 5
myvar
Create a variable called “newVar” that is equal to the value of five plus seven. Then display your variable to see what its value is.
i_use_snake_case
otherPeopleUseCamelCase
some.people.use.periods
And_aFew.People_RENOUNCEconvention
Comments in R begin with a #
symbol.
# This variable contains a continuous value
some_variable <- 2.5
"five"
)5
)5.0
)stringvar <- "five"
typeof(stringvar)
These are “.r” files, and you can create them by clicking the +
icon at the top left of the RStudio window and selecting “R Script”.
Always comment your code so you can remember things when you come back later!
GarlicMustardData <- read.csv("GarlicMustardData.csv")
# Browse a data frame with the view function.
View(GarlicMustardData)
# Get summary statistics for every column in a data frame.
summary(GarlicMustardData)
# Access a specific column of a data frame.
GarlicMustardData$AvgAdultHeight
Import the “GarlicMustardData.csv” file. Get a summary for all of the data, and then get a summary for just the AvgNLeaves column. Bonus: find the type of that column.
n.b. Remember to make sure your working directory is set to the right place!
R has many built-in functions.
# Some functions give a number result
mean(GarlicMustardData$AvgNFruits, na.rm=TRUE)
# Other functions graph things
plot(GarlicMustardData$AvgAdultHeight,GarlicMustardData$bio12)
Functions can do just about anything: calculate values, create graphs, transform data, etc.
myfunction <- function(arg1, arg2, ... ){
statements
return(object)
}
A real example:
get_last_value <- function(some_list){
return(some_list[length(some_list)])
}
# Install libraries only once
install.packages("packagename")
# Load a library every time you run your code
library(packagename)
Install and load the tidyverse
package. Then install and load the swirl
package. Can you tell what these two packages do?
mpg
dataset, view it, and get a summarympg
and get its mean and medianmpg
variables (columns)