Subset the chicago_air dataset to only include records where the temperature is below 30 degrees and the pressure is above 1000 hPa. Use the filter() function from the dplyr package.
Click for Solution
Load the
dplyrpackage usinglibrary()and use logical expressions to get records wheretempis less than 30 andpressureis greater than 1000.
library(dplyr)
cold_high_pressure <- filter(chicago_air, temp < 30, pressure > 1000)
cold_high_pressureCreate a new column in the chicago_air data frame called ozone_category that categorizes ozone levels into "Low", "Moderate", and "High". Use the mutate() function from the dplyr package.
Click for Solution
Use
mutate()to create a new columnozone_categorybased on conditional statements forozonevalues.
library(dplyr)
chicago_air <- mutate(chicago_air,
ozone_category = case_when(
ozone < 0.030 ~ "Low",
ozone >= 0.030 & ozone < 0.060 ~ "Moderate",
ozone >= 0.060 ~ "High"
))
head(chicago_air)Sort the chicago_air data frame by temperature in ascending order and then by pressure in descending order. Use the arrange() function from the dplyr package.
Click for Solution
Use
arrange()withtempanddesc(pressure)to sort the data frame.
library(dplyr)
sorted_air <- arrange(chicago_air, temp, desc(pressure))
head(sorted_air)Use the apply() function to create a vector of the standard deviation values from the numeric columns in the chicago_air data frame.
Click for Solution
Subset the
chicago_airdata frame to the numeric columns and use theapply()function withsd.
chicago_numeric <- chicago_air[, c("ozone", "temp", "pressure")]
sd_values <- apply(chicago_numeric, MARGIN = 2, FUN = sd, na.rm = TRUE)
sd_valuesWrite a function called temp_range that takes a data frame and returns the difference between the maximum and minimum temperature values.
Click for Solution
Define the function
temp_rangeand usemax()andmin()to calculate the range of temperatures.
temp_range <- function(data) {
max(data$temp, na.rm = TRUE) - min(data$temp, na.rm = TRUE)
}
temp_range(chicago_air)Create a function named subset_by_weekday that takes a data frame and a weekday (as an integer) as arguments and returns a subset of the data frame for that weekday.
Click for Solution
Define the function
subset_by_weekdayto filter the data frame based on theweekdaycolumn.
subset_by_weekday <- function(data, weekday) {
filter(data, weekday == weekday)
}
subset_by_weekday(chicago_air, 3)Use the dplyr package to create a new data frame that groups the chicago_air data by month and calculates the average temperature for each month.
Click for Solution
Use
group_by()andsummarize()to group bymonthand calculate the averagetemp.
library(dplyr)
monthly_avg_temp <- chicago_air %>%
group_by(month) %>%
summarize(avg_temp = mean(temp, na.rm = TRUE))
monthly_avg_tempWrite a function called convert_date that takes a data frame and converts the date column from a character to a Date object.
Click for Solution
Define the function
convert_dateand useas.Date()to convert thedatecolumn.
convert_date <- function(data) {
data$date <- as.Date(data$date)
return(data)
}
chicago_air <- convert_date(chicago_air)
str(chicago_air)Use a for() loop to create a vector of average ozone levels for each month in the chicago_air data frame.
Click for Solution
Loop through each month, filter the data, and calculate the mean ozone level.
average_ozone <- c()
for (month in 1:12) {
monthly_data <- filter(chicago_air, month == month)
average_ozone[month] <- mean(monthly_data$ozone, na.rm = TRUE)
}
average_ozoneCombine the warm and cool data frames from the lesson using the rbind() function instead of bind_rows().
Click for Solution
Use
rbind()to combinewarmandcooldata frames.
recombined_rbind <- rbind(warm, cool)
nrow(recombined_rbind) == nrow(chicago_air)