School Immunizations

The goal for this project is to format county-level data on school vaccinations obtained from the states. As a worked example see the Arizona (AZ) project.

Getting started

instal the dcf packages install.packages("remotes") remotes::install_github("dissc-yale/dcf")

Working on the project

go to ./data/ and find the state you are working on. Open the folder and click the .rproj file to open the project
Save any raw data in the /raw subfolder
Open the ingest.R script. This is where you out the script to clean the data. Import all raw files, process so that the output data should have the following columns :

geography: (5 digit fips for county, 2 digit fips for state)
geography_name: name of the county, or 'Total' for statewide total
time (YYYY-09-01), where YYYY is the 4 digit year for the start of the school year
pct_XX the percent of children immunized (0-100). XX should be replaced by the state abbreviation. Numeric variable
N_XX the number of children in the population (denominator). Numeric variable
vax: the name of the vaccine or exemption category:
- "dtap"
- "polio"
- "mmr"
- "hep_b"
- "varicella"
- "personal_exempt"
- "medical_exempt"
- "full_exempt"

The ingest script shoudl have this structure:

library(dcf)
library(tidyverse)

# check raw state
raw_state <- as.list(tools::md5sum(list.files(
  "raw", "csv", recursive = TRUE, full.names = TRUE
)))
process <- dcf::dcf_process_record()

# process raw if state has changed
if (!identical(process$raw_state, raw_state)) {

 
 ## CODE TO CLEAN RAW DATA 
 ##xxxx
  
  
  #Save standard file as a compressed csv
  vroom::vroom_write(data, './standard/data.csv.gz')
  
  # record processed raw state
  process$raw_state <- raw_state
  dcf::dcf_process_record(updated = process)
}

save the formatted dataset as a compressed csv file in the /standard subfolder
Update the measure_info.json file for the project to incclude descriptions of all variables. I recommend editing the jsons in Microsoft Visual Studio to ensure proper formatting
run dcf::dcf_process("XX") to process individual datasets, substituting the state abbreviation for XX. This should be done withint the state-specific project
run dcf::dcf_build() form the root directory (set working directory to school_immunizations. This builds the whole project
You can see the datasets and their relationships here: https://github.com/PopHIVE/school_immunizations/blob/main/status.md
See all processed files at: https://dissc-yale.github.io/dcf/report/?repo=PopHIVE/school_immunizations

Other notes:

This is set up as a Data Collection Framework project, initialized with dcf::dcf_init.

You can use the dcf package to check the source projects:

dcf::dcf_check_source()

And process them:

dcf::dcf_process()

Name		Name	Last commit message	Last commit date
Latest commit History 157 Commits
.github/workflows		.github/workflows
data		data
resources		resources
scripts		scripts
.gitignore		.gitignore
README.md		README.md
file_log.json		file_log.json
renv.lock		renv.lock
report.json.gz		report.json.gz
settings.json		settings.json
status.md		status.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

School Immunizations

Getting started

Working on the project

About

Uh oh!

Releases

Packages

Contributors 4

Uh oh!

Languages

PopHIVE/school_immunizations

Folders and files

Latest commit

History

Repository files navigation

School Immunizations

Getting started

Working on the project

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Uh oh!

Languages

Packages