Skip to content
@HigherEdData

The Higher Education DataHub

A project of the Higher Education Race and the Economy (HERE) Lab at UC Merced.

Hi there 👋

The Higher Ed DataHub publishes code for linking and analyzing organizational and soci-economic data for US higher education institutions, students, businesses, business leaders, and consumers. We strive to publish code that is not just open, but legible and accessible for a wide range of researchers, practitioners, and students from diverse backgrounds and knowledge bases. Legible code paired with succinct, clear explanation can provide tools for these communities to more rapidly build on, adapt, modify and even correct our data and analyses without having to reinvent the wheel. We hope this approach can help shift open science from a "gotcha" culture to a diverse, collaborative culture. This sort of culture shift promises to accelerate social scientific discovery.

We are building this plane as we fly it. As part of this approach, we have published much of our code and data while we are still working to make them more navigable and legible for visitors like you. If you have ideas or questions, please let us know by submitting an issue in any of our repositories.

In the meantime, here are some tips for using our repositories:

  • Each repository has topics tags for the datasets used in the repository. Click on the Repositories tab above to browse all of our repositories and their topic tags.
  • Click on a blue topics tag for a given dataset to get a list of all our repositories that use that dataset.
  • Each repository contains code and data for a particular paper or book project. When the project uses proprietary or restriced use data, the repository includes only code for analyzing the data. You can then use the code if you receive access from the publisher of the source data.
  • Each repository has a data folder that should include .csv and .dta (Stata) data files for the public use datasets used in the project.
  • We currently publish only .do and .ipnyb files with Stata code for using our data. The .ipnyb Jupyter Notebook files contain Stata code and can be used with a Stata kernel for Jupyter. For details see: https://kylebarron.dev/stata_kernel/. In the future, we hope to publish R or Python code as an open source option for using our data.
  • The code and outputs for any .ipnyb can be viewed in your web browser by just clicking on the .ipnyb file link within GitHub.
  • At the bottom of the main page of each DataHub Repository, a README.md file should display with details on which code and data files use which datasets.
  • Each Repository will eventually include a web browseable .ipnyb Notebook with the prefix d_vardef for each dataset in the project that will display a list of variables and variable definitions for the given dataset.

Pinned Loading

  1. Asymmetry-by-Design-in-For-Profit-Higher-Education Asymmetry-by-Design-in-For-Profit-Higher-Education Public

    By Adam Goldstein and Charlie Eaton ** DATA: For-profit News Article DataHub Data * Private Equity and College Ownership DataHub Data * Borrower Defense Complaint DataHub Data * IPEDS Completions *…

    Jupyter Notebook

  2. BankersInTheIvoryTower BankersInTheIvoryTower Public

    By Charlie Eaton ** DATA: OMB Federal Higher Education Spending Outlays Over Time * Grapevine State Higher Education Spending Over Time * College Board Student Debt Aggregate Data Over Time * Colle…

    Jupyter Notebook 1

  3. The-For-Profit-Side-of-Public-U The-For-Profit-Side-of-Public-U Public

    By Heather Daniels, Christian Smith, Laura T. Hamilton, and Charlie Eaton ** DATA: Online Program Manager Contracts DataHub Data * Online Program Manager Ownership DataHub Data

    Jupyter Notebook

Repositories

Showing 10 of 25 repositories

Top languages

Loading…

Most used topics

Loading…