Skip to content

API data extraction and normalization pipeline for Nobel Prize data. Transforms nested JSON to pandas DataFrames using JSONPath-style navigation and relationship modeling.

License

Notifications You must be signed in to change notification settings

mbarbag/nobel-prize-api-data-engineering

Repository files navigation

Nobel prize API data extraction & transformation

Project that extracts Nobel Prize data from APIs, normalizes nested JSONs into dataframes, and explores many-to-many relationships between prizes and laureates.

Key Technical Skills

  • API Data Extraction - Data ingestion with error handling
  • Data Normalization - JSONPath-style navigation for nested structures
  • Data Modeling - Many-to-many relationship handling between prizes and laureates

Data Source

This project use the official Nobel Prize API v2.1 provided by NobelPrize.org.

Usage

  1. Clone the repository:

    git clone https://github.com/yourusername/nobel-prize-api-data-engineering.git
    cd nobel-prize-api-data-engineering
  2. Open the Jupyter notebook:

    jupyter notebook nobel-prize-api-data-engineering.ipynb
  3. Run all cells to execute the complete data processing workflow

Future Enhancements

  • Save processed data to a cloud data-lake
  • Add data type conversion and validation
  • Implement incremental data updates

About

API data extraction and normalization pipeline for Nobel Prize data. Transforms nested JSON to pandas DataFrames using JSONPath-style navigation and relationship modeling.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published