Project that extracts Nobel Prize data from APIs, normalizes nested JSONs into dataframes, and explores many-to-many relationships between prizes and laureates.
- API Data Extraction - Data ingestion with error handling
- Data Normalization - JSONPath-style navigation for nested structures
- Data Modeling - Many-to-many relationship handling between prizes and laureates
This project use the official Nobel Prize API v2.1 provided by NobelPrize.org.
-
Clone the repository:
git clone https://github.com/yourusername/nobel-prize-api-data-engineering.git cd nobel-prize-api-data-engineering -
Open the Jupyter notebook:
jupyter notebook nobel-prize-api-data-engineering.ipynb
-
Run all cells to execute the complete data processing workflow
- Save processed data to a cloud data-lake
- Add data type conversion and validation
- Implement incremental data updates