Skip to content

SatvikBhatnagar/Python-Project-for-Data-Engineering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Engineering Portfolio

This repository contains three main projects showcasing my data engineering skills: ETL, Web Scraping and Data Extraction, and SQLite Database Operations.

Projects

1. ETL (Extract, Transform, Load)

This project demonstrates the ETL process which involves extracting data from various file formats, transforming the data, and loading it into a target file.

Features:

  • Extract data from CSV, JSON, and XML files.
  • Transform data by converting units.
  • Load data into a CSV file.
  • Log each phase of the ETL process.

Usage:

  • Run data_extraction.py to execute the ETL process.

2. Web Scraping and Data Extraction

This project involves scraping data from a web page, parsing the HTML to extract specific data points, and storing the extracted data in both CSV and SQLite database formats.

Features:

  • Scrape data from a specified URL.
  • Parse HTML content to extract table data.
  • Save extracted data to a CSV file and an SQLite database.

Usage:

  • Run webscraping.py to perform web scraping and data extraction.

3. SQLite Database Operations

This project demonstrates various database operations using SQLite, including creating tables, inserting data, and querying the database.

Features:

  • Read data from CSV files.
  • Create and populate SQLite tables.
  • Perform SQL queries and operations on the database.
  • Append new data to tables.

Usage:

  • Run sqlite_operations.py to perform database operations.

Requirements

To install the required Python libraries, navigate to the directory containing requirements.txt and run:

pip install -r requirements.txt

Documentation

Documentation

Please go through Docs:

This README.md includes sections for each project, a brief description, usage instructions, and links to the HLD and LLD documents located in the doc directory.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages