This repository contains five main folders, each organized according to its functionality:
bug_fixing_on_production.py
*Contains scripts created to address and resolve production-related issues.
database_and_schema_manipulation_script.py
*Includes scripts for modifying, updating, and maintaining database schemas and related structures.
insert_user_registration.py
*Updates user registration–related data received via google form.
monitoring_data_pipeline.py
*Implements the data pipeline for scraping Graphy LMS data, transforming it into structured formats, and updating monitoring tables in the raw schema, intermediate schema and final schema.
old_data_insertion_scripts.py
*Stores legacy scripts used for initial or past data insertion tasks.
Each folder contains its own README file that provides more detailed information about the scripts located inside it, including their purpose and execution steps.
All folders consistently reference a shared configuration file: config.env, located at the root of the repository. This file defines the database connection parameters:
DB_HOST='my_db_host'
DB_NAME='my_db_name'
DB_USER='my_db_user'
DB_PASSWORD='my_db_password'
DB_PORT='my_db_port'
Important: Replace these placeholder values with the actual production credentials before running any scripts.
├── bug_fixing_on_production/
└── README.md
└── init.py
└── Fix_district_state_empty_value.py
└── INC7_upcoming_incubators_monitoring_data_insertion.py
└── assign_missing_ids.py
└── clean_emails.py
└── delete_emails_from_sheet.py
└── delete_student_id.py
└── remove_dupe_student_details.py
└── sql_update_script.py
├── database_and_schema_manipulation_script/
└── README.md
└── init.py
└── Add_data_to_new_column.py
└── add_column.py
└── alter_table_and_create_enum.py
└── create_db_and_db_schema_script.py
└── create_enum.py
└── create_final_tables_with_schema.py
└── create_raw_intermediate_indexes.py
├── insert_user_registration/
└── README.md
└── init.py
└── insert_new_data.py
├── monitoring_data_pipeline/
└── README.md
└── init.py
└── post_cohort_repeatative_script
└── init.py
└── monitoring_data_raw_schema_tables_update_script.py
└── raw_schema_to_intermediate_upsert_script.py
└── upsertion_intermediate_to_final.py
└── pre_cohort_non_repeatative_script
└── init.py
└── Add_new_cohorts_names_for_upcoming_cohort.py
└── Update_incubator_name_based_on_email.py
├── old_data_insertion_scripts/
└── README.md
└── init.py
└── data_insertion_script.py
└── table_creation.py
└── load_csvs_to_db.py
└── update_clean_data.py
└── update_course_name_INC_7_script.py
└── update_script_location_id.py
└── init.py
├── config.env
├── connection.py
└── README.md ← (this file)
└── requirements.txt
Run all scripts from the project root i.e deployment_scripts using module syntax:
python -m deployment_scripts..<script_name_without_py>
eg: for scripts inside subfolder - bug_fixing_on_production: python -m deployment_scripts.bug_fixing_on_production.<script_name_without_py>