A comprehensive SQL query collection focused on data quality validation and bug detection in databases. This project demonstrates real-world QA skills: finding data anomalies, validating integrity, and detecting broken relationships.
This repository showcases SQL skills from a QA Engineer perspective:
- ✅ Detecting invalid data (negative values, nulls, zeros)
- ✅ Finding duplicates and inconsistencies
- ✅ Validating data integrity and relationships
- ✅ Identifying orphaned records and broken references
This is NOT about writing complex queries to show off.
This IS about writing clear, focused queries that find real bugs.
Simple e-commerce database with intentional bugs for QA practice:
users- Customer accountsproducts- Product catalogorders- Purchase orders
orders.user_id → users.id
orders.product_id → products.id
The seed data contains real-world bugs that QA engineers encounter:
- Negative totals and prices
- Zero prices and quantities
- Invalid/future dates
- Empty/null required fields
- Duplicate emails in users
- Potential duplicate orders
- Orphaned orders (user doesn't exist)
- Broken references (product doesn't exist)
- Totals that don't match calculations
Queries are organized by validation type:
queries/
├── 01_basic_validation/ # List, filter, order data
├── 02_data_quality/ # Find invalid data
├── 03_duplicates/ # Detect duplicates
└── 04_data_integrity/ # Validate relationships
- MySQL 8.0+
- MySQL Workbench (or any SQL client)
- Clone the repository:
git clone https://github.com/your-username/sql-for-qa-testing.git
cd sql-for-qa-testing- Create database:
CREATE DATABASE sql_qa_testing;
USE sql_qa_testing;- Run schema:
mysql -u root -p sql_qa_testing < database/schema.sql- Load seed data:
mysql -u root -p sql_qa_testing < database/seed_data.sql- Verify setup:
SELECT COUNT(*) FROM users; -- Should return 10
SELECT COUNT(*) FROM products; -- Should return 10
SELECT COUNT(*) FROM orders; -- Should return 15Each query file is standalone and can be executed independently:
-- Example: Find negative prices
source queries/02_data_quality/find_negative_totals.sqlOr open in MySQL Workbench and execute.
- How to validate data quality systematically
- Writing queries that detect real bugs
- Understanding data relationships and integrity
- Identifying edge cases in datasets
- SQL SELECT with filters and joins
- Aggregate functions for validation
- Subqueries for data verification
- NULL handling and data anomalies
Good QA SQL queries are:
- ✅ Clear - Easy to understand what's being checked
- ✅ Focused - One validation per query
- ✅ Documented - Comments explain the bug being detected
- ✅ Actionable - Results can be directly reported
Avoid:
- ❌ Overly complex queries that are hard to maintain
- ❌ Multiple validations in one query
- ❌ Queries without clear purpose
| Category | Status | Queries |
|---|---|---|
| Basic Validation | ✅ Complete | 3/3 |
| Data Quality | ✅ Complete | 4/4 |
| Duplicates | ✅ Complete | 3/3 |
| Data Integrity | ✅ Complete | 3/3 |
Total: 13 professional QA queries
This project successfully detected:
- 3 orders with invalid totals (negative/zero)
- 2 records with future dates
- 3 products with zero/negative prices
- 1 user with empty name
- 2 duplicate emails (4 users affected)
- 1 orphaned order (user doesn't exist)
- 1 broken reference (product doesn't exist)
Data Integrity: 86.67% (13 out of 15 orders are valid)
This is a personal learning project, but suggestions are welcome!
- GitHub: https://github.com/ChandeDeVargas
- LinkedIn: https://www.linkedin.com/in/chande-de-vargas-b8a51838a/
This project is open source and available under the MIT License.
⭐ If this project helps you learn SQL for QA, give it a star!