This project focuses on analyzing a Spotify music dataset using SQL to understand track performance, artist trends, and engagement metrics. The goal of this project was to strengthen my SQL skills by working with a real-world dataset and applying concepts such as data exploration, aggregation, window functions, and query optimization.
The project covers the complete workflow — from understanding the dataset structure to writing optimized SQL queries and extracting meaningful insights.
-
Source: Kaggle – Spotify Dataset
-
Data Type: Structured CSV data
-
Key Attributes:
-
Artist, Track, Album, Album Type
-
Audio features (danceability, energy, tempo, loudness, etc.)
-
Engagement metrics (views, likes, comments, streams)
-
Flags such as licensed and official_video
-
Database: PostgreSQL
-
Query Language: SQL
-
SQL Concepts Applied:
-
Filtering & Aggregations
-
GROUP BY & HAVING
-
Subqueries & CTEs
-
Window Functions
-
Indexing & Query Optimization
-
Tool: pgAdmin 4
-
Explored and understood the dataset structure and columns
-
Created database tables and imported the dataset
-
Wrote SQL queries categorized into:
-
Easy (basic filtering and counts)
-
Medium (aggregations and grouping)
-
Advanced (CTEs, window functions, optimization)
-
Analyzed query performance using EXPLAIN ANALYZE
-
Improved performance by applying indexing techniques
-
Which tracks have more than 1 billion streams?
-
What are the top 5 tracks with the highest energy levels?
-
How does engagement differ for official vs non-official videos?
-
Which artists and albums generate the highest views?
-
How do audio features like energy and liveness vary across albums?
-
To understand performance tuning, I analyzed query execution plans using EXPLAIN ANALYZE.
-
Measured execution and planning time for queries
-
Created indexes on frequently queried columns (e.g., artist)
-
Compared query performance before and after indexing
-
Observed significant reduction in execution time after optimization
-
This helped me understand how indexing impacts query performance on large datasets.
-
Practical application of SQL for data analysis
-
Writing efficient and optimized queries
-
Using window functions for advanced analytics
-
Understanding query execution plans
-
Working with real-world datasets and business-oriented questions
- Tracks with official videos generally have higher engagement
- High-energy tracks tend to receive more views
- Indexing significantly reduced query execution time
This project enhanced my hands-on experience in SQL and data analysis. It demonstrates my ability to work with structured datasets, write optimized queries, and extract meaningful insights. The project reflects my readiness for entry-level roles in Data Analysis and Business Intelligence.
