A comprehensive data visualization project demonstrating various charting techniques using Matplotlib, Seaborn, and Plotly. This project analyzes educational platform user behavior data through multiple visualization approaches, from basic line charts to interactive animations.
This project processes student behavior data from an educational platform database to create meaningful visualizations. Each exercise builds on the previous one, exploring different aspects of the data through various chart types and styling approaches.
Basic line chart showing daily page views using Pandas plotting functionality
Dual-line chart comparing page views and commits with custom styling
Bar chart analyzing commit patterns across different times of day
Comparative bar charts showing weekday vs weekend commit patterns
Overlapping histograms comparing commit distributions using Matplotlib
Box plot analysis of control vs test group behavior with custom styling
Scatter matrix visualization showing correlations between multiple variables
Heatmap visualizations for temporal commit patterns across users
Advanced styling with Seaborn for project commit dynamics
Interactive animated line race chart using Plotly
• SQL queries with SQLite3 for database operations and filtering non-admin users
• Pandas DataFrame operations including grouping, aggregation, and time-based filtering
• Time-based data categorization using datetime operations to classify commits by time periods
• Multi-index operations for complex data grouping and analysis
• Custom color palettes and transparency effects using matplotlib alpha parameter
• Figure sizing and font customization with matplotlib's rcParams configuration
• Subplot positioning and layout management using matplotlib's figure and axes system
• KDE (Kernel Density Estimation) plots for distribution visualization
• Scatter matrix generation using pandas.plotting.scatter_matrix
• Heatmap creation with custom colormaps and axis labeling
• Animation frameworks using Plotly's graph_objects for interactive charts
• Pandas - Advanced data manipulation and analysis
• SQLite3 - Database connectivity and querying
• NumPy - Numerical computing for data processing
• Matplotlib - Foundational plotting library with extensive customization options
• Seaborn - Statistical data visualization with built-in themes and color palettes
• Plotly - Interactive plotting library for web-based visualizations
• mpl_toolkits.axes_grid1 - Advanced subplot positioning and colorbar management
• Custom transparency handling for overlapping visualizations
├── src/
│ ├── ex00/
│ ├── ex01/
│ ├── ex02/
│ ├── ex03/
│ ├── ex04/
│ ├── ex05/
│ ├── ex06/
│ ├── ex07/
│ ├── ex08/
│ └── ex09/
├── data/
└── misc/
└── images/
src/: Contains all Jupyter notebooks organized by exercise number, each demonstrating different visualization techniques and data analysis approaches
data/: Stores the SQLite database and CSV files used for analysis. The main database contains pageviews and checker tables with student activity data
misc/images/: Reference images showing expected output for each visualization exercise
The project demonstrates progression from basic static charts to advanced interactive visualizations. Each exercise explores different aspects of the same dataset, showing how visualization choice affects data interpretation. The final Plotly exercise creates an animated line race chart that dynamically shows commit progress over time.
The visualizations reveal patterns like peak usage hours, weekend vs weekday behavior differences, and user engagement metrics that would be difficult to spot in raw data tables.