Counter-Strike: Global Offensive Champions Match Tracker

A MongoDB-based data science project that tracks and analyses CS:GO championship matches using web scraping and aggregation pipelines.

Project Overview

During the COVID-19 lockdown, Counter-Strike: Global Offensive (CS:GO) served as more than a game — it was a way to connect with friends. Years later in university, inspired by those shared experiences and a memorable esports tournament (PGL Major Antwerp 2022), this project explores CS:GO match data by sourcing, storing, and analysing statistics in a structured MongoDB environment.

Dataset Description

Source: Scraped from bo3.gg
Matches Analysed: 2022 PGL Major matches
Data Type: Match metadata, teams, players, maps, stats, and simulated commentary

Schema Design

The project uses six MongoDB collections:

`matches`

match_id, title, tournament, date
teams: list of team IDs
score, map_ids

`teams`

team_id, team_name
players: list of player IDs
tournament

`players`

player_id, name, tournament, team_id

`maps`

map_id, name, times_played

`player_stats`

match_id, player_id, kills, deaths, assists

`commentary`

commentary_id, match_id, timestamp, text, tags

All keys are uniquely generated using UUIDs for consistency and ease of reference.

Data Ingestion

Web scraper built using:
- BeautifulSoup4, requests, uuid, pymongo
Robust error-handling and duplicate prevention mechanisms
HTML parsing logic refined over 30+ iterations

Live Commentary Simulation

Since timestamped highlights were unavailable, a 2-minute mock simulation of live commentary was scripted. See the .ipynb notebook for execution instructions.

Data Verification Functions

Implemented in Python with pymongo:

verify_single_team_per_player()
verify_teams_in_matches()
verify_player_stats_references()
verify_maps_in_matches()
verify_commentary_matches()

These ensure referential integrity across the dataset.

Aggregation Queries

Full Match Summary
Nested lookups and array queries to compile match data, participating teams, players, and commentary.
Top KDA Players
Identifies top players by computing KDA = (kills + assists) / deaths.
Top Performer per Team/Match
Lists the highest KDA scorer from each team in every match.

Additional explanation is available in the Jupyter Notebook.

Project Files

match_tracker.ipynb: Code for scraping, ingestion, and aggregation
README.md: This file
data/: (optional) Any manually backed up datasets

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
LICENSE		LICENSE
pymongo.ipynb		pymongo.ipynb
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Counter-Strike: Global Offensive Champions Match Tracker

Project Overview

Dataset Description

Schema Design

`matches`

`teams`

`players`

`maps`

`player_stats`

`commentary`

Data Ingestion

Live Commentary Simulation

Data Verification Functions

Aggregation Queries

Project Files

References

About

Uh oh!

Releases

Packages

Languages

License

olicastrol/CSGO-Match-Tracker

Folders and files

Latest commit

History

Repository files navigation

Counter-Strike: Global Offensive Champions Match Tracker

Project Overview

Dataset Description

Schema Design

matches

teams

players

maps

player_stats

commentary

Data Ingestion

Live Commentary Simulation

Data Verification Functions

Aggregation Queries

Project Files

References

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`matches`

`teams`

`players`

`maps`

`player_stats`

`commentary`

Packages