Visualizing the T: MBTA Rider Satisfaction Score

About

An interactive analytics platform that computes a comprehensive Rider Satisfaction Score (RSS) for the Massachusetts Bay Transportation Authority (MBTA) by combining operational performance metrics with passenger survey data.

Team Members: Dhanrithii Deepa, Eric Fu, Yuchen Cai
Live Demo: MBTA RSS Streamlit App

Project Goal

Develop a data-driven Rider Satisfaction Score that:

Reflects both perceived experience (survey data) and operational performance (travel times, restrictions, crowding)
Enables equity analysis across demographic groups and service areas
Provides actionable insights at station/line/time-period granularity for MBTA service planning

How to Run

Prerequisites

Python 3.9+
pip or conda

Installation

# Clone repository
git clone https://github.com/urverse/Visualizing-the-T.git
cd Visualizing-the-T

# Install dependencies
pip install -r requirements.txt

Run Analysis Notebooks

# Start Jupyter
jupyter notebook

# Run notebooks in order:
# 1. notebooks/01_data_exploration.ipynb
# 2. notebooks/02_indicator_calculation.ipynb
# 3. notebooks/03_rss_computation.ipynb
# 4. notebooks/04_advanced_analytics.ipynb
# 5. notebooks/05_statistical_validation.ipynb

Run Streamlit Dashboard

streamlit run app/streamlit_app.py

Methodology

Data Sources

MBTA 2024 Passenger Survey - Rider satisfaction ratings and demographics
Rapid Transit Travel Times - Operational performance metrics
Fall 2024 SDP Ridership - Exposure weights by station/line/time
Speed Restrictions by Day - Service disruptions (slow zones)

Approach

1. Indicator Calculation

Travel time reliability (P90/median ratio)
Travel time volatility (coefficient of variation)
Speed restriction coverage (% of line restricted)
Crowding levels (ridership/capacity ratio)
Actual survey satisfaction scores

2. RSS Computation

Literature-based weights applied to normalized indicators
Ridge regression validates weights against survey satisfaction
Exposure-weighted aggregation using ridership data

3. Statistical Validation

ANOVA hypothesis testing for line comparisons
Bootstrap confidence intervals (95% CI)
Equity analysis across income/race demographics

Key Findings

Orange Line achieved highest RSS (73.2) despite serving predominantly low-income riders
Methodology is equity-neutral - no systematic bias against underserved communities
Reliability is strongest satisfaction driver (β = 0.42, p < 0.001)
Statistical validation confirms robust model performance (R² = 0.68)

Results & Outputs

Generated Artifacts

data/final/rss_scores.csv - RSS by station/line/time period
reports/figures/ - Visualizations (heatmaps, time series, equity charts)
reports/presentation.pdf - Final academic presentation

Dashboard Features

Interactive line/station filtering
Time-period comparisons
Equity analysis breakouts
Indicator contribution charts

Repository Structure

Visualizing-the-T/
├── data/
│   ├── raw/              # Original MBTA datasets
│   ├── processed/        # Cleaned data
│   └── final/            # RSS scores
├── notebooks/            # Analysis notebooks (01-05)
├── src/                  # Python modules
├── app/                  # Streamlit dashboard
├── reports/              # Figures and presentation
├── config/               # Weights configuration
└── requirements.txt

Dependencies

Data Processing: pandas, pyspark, numpy
Modeling: scikit-learn, scipy, statsmodels
Visualization: matplotlib, seaborn, plotly
Dashboard: streamlit
Configuration: pyyaml

See requirements.txt for full list with versions.

Team Contributions

Eric Fu: Data engineering, PySpark ETL pipelines, infrastructure setup
Yuchen Cai: Feature engineering, Streamlit app development, dashboard design
Dhanrithii Deepa: Statistical validation, equity analysis, hypothesis testing

License

MIT License - see LICENSE file for details

Acknowledgments

Data provided by MBTA Open Data Portal

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
.streamlit		.streamlit
data		data
etl		etl
pages		pages
planning		planning
src		src
.gitignore		.gitignore
Home_Page.py		Home_Page.py
README.md		README.md
app.py		app.py
data_inventory.md		data_inventory.md
reproduce_match.py		reproduce_match.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Visualizing the T: MBTA Rider Satisfaction Score

About

Project Goal

How to Run

Prerequisites

Installation

Run Analysis Notebooks

Run Streamlit Dashboard

Methodology

Data Sources

Approach

Key Findings

Results & Outputs

Generated Artifacts

Dashboard Features

Repository Structure

Dependencies

Team Contributions

License

Acknowledgments

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

urverse/Visualizing-the-T

Folders and files

Latest commit

History

Repository files navigation

Visualizing the T: MBTA Rider Satisfaction Score

About

Project Goal

How to Run

Prerequisites

Installation

Run Analysis Notebooks

Run Streamlit Dashboard

Methodology

Data Sources

Approach

Key Findings

Results & Outputs

Generated Artifacts

Dashboard Features

Repository Structure

Dependencies

Team Contributions

License

Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages