MATE: Motion-Conditioned Attention and Temporal-Conditioned Convolutions for Efficient Video Anomaly Detection
This repository implements a compact motion-conditioned architecture for weakly-supervised video anomaly detection . The model conditions attention and pooling on an explicit MotionMap derived from feature differences and operates on precomputed X3D features extracted from UCF-Crime videos.
Quick links
- UCF-Crime dataset information: https://www.crcv.ucf.edu/projects/real-world/
- Extracted X3D features (shared): https://drive.google.com/drive/u/0/folders/18QmmfQZqhRqzFsHM7LiyPksteUzsuBC0
Repository overview
requirements.txt— Python dependenciesfeature_extractor.py— extract & save X3D features from raw videos (save as .npy)makelist.py— generate train/test manifest lists (video_path + label)StratifiedKFold.py— create K folds (stratified) for cross-validationStratifiedKFold_verify.py— verify the class balance in folds and print summarymain.py/train.py— training entry pointtest.py— evaluation / scoring script (produces per-video scores and metrics)test1.py— motion / UMAP / frame-level motion visualizations and statstest2.py— run AUC ROC plots with / without motionmotion_map_analyze.py— motion statistics for normal vs anomaly (entropy, area ratio)snr_experiment.py— synthetic Signal-to-Noise Ratio experimentscripts/— helper scripts (option.py,dataset.py)configs/— example config files (fast / base variants)runs/— (runtime) training logs, checkpoints, predictions per experiment
Environment setup
- Create and activate a Python virtual environment (recommended):
python -m venv .venv
source .venv/bin/activate # Linux / macOS
# .venv\Scripts\activate # Windows- Install dependencies:
pip install -r requirements.txtData preparation (using precomputed features)
- Download the provided X3D feature files from the Drive folder above (or extract your own using
feature_extractor.py) and place them under:
data/ucf_x3d/
Keep a consistent filesystem layout and naming convention for features. Each feature file should be a .npy file representing per-clip or per-video X3D features.
- Feature extraction (if you start from raw videos)
- Run the feature extraction script to convert raw videos into feature .npy files. English: Run the feature extractor with the dataset root and output folder; it will save one .npy per sample (or per clip) depending on configuration. Example:
python feature_extractor.py --video_root path/to/UCF-Crime --out_dir data/ucf_x3d/(See feature_extractor.py --help for flags such as frames-per-clip, sampling rate, backbone checkpoint, etc.)
Create train / test manifests
- Use
makelist.pyto build plain-text manifests listing feature_path and video-level label (0 = normal, 1 = anomalous). English: Generate a list file where each line is: <path_to_feature.npy> Example:
python makelist.py --features_dir data/ucf_x3d/ --out_train ucf_x3d_train.txt --out_test ucf_x3d_test.txtK-fold cross-validation (stratified)
- Create K folds (default K=5) stratified by video-level labels: English: Run the stratified fold generator on your training manifest to produce K train/val manifest pairs. Example:
python StratifiedKFold.py --rgb_list ucf_x3d_train.txt- Verify fold balances:
python StratifiedKFold_verify.py --rgb_list ucf_x3d_train.txtExample verification output (per fold):
Fold 1:
Train -> Normal: 640, Abnormal: 648
Val -> Normal: 160, Abnormal: 162
...
Training Train the model by supplying the training and validation manifest files and choose a model variant (e.g., a compact/fast variant or a larger/base variant). Set batch size, learning rate, epochs, and experiment name. Example:
python main.py \
--train_list data/manifests/fold0_train.txt \
--val_list data/manifests/fold0_val.txt \
--batch_size 16 \
--lr 0.002 \
--max_epoch 150 \
--exp_name "motion_cond_fast" \
--model_variant fastNotes:
- Keep batch size identical across comparative experiments.
- Save configuration (hyperparameters) per run — the script will write
config.jsonin the run folder.
Testing / evaluation English: Evaluate a trained checkpoint on a test manifest to produce per-video anomaly scores and compute video-level metrics (ROC-AUC, PR-AUC). Example:
python test.py \
--test_list data/manifests/fold0_test.txt \
--checkpoint runs/motion_cond_fast/checkpoint_best.pth- The script outputs: per-video prediction scores, ROC-AUC, PR-AUC and can save per-video CSVs for later aggregation.
K-fold evaluation & aggregation
- Run training + testing for each fold (0..K-1):
- Train on
foldX_train.txt, validate onfoldX_val.txt, and test onfoldX_test.txt.
- Train on
- Save per-fold test predictions and metrics.
- Aggregate: compute mean ± std for validation and test ROC-AUC and PR-AUC across folds. Use the provided bootstrap script for CIs and paired tests.
Example fold run loop (conceptual):
for fold in 0 1 2 3 4; do
python main.py --train_list data/manifests/fold${fold}_train.txt --val_list data/manifests/fold${fold}_val.txt --exp_name exp_fold${fold} ...
python test.py --test_list data/manifests/fold${fold}_test.txt --checkpoint runs/exp_fold${fold}/checkpoint_best.pth
doneAblation study Remove or disable components and retrain / re-evaluate to measure their effect on final performance. Example components to ablate: MotionMap, Attention Block, Temporal-Conditioned Conv (TCC) variants, Motion-Aware Pooling, Triplet Loss. Below is a sample ablation summary (example numbers — replace with your results):
| Component Removed | # Params (approx) | ROC-AUC |
|---|---|---|
| Motion Map | 0.5M | 0.85 |
| Attention Block | 1.2M | 0.87 |
| TCC (w/ Multiscale) | 0.8M | 0.86 |
| TCC (w/ Motion Gating) | 0.8M | 0.86 |
| TCC (no extras) | 0.8M | 0.88 |
| TCC (Multiscale + Motion Gating) | 0.8M | 0.84 |
| Triplet Loss | 0.1M | 0.85 |
| Motion-Aware Temporal Pooling Block | 0.05M | 0.82 |
How to run an ablation:
- Modify the model config or CLI flags to disable a component (e.g.,
--use_motion_gating False) and re-run the training and testing steps for the same folds. Save results per fold and aggregate.
Motion diagnostics & visualization
-
Compare AUC with and without motion-based conditioning:
- Use the script
test2.pyto produce paired ROC curves and AUC comparison (motion ON vs OFF). Example:
python test2.py
- Use the script
-
UMAP and frame-level motion plots:
test1.pyvisualizes embedding spaces (UMAP) and plots normal vs anomalous motion trajectories and per-frame motion statistics. Example:
python test1.py
-
Motion statistics per-class:
motion_map_analyze.pyprints motion statistics such as Motion Entropy and Motion Area Ratio for normal vs anomaly classes and plots motion over frames. Example output:
[NORMAL] Motion Entropy : 7.2806 Motion Area Ratio : 0.0100 [ANOMALY] Motion Entropy : 7.2930 Motion Area Ratio : 0.0169
Synthetic SNR experiment
- Use
snr_experiment.pyto run controlled experiments showing how motion-weighted pooling increases SNR and improves detection in a synthetic dataset where anomalies correlate with motion. Example:
python snr_experiment.pyRecommended command-line help
- For each script run:
python script.py --helpto see available flags. The examples above use common flags; consult the script's --help for exact flag names.
Logging & outputs
- Each run writes a folder under
runs/{exp_name}/containing:config.json(full hyperparameters)checkpoint_best.pthtrain.log/metrics.csvpreds_fold{X}.npy(per-fold predictions)plots/(training curves, ROC/PR curves)
- Keep all run folders and compress or upload the relevant ones when sharing results.
Licensing & citation
Acknowledgements & dataset notice
- The UCF-Crime dataset and its authors retain rights to the data; follow dataset terms and cite the original sources.
- The precomputed X3D feature folder linked above is provided for convenience in reproducing experiments;
Appendix: useful commands summary (examples)
- Extract features:
python feature_extractor.py --video_root /path/to/UCF-Crime --out_dir data/ucf_x3d/- Make train/test lists:
python makelist.py --features_dir data/ucf_x3d/ --out_train ucf_x3d_train.txt --out_test ucf_x3d_test.txt- Create stratified 5-folds:
python StratifiedKFold.py --rgb_list ucf_x3d_train.txt --k 5 --out_dir data/manifests/
python StratifiedKFold_verify.py --rgb_list ucf_x3d_train.txt --folds_dir data/manifests/- Train (example):
python main.py --train_list data/manifests/fold0_train.txt --val_list data/manifests/fold0_val.txt --batch_size 16 --lr 0.002 --max_epoch 150 --exp_name motion_cond_fast --model_variant fast- Test (example):
python test.py --test_list data/manifests/fold0_test.txt --checkpoint - Run motion diagnostics:
python motion_map_analyze.py- Synthetic SNR experiment:
python snr_experiment.py