Speedup For EDistance-like distances #880

selmanozleyen · 2025-11-12T00:40:41Z

New numba kernels that speed up distances that are based on the mean.

Here are my local benchmark results. The pairwise_mean doesn't speed-up with numba much but it's more memory efficient so I will keep it that way. We can maybe get into discussion about why sklearn mean pairwise is faster in a topic. Here is the benchmark script: https://gist.github.com/selmanozleyen/f78e2f87661348615bad03485935fcf0

Update: with fastmath its faster


--- n_samples=500, n_features=50 ---
Edistance:
  pertpy:      0.98 ms ± 0.05
  sklearn:     4.00 ms ± 0.47
  Speedup: 4.1x
  Results match: True
MeanPairwiseDistance:
  pertpy:      0.42 ms ± 0.02
  sklearn:     0.61 ms ± 0.02
  Speedup: 1.5x
  Results match: True

--- n_samples=1000, n_features=50 ---
Edistance:
  pertpy:      3.44 ms ± 0.14
  sklearn:    13.74 ms ± 1.64
  Speedup: 4.0x
  Results match: True
MeanPairwiseDistance:
  pertpy:      1.90 ms ± 0.22
  sklearn:     2.88 ms ± 0.53
  Speedup: 1.5x
  Results match: True

--- n_samples=2000, n_features=50 ---
Edistance:
  pertpy:     13.52 ms ± 0.72
  sklearn:    50.16 ms ± 0.53
  Speedup: 3.7x
  Results match: True
MeanPairwiseDistance:
  pertpy:      6.25 ms ± 0.46
  sklearn:     9.44 ms ± 0.39
  Speedup: 1.5x
  Results match: True

--- n_samples=5000, n_features=50 ---
Edistance:
  pertpy:     82.86 ms ± 1.96
  sklearn:   368.52 ms ± 9.17
  Speedup: 4.4x
  Results match: True
MeanPairwiseDistance:
  pertpy:     38.79 ms ± 1.85
  sklearn:    66.18 ms ± 0.62
  Speedup: 1.7x
  Results match: True

codecov-commenter · 2025-11-12T01:00:39Z

Codecov Report

❌ Patch coverage is 69.62963% with 41 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.88%. Comparing base (12897e1) to head (cbfac60).
⚠️ Report is 15 commits behind head on main.

Files with missing lines	Patch %	Lines
pertpy/tools/_distances/_distances.py	68.93%	41 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #880      +/-   ##
==========================================
- Coverage   73.54%   71.88%   -1.67%     
==========================================
  Files          48       48              
  Lines        5613     5733     +120     
==========================================
- Hits         4128     4121       -7     
- Misses       1485     1612     +127

Files with missing lines	Coverage Δ
pertpy/tools/_mixscape.py	`85.29% <100.00%> (ø)`
pertpy/tools/_distances/_distances.py	`86.29% <68.93%> (-4.29%)`	⬇️

... and 7 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

selmanozleyen · 2025-11-30T08:43:50Z

Ok I found out why the test fail. I checked myself and when I directly run this cell on the main branch I get Distance.precompute_distances() got an unexpected keyword argument 'verbose'

import matplotlib.pyplot as plt
import numpy as np
import pertpy as pt
import scanpy as sc
from seaborn import clustermap
adata = pt.dt.distance_example()
obs_key = "perturbation"  # defines groups to test

Turns out the notebook was giving this invalid kwarg but it didn't fail because the older implementation would precompute the distances and it wouldn't hit.

if f"{self.obsm_key}_{self.cell_wise_metric}_predistances" not in adata.obsp:
     self.precompute_distances(adata, n_jobs=n_jobs, **kwargs)

while it hits in newer implementation because it doesn't precompute the whole distance matrices in this cell anymore

distance = pt.tl.Distance(metric="euclidean", obsm_key="X_pca")
df = distance.pairwise(adata, groupby=obs_key)

So https://github.com/scverse/pertpy-tutorials/blob/main/distances.ipynb would need updating. That's why I am against kwargs and even if they are used all of the items in there should be checked if they are being passed or not.

Zethson · 2025-11-30T19:44:40Z

That's why I am against kwargs and even if they are used all of the items in there should be checked if they are being passed or not.

Yes, I agree with you. We can happily change that.

So https://github.com/scverse/pertpy-tutorials/blob/main/distances.ipynb would need updating.

Are you willing to make this change or do you want me to do it? Ideally, we'd include the update commit in this PR.

Thank you!

Zethson

Thank you very very much for your work!

I really appreciate that you took the time to make the code much more explicit and clearer.

pertpy/tools/_distances/_distances.py

Co-authored-by: Lukas Heumos <lukas.heumos@posteo.net>

Signed-off-by: Lukas Heumos <lukas.heumos@posteo.net>

Co-authored-by: Lukas Heumos <lukas.heumos@posteo.net>

Signed-off-by: Lukas Heumos <lukas.heumos@posteo.net>

…pertpy into speedup/edistance

init

9a50824

selmanozleyen changed the title ~~init~~ Speedup For EDistance like Distances Nov 12, 2025

use a kernel

777d3d6

selmanozleyen changed the title ~~Speedup For EDistance like Distances~~ Speedup For EDistance-like distances Nov 12, 2025

Zethson marked this pull request as draft November 12, 2025 10:46

selmanozleyen and others added 3 commits November 29, 2025 23:46

Merge branch 'main' into speedup/edistance

5828639

kwargs for pairwisedistance

81667dd

fix n_pairs blunder

825b4d6

selmanozleyen marked this pull request as ready for review November 30, 2025 08:43

selmanozleyen mentioned this pull request Dec 1, 2025

remove verbose option in pairwise function scverse/pertpy-tutorials#56

Merged

update subproject commit

98f3c64

Zethson approved these changes Dec 5, 2025

View reviewed changes

selmanozleyen and others added 10 commits December 5, 2025 11:03

Merge branch 'main' into speedup/edistance

326c64e

Update pertpy/tools/_distances/_distances.py

88a1239

Co-authored-by: Lukas Heumos <lukas.heumos@posteo.net>

Merge branch 'main' into speedup/edistance

fd1965a

resolve comments

837f190

Tutorials

b954aaa

Signed-off-by: Lukas Heumos <lukas.heumos@posteo.net>

Merge branch 'main' into speedup/edistance

380c0de

Update pertpy/tools/_distances/_distances.py

83b5718

Co-authored-by: Lukas Heumos <lukas.heumos@posteo.net>

Fix mixscape

4f3680b

Signed-off-by: Lukas Heumos <lukas.heumos@posteo.net>

Merge branch 'speedup/edistance' of https://github.com/selmanozleyen/…

247c14c

…pertpy into speedup/edistance

fastmath=True

cbfac60

Zethson merged commit c36eb42 into scverse:main Dec 11, 2025
17 checks passed

selmanozleyen deleted the speedup/edistance branch December 11, 2025 12:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Speedup For EDistance-like distances #880

Speedup For EDistance-like distances #880

Uh oh!

selmanozleyen commented Nov 12, 2025 •

edited

Loading

Uh oh!

codecov-commenter commented Nov 12, 2025 •

edited

Loading

Uh oh!

selmanozleyen commented Nov 30, 2025

Uh oh!

Zethson commented Nov 30, 2025

Uh oh!

Zethson left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Speedup For EDistance-like distances #880

Speedup For EDistance-like distances #880

Uh oh!

Conversation

selmanozleyen commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

selmanozleyen commented Nov 30, 2025

Uh oh!

Zethson commented Nov 30, 2025

Uh oh!

Zethson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

selmanozleyen commented Nov 12, 2025 •

edited

Loading

codecov-commenter commented Nov 12, 2025 •

edited

Loading