Add unified CEBRA encoder: pytorch implementation #251

CeliaBenquet · 2025-05-01T14:36:20Z

This PR adds a PyTorch implementation of a unified CEBRA encoder, which is composed of:

A new sampling scheme that samples across all sessions so that they can be aligned on the neuron axis to train a single encoder.
A unified Dataset and Loader, adapted to the new sampling scheme.
A unified Solver that considers multiple sessions to be aligned at inference.
A new masked modeling training option, with different types of masking.

🚧 A preprint is pending "Unified CEBRA Encoders for Integrating Neural Recordings via Behavioral Alignment" by Célia Benquet, Hossein Mirzaei, Steffen Schneider, Mackenzie W. Mathis.

💻 A DEMO Notebook is available at: https://cebra.ai/docs/demos.html

…ional models in _transform

…est accordingly

add headers to new files

- header

cebra/data/single_session.py

cebra/datasets/__init__.py

cebra/integrations/sklearn/cebra.py

- header

MMathisLab

thanks @CeliaBenquet ! I went through and left comments for disucssion

cebra/models/decoders.py

cebra/models/model.py

cebra/solver/multi_session.py

cebra/solver/single_session.py

cebra/models/decoders.py

stes

Looks good overall; left some comments!

Implementation of the Mixin class for the masking: If I understood correctly, the only change is that this apply_mask function is applied after loading a batch. This seems to be a change that could be minimally invasively applied not in the dataset, but actually in the data loader. Is there a good case why the datasets themselves need to be modified?
Discussion on where to place the decoders: currently in cebra.models.decoders; are the decoders useful as "standalone" models? where are they currently used? based on that we could determine if we move them e.g. as standalone to integrations
see other comments; mostly on class design, removing duplicated code, etc.

cebra/data/masking.py

cebra/data/multi_session.py

cebra/data/single_session.py

cebra/datasets/demo.py

docs/source/api/pytorch/helpers.rst

tests/test_data_masking.py

tests/test_solver.py

cebra/data/masking.py

cebra/data/base.py

MMathisLab

lgtm! Just the one comment on kwargs seems critical to decide

cebra/solver/single_session.py

stes

Remaining issues are:

more tests (e.g. for changes to model, adaptations of sklearn API, ...)
improved class design for masking
improved class design for solver hierarchy

However, this can be done in follow-up contributions, this can be merged now. Attempt to clean up the commit message for sqashing:

* start tests

* remove print statements

* first passing test

* move functionality to base file in solver and separate in functions

* add test_select_model for multisession

* remove float16

* Improve modularity remove duplicate code and todos

* Add tests to solver

* Fix save/load

* Fix extra docs errors

* Add review updates

* apply ruff auto-fixes

* fix linting errors

* Run isort, ruff, yapf

* Fix gaussian mixture dataset import

* Fix all tests but xcebra tests

* Fix pytorch API usage example

* Make xCEBRA compatible with the batched inference & padding in solver

* Add some tests on transform() with xCEBRA

* Add some docstrings and typings and clean unnecessary changes

* Implement review comments

* Fix sklearn test

* Initial pass at integrating unifiedCEBRA

* Add name in NOTE

* Implement reviews on tests and typing

* Fix import errors

* Add select_model to aux solvers

* Fix tests

* Add mask tests

* Fix docs error

* Remove masking init()

* Remove shuffled neurons in unified dataset

* Remove extra datasets

* Add tests on the private functions in base solver

* Update tests and duplicate code based on review

* Fix quantized_embedding_norm undefined when `normalize=False` (#249)

* Fix tests

* Adapt unified code to get_model method

* Update mask.py

add headers to new files

* Update masking.py

- header

* Update test_data_masking.py

- header

* Implement review comments and fix typos

* Fix docs errors

* Remove np.int typing error

* Fix docstring warning

* Fix indentation docstrings

* Implement review comments

* Fix circular import and abstract method

* Add maskedmixin to __all__

* Implement extra review comments

* Change masking kwargs as tuple and not dict in sklearn impl

* Add integrations/decoders.py

* Fix typo

* minor simplification in solver

---------

Note, some comments in this PR overlap with
https://github.com/AdaptiveMotorControlLab/CEBRA/pull/168
and
https://github.com/AdaptiveMotorControlLab/CEBRA/pull/225
which were developed in parallel.

gonlairo and others added 30 commits August 23, 2024 13:54

first proposal for batching in tranform method

283de06

first running version of padding with batched inference

202e379

start tests

1f1989d

add pad_before_transform to fit function and add support for convolut…

8665660

…ional models in _transform

remove print statements

8d5b114

first passing test

32c5ecd

add support for hybrid models

9928f63

rewrite transform in sklearn API

be5630a

baseline version of a torch.Datset

1300b20

move batching logic outside solver

bc6af24

move functionality to base file in solver and separate in functions

ec377b9

add test_select_model for single session

6f9ca98

add checks and test for _process_batch

fbe7eb4

add test_select_model for multisession

463b0f8

make self.num_sessions compatible with single session training

5219171

improve test_batched_transform_singlesession

f9bd1a6

make it work with small batches

e23a7ef

make test with multisession work

19c3f87

change to torch padding

87bebac

add argument to sklearn api

f0303e0

add torch padding to _transform

8c8be85

convert to torch if numpy array as inputs

59df402

add distinction between pad with data and pad with zeros and modify t…

1aadc8b

…est accordingly

differentiate between data padding and zero padding

bc8ee25

remove float16

5e7a14c

change argument position

928d882

clean test

07bac1c

clean test

0823b54

Fix warning

9fe3af3

Improve modularity remove duplicate code and todos

b417a23

MMathisLab and others added 3 commits May 23, 2025 15:51

Update mask.py

b4caf3a

add headers to new files

Merge branch 'main' into unified-cebra

718f7ca

Update masking.py

bbf0e8f

- header

MMathisLab reviewed May 26, 2025

View reviewed changes

cebra/data/single_session.py Outdated Show resolved Hide resolved

MMathisLab reviewed May 26, 2025

View reviewed changes

cebra/datasets/__init__.py Outdated Show resolved Hide resolved

MMathisLab reviewed May 26, 2025

View reviewed changes

cebra/integrations/sklearn/cebra.py Outdated Show resolved Hide resolved

Update test_data_masking.py

858b77b

- header

MMathisLab requested changes May 26, 2025

View reviewed changes

CeliaBenquet added 5 commits May 26, 2025 15:10

Implement review comments and fix typos

4d5e9c3

Fix docs errors

4424ba1

Remove np.int typing error

a968768

Fix docstring warning

535cef3

Fix indentation docstrings

8798aa0

stes requested changes May 28, 2025

View reviewed changes

CeliaBenquet added 4 commits May 28, 2025 09:40

Implement review comments

165d641

Fix circular import and abstract method

8e5bd4e

Add maskedmixin to __all__

63d5a7c

Implement extra review comments

d91949f

MMathisLab approved these changes May 28, 2025

View reviewed changes

CeliaBenquet added 3 commits May 28, 2025 20:18

Change masking kwargs as tuple and not dict in sklearn impl

de300a9

Add integrations/decoders.py

fe341e1

Fix typo

276e5a3

MMathisLab requested a review from stes May 28, 2025 23:02

MMathisLab changed the title ~~Add unified encoder pytorch implementation~~ Add unified CEBRA encoder: pytorch implementation Jun 5, 2025

stes reviewed Jun 5, 2025

View reviewed changes

cebra/solver/single_session.py Show resolved Hide resolved

minor simplification in solver

2135bf2

stes approved these changes Jun 5, 2025

View reviewed changes

stes merged commit 5614d80 into AdaptiveMotorControlLab:main Jun 5, 2025
11 checks passed

stes deleted the unified-cebra branch June 5, 2025 20:08

stes mentioned this pull request Jun 5, 2025

Release 0.6.0a2 #256

Merged

Add unified CEBRA encoder: pytorch implementation #251

Add unified CEBRA encoder: pytorch implementation #251

Uh oh!

Conversation

CeliaBenquet commented May 1, 2025 • edited by MMathisLab Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MMathisLab left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

stes left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MMathisLab left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

stes left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

CeliaBenquet commented May 1, 2025 •

edited by MMathisLab

Loading