Skip to content

Conversation

@CeliaBenquet
Copy link
Member

@CeliaBenquet CeliaBenquet commented May 1, 2025

This PR adds a PyTorch implementation of a unified CEBRA encoder, which is composed of:

  • A new sampling scheme that samples across all sessions so that they can be aligned on the neuron axis to train a single encoder.
  • A unified Dataset and Loader, adapted to the new sampling scheme.
  • A unified Solver that considers multiple sessions to be aligned at inference.
  • A new masked modeling training option, with different types of masking.

🚧 A preprint is pending "Unified CEBRA Encoders for Integrating Neural Recordings via Behavioral Alignment" by Célia Benquet, Hossein Mirzaei, Steffen Schneider, Mackenzie W. Mathis.

💻 A DEMO Notebook is available at: https://cebra.ai/docs/demos.html

Screenshot 2025-06-05 at 14 26 43

Copy link
Member

@MMathisLab MMathisLab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @CeliaBenquet ! I went through and left comments for disucssion

Copy link
Member

@stes stes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall; left some comments!

  • Implementation of the Mixin class for the masking: If I understood correctly, the only change is that this apply_mask function is applied after loading a batch. This seems to be a change that could be minimally invasively applied not in the dataset, but actually in the data loader. Is there a good case why the datasets themselves need to be modified?
  • Discussion on where to place the decoders: currently in cebra.models.decoders; are the decoders useful as "standalone" models? where are they currently used? based on that we could determine if we move them e.g. as standalone to integrations
  • see other comments; mostly on class design, removing duplicated code, etc.

Copy link
Member

@MMathisLab MMathisLab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! Just the one comment on kwargs seems critical to decide

@MMathisLab MMathisLab requested a review from stes May 28, 2025 23:02
@MMathisLab MMathisLab changed the title Add unified encoder pytorch implementation Add unified CEBRA encoder: pytorch implementation Jun 5, 2025
Copy link
Member

@stes stes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remaining issues are:

  • more tests (e.g. for changes to model, adaptations of sklearn API, ...)
  • improved class design for masking
  • improved class design for solver hierarchy

However, this can be done in follow-up contributions, this can be merged now. Attempt to clean up the commit message for sqashing:

* start tests

* remove print statements

* first passing test

* move functionality to base file in solver and separate in functions

* add test_select_model for multisession

* remove float16

* Improve modularity remove duplicate code and todos

* Add tests to solver

* Fix save/load

* Fix extra docs errors

* Add review updates

* apply ruff auto-fixes

* fix linting errors

* Run isort, ruff, yapf

* Fix gaussian mixture dataset import

* Fix all tests but xcebra tests

* Fix pytorch API usage example

* Make xCEBRA compatible with the batched inference & padding in solver

* Add some tests on transform() with xCEBRA

* Add some docstrings and typings and clean unnecessary changes

* Implement review comments

* Fix sklearn test

* Initial pass at integrating unifiedCEBRA

* Add name in NOTE

* Implement reviews on tests and typing

* Fix import errors

* Add select_model to aux solvers

* Fix tests

* Add mask tests

* Fix docs error

* Remove masking init()

* Remove shuffled neurons in unified dataset

* Remove extra datasets

* Add tests on the private functions in base solver

* Update tests and duplicate code based on review

* Fix quantized_embedding_norm undefined when `normalize=False` (#249)

* Fix tests

* Adapt unified code to get_model method

* Update mask.py

add headers to new files

* Update masking.py

- header

* Update test_data_masking.py

- header

* Implement review comments and fix typos

* Fix docs errors

* Remove np.int typing error

* Fix docstring warning

* Fix indentation docstrings

* Implement review comments

* Fix circular import and abstract method

* Add maskedmixin to __all__

* Implement extra review comments

* Change masking kwargs as tuple and not dict in sklearn impl

* Add integrations/decoders.py

* Fix typo

* minor simplification in solver

---------

Note, some comments in this PR overlap with
https://github.com/AdaptiveMotorControlLab/CEBRA/pull/168
and
https://github.com/AdaptiveMotorControlLab/CEBRA/pull/225
which were developed in parallel.

@stes stes merged commit 5614d80 into AdaptiveMotorControlLab:main Jun 5, 2025
11 checks passed
@stes stes deleted the unified-cebra branch June 5, 2025 20:08
@stes stes mentioned this pull request Jun 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants