Skip to content

Conversation

@NiklasAbraham
Copy link
Contributor

Added small fixed in embedding and mutation and standard numbering.

@NiklasAbraham NiklasAbraham requested a review from Copilot July 5, 2025 09:41
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a normalize flag throughout embedding calculation routines, standardizes Neo4j queries to use elementId instead of id, and updates several dependencies and logging patterns.

  • Adds normalize: bool = True parameter to all embedding methods and propagates it through the processor and model implementations
  • Replaces id(r) with elementId(r) in standard numbering and mutation detection queries
  • Improves logging in sequence alignment and ontology loading, and updates project dependencies

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/pyeed/embeddings/processor.py Added normalize parameter to public/legacy embedding APIs
src/pyeed/embeddings/models/prott5.py Propagated normalize and pooling changes for ProtT5
src/pyeed/embeddings/models/esmc.py Added normalize flag to ESMC batch/single embedding
src/pyeed/embeddings/models/esm3.py Added normalize flag to ESM3 batch/single embedding
src/pyeed/embeddings/models/esm2.py Added normalize flag to ESM2 batch/single embedding
src/pyeed/embeddings/base.py Updated abstract methods to include normalize parameter
src/pyeed/analysis/standard_numbering.py Switched to elementId and refined logging
src/pyeed/analysis/sequence_alignment.py Replaced print with logger, fixed query patterns
src/pyeed/analysis/ontology_loading.py Improved OWL restriction handling and relationship logic
src/pyeed/analysis/mutation_detection.py Updated region_ids_neo4j type and elementId queries
pyproject.toml Removed old numpy constraint; replaced umap with umap-learn
Comments suppressed due to low confidence (1)

src/pyeed/embeddings/models/prott5.py:145

  • The variable attention_mask is not defined in this scope. You need to obtain it from the model outputs (e.g., outputs.attention_mask) or pass it into the method.
        seq_len = attention_mask.cpu().numpy().sum()

NiklasAbraham and others added 5 commits July 5, 2025 11:46
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@haeussma haeussma merged commit c5441e6 into main Sep 3, 2025
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants