-
Notifications
You must be signed in to change notification settings - Fork 38
Open
Description
Is your feature request related to a problem?
- Yes, it is related to a problem
Describe the feature you'd like
🌟 Feature Description
Extend existing pgvector-based RAG infrastructure from meeting transcriptions to tasks and tickets. Replace full dataset context injection with semantic vector similarity search using cosine distance, retrieving only top-k relevant entities per query.
🔍 Problem Statement
Current implementation performs context serialization of entire task and ticket collections into LLM prompts regardless of query relevance. The AI service takes all user tasks and tickets as string context before each inference call, resulting in:
- Average prompt size: 2500 tokens (80% from task/ticket serialization)
- Gemini API latency: 3-4 seconds due to large context processing
- Context window overflow at approximately 500 entities
- Zero semantic filtering - irrelevant entities pollute prompt context
Meeting entities already leverage vector embeddings stored in summary_embedding vector(768) with vector indexing and RPC-based similarity search
🎯 Expected Outcome
- Augment
tasksandticketsschemas withdescription_embedding vector(768)columns - Implement automatic embedding generation pipeline via Supabase Edge Functions usin embedding model.
- Refactor AI service to perform query embedding generation and vector space retrieval before LLM inference
- Token budget reduction: 2500 → 300 per request (88% decrease)
📷 Screenshots and Design Ideas
📋 Additional Context
Record
- I agree to follow this project's Code of Conduct
- I want to work on implementing this feature
Metadata
Metadata
Assignees
Labels
No labels