This is the starter kit for the problem statement defined for interface.ai's AI/LLM role
Your team is building a hybrid retrieval system that combines a traditional full-text search engine (e.g., BM25 or Postgres FTS) with a vector search engine (e.g., cosine similarity on embeddings).
Each engine returns hits independently, and you must merge them into a single ranked list of entity IDs.
You have two independent retrieval outputs:
- Full-text hits: entity_id: str, bm25_score: float (higher = better)
- Vector hits: entity_id: str, cosine_similarity: float ∈ [0, 1]
You are also given a weight: alpha: float # in [0, 1], higher = more weight on full-text
Formula:
final_score = alpha * normalized_bm25 + (1 - alpha) * normalized_cosine
Rules:
- If an entity appears only in one list, the missing score = 0 during fusion.
- If it appears in both, combine the scores normally.
./run.sh "{\"alpha\": 0.7, \"full_text_hits\": [(\"apple_inc\", 10.0), (\"fruit_apple\", 4.0), (\"orange_fruit\", 1.0),], \"vector_hits\": [(\"apple_inc\", 0.3), (\"fruit_apple\", 0.9), (\"pear_fruit\", 0.8),]}"