Add dataset serialization to support distributed eval runs #1472
+213
−2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Important
Add
items_to_dict()inDatasetClientfor dataset item serialization, with tests for distributed processing support.items_to_dict()method inDatasetClientto serialize dataset items into dictionaries for distributed processing.test_items_to_dict_basic()intest_datasets.pyto verify basic serialization structure.test_items_to_dict_multiple_items()to check handling of multiple items.test_items_to_dict_reconstruct_item()to ensure items can be reconstructed from dicts.test_items_to_dict_with_run()to verify reconstructed items work withrun()context manager.This description was created by
for de4c08b. You can customize this summary. It will automatically update as commits are pushed.
Disclaimer: Experimental PR review
Greptile Overview
Greptile Summary
This PR adds dataset serialization capabilities to the
langfuse-pythonclient to support distributed evaluation workflows. The main changes include a newitems_to_dict()method inDatasetClientthat converts dataset items into serializable dictionaries, and modifications toDatasetItemClientto store a reference to the originalDatasetItemPydantic model. This enables dataset items to be passed between distributed processing systems like AWS Step Functions, Azure Functions, or Prefect tasks where objects need to be JSON-serialized and reconstructed in separate processes.The implementation stores the original
DatasetItemmodel in a newdataset_itemattribute onDatasetItemClientand provides a straightforward serialization method that calls.dict()on each item. TheDatasetClientconstructor also now accepts an optionallangfuseparameter for more flexible client management in distributed scenarios. This change integrates seamlessly with the existing codebase architecture by leveraging the existing Pydantic models and maintaining backward compatibility.Important Files Changed
items_to_dict()method and storesdataset_itemreference for serialization supportConfidence score: 4/5
Sequence Diagram
sequenceDiagram participant User participant DatasetClient participant DatasetItem participant DistributedWorker participant DatasetItemClient participant LangfuseSpan participant LangfuseAPI User->>DatasetClient: "items_to_dict()" DatasetClient->>DatasetItem: "dict()" DatasetItem-->>DatasetClient: "serialized_dict" DatasetClient-->>User: "List[Dict[str, Any]]" User->>DistributedWorker: "send serialized items" DistributedWorker->>DatasetItem: "DatasetItem(**item_dict)" DatasetItem-->>DistributedWorker: "reconstructed_item" DistributedWorker->>DatasetItemClient: "DatasetItemClient(item, langfuse)" DatasetItemClient-->>DistributedWorker: "client_instance" DistributedWorker->>DatasetItemClient: "run(run_name=name)" DatasetItemClient->>LangfuseSpan: "start_as_current_span()" LangfuseSpan-->>DatasetItemClient: "span_context" DatasetItemClient->>LangfuseAPI: "create dataset_run_item" LangfuseAPI-->>DatasetItemClient: "run_item_created" DatasetItemClient-->>DistributedWorker: "yield span" DistributedWorker->>LangfuseSpan: "update_trace()" DistributedWorker->>LangfuseSpan: "score_trace()" DistributedWorker->>DatasetItemClient: "flush()" DatasetItemClient->>LangfuseAPI: "flush traces and scores"Context used:
dashboard- Move imports to the top of the module instead of placing them within functions or methods. (source)