Skip to content

Conversation

@Deng-m1
Copy link

@Deng-m1 Deng-m1 commented Dec 11, 2025

## Summary

Fix timestamp inconsistency that causes duplicate trace records in ClickHouse when the SDK runs in non-UTC timezones.

Closes https://github.com/langfuse/langfuse/issues/11026

## Problem

When a datetime object without timezone info (naive datetime) is passed to `serialize_datetime()`, 
it was previously interpreted using the **local timezone**. In UTC+8 environments, this caused:

1. **Duplicate trace records** - Same trace ID gets different `toDate(timestamp)` values
2. **ReplacingMergeTree merge failures** - ClickHouse can't merge records with different dates
3. **Inconsistent trace rendering** - Traces appear fragmented in the UI

### Example

id: 2e0e8e5f811849f5106dc38a584cce27
Row 1: timestamp=2025-12-10 (UTC), name=NULL
Row 2: timestamp=2025-12-11 (local), name=LangGraph ← Different date!


## Root Cause

In `langfuse/api/core/datetime_utils.py`:

```python
# Before (problematic)
local_tz = dt.datetime.now().astimezone().tzinfo
localized_dt = v.replace(tzinfo=local_tz)

Solution

Assume naive datetime is UTC, consistent with:

  • Langfuse infrastructure requirements (all timestamps should be UTC)
  • Internal _get_timestamp() which already uses datetime.now(timezone.utc)
# After (fixed)
utc_dt = v.replace(tzinfo=dt.timezone.utc)

Testing

Added 11 test cases in tests/test_datetime_utils.py:

  • ✅ UTC datetime serialization (ends with 'Z')
  • ✅ Non-UTC timezone handling (uses offset format)
  • ✅ Naive datetime assumed as UTC
  • ✅ Consistency with _get_timestamp()
  • ✅ Edge cases (midnight, end of day)
  • ✅ ISO 8601 compliance

All tests pass.

Checklist

  • Code follows project conventions
  • Tests added for the fix
  • All tests pass locally
  • Linked to issue #11026

---

<!-- ELLIPSIS_HIDDEN -->

----

> [!IMPORTANT]
> Fixes `serialize_datetime()` to assume naive datetimes as UTC, preventing duplicate trace records in ClickHouse.
> 
>   - **Behavior**:
>     - `serialize_datetime()` in `datetime_utils.py` now assumes naive datetimes are UTC, preventing duplicate trace records in ClickHouse.
>     - Aligns with Langfuse infrastructure requirements for UTC timestamps.
>   - **Testing**:
>     - Adds 11 test cases in `test_datetime_utils.py` to verify correct serialization of UTC, naive, and non-UTC datetimes.
>     - Tests include edge cases like midnight, end of day, and ISO 8601 compliance.
> 
> <sup>This description was created by </sup>[<img alt="Ellipsis" src="https://img.shields.io/badge/Ellipsis-blue?color=175173">](https://www.ellipsis.dev?ref=langfuse%2Flangfuse-python&utm_source=github&utm_medium=referral)<sup> for 02587ef3a9ed726d4960df1698c387240761bc56. You can [customize](https://app.ellipsis.dev/langfuse/settings/summaries) this summary. It will automatically update as commits are pushed.</sup>

<!-- ELLIPSIS_HIDDEN -->

<!-- greptile_comment -->

**Disclaimer**: Experimental PR review
---

<h2>Greptile Overview</h2>

<h3>Greptile Summary</h3>


Fixed timestamp inconsistency causing duplicate trace records in ClickHouse when the SDK runs in non-UTC timezones by changing `serialize_datetime()` to assume naive datetimes are UTC instead of local time.

- Changed naive datetime handling in `langfuse/api/core/datetime_utils.py` from local timezone to UTC assumption
- This aligns with Langfuse infrastructure requirements where all timestamps should be UTC
- Consistent with internal `_get_timestamp()` which uses `datetime.now(timezone.utc)`
- Added 11 comprehensive test cases covering UTC, non-UTC timezones, naive datetime handling, and edge cases
- **Note:** This is a behavioral change - naive datetimes will now serialize differently in non-UTC environments

<h3>Confidence Score: 4/5</h3>


- Safe to merge - targeted bug fix with comprehensive test coverage.
- The change is well-documented, addresses a real issue (duplicate trace records), and includes thorough test coverage. Minor deduction for the in-method import style issue in tests.
- No files require special attention.

<h3>Important Files Changed</h3>



File Analysis



| Filename | Score | Overview |
|----------|-------|----------|
| langfuse/api/core/datetime_utils.py | 5/5 | Changed naive datetime handling from local timezone to UTC assumption, fixing duplicate trace records in ClickHouse for non-UTC environments. |
| tests/test_datetime_utils.py | 4/5 | Added 11 comprehensive test cases for datetime serialization; contains one in-method import that should be moved to module top. |

</details>



<h3>Sequence Diagram</h3>

```mermaid
sequenceDiagram
    participant SDK as Langfuse SDK
    participant DT as serialize_datetime()
    participant CH as ClickHouse

    Note over SDK,CH: Before Fix (Non-UTC timezone)
    SDK->>DT: naive datetime (e.g., 2025-12-10 21:00)
    DT->>DT: Apply local timezone (+08:00)
    DT->>CH: 2025-12-11T05:00:00+08:00
    Note over CH: Stored as 2025-12-11

    Note over SDK,CH: After Fix (Any timezone)
    SDK->>DT: naive datetime (e.g., 2025-12-10 21:00)
    DT->>DT: Assume UTC
    DT->>CH: 2025-12-10T21:00:00Z
    Note over CH: Stored as 2025-12-10 (consistent)

When a datetime object without timezone info (naive datetime) is passed
to serialize_datetime(), it was previously interpreted using the local
timezone. This caused timestamp inconsistencies when the SDK runs in
non-UTC timezones, leading to:

1. Duplicate trace records in ClickHouse with different toDate(timestamp)
2. ReplacingMergeTree unable to merge records for the same trace ID
3. Incorrect trace rendering in the Langfuse UI

This fix changes the behavior to assume naive datetimes are UTC, which
is consistent with Langfuse's infrastructure requirements and the
internal _get_timestamp() function that already uses UTC.

The fix ensures all timestamp serialization uses UTC, preventing the
date boundary issues that occur when local time differs from UTC by
enough hours to cross midnight.

Includes comprehensive test coverage for:
- UTC datetime serialization
- Non-UTC timezone handling
- Naive datetime (now assumed UTC)
- Edge cases (midnight, end of day)
- ISO 8601 compliance
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


north-al seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@hassiebp hassiebp self-requested a review December 11, 2025 16:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants