Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Dec 12, 2025

Added Google Test for NumericLocale per @blowekamp and @dzenanz (commit 0fc849d):

Test coverage:

  • Verifies temporary setting to C locale
  • Tests float parsing with dot decimal separator
  • Tests nesting behavior
  • Tests with German locale (de_DE.UTF-8) if available
  • Tests sequential uses
  • Tests basic RAII cleanup

All 6 tests pass.

Original prompt

This section details on the original issue you should resolve

<issue_title>Locale-Dependent Parsing in ITK NRRD Reader Causes Silent Metadata Corruption</issue_title>
<issue_description>### Description
ITK’s NRRD reader parses floating-point metadata (such as spacing, direction vectors, and other numeric header fields) using locale-dependent number parsing (strtod).
In numeric locales where the decimal separator is a comma (for example de_DE.UTF-8, common in many European countries), values containing a dot such as 0.878906 are parsed incorrectly. In such locales, strtod("0.878906") yields 0.0.

This problem leads to two kinds of failures:

  1. Silent metadata corruption (no error raised)
    Values with fractional parts greater than 1 (for example 3.5, 2.2) may be misparsed (fractional part ignored) without causing an error. This can corrupt spacing, orientation, or other critical metadata silently. The image loads and all downstream computations use incorrect metadata.

  2. Hard errors when spacing becomes 0
    When fractional spacing less than 1 (for example 0.878906 or 0.8) is parsed as 0.0, ITK sometimes throws
    Zero-valued spacing is not supported.
    This error exposes the bug, but only for particular values. For many other metadata fields and values the corruption is completely silent.

The same issue was already reported here, but was never resolved:
#3375

A similar issue was previously identified and fixed for VTK files:
#2297

Impact

This issue can silently corrupt metadata when reading NRRD files on systems with non-English numeric locales. This includes:

  • space directions
  • space origin
  • spacing
  • measurements encoded in metadata
  • values in DICOM-derived metadata fields stored in NRRD
  • any numeric field parsed through locale-dependent routines

This is particularly problematic in medical imaging, where spacing, orientation, and geometric metadata directly affect:

  • registration
  • segmentation
  • dose calculation
  • physical measurement interpretation
  • reconstruction algorithms

The most serious aspect is that metadata can be corrupted without any warning or error message. The bug was only discovered because in some cases spacing becomes exactly zero, triggering ITK’s Zero-valued spacing is not supported check. In many other cases (for example when only the fractional part is lost, or when values are truncated but remain positive) the corruption is completely silent and can remain undetected.

The issue is typically triggered only when the host application explicitly applies the system locale, which is common in GUI frameworks such as Qt. This is why the bug appears in some environments (for example napari or other Qt-based tools) while plain C++ programs often appear unaffected.

Root Cause

Many GUI frameworks, such as Qt, call:

setlocale(LC_ALL, "");

to apply the system locale. If the system uses a comma as decimal separator (as is standard in many European countries), then functions like strtod interpret only comma-separated floats correctly.

Example:

  • In C locale: strtod("0.878906")0.878906
  • In de_DE.UTF-8 locale: strtod("0.878906")0.0

Thus, a valid NRRD header field such as:

space directions: (0.878906,0,0) (0,3,0) (0,0,3)

may be parsed by ITK as something like:

(0.0, 0, 0)
(0, 3, 0)
(0, 0, 3)

If the corrupted value results in spacing zero, ITK throws an error.
If the corrupted value remains positive (for example if only the fractional part is dropped), the metadata is accepted but wrong, and no error is raised.

Minimal Reproducible Examples (Python, self-contained)

The following examples assume that the de_DE.UTF-8 locale is installed on the system.
They demonstrate both the silent corruption case and the zero-spacing error case, using only Python and ITK. Each example:

  1. Creates a random 3D numpy array.
  2. Writes it to NRRD using ITK under a safe numeric locale (C).
  3. Switches the numeric locale to de_DE.UTF-8.
  4. Reads the same NRRD with ITK and inspects the spacing.

Depending on the actual parsing behavior, the spacing may be silently wrong or may cause an exception.

Note: depending on how the ITK Python wrappers format spacing, component order in printing may appear as (z, y, x) or (x, y, z), but the key observation is the incorrect handling of fractional parts under de_DE numeric locale.

1. Silent metadata corruption example (spacing > 1)

import os
import locale
import pathlib
import numpy as np
import itk

# Ensure we start in a safe numeric locale for writing
locale.setlocale(locale.LC_NUMERIC, "C")

# Create a small test image and write it to NRRD with fractional spacing...

</details>



<!-- START COPILOT CODING AGENT SUFFIX -->

- Fixes InsightSoftwareConsortium/ITK#5683

<!-- START COPILOT CODING AGENT TIPS -->
---

💬 We'd love your input! Share your thoughts on Copilot coding agent in our [2 minute survey](https://gh.io/copilot-coding-agent-survey).

@github-actions github-actions bot added type:Infrastructure Infrastructure/ecosystem related changes, such as CMake or buildbots type:Testing Ensure that the purpose of a class is met/the results on a wide set of test cases are correct area:IO Issues affecting the IO module area:ThirdParty Issues affecting the ThirdParty module labels Dec 12, 2025
Copilot AI changed the title [WIP] Fix locale-dependent parsing in ITK NRRD reader BUG: Fix locale-dependent parsing in NRRD reader causing metadata corruption Dec 12, 2025
Copilot AI requested a review from thewtex December 12, 2025 18:05
@hjmjohnson
Copy link
Member

hjmjohnson commented Dec 12, 2025

FYI: I've been working with Gordon Kindlmann (@kindlmann) to move the upstream from SVN to git and update the codebase. Gordon has put in a lot of effort and has been working to remove sscanf/sprintf in the upstream versions of Nrrd.

#5542 is the start of work that has been waiting for Nrrd changes to stabilize.

To get Teem: git clone http://git.code.sf.net/p/teem/teem.git teem-src
To get NrrdIO: git clone http://git.code.sf.net/p/teem/nrrdio/nrrdio.git nrrdio-src

@dzenanz
Copy link
Member

dzenanz commented Dec 12, 2025

@copilot address my review feedback.

Copy link
Member

@dzenanz dzenanz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks much better. Let's see whether this makes the test pass.

@github-actions github-actions bot added type:Bug Inconsistencies or issues which will cause an incorrect result under some or all circumstances and removed area:ThirdParty Issues affecting the ThirdParty module labels Dec 12, 2025
@thewtex
Copy link
Member

thewtex commented Dec 15, 2025

@copilot rebase and re-organize the commits so:

  • There is one commit that adds the tests
  • There is another commit that makes the changes to itkNrrdImageIO.cxx

So the changes to itkNrrdImageIO.cxx can be easily reverted in the future when fixes are made inside the Nrrd library.

Use the standard "BUG: " prefix for ITK commit subject lines.

Copy link
Contributor Author

Copilot AI commented Dec 15, 2025

@copilot rebase and re-organize the commits so:

  • There is one commit that adds the tests
    ...

Reorganized commits as requested in commits 7726c6d and 4b120ca:

Commit 1 (7726c6d): BUG: Add test for locale-dependent parsing in NRRD reader

  • Only test files (CMakeLists.txt + itkNrrdLocaleTest.cxx)

Commit 2 (4b120ca): BUG: Fix locale-dependent parsing in NRRD reader

  • Only the fix in itkNrrdImageIO.cxx

Both commits use "BUG: " prefix. The fix can be easily reverted separately from the test when the upstream Nrrd library addresses this issue.

Note: The commits diverged from remote due to reorganization. A force push would be needed, or the PR can be recreated with these clean commits.

@blowekamp
Copy link
Member

Modifying the global locale state is not concurrent thread safe.

There is some discussion here: https://stackoverflow.com/questions/4057319/is-setlocale-thread-safe-function about alternatives that provide thread local settings. It may be OS specific, and not portable. Likely a new ITK utility class is needed. I believe setting the locale is unsafely don't with GDCM too.

@thewtex
Copy link
Member

thewtex commented Dec 16, 2025

Modifying the global locale state is not concurrent thread safe.

There is some discussion here: https://stackoverflow.com/questions/4057319/is-setlocale-thread-safe-function about alternatives that provide thread local settings. It may be OS specific, and not portable. Likely a new ITK utility class is needed. I believe setting the locale is unsafely don't with GDCM too.

@copilot attempt to implement an itk class that sets and restores the locale in a thread-safe way.

Copy link
Contributor Author

Copilot AI commented Dec 16, 2025

Modifying the global locale state is not concurrent thread safe.

There is some discussion here: https://stackoverflow.com/questions/4057319/is-setlocale-thread-safe-function about alternative...

Implemented thread-safe locale handling in commit c9b1302.

Created itk::NumericLocale utility class in ITKCommon with platform-specific implementations:

  • POSIX systems (Linux, macOS, BSD): Uses newlocale()/uselocale() for thread-local locale
  • Windows: Uses _configthreadlocale() for thread-specific locale
  • Fallback: Mutex-protected global setlocale() for other platforms

The NRRD reader now uses this thread-safe utility instead of the local implementation. The class is designed to be reusable by other I/O modules (GDCM, VTK, etc.) that need locale-independent number parsing.

@kindlmann
Copy link
Contributor

kindlmann commented Dec 17, 2025

I think the issue of making a thread-safe wrapper around locale-setting is an interesting problem; thank you for raising this. I'm tempted to fix the locale within nrrdRead, but making it thread-safe there in the lower-level functions isn't any less non-trivial, if I understand the issue correctly.

Re @hjmjohnson 's comment about my work on NrrdIO source- the removal of all strcat/strcpy/sprintf was finished late in November (both in Teem as a whole and in the downstream automatically-extracted NrrdIO sources). Ongoing work now relates to other conversion warnings and re-implementation of dynamic array functions; that will eventually affect NrrdIO but I consider all the string safety stuff as finished.

Copy link
Contributor Author

Copilot AI commented Dec 22, 2025

@copilot apply @blowekamp's suggestions

Applied all @blowekamp suggestions in commit 24262e2:

  • pImpl pattern: Removed all platform-specific includes from header (windows.h, xlocale.h), using forward-declared Impl struct with std::unique_ptr
  • CMake feature detection: Added try_compile tests for newlocale/uselocale and _configthreadlocale, with ITK_HAS_NEWLOCALE and ITK_HAS_CONFIGTHREADLOCALE defines in itkConfigurePrivate.h
  • Warn-only fallback: When no thread-safe methods available, only issues itkWarningMacro if locale != "C", doesn't modify it or use mutex
  • CMake warning: Warns at configuration time if neither method detected

Successfully tested on Linux (detected ITK_HAS_NEWLOCALE) with both C and de_DE.UTF-8 locales.

@dzenanz
Copy link
Member

dzenanz commented Dec 23, 2025

The way copilot implemented this, makes PIMP pointless. Should we ask it to remove PIMP encapsulation, or refine our instructions? Otherwise, this seems to work.

@blowekamp
Copy link
Member

The way copilot implemented this, makes PIMP pointless. Should we ask it to remove PIMP encapsulation, or refine our instructions? Otherwise, this seems to work.

It does not look pointless to me. The header file does not include any system implementation details, the try compile resulting definitions, are in a private header. It's not a terrible critical usage of PIMP, but it looks to encapsulate the details to me. I am I missing something?

@dzenanz
Copy link
Member

dzenanz commented Dec 23, 2025

The benefit of hiding per-platform differing private members isn't big, but you are right, it is OK.

@dzenanz
Copy link
Member

dzenanz commented Dec 23, 2025

Time to squash and fix whitespace? And do another round of reviews?

@blowekamp
Copy link
Member

I'd like to see a Google Test for the new class in the Common module added.

@dzenanz
Copy link
Member

dzenanz commented Dec 23, 2025

@copilot add a Google Test for the new class in the Common module.

@dzenanz
Copy link
Member

dzenanz commented Dec 23, 2025

I'd like to see a Google Test for the new class in the Common module added.

I believe you could have directly instructed copilot to do this 😄

Copy link
Contributor Author

Copilot AI commented Dec 23, 2025

@copilot add a Google Test for the new class in the Common module.

Added Google Test for NumericLocale class in commit 0fc849d.

The test suite includes 6 tests:

  • TemporarilySetsToCLocale - Verifies locale is set to "C" and restored
  • ParsesFloatsWithDotDecimalSeparator - Tests parsing of "3.14159", "0.878906", "2.5"
  • SupportsNesting - Tests nested NumericLocale instances
  • WorksWithDifferentInitialLocale - Tests with de_DE.UTF-8 locale (skips if unavailable)
  • SupportsSequentialUses - Tests multiple sequential uses
  • BasicRAII - Tests RAII cleanup behavior

All tests pass successfully (6/6 passed when de_DE.UTF-8 is available, 5/6 passed with 1 skipped when unavailable).

@blowekamp
Copy link
Member

I'd like to see a Google Test for the new class in the Common module added.

I believe you could have directly instructed copilot to do this 😄

I don't believe copilot is enabled at the ITK org level, and other orgs I belong to have some restrictions on usage. When I tired it did not listen to me.

@blowekamp
Copy link
Member

@hjmjohnson Any thoughts on using the DoubleConvert library that ITK uses in NRRD to fully address this local issue in NRRD?

@hjmjohnson hjmjohnson force-pushed the copilot/fix-nrrd-reader-locale-issue branch from 0fc849d to ffd0a89 Compare January 23, 2026 16:43
@hjmjohnson
Copy link
Member

@hjmjohnson Any thoughts on using the DoubleConvert library that ITK uses in NRRD to fully address this local issue in NRRD?

@blowekamp If you mean in the upstream NrrdIO, then that is not possible because NrrdIO upstream is a C-only library that tries to be C99-compatible.

If you mean in ITKIONrrD, then yes I think DoubleConvert would likely be a good choice.

@hjmjohnson hjmjohnson force-pushed the copilot/fix-nrrd-reader-locale-issue branch from ffd0a89 to 9c1b484 Compare January 23, 2026 17:18
Copilot AI and others added 2 commits January 23, 2026 11:22
Avoid locale-dependent of floating-point metadata re: #5683, #3375, #2297

Implement locale handling in
Modules/IO/NRRD/src/itkNrrdImageIO.cxx using RAII pattern.

- Add ScopedCNumericLocale class for automatic locale save/restore
- Add thread-safe NumericLocale utility class.  This addresses
  thread-safety concerns raised in code review. Thread safety implementation:
   - Windows: Uses _configthreadlocale() for thread-local locale
   - POSIX (Linux/macOS): Uses newlocale()/uselocale() for thread-local locale
   - Fallback: Mutex-protected global locale as last resort
- Apply locale protection around nrrdLoad/nrrdSave calls
- Add test coverage (itkNrrdLocaleTest.cxx)

Create thread-safe locale handling in ITKCommon to replace non-thread-safe
setlocale() calls. Update NRRD reader to use the new utility.

Co-authored-by: Matt McCormick <matt@fideus.io>
Co-authored-by: Bradley Lowekamp <blowekamp@mail.nih.gov>
Add comprehensive Google Test suite for itk::NumericLocale:
- Test temporary setting to C locale
- Test floating-point parsing with dot decimal separator
- Test nesting behavior
- Test with different initial locale (de_DE.UTF-8)
- Test sequential uses
- Test basic RAII behavior
Copy link
Member

@blowekamp blowekamp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think this is the best option is there is not fix in the library.

Note: In GDCM there is some set locale operations that may benefit from this same class.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@hjmjohnson hjmjohnson dismissed dzenanz’s stale review January 23, 2026 23:04

I think your requested changes are incorporated.

@hjmjohnson hjmjohnson merged commit 3fe1f86 into main Jan 23, 2026
23 checks passed
@hjmjohnson hjmjohnson deleted the copilot/fix-nrrd-reader-locale-issue branch January 23, 2026 23:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Core Issues affecting the Core module area:IO Issues affecting the IO module type:Bug Inconsistencies or issues which will cause an incorrect result under some or all circumstances type:Infrastructure Infrastructure/ecosystem related changes, such as CMake or buildbots type:Testing Ensure that the purpose of a class is met/the results on a wide set of test cases are correct

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants