Fix | SqlVector: Explicitly perform little-endian multibyte writes #3861

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

edwardneal wants to merge 5 commits into dotnet:main from edwardneal:fix/issue-3790

+62 −19

Contributor

edwardneal commented Dec 24, 2025

Description

This deals with a minor point which came out of the original implementation PR: when we build (or read) the byte array representing a vector({size}, float32), we now explicitly do so using BinaryPrimitives' little-endian methods.

There are two slightly unusual points:

netfx doesn't have the BinaryPrimitives.WriteSingleLittleEndian method. Instead, we fall back to the existing BitConverterCompatible.SingleToInt32Bits on this target.
When building the byte array, we perform two casts: (float)(object)valueSpan[i]. This is another variation of the same pattern used elsewhere of (T)(object)item, and the same pattern holds: the JIT sees that valueSpan is actually a ReadOnlySpan<float> and eliminates the redundant cast. An example of this on Sharplab is here.

Issues

Fixes #3790.

Testing

Automated tests continue to pass.


          Explicitly perform little-endian multibyte writes

d2dd1b0

edwardneal requested a review from a team as a code owner

December 24, 2025 00:24

apoorvdeshmukh requested a review from Copilot

December 24, 2025 04:37

Contributor

apoorvdeshmukh commented Dec 24, 2025

/azp run

Copilot started reviewing on behalf of apoorvdeshmukh

December 24, 2025 04:37

azure-pipelines bot commented Dec 24, 2025

Azure Pipelines successfully started running 2 pipeline(s).

Copilot AI reviewed

View reviewed changes

Contributor

Copilot AI left a comment

Pull request overview

This PR refactors the SqlVector<T> implementation to use explicit little-endian byte operations when serializing and deserializing vector data. The change replaces platform-specific memory marshaling/copying approaches with consistent loop-based operations using BinaryPrimitives methods to ensure correct endianness.

Key Changes:

Replaced manual bit manipulation and platform-specific Buffer.BlockCopy/MemoryMarshal operations with explicit little-endian read/write methods
Introduced platform-specific handling: BinaryPrimitives.WriteSingleLittleEndian for .NET and BitConverterCompatible.SingleToInt32Bits + BinaryPrimitives.WriteInt32LittleEndian for .NET Framework
Ensured symmetry between serialization (MakeTdsBytes) and deserialization (MakeArray) operations

src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlTypes/SqlVector.cs Outdated Show resolved Hide resolved

src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlTypes/SqlVector.cs Show resolved Hide resolved

paulmedynski self-assigned this

paulmedynski reviewed

View reviewed changes

src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlTypes/SqlVector.cs

    
                  private SqlVector(int length)

                  {

                      if (length < 0)

Contributor

paulmedynski Jan 5, 2026

Should we also be throwing if length > ushort.Max ?

Same for the ReadOnlyMemory constructor.

(With updated public docs to match.)

Contributor Author

edwardneal Jan 5, 2026 •

edited

Loading

I think so, yes - although I think it's probably going to be TdsEnums.VECTOR_HEADER_SIZE + (_elementSize * Length) > 8000 to align with SQL Server.

This also means that length will always be < 8000, thus always in the acceptable range for a ushort (so MakeTdsBytes can just have a simple debug assertion rather than an exception.)

src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlTypes/SqlVector.cs

    
                      result[1] = VecVersionNo;

                      result[2] = (byte)(Length & 0xFF);

                      result[3] = (byte)((Length >> 8) & 0xFF);

                      BinaryPrimitives.WriteUInt16LittleEndian(result.AsSpan(2), (ushort)Length);

Contributor

paulmedynski Jan 5, 2026

I think we need a new class invariant to ensure Length's value is compatible with ushort. See my comment on the constructor above.

src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlTypes/SqlVector.cs Outdated Show resolved Hide resolved

src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlTypes/SqlVector.cs Show resolved Hide resolved

paulmedynski reviewed

View reviewed changes

src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlTypes/SqlVector.cs Outdated

    
                          for (int i = 0, currPosition = TdsEnums.VECTOR_HEADER_SIZE; i < values.Length; i++, currPosition += _elementSize)

                          {

              #if NET

                              BinaryPrimitives.WriteSingleLittleEndian(result.AsSpan(currPosition), (float)(object)valueSpan[i]);

Contributor

paulmedynski Jan 5, 2026

Is there a performance impact here (good or bad)?

Contributor Author

edwardneal Jan 5, 2026

I expected there to be one, but didn't expect it to be so large.

Previously, the method would have taken ~70ns to copy a max-length vector when the ReadOnlyMemory was backed by an array and ~240ns when it was backed by unmanaged memory (as a result of the extra copy.) As a result of the first set of changes, it would have taken ~1150ns.

The endianness is only really a concern on big-endian systems, so I've changed the method slightly. On a big-endian machine, it'll continue to use the replacement method (so will take about 1150ns.) On a little-endian machine, it'll continue to use the pre-PR method (albeit with a Span-based copy instead of Buffer.BlockCopy) which now takes ~60ns.

edwardneal added 4 commits

January 5, 2026 14:37


          Tighten validation of array lengths

c66a1e5

Vectors must always be less than 8000 bytes.
This also means that Length will always be <= ushort.MaxValue. Add assertion to highlight this.


          Merge branch 'main' into fix/issue-3790


          Partially revert changes to avoid performance regression

d976373


          Correct invalid debug assertion

8a908ed

Length is the number of elements.

Member

cheenamalhotra commented Jan 5, 2026

/azp run

azure-pipelines bot commented Jan 5, 2026

Azure Pipelines successfully started running 2 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet