Skip to content

Conversation

@edwardneal
Copy link
Contributor

Description

This deals with a minor point which came out of the original implementation PR: when we build (or read) the byte array representing a vector({size}, float32), we now explicitly do so using BinaryPrimitives' little-endian methods.

There are two slightly unusual points:

  • netfx doesn't have the BinaryPrimitives.WriteSingleLittleEndian method. Instead, we fall back to the existing BitConverterCompatible.SingleToInt32Bits on this target.
  • When building the byte array, we perform two casts: (float)(object)valueSpan[i]. This is another variation of the same pattern used elsewhere of (T)(object)item, and the same pattern holds: the JIT sees that valueSpan is actually a ReadOnlySpan<float> and eliminates the redundant cast. An example of this on Sharplab is here.

Issues

Fixes #3790.

Testing

Automated tests continue to pass.

@edwardneal edwardneal requested a review from a team as a code owner December 24, 2025 00:24
@apoorvdeshmukh
Copy link
Contributor

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the SqlVector<T> implementation to use explicit little-endian byte operations when serializing and deserializing vector data. The change replaces platform-specific memory marshaling/copying approaches with consistent loop-based operations using BinaryPrimitives methods to ensure correct endianness.

Key Changes:

  • Replaced manual bit manipulation and platform-specific Buffer.BlockCopy/MemoryMarshal operations with explicit little-endian read/write methods
  • Introduced platform-specific handling: BinaryPrimitives.WriteSingleLittleEndian for .NET and BitConverterCompatible.SingleToInt32Bits + BinaryPrimitives.WriteInt32LittleEndian for .NET Framework
  • Ensured symmetry between serialization (MakeTdsBytes) and deserialization (MakeArray) operations

@paulmedynski paulmedynski self-assigned this Jan 5, 2026

private SqlVector(int length)
{
if (length < 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also be throwing if length > ushort.Max ?

Same for the ReadOnlyMemory constructor.

(With updated public docs to match.)

Copy link
Contributor Author

@edwardneal edwardneal Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so, yes - although I think it's probably going to be TdsEnums.VECTOR_HEADER_SIZE + (_elementSize * Length) > 8000 to align with SQL Server.

This also means that length will always be < 8000, thus always in the acceptable range for a ushort (so MakeTdsBytes can just have a simple debug assertion rather than an exception.)

result[1] = VecVersionNo;
result[2] = (byte)(Length & 0xFF);
result[3] = (byte)((Length >> 8) & 0xFF);
BinaryPrimitives.WriteUInt16LittleEndian(result.AsSpan(2), (ushort)Length);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a new class invariant to ensure Length's value is compatible with ushort. See my comment on the constructor above.

for (int i = 0, currPosition = TdsEnums.VECTOR_HEADER_SIZE; i < values.Length; i++, currPosition += _elementSize)
{
#if NET
BinaryPrimitives.WriteSingleLittleEndian(result.AsSpan(currPosition), (float)(object)valueSpan[i]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a performance impact here (good or bad)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expected there to be one, but didn't expect it to be so large.

Previously, the method would have taken ~70ns to copy a max-length vector when the ReadOnlyMemory was backed by an array and ~240ns when it was backed by unmanaged memory (as a result of the extra copy.) As a result of the first set of changes, it would have taken ~1150ns.

The endianness is only really a concern on big-endian systems, so I've changed the method slightly. On a big-endian machine, it'll continue to use the replacement method (so will take about 1150ns.) On a little-endian machine, it'll continue to use the pre-PR method (albeit with a Span-based copy instead of Buffer.BlockCopy) which now takes ~60ns.

Vectors must always be less than 8000 bytes.
This also means that Length will always be <= ushort.MaxValue. Add assertion to highlight this.
Length is the number of elements.
@cheenamalhotra
Copy link
Member

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Multibyte writes should explicitly happen as little-endian.

4 participants