Skip to content

Conversation

@edwardneal
Copy link
Contributor

Description

This PR makes two related changes: it populates the DataTypes table returned by SqlConnection.GetSchema in code rather than by deserializing an embedded XML resource, and then adds the json data type to this table.

In the process of adding tests, I also noticed that there wasn't a coherent view of whether or not the json data type was present, so added one. This also led me to notice that we're not running JSON-based tests on SQL 2025, so I changed their test criteria to make sure this happens. This is essentially the same situation as #3794, and I'm expecting some degree of conflicts depending upon the order they're merged in.

It should be possible to move commit-by-commit, but 2f907ed is probably the bulkiest. This commit populates the fields in the DataTypes table based on the value of fields in the MetaType instances for each type.

Issues

Contributes to #3833.
Contributes to #3372.

I'm conscious that #3833 refers to the json and the vector types. I've not included the vector type because I don't know whether we should treat vector({0}[, float32]) and vector({0}, float16) as two different data types. There's also a little extra plumbing to deal with within SqlVector<float> and in the tests (which #3794 introduces.)

Testing

I've manually verified that all fields for all existing data types are identical between main and this branch. I've also added a new test to prove that no data types have been omitted.

SqlClient cannot connect to SQL Server 2000 instances, so the minimum server version is 9.0.
* Remove data type entries with a maximum version below this.
* Remove the minimum version constraint where this constraint will always be true.
This enables the JSON tests to run against SQL Server 2025.
Also correct a comment in LoadDataTypesDataTables.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the DataTypes schema metadata population from an XML-based approach to a code-based approach, and adds support for the json data type. It introduces a new IsJsonSupported property to consolidate SQL Server version detection for JSON features and updates all JSON-related tests to use this unified check.

Key Changes

  • Replaced XML deserialization for the DataTypes metadata table with programmatic population using MetaType introspection
  • Added the json data type to the schema with appropriate metadata (minimum version 17.00.000.0, long type, not best match)
  • Introduced DataTestUtility.IsJsonSupported property to centralize JSON support detection across tests

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
JsonTest.cs Updated conditional test attributes to use IsJsonSupported instead of Azure-specific checks
JsonStreamTest.cs Updated conditional test attributes to use IsJsonSupported
JsonBulkCopyTest.cs Updated conditional test attributes to use IsJsonSupported
ConnectionSchemaTest.cs Added new test to verify DataTypes schema and validate json type presence based on server support
DataTestUtility.cs Added IsJsonSupported property and IsTypePresent helper method for SQL Server feature detection
SqlMetaData.xml Removed DataTypesTable XML section (695 lines)
SqlMetaDataFactory.cs Updated to call new code-based DataTypes population instead of XML deserialization
SqlMetaDataFactory.DataTypes.cs New file containing all DataTypes table population logic with helper methods for different type categories
Microsoft.Data.SqlClient.csproj (netfx/netcore) Added reference to new SqlMetaDataFactory.DataTypes.cs file

Comment on lines 245 to 253
void AddLongStringOrBinaryType(SqlDbType longDbType,
string? literalPrefix = null, string? literalSuffix = null,
string? minimumVersion = null) =>
AddStringOrBinaryType(longDbType,
// The column size is measured in elements, not bytes. For ntext, each element is a 2 byte Unicode character.
columnSize: longDbType is SqlDbType.NText ? int.MaxValue / ADP.CharSize : int.MaxValue,
isLong: true, isFixedLength: false,
isSearchable: false,
literalPrefix: literalPrefix, literalSuffix: literalSuffix);
Copy link

Copilot AI Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The minimumVersion parameter is accepted by AddLongStringOrBinaryType but is never passed to the AddStringOrBinaryType method. This means the json data type won't have its MinimumVersion field set to "17.00.000.0" as intended. The AddStringOrBinaryType method needs to accept a minimumVersion parameter and set the MinimumVersionKey field when it's not null, similar to how AddFixedPrecisionDateTimeType and AddVariablePrecisionDateTimeType handle it.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

@edwardneal edwardneal Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct - I'd tested with SQL 2025, so this was always true. I've plumbed this parameter value through and re-tested with SQL 2025, SQL 2022 and SQL Azure to test success and failure cases (both for the new test, and for the other tests which used the new JSON detection logic.)

This has highlighted one possible issue with SQL Azure, which reports a version of 12.x. As a result, it doesn't meet the version criteria and the json type doesn't appear in the list of types returned by GetSchema when run against a SQL Azure instance. I've adapted the test criteria to match this, since the same issue would also impact other data returned by GetSchema (such as the queries run to get a list of columns) and a fix here requires more thought.

@apoorvdeshmukh
Copy link
Contributor

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@paulmedynski paulmedynski self-assigned this Dec 23, 2025
Assert.All(expectedTypes, type => Assert.Contains(type, actualTypes));

// The "json" type should only be present when running against a SQL Server version which supports it.
// SQL Azure reports a version of 12.x but supports JSON, so SqlClient doesn't include it in the list of types.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Azure SQL versions aren't comparable with SQL Server versions, AFAICT. Azure SQL uses Engine Edition 5, so we can check for that and assume all Azure SQL instances support JSON. There are specific version numbers that we may be able to go back and confirm when JSON support first appeared for Azure SQL, but I don't think it's worth it.

Same idea for vectors, except that underlying data types will be added as time goes on, and we will only know by performing the login handshake.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but it's a wider design issue with SqlMetaDataFactory. It decides whether a row should appear in the final DataTable based upon the reported server version (in SupportedByCurrentVersion.) In theory, if we decided to use a different query for one of the other metadata collections (e.g. Tables) or to add a new metadata collection which only appeared in SQL Server 2016 / v13.x or higher, the filtering logic would hide it for Azure SQL instances.

We need a way to convert feature IDs and versions to a list of server capabilities (i.e. JSON feature with v1, vector feature with float32, vector feature with float16, etc.) and to allow filtering on those. That's capability list is best placed within SqlInternalConnectionTds though and this PR's already pretty large, so I'd prefer to do that separately prior to preview4.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sound good - I just wanted to point the different way the version numbers are treated for the different engine editions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there existing unit tests for this functionality that are now testing this new file? Or do we need new tests for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are existing unit tests which prove that the process of building the schema dataset completes successfully, and there's a new (somewhat basic) unit test which proves that we've not missed any types from the original XML file.

The data which we add in this new file shouldn't change (barring new data types) so I've not added an automatic test to verify it field-by-field, I just performed a manual one-off check of the resultant DataTypes table against the original XML file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants