-
Notifications
You must be signed in to change notification settings - Fork 323
It upgrades Elastic search client and urlib3 libraries.
#2024
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
ee26164 to
1e8aeeb
Compare
1e8aeeb to
1421ad0
Compare
553574d to
de1c485
Compare
…pgrade of elastisearch client libraries.
b9c230c to
46c6f19
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR upgrades the Elasticsearch client libraries and urllib3 to newer major versions, updating from Elasticsearch 8.x to 9.x and urllib3 1.x to 2.x. The changes include API parameter updates, refactoring of header handling logic, and deprecation of certain track dependencies.
Changes:
- Updated
elasticsearch[async]from 8.6.1 to 9.2.1,elastic-transportfrom 8.4.1 to 9.2.1, andurllib3from 1.26.19 to 2.6.3 - Renamed
maxsizeparameter toconnections_per_nodein client initialization to match the new API - Refactored header handling logic into reusable functions and improved MIME type compatibility handling
Reviewed changes
Copilot reviewed 11 out of 12 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| pyproject.toml | Updated dependency versions for elasticsearch, elastic-transport, urllib3, and pip |
| esrally/client/factory.py | Changed maxsize to connections_per_node parameter |
| tests/client/factory_test.py | Updated test to use connections_per_node parameter |
| esrally/client/common.py | Refactored header combining and MIME type compatibility logic into dedicated functions |
| esrally/client/synchronous.py | Updated to use new header handling functions and added endpoint_id/path_parts parameters |
| esrally/client/asynchronous.py | Updated RallyAsyncElasticsearch signature and header handling to use new functions |
| esrally/track/loader.py | Added deprecation handling for elasticsearch and elastic-transport dependencies in tracks |
| esrally/utils/versions.py | Added type hints to is_version_identifier function |
| it/init.py | Updated test distributions to include 9.2.4 and removed 6.8.0/ARM logic |
| tests/client/common_test.py | Removed test file (likely obsolete after refactoring) |
| tests/driver/runner_test.py | Added typing import and @typing.no_type_check decorator to test |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
urlib3 libraries.
…ocker-compose` (second part).
b1a7027 to
24f6479
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 20 out of 21 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
I am waiting for review for the following Elasticsearch PR fix that should block this PR build failures: elastic/elasticsearch#140947 It should correct the following error: |
|
PR elastic/elasticsearch#140947 has been merged. Now current Elasticsearch should compile again. |
gareth-ellis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - Have you run any comparisons with the old and new clients, and validated whether performance is equivalent?
gbanasiak
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few areas of concern:
- How does this change affect the range of supported Elasticsearch versions? This needs to be reflected in Rally documentation.
- If the intention is to drop support for ES 6.8 and 7.x as part of client upgrade, we should release one more Rally version before merging this update IMO. It's been a while since we released.
- What is the test plan here? Go to nightly staging, and then nightly production, and revert if things go south? I would prefer to test this more extensively before the merge (e.g. by running all nightlies in test mode).
- Why
_ProductCheckeris kept insynchronous.py? This client upgrade is an opportunity to simplify, because the following no longer holds:
Because Rally needs to support versions of ES >= 6.8.0, we resurrect the previous
logic for determining the authenticity of the server, which does not rely exclusively
on this header.
…al parent method implementations.
It should now match versions supported by Elasticsearch Python client 9.x
OK.
It would be nice to push this to a branch used by the nightly staging CI jobs, so we would eventually see it working or not. Is
I removed the old logic and matched the code in Rally as much as possible with the official client, with the only exception of the support for serverless and the automatic addition of |
…g tracks preparation.
… version to 9.2.4
gbanasiak
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've noticed ES Python client docs say 9.x is not compatible with ES 8.x, so I started wondering if we shouldn't conservatively upgrade the client to 8.19 instead? This would still allow us to get rid of old urllib3 version. What exactly fails when using 9.x client against 8.x cluster?
What is the purpose of _ProductChecker in synchronous.py now that we're dropping support for ES 6.8 and 7.x?
Have you checked if we actually have to keep overriding _perform_request() in RallySyncElasticsearch and RallyAsyncElasticsearch? See note below.
| if not self.is_serverless: | ||
| if versions.is_version_identifier(self.distribution_version) and ( | ||
| versions.Version.from_string(self.distribution_version) >= versions.Version.from_string("8.0.0") | ||
| ): | ||
| _mimetype_header_to_compat("Accept", request_headers) | ||
| _mimetype_header_to_compat("Content-Type", request_headers) | ||
| # Converts all parts of a Accept/Content-Type headers | ||
| # from application/X -> application/vnd.elasticsearch+X | ||
| # see https://github.com/elastic/elasticsearch/issues/51816 | ||
| mimetype_headers_to_compat(request_headers, self.distribution_version) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need if not self.is_serverless condition? I don't see anything like it in 9.2.1 version of ES Python client, see https://github.com/elastic/elasticsearch-py/blob/2022b70b5c2139c3c5d9c47d038f022a1cb1f7ca/elasticsearch/_sync/client/_base.py#L307-L308 which does support serverless I presume.
If this goes away, can we get rid of _perform_request() completely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sending compatible-with on serverless is more correct, that's a missing feature in the Python client. It's harder to fix because we don't know if something is serverless when talking to it initially, so the only thing we can do is add an opt-in serverless mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the new client there are new features like supporting open telemetry. I decided to keep it and have the client code as much similar to the official version to make its maintenance simpler for the next upgrades. _perform_request is the actual new implementation in the base client, it is being wrapped to support open telemetry and I keep it the same. is_serverless condition makes things bit more complex but required to avoid sending an unsupported header to serverless. Anyway the header is only send if the server version is known that could not be the case (and it should work anyway I guess). The fact no CI job was broken by testing it with either 8 and 9 family give me confidence this approach could actually work and it is enough simple.
I never validated performances. |
| # from application/X -> application/vnd.elasticsearch+X | ||
| # see https://github.com/elastic/elasticsearch/issues/51816 | ||
| # Not applicable to serverless | ||
| if body is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this block is only required to avoid updating our tracks to include these headers. I wonder why the headers handling wasn't moved to an higher level to leave this client patching much simpler. @pquentin What do you think?
| if body is not None: | ||
| # It ensures content-type and accept headers are set. | ||
| mimetype = "application/json" | ||
| if path.endswith("/_bulk"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
API protocol 9 is more picky and requires application/x-ndjson for bulk requests.
|
|
||
| raise HTTP_EXCEPTIONS.get(meta.status, ApiError)(message=message, meta=meta, body=resp_body) | ||
|
|
||
| # 'X-Elastic-Product: Elasticsearch' should be on every 2XX response. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intended to keep the code much similar to the official client.
| return | ||
|
|
||
| major_version = 8 | ||
| if versions.is_version_identifier(distribution_version): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It only uses compatibility version 9 when it does knows server version is 9, else it always uses version 8.
Concerns lifted after internal discussion.
This PR upgrades the Elasticsearch client libraries and urllib3 to newer major versions, updating from Elasticsearch 8.x to 9.x and urllib3 1.x to 2.x. The changes include API parameter updates, refactoring of header handling logic, and deprecation of certain track dependencies.
Changes:
elasticsearch[async]from 8.6.1 to 9.2.1elastic-transportfrom 8.4.1 to 9.2.1urllib3from 1.26.19 to 2.6.3dockerfrom 6.0.0 to 7.1.0maxsizeparameter toconnections_per_nodein client initialization to match the new APIdocker-composecommand withdocker composeto avoid conflicts with newurlib3version.it/docker_dev_image_test.pyand update elasticsearch server to version 9.2.4.This PR replaces the now obsolete #2003 and it requires PR elastic/rally-tracks#1000