-
Notifications
You must be signed in to change notification settings - Fork 1k
EC2 instance slow boot with multiple NICs where primary NIC is not the first in the list. #6618 #6644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Hi @MoeSalah1999, thanks for taking a look at this. I think that there is an issue with this approach. The problem in #6618 occurs before this code runs - it is the ephemeral interface code that fails, which occurs before the metadata is queried. A change like this might still be desirable as a part of the solution for #6618, but as it is I do not think that this is complete. Either way, we will need testing on the specific interface type to verify the solution to this problem. |
|
@holmanb Thanks for the clarification — that makes sense. I agree that this helper is, at best, a partial improvement that may still be useful later in the pipeline once metadata is available, but not a complete fix on its own. Unfortunately I wasn’t able to reproduce the issue locally or on a real EC2 instance with the specific multi-NIC + ephemeral interface configuration described in the report. As you noted, proper validation will require testing against that interface type. Please let me know if you’d prefer this PR to be parked as a partial improvement, extended to explore the earlier ephemeral interface code path, or closed until testing on the specific EC2 configuration can be performed. |
|
@holmanb I’m going to trace the ephemeral interface heuristic and look at deferring or softening classification until metadata is available. I’ll follow up with a revised approach. |
3598b59 to
4191bce
Compare
…longer depends on kernel enumeration order
4191bce to
96108dd
Compare
Proposed Commit Message
Additional Context
Behavior
Test Steps
Unit tests added to validate:
Run pytest tests/unittests/sources/helpers/test_ec2.py
This change is covered by new unit tests exercising EC2 metadata shapes where the primary ENI is not first in the metadata ordering, including deterministic selection behavior.
The fix has not yet been validated on a live EC2 instance with multiple attached ENIs where the primary interface is not enumerated first by IMDS. While the logic follows documented EC2 metadata semantics (network-card == 0, device-number == 0) and matches observed metadata layouts, confirmation on a real EC2 datasource would further validate the behavior under actual IMDS timing and attachment conditions.
Merge type