Skip to content

Conversation

@jackchallen
Copy link
Collaborator

No description provided.

A customer saw vector space exhaustion, which meant NVMEs became
unusable. This was caused by x2apic not being available, (which
can itself by caused by IOMMU being disabled)

Therefore, check that x2apic (Intel) or ext_apic (AMD) is available,
which should result in plenty of IRQ space.

https://weka-support.slack.com/archives/C066DNGSAE5/p1764947984669029
#check that extended APIC (or x2apic) is available, because it's required for more
# space for IRQs

if (grep -m1 -q -E '^flags.*(\<extapic|\<x2apic)' /proc/cpuinfo) ; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This errors out for me, unless there's a no-op instruction:

if (grep -m1 -q -E '^flags.*(\<extapic|\<x2apic)' /proc/cpuinfo) ; then
    :
else

Also, do you intend for this to run in a subshell?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @vrragosta no idea which version made it into the commit. Pushed a slightly less weird version.

Copy link
Contributor

@vrragosta vrragosta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A-OK

echo "by a Weka process."
echo "This can be caused by the presence of an enabled APIC device. Review your hardware,"
echo "firmware, and linux kernel settings if this is causing a problem"
echo "This can sometimes prevent a WEKA Process from receiving interrupts from the NVME"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weka doesnt care about the interrupts, the issue was that the kernel couldn't allocate interrupts and that caused the KERNEL to fail to use the device, which in turn made weka unable to use it since the device couldn't be scanned to know that it is a weka signed device.

#check that extended APIC (or x2apic) is available, because it's required for more
# space for IRQs

grep -m1 -q -E '^flags.*(\<extapic|\<x2apic)' /proc/cpuinfo 2>/dev/null

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know for certain that all relevant platforms have x2apic? down to the oldest supported server platforms?

I never took notice of that to know myself.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently came out in 2008 with Nehalem.... I think that's older than weka.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants