Skip to content

Conversation

@danieldk
Copy link
Member

@danieldk danieldk commented Dec 17, 2025

We computed a kernel's capabilities by taking the loose intersection of the stated kernel capabilities (or the default) and the capabilities reported to be supported by CMake/Torch. However, this led to issues with e.g. capability 8.9, which is not in these lists (anymore?), but is fine to compile for.

To solve this issue, we will ignore the capabilities reported by CMake/Torch and instead use our own list of capabilities for the loose intersection with the kernel capabilities. This list is the list of all capabilities supported by a CUDA version minus some really old capabilities that are not supported by Torch anyway. This behavior is used by enabling the new BUILD_ALL_SUPPORTED_ARCHS CMake option (which is the default for the Nix and Windows builders).

When BUILD_ALL_SUPPORTED_ARCHS is not set, we will try to detect the capability of the user's CUDA GPU. This speeds up development - since one then only has to compile for a single capability. If this fails for some reason, we'll revert to using all capabilities as if BUILD_ALL_SUPPORTED_ARCHS was set.

We computed a kernel's capabilities by taking the loose intersection of
the stated kernel capabilities (or the default) and the capabilities
reported to be supported by CMake/Torch. However, this led to issues
with e.g. capability 8.9, which is not in these lists (anymore?), but is
fine to compile for.

To solve this issue, we will ignore the capabilities reported by
CMake/Torch and instead use our own list of capabilities. This list is
the list of all capabilities supported by a CUDA version minus some
really old capabilities that are not supported by Torch anyway. This
behavior is used by enabling the new `BUILD_ALL_SUPPORTED_ARCHS` CMake
option (which is the default for the Nix and Windows builders).

When `BUILD_ALL_SUPPORTED_ARCHS` is not set, we will try to detect the
capability of the user's CUDA GPU. This speeds up development - since
one then only has to compile for a single capability. If this fails for
some reason, we'll revert to using all capabilities as if
`BUILD_ALL_SUPPORTED_ARCHS` was set.
Copy link
Collaborator

@MekkCyber MekkCyber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing!

Copy link
Collaborator

@MekkCyber MekkCyber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm Thank you

@danieldk danieldk merged commit 9ea57a8 into main Dec 18, 2025
28 of 29 checks passed
@danieldk danieldk deleted the redo-capabilities branch December 18, 2025 11:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants