-
Notifications
You must be signed in to change notification settings - Fork 263
Description
Describe the bug
After MetroCluster SVM failover, Trident ontap-nas-economy driver cannot discover existing flexvols. after some investigation with debuggTraceFlags.api=true, we noticed it is searching for flexVols with a snapshot policy of none, while the existing flexVols (on the main site) have none-DR snapshot policy.
The driver creates new flexvols instead of reusing existing ones with available capacity, causing SVM volume quota exhaustion.
Environment
Provide accurate information about the environment to help us reproduce the issue.
- Trident version: 25.06.2
- Kubernetes version: 1.33
- Kubernetes orchestrator: vanilla
- NetApp backend types: MetroCluster ONTAP
- Other: SVM volume quota limit 100, qtreesPerFlexvol: 200, limitVolumeSize: 5000Gi
To Reproduce
Steps to reproduce the behavior:
- Configure ontap-nas-economy backend with MetroCluster SVM
- Create flexvols and provision qtrees
- Trigger MetroCluster SVM failover (SVM name changes from
nametoname-mc, flexvols' snapshot-policy changes fromnonetonone-DR) - Provision new PVCs after failover
- Observe: Trident creates new flexvols instead of reusing existing flexvols with
none-DRpolicy - Verify in debug logs:
volume-get-iterZAPI query includes<snapshot-policy>none</snapshot-policy>, filtering outnone-DRvolumes
Expected behavior
Trident should discover and reuse existing flexvols regardless of snapshot policy differences (e.g., none vs none-DR).
Additional context
I've implemented a hack/workaround which removes the snapshot policy from the query (as we are not interested in that value in our setup), so that the query returns both none-DR and none flexVols:
patch
diff --git a/storage_drivers/ontap/api/ontap_zapi.go b/storage_drivers/ontap/api/ontap_zapi.go
index ba1021f0..77780b7a 100644
--- a/storage_drivers/ontap/api/ontap_zapi.go
+++ b/storage_drivers/ontap/api/ontap_zapi.go
@@ -1637,7 +1637,12 @@ func (c Client) VolumeListByAttrs(
if snapReserve >= 0 {
queryVolSpaceAttrs.SetPercentageSnapshotReserve(snapReserve)
}
- queryVolSnapshotAttrs := azgo.NewVolumeSnapshotAttributesType().SetSnapshotPolicy(snapshotPolicy)
+ queryVolSnapshotAttrs := azgo.NewVolumeSnapshotAttributesType()
+ // Only filter by snapshot policy if specified (non-empty)
+ // This allows finding flexvols with different snapshot policies (e.g., "none-DR" vs "none")
+ if snapshotPolicy != "" {
+ queryVolSnapshotAttrs.SetSnapshotPolicy(snapshotPolicy)
+ }
if snapshotDir != nil {
queryVolSnapshotAttrs.SetSnapdirAccessEnabled(*snapshotDir)
}
diff --git a/storage_drivers/ontap/ontap_nas_qtree.go b/storage_drivers/ontap/ontap_nas_qtree.go
index a04852d5..f8314a63 100644
--- a/storage_drivers/ontap/ontap_nas_qtree.go
+++ b/storage_drivers/ontap/ontap_nas_qtree.go
@@ -1443,12 +1443,14 @@ func (d *NASQtreeStorageDriver) findFlexvolForQtree(
}
// Get all volumes matching the specified attributes
+ // Note: Do not filter by SnapshotPolicy to allow discovery of flexvols with different
+ // snapshot policies (e.g., "none-DR" vs "none" in MetroCluster environments)
volAttrs := &api.Volume{
Aggregates: []string{aggregate},
Encrypt: enableEncryption,
Name: d.FlexvolNamePrefix() + "*",
SnapshotDir: convert.ToPtr(enableSnapshotDir),
- SnapshotPolicy: snapshotPolicy,
+ SnapshotPolicy: "", // Empty = don't filter by snapshot policy
SpaceReserve: spaceReserve,
SnapshotReserve: snapshotReserveInt,
TieringPolicy: tieringPolicy,