-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Ceph RGW QuotaExceeded Analysis for Bucket EllrottLab
Context
The application is receiving an S3-style error:
<Error>
<Code>QuotaExceeded</Code>
<BucketName>EllrottLab</BucketName>
...
</Error>The object store backend is Ceph RGW (RADOS Gateway), not AWS S3. In Ceph RGW, QuotaExceeded indicates that a configured quota limit has been reached, not that the physical cluster is necessarily out of raw capacity and not that the system is performing request-rate throttling.
How Ceph RGW Quotas Work
Ceph RGW supports quotas at two primary scopes:
-
User quota
- Limits total usage for a given RGW user (across all buckets).
- Controls:
max_size: maximum total bytes used.max_objects: maximum number of objects.
-
Bucket quota
- Limits usage for a specific bucket.
- Controls:
max_size: maximum bytes in that bucket.max_objects: maximum number of objects in that bucket.
When either a user quota or a bucket quota is enabled and exceeded, RGW returns:
<Code>QuotaExceeded</Code>for subsequent write operations (PUT Object, multipart upload, CopyObject, etc.) that would increase usage.
What QuotaExceeded Is and Is Not
It IS:
- A signal that a configured quota has been reached:
- Bucket size (
max_sizefor the bucket), or - Bucket object count (
max_objectsfor the bucket), or - User total size (
max_sizeacross all buckets), or - User total object count (
max_objectsacross all buckets).
- Bucket size (
It is NOT:
- AWS S3 throttling (
SlowDown,ServiceUnavailable, etc.). - Direct evidence of cluster-wide space exhaustion (though that can co-exist).
- A transient, retryable rate-limit error. It will persist until the quota is adjusted or usage is reduced.
Diagnostic Procedure
The goal is to determine:
- Whether the quota violation is at the bucket level or the user level.
- Which parameter (size vs object count) is responsible.
Step 1: Inspect bucket stats and bucket quota
-
Use
radosgw-admin bucket statsto view current bucket usage and owner:radosgw-admin bucket stats --bucket=EllrottLab | jq .
This shows:
- Current size and object count.
- The
ownerfield, which is the RGW user that owns the bucket.
-
Check bucket-level quota configuration:
radosgw-admin quota get --quota-scope=bucket --bucket=EllrottLab
Here you will see:
- Whether the bucket quota is
enabled. - Any values set for
max_sizeandmax_objects.
- Whether the bucket quota is
Step 2: Determine bucket owner and inspect user quota
-
Extract the bucket owner:
OWNER=$(radosgw-admin bucket stats --bucket=EllrottLab | jq -r '.owner') echo "Bucket owner (RGW user): $OWNER"
-
Inspect the user’s quota and usage:
radosgw-admin user info --uid="$OWNER" | jq '.user_quota, .bucket_quota' radosgw-admin quota get --quota-scope=user --uid="$OWNER"
This shows:
- Whether user quotas are enabled.
max_size/max_objectsfor the user.- Aggregate usage across all buckets for this user.
Step 3: Decide which quota is limiting
Compare:
- Bucket usage vs bucket quota (
max_size,max_objects). - User usage vs user quota.
The first scope where usage meets or exceeds the configured limit is what is generating QuotaExceeded.
Examples:
- If
EllrottLabbucket is ~1 TB and bucketmax_sizeis1T, and bucket quota is enabled → bucket quota is limiting. - If the user’s aggregate usage across multiple buckets hits
max_sizewhile the bucket quota is generous or disabled → user quota is limiting.
Remediation Options
Depending on policy, you can:
-
Increase the bucket quota (if bucket quota is the limiting factor):
radosgw-admin quota set --quota-scope=bucket --bucket=EllrottLab --max-size=10T --max-objects=-1 radosgw-admin quota enable --quota-scope=bucket --bucket=EllrottLab
--max-size=10T: set maximum bucket size to 10 TB (adjust to your environment).--max-objects=-1: disable object-count limit.
-
Increase the user quota (if user quota is the limiting factor):
radosgw-admin quota set --quota-scope=user --uid="$OWNER" --max-size=50T --max-objects=-1 radosgw-admin quota enable --quota-scope=user --uid="$OWNER"
--max-size=50T: increase total allowed size across all buckets for this user.
-
Reduce usage (if policy restricts quotas):
- Delete unneeded objects from the bucket.
- Apply lifecycle policies (e.g., expiration on old data) and wait for them to take effect.
Post-Change Verification
After adjusting quotas, verify that:
-
The bucket and user quotas are now high enough compared to current usage:
radosgw-admin bucket stats --bucket=EllrottLab | jq . radosgw-admin quota get --quota-scope=bucket --bucket=EllrottLab radosgw-admin quota get --quota-scope=user --uid="$OWNER"
-
New uploads no longer return
QuotaExceeded.
If QuotaExceeded persists despite apparently generous quotas, investigate:
- Whether quotas are being applied at the correct scope (e.g., updating the wrong user).
- Any higher-level pool or namespace constraints (e.g., underlying Ceph pool near full).
Summary
For Ceph RGW:
QuotaExceededis a logical limit error, driven by the RGW quota subsystem, not a generic storage or transaction-rate failure.- The typical resolution path is:
- Identify whether bucket or user quotas are limiting.
- Adjust
max_sizeand/ormax_objectsin line with storage policy. - Re-verify usage and resume uploads.
This should be treated as an intentional guardrail, and quota changes should follow your organization’s storage governance and approval processes.