-
Notifications
You must be signed in to change notification settings - Fork 30
Description
Feature Request: Add Base64 ↔ Base64URL Conversion/Normalization
Background
When working with Microsoft Graph API and other services, developers often encounter identifier encoding differences between Base64 and Base64URL formats. A common example is the conversationId field in Microsoft Graph API responses:
- Microsoft Graph API returns
conversationIdvalues in Base64URL-safe format (uses-and_) - Outlook UI may display the same identifier in Base64 format (uses
+and/)
These are functionally identical encodings of the same binary data, but the character differences (+ ↔ -, / ↔ _) can cause issues when:
- Comparing conversation IDs from different sources
- Filtering/querying by conversation ID
- Matching identifiers between systems
Current State
The ut base64 tool currently supports:
- ✅ Encoding to Base64 (standard)
- ✅ Encoding to Base64URL (with
--urlsafeflag) - ✅ Decoding from Base64 (standard)
- ✅ Decoding from Base64URL (with
--urlsafeflag)
Problem
To convert between Base64 and Base64URL formats, users currently need to:
- Decode the Base64 string (requires valid encoding)
- Re-encode with the other format
Limitations:
- This approach requires the Base64 string to be valid and decodable
- It's a two-step process that's not intuitive
- For identifier normalization (like
conversationId), you often just need character substitution without decoding
Proposed Solution
Add a new subcommand convert (or normalize) to the base64 tool that directly converts between Base64 and Base64URL formats without decoding/encoding.
Proposed API
# Convert Base64 to Base64URL
ut base64 convert "SGVsbG8rV29ybGQvVGVzdA==" --to urlsafe
# Convert Base64URL to Base64
ut base64 convert "SGVsbG8rV29ybGQvVGVzdA==" --to standard
# Or with shorthand flags
ut base64 convert "SGVsbG8rV29ybGQvVGVzdA==" --urlsafe
ut base64 convert "SGVsbG8rV29ybGQvVGVzdA==" --standardCharacter Conversion Rules
-
Base64 → Base64URL:
+→-/→_- Padding
=can be preserved or removed (configurable)
-
Base64URL → Base64:
-→+_→/- Padding
=should be preserved if present
Use Cases
1. Microsoft Graph API Conversation IDs
# Normalize conversationId from Graph API to match Outlook UI format
ut base64 convert "AAABAI..." --to standard
# Normalize conversationId from Outlook UI to match Graph API format
ut base64 convert "AAABAI..." --to urlsafe2. JWT Token Format Conversion
JWT tokens can be encoded in Base64URL format, but sometimes need to be converted to standard Base64 for compatibility with certain systems.
3. API Integration
When integrating with APIs that use different Base64 encoding variants, quick conversion is needed without understanding the underlying data.
4. Identifier Matching
Comparing identifiers from different sources that use different Base64 encoding variants.
Implementation Considerations
- Character substitution only: The conversion should be a simple string transformation without decoding/encoding
- Padding handling: Consider whether to preserve, remove, or normalize padding (
=) - Validation: Should we validate that the input is valid Base64 before conversion?
- Edge cases: Handle strings that contain both Base64 and Base64URL characters
- Auto-detection: Optionally detect the input format and convert to the opposite format
Example Implementation (Pseudocode)
fn convert_base64_format(input: &str, target: Format) -> String {
match target {
Format::UrlSafe => {
input.replace('+', "-")
.replace('/', "_")
.trim_end_matches('=') // Optional: remove padding
}
Format::Standard => {
input.replace('-', "+")
.replace('_', "/")
// Add padding if needed
}
}
}Alternative Approaches
- Add flags to existing encode/decode: Add
--convert-to-urlsafeflag to encode command - Separate tool: Create a new
base64-normalizeorbase64-converttool - Subcommand: Add
convertas a subcommand underbase64(preferred)
Questions for Discussion
- Should we validate the input format before conversion?
- How should we handle padding? (preserve, remove, normalize)
- Should we support auto-detection of input format?
- What should be the default behavior if format is ambiguous?
- Should this be a separate subcommand or integrated into existing encode/decode?
Related Resources
- RFC 4648 - Base64, Base64URL encodings
- Microsoft Graph API conversationId documentation
- Discussion around Base64URL normalization in various ecosystems
Labels: enhancement, feature-request, base64, encoding