Skip to content

Feature Request: Add Base64 ↔ Base64URL Conversion/Normalization #22

@nooscraft

Description

@nooscraft

Feature Request: Add Base64 ↔ Base64URL Conversion/Normalization

Background

When working with Microsoft Graph API and other services, developers often encounter identifier encoding differences between Base64 and Base64URL formats. A common example is the conversationId field in Microsoft Graph API responses:

  • Microsoft Graph API returns conversationId values in Base64URL-safe format (uses - and _)
  • Outlook UI may display the same identifier in Base64 format (uses + and /)

These are functionally identical encodings of the same binary data, but the character differences (+-, /_) can cause issues when:

  • Comparing conversation IDs from different sources
  • Filtering/querying by conversation ID
  • Matching identifiers between systems

Current State

The ut base64 tool currently supports:

  • ✅ Encoding to Base64 (standard)
  • ✅ Encoding to Base64URL (with --urlsafe flag)
  • ✅ Decoding from Base64 (standard)
  • ✅ Decoding from Base64URL (with --urlsafe flag)

Problem

To convert between Base64 and Base64URL formats, users currently need to:

  1. Decode the Base64 string (requires valid encoding)
  2. Re-encode with the other format

Limitations:

  • This approach requires the Base64 string to be valid and decodable
  • It's a two-step process that's not intuitive
  • For identifier normalization (like conversationId), you often just need character substitution without decoding

Proposed Solution

Add a new subcommand convert (or normalize) to the base64 tool that directly converts between Base64 and Base64URL formats without decoding/encoding.

Proposed API

# Convert Base64 to Base64URL
ut base64 convert "SGVsbG8rV29ybGQvVGVzdA==" --to urlsafe

# Convert Base64URL to Base64
ut base64 convert "SGVsbG8rV29ybGQvVGVzdA==" --to standard

# Or with shorthand flags
ut base64 convert "SGVsbG8rV29ybGQvVGVzdA==" --urlsafe
ut base64 convert "SGVsbG8rV29ybGQvVGVzdA==" --standard

Character Conversion Rules

  • Base64 → Base64URL:

    • +-
    • /_
    • Padding = can be preserved or removed (configurable)
  • Base64URL → Base64:

    • -+
    • _/
    • Padding = should be preserved if present

Use Cases

1. Microsoft Graph API Conversation IDs

# Normalize conversationId from Graph API to match Outlook UI format
ut base64 convert "AAABAI..." --to standard

# Normalize conversationId from Outlook UI to match Graph API format  
ut base64 convert "AAABAI..." --to urlsafe

2. JWT Token Format Conversion

JWT tokens can be encoded in Base64URL format, but sometimes need to be converted to standard Base64 for compatibility with certain systems.

3. API Integration

When integrating with APIs that use different Base64 encoding variants, quick conversion is needed without understanding the underlying data.

4. Identifier Matching

Comparing identifiers from different sources that use different Base64 encoding variants.

Implementation Considerations

  1. Character substitution only: The conversion should be a simple string transformation without decoding/encoding
  2. Padding handling: Consider whether to preserve, remove, or normalize padding (=)
  3. Validation: Should we validate that the input is valid Base64 before conversion?
  4. Edge cases: Handle strings that contain both Base64 and Base64URL characters
  5. Auto-detection: Optionally detect the input format and convert to the opposite format

Example Implementation (Pseudocode)

fn convert_base64_format(input: &str, target: Format) -> String {
    match target {
        Format::UrlSafe => {
            input.replace('+', "-")
                 .replace('/', "_")
                 .trim_end_matches('=')  // Optional: remove padding
        }
        Format::Standard => {
            input.replace('-', "+")
                 .replace('_', "/")
                 // Add padding if needed
        }
    }
}

Alternative Approaches

  1. Add flags to existing encode/decode: Add --convert-to-urlsafe flag to encode command
  2. Separate tool: Create a new base64-normalize or base64-convert tool
  3. Subcommand: Add convert as a subcommand under base64 (preferred)

Questions for Discussion

  1. Should we validate the input format before conversion?
  2. How should we handle padding? (preserve, remove, normalize)
  3. Should we support auto-detection of input format?
  4. What should be the default behavior if format is ambiguous?
  5. Should this be a separate subcommand or integrated into existing encode/decode?

Related Resources


Labels: enhancement, feature-request, base64, encoding

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions