-
Notifications
You must be signed in to change notification settings - Fork 0
Ldrozdz93/azure blob storage source #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| Example connection string: | ||
| ```text | ||
| DefaultEndpointsProtocol=https;AccountName=myaccount;AccountKey=mykey;EndpointSuffix=core.windows.net |
Check failure
Code scanning / check-spelling
Unrecognized Spelling Error
| container_name = "logs" | ||
| [sources.azure_logs.queue] | ||
| queue_name = "eventgrid" |
Check failure
Code scanning / check-spelling
Unrecognized Spelling Error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request introduces a new azure_blob source for Vector that enables reading logs from Azure Blob Storage via Event Grid notifications delivered through Azure Storage Queues.
Changes:
- New Azure Blob Storage source with queue-based event processing
- Support for compression (gzip, zstd), multiple codecs, and multiline aggregation
- Comprehensive unit and integration tests
- Documentation files and configuration examples
Reviewed changes
Copilot reviewed 20 out of 21 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| src/sources/azure_blob/mod.rs | Main source implementation with streaming and event processing |
| src/sources/azure_blob/queue.rs | Queue integration, blob retrieval, and Event Grid message processing |
| src/sources/azure_blob/test.rs | Unit tests for compression detection and blob processing |
| src/sources/azure_blob/integration_tests.rs | Integration tests covering various scenarios |
| src/internal_events/azure_queue.rs | Internal event definitions for metrics and logging |
| website/cue/reference/components/sources/azure_blob.cue | Component documentation and metadata |
| Cargo.toml | Dependency and feature flag additions |
| tests/integration/azure/config/*.yaml | Integration test configuration |
| testing/github-XXXXX/* | Test artifacts and documentation (should be removed) |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/internal_events/mod.rs
Outdated
| pub(crate) use self::aws_kinesis_firehose::*; | ||
| #[cfg(any(feature = "sources-aws_s3", feature = "sources-aws_sqs",))] | ||
| pub(crate) use self::aws_sqs::*; | ||
| // #[cfg(feature = "sources-azure_blob")] |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The commented-out code and feature flag comments should be either removed or properly uncommented. Lines 157-158 and 190-191 contain inconsistent commenting that suggests uncertainty about whether these should be conditionally compiled.
| // #[cfg(feature = "sources-azure_blob")] |
src/sources/azure_blob/queue.rs
Outdated
| ).await { | ||
| Ok(Some(bp)) => yield bp, | ||
| Ok(None) => trace!("Message {msg_id} is ignored, \ | ||
| no blob stream stream created from it. \ |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a typo in the log message: "no blob stream stream created" should be "no blob stream created" (remove duplicate "stream").
| no blob stream stream created from it. \ | |
| no blob stream created from it. \ |
| pub struct ClientCredentials { | ||
| /// Tenant ID for Azure authentication. | ||
| pub tenant_id: String, | ||
| /// Client ID for Azure authentication. | ||
| pub client_id: String, | ||
| /// Client secret for Azure authentication. | ||
| pub client_secret: String, | ||
| } |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fields tenant_id, client_id, and client_secret in ClientCredentials struct should be public to allow configuration deserialization. Without public visibility, users won't be able to configure these authentication credentials.
PR_CHECKLIST.md
Outdated
| --- | ||
|
|
||
| **Last Updated:** 2026-01-01 | ||
| **Status:** Ready for pre-PR work after GitHub issue approval |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR includes testing artifacts and internal documentation that should not be committed to the repository. Files like PR_CHECKLIST.md, testing/github-XXXXX/, and test-results.md appear to be development/testing artifacts rather than production code or official documentation.
| **Status:** Ready for pre-PR work after GitHub issue approval |
src/internal_events/azure_queue.rs
Outdated
| counter!( | ||
| "component_errors_total", | ||
| "error_code" => "failed_deleting_azure_queue_event", | ||
| "error_type" => error_type::WRITER_FAILED, |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error_type is inconsistent: "WRITER_FAILED" is used for a queue message deletion failure, but this should likely be "ACKNOWLEDGMENT_FAILED" to match the error message on line 99.
| "error_type" => error_type::WRITER_FAILED, | |
| "error_type" => error_type::ACKNOWLEDGMENT_FAILED, |
src/lib.rs
Outdated
| #[cfg(feature = "aws-config")] | ||
| pub mod aws; | ||
| #[allow(unreachable_pub)] | ||
| // pub mod azure; |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The commented-out code should be removed or uncommented with proper feature flags. Having commented-out module declarations in production code can lead to confusion and maintenance issues.
| // pub mod azure; |
5d8d878 to
e962d8b
Compare
A new `azure_blob` source for reading logs from Azure Blob Storage containers via Azure Storage Queue notifications (Event Grid). Designed for feature parity with the existing `aws_s3` source. Key features: - Event-driven architecture using Azure Event Grid via Storage Queue - Connection string authentication - Configurable compression (gzip, zstd) with auto-detection - Configurable framing (newline-delimited, character-delimited, etc.) - Multiline aggregation for stack traces and multi-line logs - Event metadata enrichment (container, blob, timestamp) - Acknowledgement support
e962d8b to
8a72d0d
Compare
Summary
This PR adds Azure Blob Storage source. From the user's perspective, it's intended to work in a similar manner to AWS S3 source.
Vector configuration
How did you test this PR?
Change Type
Is this a breaking change?
Does this PR include user facing changes?
no-changeloglabel to this PR.References
Notes
make build-licenseswas run to regenerate the license inventory.