Skip to content

Conversation

@rjenkins
Copy link
Contributor

@rjenkins rjenkins commented Jan 13, 2026

Adds a new File Receiver to Rotel that reads and tails log files and converts them to OpenTelemetry logs. This receiver enables Rotel to ingest logs directly from files (like nginx access logs) without requiring a separate log shipper.

Key capabilities:

  • Glob pattern-based file discovery with include/exclude support
  • Native file system watching (inotify on Linux, FSEvents on macOS) with automatic fallback to polling
  • Built-in parsers for JSON, nginx access/error logs, and custom regex patterns
  • Offset persistence for resume-after-restart support
  • Log rotation handling via inode-based file identification
  • Configurable concurrent file processing with backpressure handling

Design Overview

The receiver uses a coordinator/worker architecture:

The coordinator runs on a single OS thread and maintains exclusive ownership of file state. Workers handle blocking file I/O via tokio::spawn_blocking. The design ensures no duplicate work items per file are running concurrently at any time and provides backpressure through bounded channels.

Files Changed

Core receiver implementation (~5,800 lines):

  • src/receivers/file/receiver.rs - Main coordinator/worker logic
  • src/receivers/file/input/ - File discovery, reading, inode-based identification
  • src/receivers/file/parser/ - JSON, nginx, regex parsers
  • src/receivers/file/persistence/ - File based offset storage
  • src/receivers/file/watcher/ - Native and poll-based file watchers
  • src/init/file_receiver.rs - CLI argument definitions
  • src/init/agent.rs, args.rs, activation.rs - Wiring into Rotel startup
  • src/bounded_channel.rs - Bounded channel utility

Testing & benchmarking:

  • scripts/benchmark-file-receiver.sh - Performance benchmark comparing Rotel vs OTel collector
  • scripts/verify-file-receiver.sh - Functional verification script
  • scripts/nginx-log-generator.sh - Test data generator
  • otel_benchmark_builder/ - OTel collector build for comparison testing

Documentation:

  • README.md - File Receiver configuration reference

Copy link
Contributor

@mheffner mheffner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a lot here obviously, only did a cursory scan with some higher level questions. This may be a good one to have some high-level arch diagrams in the docs, maybe claude could generate? Happy to continue looking at specific portions, but I imagine lots of testing will help here.

One thought of a possible future improvement would be isolation of the components by unique device IDs monitored. The scenario I was thinking of was if you had files monitored across two attached EBS disks, if one were to lock up/block, would you want to continue operating on the other disk? 🤷 Definitely an optimization for the future, but just a thought I had looking.

Good stuff!

self.process_active_file(path);
}

// Mark files as rotated if they're no longer at a glob-matching path
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could a file be rotated, but still match a glob-matching path?

@rjenkins rjenkins marked this pull request as ready for review January 15, 2026 22:24
Copy link
Contributor

@mheffner mheffner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments here, a lot we discussed already though. I'm good with getting this in and continual testing as we expand use cases.

As to the generated benchmarks, I'd be more inclined to remove them to reduce size of repo. I find that stuff can get stale overtime, so may be better to track in different repos (esp since some is go).

// Main event loop - flume channels support async recv directly, no bridge needed
loop {
select! {
_ = cancel.cancelled() => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly consider adding some bias to this select?

}

// Process completed workers
Some(result) = worker_futures.next(), if !worker_futures.is_empty() => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if moving this arm above the previous one would, with a biased select, help clean up the old workers before spawning a new worker for the same file?


let payload_msg = payload::Message::new(None, vec![resource_logs], None);

match logs_output.send(payload_msg).await {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could block here and prevent the other select arms from running?

severity_number,
severity_text,
body: Some(AnyValue {
value: Some(any_value::Value::StringValue(line)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Todo comment for later as to whether to replace the body with something else from the parsed result, probably fine for nginx logs to keep the same.

base64_decode_impl(s.as_bytes())
}

// Simple base64 implementation to avoid adding another dependency
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we already have base64 as a dep?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I will remove this in a follow up PR 👍

@rjenkins
Copy link
Contributor Author

Some comments here, a lot we discussed already though. I'm good with getting this in and continual testing as we expand use cases.

As to the generated benchmarks, I'd be more inclined to remove them to reduce size of repo. I find that stuff can get stale overtime, so may be better to track in different repos (esp since some is go).

👍 I've removed the generated code.

@rjenkins rjenkins merged commit 95234ec into main Jan 16, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants