Skip to content

Conversation

@mephinet
Copy link

@mephinet mephinet commented Feb 7, 2018

We are using InfluxDB::Writer::RememberingFileTailer on our production system under heavy load with lot of short-lived parallel processes writing into the stats directory. In this setup, we've witnessed two different race conditions:

Scenario 1

The process writing stats starts and immediately writes a few lines. Later, the IO::Async::File sees the directory mtime change and calls watch_dir, which calls setup_file_watcher for the process' stats file. A new IO::Async::FileStream instance is created for this stats file, but as it calls seek_to_last("\n") all lines written before this takes place are skipped. filetailer.t tests for that scenario. The solution is to always read new files from the beginning.

Scenario 2

When many stats-writing processes are created, chances rise that the death of a process will be recognized by watch_dir / setup_file_watcher before all data for this process' stats file has been read. In this case, the IO::Async::FileStream watcher is removed from the loop before it had the chance to call buffer_push in its on_read. The solution is to read_more until the EOF is reached.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants