diff --git a/evobench-evaluator/README.md b/evobench-evaluator/README.md new file mode 100644 index 0000000..cb2e2f8 --- /dev/null +++ b/evobench-evaluator/README.md @@ -0,0 +1,6 @@ +XXX Todo + +* [usage](docs/usage.md) +* [overview](docs/overview.md) +* [hacking](docs/hacking.md) + diff --git a/evobench-evaluator/docs/hacking.md b/evobench-evaluator/docs/hacking.md index 3609115..341b4f7 100644 --- a/evobench-evaluator/docs/hacking.md +++ b/evobench-evaluator/docs/hacking.md @@ -6,19 +6,22 @@ Also see the [overview](overview.md). ### Style / details -* Types with names ending in "Opts" (or also "Opt" XX) are generally - (XX?) precursor types (at least if a sister type without the "Opts" - suffix exists): used for configuration or command line options, but - translated before use. - -* Using `Arc` for the parts that come from the config or are derived - from it during load time, as that process is quite a bit convoluted, - and worse, there's config file reload, too. It might still be - feasible to use references instead, but so what. But, trying to use - `clone_arc()` (from `src/utillib/arc.rs`) consistently whenever an - `Arc` is cloned, for clarity and easy searching when interested - where it happens. Please keep this up. - +* Types with names ending in "Opts" (or "Opt", if they only contain a + single option) are types directly taking options from humans, either + via the command line (`Clap`) or config files (`serde`). + + Sometimes they are used by the application as is. Sometimes they are + verified and translated before use; the types they are translated to + are *not* using names ending in "Opts" (but rather, generally, + "Options"). + + Types holding configuration that is generated by the program are + using names generally or often ending in "Options", but never "Opts". + +* When using `Arc` (e.g. for some parts that come from the config, + which has convoluted life times due to reloading of the config at + runtime), use `clone_arc()` (from `src/utillib/arc.rs`) to clone it, + for clarity and easy searching when interested where it happens. ## Specifics diff --git a/evobench-evaluator/docs/internals/evaluator/index.md b/evobench-evaluator/docs/internals/evaluator/index.md new file mode 100644 index 0000000..319580d --- /dev/null +++ b/evobench-evaluator/docs/internals/evaluator/index.md @@ -0,0 +1,308 @@ +# How `evobench-evaluator` works internally + +## Statistics levels + +1. The benchmark log file resulting from a benchmarking run is + processed to a statistics called "single" (for "single run"). Probe + timings are collected into a tree so that for each dynamic location + of the probe within the runtime call graph a path (like a + backtrace, but only containing probe names, not function names) can + be derived. For each such location within each thread (optionally), + but also across threads, but also for the probes irrespective of + location in the call graph, timings are collected and represented + with statistical values (count, sum, average, standard deviation, + median, percentiles) as a row in the Excel file; for flamegraphs, + only the path based representation is used. + +2. If there is an interest in detecting performance deviations, + multiple benchmarking runs (e.g. 5 or 10) should be executed for a + single combination of commit id of the target project and + benchmarking invocation parameters (directory within target + project, command and arguments, and environment variables if any), + so that statistical significance for a deviation can be + calculated. `evobench-evaluator` is run with the `summary` + subcommand to calculate this second statistical level: the + statistics for a particular result of each benchmarking run + (example: take the *median* values of each probe-location of each + run, calculate the count, sum, average, standard deviation, median + and percentiles for *those*). + +3. Then, given benchmarking logs from multiple commit ids (with + multiple runs each), a trend or graph can be derived or performance + deviation be calculate and reported. This third level is not + implemented yet (but much has been prepared for it already). + +## Types + +### `options.rs` + +The evaluator translates to Excel or flamegraph files (and in the +FUTURE: caches, graphs, perhaps reports). + +It can translate to both of those output types in the same run: the +paths are specified in `OutputOpts`. They are given as options on the +same level (via `#[clap(flatten)]` from the +[clap](https://crates.io/crates/clap) command line parser crate) as +the parameters for the evaluation, which are in `EvaluationOpts`. + +`OutputOpts` is checked and converted to `CheckedOutputOptions` before +use, which wraps a `OutputVariants`, which is a parameterized type +that holds Excel and flamegraph variants of data through the pipeline. + +#### StatsField + +When summarizing data (i.e. level 2 or 3 as described in [Statistics +levels](#Statistics levels) above), but also when generating +flamegraphs, a decision has to be taken about which statistical number +to build the higher level statistical evaluation over. The selection +of the field is represented by the `stats::StatsField` +enum type; the type parameter is an integer for how many tiles are +used in the statistics, currently the `evobench-evaluator` uses 101 +everywhere (percentiles, 0..100 inclusive). To be used as command line +option, it implements `FromStr`, i.e. can be created from a string +(like "average", "stdev", "10"). + +This field is used in the types `evaluator::options::FlameFieldOpt` +(choice of field for the flamegraph output), +`evaluator::options::FieldSelectorDimension3Opt` (choice of field for +the level 2 statistics (summary)), and +`evaluator::options::FieldSelectorDimension4Opt` (choice of field for +the unfinished level 3 statistics). The point of these wrapper types +is to hold both help text and default value for `clap` as much as to +disambiguate the option usage in the code. + +## Processing chain + +### 1. Parsing and tree building + +This part of the processing is done by the code in [evaluator/data/](../../../src/evaluator/data/mod.rs). + +1. Parsing: + + The benchmarking log files are currently in an NLJSON based + format, with version and context information at the beginning, + optionally zstd compressed. The log lines are parsed into a + vector of + [`LogMessage`](../../../src/evaluator/data/log_message.rs), which + contain [`Timing`](../../../src/evaluator/data/log_message.rs) + records for probes, held by a + [`LogData`](../../../src/evaluator/data/log_data.rs) instance. + + Note that `Timing` records contain just a single absolute data + point (but for multiple different kinds of values, e.g. real time, + cpu time etc.); it is by later pairing up the `Timing` records for + the start (logging from object constructor) and end (logging from + object destructor) of the same scope (identified by scope name, + which must be unique!) and taking the difference that the cost + becomes known. This design (calculating the difference during + evaluation, not recording) was chosen to try to keep the cost of + logging lower, but potentially the absolute timings could allow + for additional evaluations (e.g. end of scope to end of parent + scope) or event correlations, too (not currently done). + +2. Tree building: + + Then `LogMessage` entries for probes (more precisely, references + to their `Timing` parts, with the timings for the scope start and + end for each probe paired up) are collected into a + [`LogDataTree`](../../../src/evaluator/data/log_data_tree.rs). Both + the LogData and derived LogDataTree are bundled in a + [`LogDataAndTree`](../../../src/evaluator/data/log_data_and_tree.rs) + instance. + + [evaluator/data/log_data_tree.rs (`path_string()` on `Span`)](../../../src/evaluator/data/log_data_tree.rs) + also contains the code to turn a location in the tree into a path + ("probe-span backtrace"). + +### 2. Path index, calculating statistics, collection into tables + +#### Path index + +The `LogDataAndTree` structure from the previous step contains all the +original, individual `Timing` records, two per each logging probe +encounter (the `EVOBENCH_SCOPE_EVERY` probes only log once for every n +encounters): one for the start and one for when the scope ends and the +destructor runs. The tree just holds them together according to the +dynamic context (thread, then call context). This detail data now +needs to be condensed down as descriptive statistics. + +There are multiple ways how the tree could be condensed down: + +- One might wish to know the total cost of a particular scope, + irrespective of its dynamic context (i.e. regardless where it was + called from). + +- Or one might wish to know the total cost of a particular scope *in a + particular calling context*. In that case, + + - one might also care about which thread that context (call path) + was executed on, + - or one might just want to know the total cost of the same call + path across all threads. + +The tree in `LogDataAndTree` has the most precise location +information. Some of that location information needs to be ignored for +collecting the `Timing` entries for the statistics, depending on the +interest as listed above. + +In each case, a human-readable description of what the statistics was +calculated about (the location or overlaid locations in the tree) is +needed. A path string with separators and a few more features is +chosen for this; the `evaluator::data::Span::path_string` method +produces those strings. (For performance reasons, it generates these +strings into a mutable reference into a string, and for that reason +there is no custom type definition for those path strings.) This +method takes a `PathStringOptions` value to specify the details how +the path should be generated, e.g. whether the thread should be +mentioned or not, etc. The same location (represented by a +`evaluator::data::Span`) could produce such different path strings as: + +1. With the specific thread (threads are numbered in order of new + thread ids in timings occurring in the log, starting from 0): + + N:thread00 > main|main > sum_of_fibs|all > sum_of_fibs n=22 > sum_of_fibs|body > main|fib > fib|fib + +2. Union across all threads: + + A:thread > main|main > sum_of_fibs|all > sum_of_fibs n=22 > sum_of_fibs|body > main|fib > fib|fib + +3. The same path in reverse order: + + AR:fib|fib < main|fib < sum_of_fibs|body < sum_of_fibs n=22 < sum_of_fibs|all < main|main < thread + +4. Or ignoring location altogether (only showing the probe name, not the location): + + fib|fib + +Path 1 will represent the fewest data points since it is the most +specific, 2 and 3 (representing the same data points) represent +possibly more points since those paths potentially cover multiple +threads, 4 represents the most data points. + +So, to collect the data points, for each point (again, represented by +a `evaluator::data::Span`) the path is calculated according to a +chosen `PathStringOptions` value, and then the path is keyed into a +hash map, and a reference to that `Span` is added to a vector held in +that map. Afterwards, for each entry in the map, the statistics over +its vector can be calculated. This map is wrapped in the +`evaluator::IndexByCallPath` type, and the indexing happens in the +`evaluator::IndexByCallPath::from_logdataindex` method. + +Some of the parameters for generating the paths can be chosen via +command line arguments to `evobench-evaluator` (for the `single` or +`summary` subcommands). But for Excel output, multiple runs are done +with different `PathStringOptions` options to fill the +`IndexByCallPath` with entries for different usecases at once: the +resulting Excel sheets are "multi use" in this regard; the path +formats are chosen so that the generated paths are not ambiguous for +those cases. + +#### Calculating statistics + +So each path in `evaluator::IndexByCallPath` maps to the vector of +spans of timings for that path. Statistics are calculated for each of +those vectors, separately for each of the fields in the timings that +the user (explicitly or implicitly) is interested in. (We have 2 +dimensions of statistical output here: the paths are one dimension, +the field the second dimension (although that one has a statically +fixed selection of values--"real time", "cpu time" etc.).) + +Remember, the `Timing` records contain all the kinds of timings that +are collected: real time, cpu time, system time, multiple kinds of +context switches, and more. Some are not currently generated on macOS, +thus the currently extracted values are currently just real, cpu, +system times and a sum of all kinds of context switches. + +For each of those timing kinds, a separate statistics is +calculated. For Excel output, the statistics for all timing kinds are +integrated into the same file as separate worksheets. For flamegraphs, +a separate SVG file is generated for each kind, adding the timing kind +name to the file name (like `single-real time.svg`, `single-cpu +time.svg`, etc.). + +The `evaluator::AllFieldsTable` struct has the job of holding all 4 statistics kinds. + +The `evaluator::AllFieldsTableWithOutputPathOrBase` struct bundles +that with output path (XXX: what is the logic exactly with +`is_final_file`?). Those instances are specific to one of the output +formats (Excel, flamegraphs), the program evaluates separate ones +because the path syntax needs to be different for flamegraphs (to +follow the required format for the +[inferno](https://crates.io/crates/inferno) crate), and also for +flamegraphs only one kind of path is generated (and also influenced by +flamegraph-specific user options?). + +The `evaluator::AllOutputsAllFieldsTable` struct bundles the separate +`evaluator::AllFieldsTableWithOutputPathOrBase` instances for all +requested output formats. + +The 3 structs above (`evaluator::AllFieldsTable` / +`evaluator::AllFieldsTableWithOutputPathOrBase` / +`evaluator::AllOutputsAllFieldsTable`) are type-parameterized with a +`` type. Current such types (implementors of +`evaluator::AllFieldsTableKind`) are `SingleRunStats`, `SummaryStats`, +`TrendStats`, they are currently all empty marker types, used just to +mark the structs to clarify what kind of statistical results they +hold. + +The `evaluator::AllOutputsAllFieldsTable` instance is then written to +files via its `write_to_files` method. + + + +XXXWRONG a single choice is taken, and can specified on +the command line via `evaluator::options::FlameFieldOpt`, which was +mentioned in the "StatsField" section above; although currently the +program still evaluates the statistics for all 4 kinds first and only +then picks the chosen one for the flamegraphs. + + + +The data structure to hold the 4 kinds of data points + +`AllOutputsAllFieldsTable` + + +XXX move this OUT of src/evaluator/data/ ? ! und eben why do i do it kitschi a little .?. + +wohin? AllOutputsAllFieldsTable::from_log_data_tree is next step -- which is in src/evaluator/ + +3. Path index: + + After building the tree, an index over all paths is created. + XXX (which types, and why?, what is different from the tree directly?) + +### 2. + +### X. + +`AllOutputsAllFieldsTable` + + +### X. Creating the outputs + +`StatsField` + + +#### Excel + + + +#### Flamegraphs + +The [inferno](https://crates.io/crates/inferno) library used for +generating the flamegraphs requires a format where parent scopes' +timing numbers do not include the numbers of child scopes. This is +unlike in Excel files, where the parent scope is shown with the whole +costs for that scope, regardless of which child scopes there may be, +which is both more natural when child scopes can be added to or +removed from the project over time, and also are not immediately +visible when reading the Excel file (those scope are on different rows +in the sheet). The function +[`fix_tree`](../../../src/evaluator/all_outputs_all_fields_table.rs) +converts from the child-inclusive to this child-exclusive format. + +The processing is as follows: + +1. First, the same code path as for Excel is used to generate an + `AllOutputsAllFieldsTable<_>`. XXX diff --git a/evobench-evaluator/docs/internals/index.md b/evobench-evaluator/docs/internals/index.md new file mode 100644 index 0000000..bdad401 --- /dev/null +++ b/evobench-evaluator/docs/internals/index.md @@ -0,0 +1,35 @@ +# Source directory overview + +## `bin` subdirectory + +The source files representing program binaries. + +These are the user-relevant programs: + +* [`bin/evobench-evaluator.rs`](../../src/bin/evobench-evaluator.rs): produce human-readable outputs from benchmarking log files; does not know about where to place files (needs explicit paths), and doesn't know about running benchmarks +* [`bin/evobench-run.rs`](../../src/bin/evobench-run.rs): runs benchmarking jobs, i.e. produces benchmarking log files in a structured and automatic way (i.e. offers a service plus tools to change and query the service status); calls `evobench-evaluator` to turn them into human-readable outputs. + +Other programs (not normally in use, feel free to ignore): + +* [`bin/jobqueue.rs`](../../src/bin/jobqueue.rs): a general purpose program to work with queues (just an application of the `key_val_fs` module, perhaps generally useful?) +* [`bin/trying-git.rs`](../../src/bin/trying-git.rs): a program to play with git graphs, mostly to verify the workings of the `git` module. + +## Other subdirectories + +* [`serde/`](../../src/serde/mod.rs): custom types in config files and other places with user interaction via text +* [`key_val_fs/`](../../src/key_val_fs/mod.rs): a simple key-value database via files, and a queue implementation on top +* [`stats/`](../../src/stats/mod.rs): simple statistics, keeping track of the unit (ns, us, counts) via the type system +* [`tables/`](../../src/tables/mod.rs): tabular output for Excel, works with [`stats/`](../../src/stats/mod.rs) keeping track of the unit (ns, us, counts) via the type system +* [`evaluator/`](../../src/evaluator/mod.rs): the meat of the `evobench-evaluator` tool +* [`run/`](../../src/run/mod.rs): the meat of the `evobench-run` tool + +(There are some more, utilities without group documentation: +[`date_and_time/`](../../src/date_and_time/mod.rs), +[`utillib/`](../../src/utillib/mod.rs), +[`io_utils/`](../../src/io_utils/mod.rs).) + +## Tool internals documentation + +* [evobench-evaluator](evaluator/index.md) + +* [evobench-run](runner/index.md) diff --git a/evobench-evaluator/docs/internals/runner/index.md b/evobench-evaluator/docs/internals/runner/index.md new file mode 100644 index 0000000..e1a2257 --- /dev/null +++ b/evobench-evaluator/docs/internals/runner/index.md @@ -0,0 +1,2 @@ +# How `evobench-run` works internally + diff --git a/evobench-evaluator/docs/overview.md b/evobench-evaluator/docs/overview.md index 5cdb615..cac827c 100644 --- a/evobench-evaluator/docs/overview.md +++ b/evobench-evaluator/docs/overview.md @@ -24,15 +24,19 @@ benchmarking results). The tool has various subcommands, for polling a repository for changes, inserting jobs, listing them, and running them (daemon). Run it with `--help`. -It has a concept of a "key", which is all pieces of information that -influence a benchmarking run (which commit of the target project was -run, with which custom parameters, in which queuing context -(configurable), and on which machine/OS (but which is not currently -used as results are currently only stored locally)). - -It currently runs the `evobench-evaluator` after each finished job run -to evaluate the results of the run and also generate summary -statistics across all runs for the same "key". +It runs the `evobench-evaluator` after each finished job run to +evaluate the results of the run and also generate summary statistics +across all runs for the same key. The key here is the set of pieces of +information about a benchmarking run that identify the experiment that +the run belongs to. The experiment is the measurement of performance +for a commit id of the target project, on a particular piece of +hardware and OS, for a particular set of custom parameters for a +particular benchmarking invocation (as defined by the target project) +and optionally the queueing context (e.g. runs during the night while +shutting down other services can be configured as a different +experiment from runs during multi-use times). Multiple runs are +executed for each experiment (the number is configurable) to allow to +calculate the statistical significance for performance deviations. ### Configuration @@ -66,10 +70,10 @@ sub directory with the timestamp of the start of the run as the directory name and holding the results for that run. The files are: `bench_output.log.zstd` -: contents of what the target app wrote to $BENCH_OUTPUT_LOG +: contents of what the target app wrote to the path in `$BENCH_OUTPUT_LOG` `evobench.log.zstd` -: contents of what evobench-probes wrote to $EVOBENCH_LOG +: contents of what evobench-probes wrote to the path in `$EVOBENCH_LOG` `single.xlsx` : statistical results of the run, extracted from `evobench.log.zstd` diff --git a/evobench-evaluator/src/ctx.rs b/evobench-evaluator/src/ctx.rs index 5595284..816e374 100644 --- a/evobench-evaluator/src/ctx.rs +++ b/evobench-evaluator/src/ctx.rs @@ -1,3 +1,8 @@ +//! A shorter way to add error context information when using the `anyhow` crate. +//! +//! Instead of `.with_context(|| anyhow!("while doing {}", 1 + 1))`, this allows writing +//! `.map_err(ctx!("while doing {}", 1 + 1))`. + #[macro_export] macro_rules! ctx { ($fmt:tt) => { diff --git a/evobench-evaluator/src/date_and_time/mod.rs b/evobench-evaluator/src/date_and_time/mod.rs index 431118a..17f4bdd 100644 --- a/evobench-evaluator/src/date_and_time/mod.rs +++ b/evobench-evaluator/src/date_and_time/mod.rs @@ -1,3 +1,5 @@ +//! Date/time handling extensions and utilities. + pub mod system_time_with_display; pub mod time_ranges; pub mod unixtime; diff --git a/evobench-evaluator/src/digit_num.rs b/evobench-evaluator/src/digit_num.rs index c004fd5..d55098f 100644 --- a/evobench-evaluator/src/digit_num.rs +++ b/evobench-evaluator/src/digit_num.rs @@ -1,4 +1,6 @@ -//! Numbers based on decimal digits, for testing only +//! Numbers based on decimal digits, for correctness and +//! introspection, not performance. Useful for writing some kinds of +//! tests. use std::{fmt::Display, io::Write}; diff --git a/evobench-evaluator/src/evaluator/all_fields_table.rs b/evobench-evaluator/src/evaluator/all_fields_table.rs index ab2223c..4474b63 100644 --- a/evobench-evaluator/src/evaluator/all_fields_table.rs +++ b/evobench-evaluator/src/evaluator/all_fields_table.rs @@ -110,8 +110,9 @@ fn table_for_field<'key, K: KeyDetails>( /// name suggests, also what rows are generated, since the grouping of /// the measurements depends on the set of generated key /// strings. (This only contains the runtime data, but unlike what the -/// name suggests, actually there is no static data for the key -/// column?) +/// name suggests, actually there is no static data for the key column +/// in the output? (PS. But there is the definition of the trait +/// `KeyDetails` below, can't conflict with that.)) #[derive(Clone, PartialEq, Debug)] pub struct KeyRuntimeDetails { /// The separators to use diff --git a/evobench-evaluator/src/evaluator/all_outputs_all_fields_table.rs b/evobench-evaluator/src/evaluator/all_outputs_all_fields_table.rs index 120651f..baac57c 100644 --- a/evobench-evaluator/src/evaluator/all_outputs_all_fields_table.rs +++ b/evobench-evaluator/src/evaluator/all_outputs_all_fields_table.rs @@ -275,9 +275,10 @@ impl AllOutputsAllFieldsTable { for table in tables { if table.table_key_vals(flame_field).next().is_none() { - // Attempting to generate flame graphs - // without data is giving an error from - // the library, thus skip this table + // The table has no rows. `inferno` is + // giving errors when attempting to + // generate flame graphs without data, + // thus skip this table continue; } @@ -297,10 +298,10 @@ impl AllOutputsAllFieldsTable { .collect() }; - // inferno is really fussy, apparently it + // `inferno` is really fussy, apparently it // gives a "No stack counts found" error - // whenever it's missing any line with a - // ";" in it, thus check: + // whenever it's missing any line with a ";" + // in it, thus check: if !lines.iter().any(|s| s.contains(';')) { eprintln!( "note: there are no lines with ';' to be fed to inferno, \ diff --git a/evobench-evaluator/src/evaluator/data/log_data_tree.rs b/evobench-evaluator/src/evaluator/data/log_data_tree.rs index f70dfa3..856b71d 100644 --- a/evobench-evaluator/src/evaluator/data/log_data_tree.rs +++ b/evobench-evaluator/src/evaluator/data/log_data_tree.rs @@ -1,4 +1,5 @@ -//! Build tree and index for making summaries. +//! Build tree and index for making statistical evaluations over +//! probes in their call context. //! `Timing` and contextual info remains in the parsed log file //! (`Vec`), the index just references into those. diff --git a/evobench-evaluator/src/evaluator/data/mod.rs b/evobench-evaluator/src/evaluator/data/mod.rs index bc5e105..d7b553a 100644 --- a/evobench-evaluator/src/evaluator/data/mod.rs +++ b/evobench-evaluator/src/evaluator/data/mod.rs @@ -1,3 +1,5 @@ +//! The benchmarking log file parser and log data tree representation. + pub mod log_data; pub mod log_data_and_tree; pub mod log_data_tree; diff --git a/evobench-evaluator/src/evaluator/mod.rs b/evobench-evaluator/src/evaluator/mod.rs index 0d95b71..24a25cf 100644 --- a/evobench-evaluator/src/evaluator/mod.rs +++ b/evobench-evaluator/src/evaluator/mod.rs @@ -1,6 +1,6 @@ -//! The core evobench-evaluator functionality (i.e. excl. more general -//! library files, and excl. the main driver program at -//! src/bin/evobench-evaluator) +//! The core `evobench-evaluator` functionality (i.e. excl. more +//! general library files, and excl. the main driver program at +//! `src/bin/evobench-evaluator.rs`) pub mod all_fields_table; pub mod all_outputs_all_fields_table; diff --git a/evobench-evaluator/src/evaluator/options.rs b/evobench-evaluator/src/evaluator/options.rs index f4476b2..b6c00e5 100644 --- a/evobench-evaluator/src/evaluator/options.rs +++ b/evobench-evaluator/src/evaluator/options.rs @@ -17,6 +17,11 @@ pub const TILE_COUNT: usize = 101; pub struct EvaluationOpts { /// The width of the column with the probes path, in characters /// (as per Excel's definition of characters) + // (This is for Excel output only; could there be a better place? + // But the `OutputOpts` are just options on the same level (via + // flatten), currently, only subcommands would allow selective + // choice, and there may be no way to specify multiple subcommands + // in the same command invocation.) #[clap(short, long, default_value = "100")] pub key_width: f64, @@ -46,7 +51,7 @@ pub struct OutputOpts { flame: Option, } -/// Do not use for level 0 (i.e. `single` subcommand), there sum must +/// Do not use for level 0 (i.e. `single` subcommand), there sum must /// always be used! #[derive(clap::Args, Debug)] pub struct FlameFieldOpt { diff --git a/evobench-evaluator/src/git.rs b/evobench-evaluator/src/git.rs index 75bf37f..efd9de9 100644 --- a/evobench-evaluator/src/git.rs +++ b/evobench-evaluator/src/git.rs @@ -1,3 +1,6 @@ +//! Parse Git history, for tracking performance changes across the +//! history. Should perhaps be moved into the `run-git` crate. + use std::{ collections::{BTreeSet, HashMap}, fmt::{Debug, Display}, diff --git a/evobench-evaluator/src/io_utils/mod.rs b/evobench-evaluator/src/io_utils/mod.rs index 5ba4c59..109f3c5 100644 --- a/evobench-evaluator/src/io_utils/mod.rs +++ b/evobench-evaluator/src/io_utils/mod.rs @@ -1,3 +1,5 @@ +//! Various utilities in the area of I/O. + pub mod bash; pub mod capture; pub mod div; diff --git a/evobench-evaluator/src/join.rs b/evobench-evaluator/src/join.rs index 3c05839..6273969 100644 --- a/evobench-evaluator/src/join.rs +++ b/evobench-evaluator/src/join.rs @@ -1,3 +1,5 @@ +//! Joins (intersections) of sorted sequences. + use itertools::{EitherOrBoth, Itertools}; #[derive(Debug, PartialEq)] diff --git a/evobench-evaluator/src/key_val_fs/mod.rs b/evobench-evaluator/src/key_val_fs/mod.rs index f9c9c0a..a5b02df 100644 --- a/evobench-evaluator/src/key_val_fs/mod.rs +++ b/evobench-evaluator/src/key_val_fs/mod.rs @@ -1,8 +1,8 @@ //! Simple filesystem based key-value database using a separate file //! in the file system per mapping, and offering locking operations on //! each entry for mutations/deletions. The goal of this library is -//! not speed, but reliability, locking features, and ease to inspect -//! the state with standard command line tools. +//! not speed, but reliability, locking features, and ease of +//! inspection of the state with standard command line tools. pub mod as_key; pub mod key_val; diff --git a/evobench-evaluator/src/linear.rs b/evobench-evaluator/src/linear.rs index 26519e8..33b105a 100644 --- a/evobench-evaluator/src/linear.rs +++ b/evobench-evaluator/src/linear.rs @@ -1,12 +1,14 @@ -//! Run-time "linear types"--warning in the `Drop` implementation in -//! release builds, panic in debug builds. Attempts at using the idea -//! of panicking in const does not work in practice (it appears that -//! drop templates are instantiated before being optimized away, hence -//! e.g. returning them in `Result::Ok` is not possible since it -//! apparently unconditionally instantiates the drop for Result which -//! instantiates the drop for the Ok value even if never used, but I -//! did not manage to analyze the binary to detect drop use after the -//! optimizer either). +//! Run-time "linear types"--panic in the `Drop` implementation in +//! debug builds, optionally only warning in release builds (depending +//! on `fatal` runtime value in the type). +//! +//! Attempts at using the idea of panicking in const does not work in +//! practice (it appears that drop templates are instantiated before +//! being optimized away, hence e.g. returning them in `Result::Ok` is +//! not possible since it apparently unconditionally instantiates the +//! drop for Result which instantiates the drop for the Ok value even +//! if never used, but I did not manage to analyze the binary to +//! detect drop use after the optimizer either). //! Only token types are supported: embedding such a token type inside //! a larger data structure makes the larger data structure run-time @@ -16,8 +18,8 @@ //! containing data structure is needed, which might be cleaner? //! Original idea and partially code came from -//! https://jack.wrenn.fyi/blog/undroppable/ and -//! https://geo-ant.github.io/blog/2024/rust-linear-types-use-once/, +//! and +//! , //! but again, doesn't appear to work in practice. There are also some //! other crates going the runtime route, maybe the most-used one //! being . diff --git a/evobench-evaluator/src/path_util.rs b/evobench-evaluator/src/path_util.rs index aff8788..cbb5f4c 100644 --- a/evobench-evaluator/src/path_util.rs +++ b/evobench-evaluator/src/path_util.rs @@ -1,3 +1,6 @@ +//! Utilities to make working with `PathBuf` / `&Path` more +//! productive. + use std::{ ffi::{OsStr, OsString}, os::unix::prelude::OsStringExt, diff --git a/evobench-evaluator/src/rayon_util.rs b/evobench-evaluator/src/rayon_util.rs index 55302d2..4c0aa99 100644 --- a/evobench-evaluator/src/rayon_util.rs +++ b/evobench-evaluator/src/rayon_util.rs @@ -1,3 +1,6 @@ +//! Utilities to make working with the +//! [rayon](https://crates.io/crates/rayon) crate more productive. + pub trait ParRun { type Output; fn par_run(self) -> Self::Output; diff --git a/evobench-evaluator/src/run/mod.rs b/evobench-evaluator/src/run/mod.rs index 553e7fc..29a7d58 100644 --- a/evobench-evaluator/src/run/mod.rs +++ b/evobench-evaluator/src/run/mod.rs @@ -1,3 +1,7 @@ +//! The core `evobench-run` functionality (i.e. excl. more general +//! library files, and excl. the main driver program at +//! `src/bin/evobench-run.rs`) + pub mod benchmarking_job; pub mod config; pub mod custom_parameter; diff --git a/evobench-evaluator/src/stats/mod.rs b/evobench-evaluator/src/stats/mod.rs index d6b8ec0..604186a 100644 --- a/evobench-evaluator/src/stats/mod.rs +++ b/evobench-evaluator/src/stats/mod.rs @@ -1,5 +1,6 @@ //! Simple statistics (count, average, standard deviation, median and -//! percentiles), with strong typing, and ability to handle weighted +//! percentiles), with the number unit (`ViewType`) and tiles count +//! verified in the type system, and ability to handle weighted //! values. pub mod average; diff --git a/evobench-evaluator/src/tables/mod.rs b/evobench-evaluator/src/tables/mod.rs index ca336e0..2def033 100644 --- a/evobench-evaluator/src/tables/mod.rs +++ b/evobench-evaluator/src/tables/mod.rs @@ -1,3 +1,28 @@ +//! A flexible way to generate files with tabular data. +//! +//! [`table_view`](table_view.rs) defines `TableViewRow` and +//! `TableView` traits that declare a tabular data representation. +//! +//! [`table_field_view`](table_field_view.rs): XXX +//! +//! [`table`](table.rs): defines `Table`, a concrete implementation of +//! `TableView` that holds rows pairing a string key (representing the +//! first column) with a value that implements `TableViewRow` +//! (representing the remaining columns). `Table` is also +//! parameterized with a `TableKind` type for type safety and to carry +//! metadata (used to represent RealTime, CpuTime, SysTime and +//! CtxSwitches tables, see +//! [../evaluator/all_field_tables.rs](../evaluator/all_field_tables.rs)). +//! +//! [`change`](change.rs) is an abstraction for values that represent +//! change, with formatting indicating positive/negative change, used +//! by the `change()` method on `Table` to produce a table that +//! represents the change between two tables. +//! +//! [`excel_table_view`](excel_table_view.rs): take a sequence of +//! values implementing `TableView` and convert them to an Excel file +//! with a workbook for each. + pub mod change; pub mod excel_table_view; pub mod table; diff --git a/evobench-evaluator/src/times.rs b/evobench-evaluator/src/times.rs index 452d881..ddbb6d9 100644 --- a/evobench-evaluator/src/times.rs +++ b/evobench-evaluator/src/times.rs @@ -1,3 +1,10 @@ +//! Time durations in microseconds and nanoseconds, plus conversions +//! between them as well as traits for a common `u64` based +//! representation and getting the unit as human-readable string from +//! the type for doing type safe statistics (works with the `stats` +//! module). Also includes formatting as strings in milliseconds but +//! padded to the precision. + use std::fmt::{Debug, Display}; use std::ops::{Add, Sub}; diff --git a/evobench-evaluator/src/utillib/mod.rs b/evobench-evaluator/src/utillib/mod.rs index 1a828b8..5c5dd18 100644 --- a/evobench-evaluator/src/utillib/mod.rs +++ b/evobench-evaluator/src/utillib/mod.rs @@ -1,3 +1,5 @@ +//! Various utilities + pub mod arc; pub mod bool_env; pub mod exit_status_ext; diff --git a/evobench-evaluator/src/zstd_file.rs b/evobench-evaluator/src/zstd_file.rs index 70ca746..563e79f 100644 --- a/evobench-evaluator/src/zstd_file.rs +++ b/evobench-evaluator/src/zstd_file.rs @@ -1,3 +1,5 @@ +//! Transparent ZSTD decompression, as well as compression, for files. + use std::{ ffi::{OsStr, OsString}, fs::File,