Add benchmarking suite #252

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

ugognw wants to merge 56 commits into KingsburyLab:main from ugognw:benchmarking-suite

Contributor

ugognw commented Jun 19, 2025 •

edited

Loading

Summary

Major changes:

feature 1: Addition of benchmarking suite (as per Separate benchmarks from unit tests (add benchmarking suite) #57 ) - pyEQL.benchmark
feature 2: added CRC dataset

Todos

Happy to hear any feedback on the current direction!

codecov bot commented Jun 19, 2025

Codecov Report

Attention: Patch coverage is 90.55118% with 12 lines in your changes missing coverage. Please review.

Project coverage is 80.01%. Comparing base (b361c6a) to head (9e21eb5).
Report is 3 commits behind head on main.

Files with missing lines	Patch %	Lines
src/pyEQL/benchmark.py	90.55%	6 Missing and 6 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #252      +/-   ##
==========================================
- Coverage   81.79%   80.01%   -1.79%     
==========================================
  Files           9       10       +1     
  Lines        1494     1621     +127     
  Branches      257      282      +25     
==========================================
+ Hits         1222     1297      +75     
- Misses        226      277      +51     
- Partials       46       47       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Member

rkingsbury commented Jun 20, 2025

Wow, thank you for the effort on this @ugognw! It looks well thought out. I'm traveling this week and part of next week but will dedicate some time to reviewing in detail as soon as I can.

ugognw force-pushed the benchmarking-suite branch from c1da2ed to 5735dea Compare

June 22, 2025 19:46

Contributor Author

ugognw commented Jun 23, 2025

Thanks! I'd like to think so! Sounds good. I look forward to hearing your feedback!

ugognw and others added 26 commits

June 22, 2025 23:53


          Draft benchmarking module

46320cb

Signed-off-by: Ugochukwu Nwosu <ugn@sfu.ca>


          Major refactor

87620d9

Replace ReferenceData with _BenchmarkEntry. This named tuple now separates solution
and solute properties which enables the separate retrieval of each type of property.
An analogous _get_solution_property method has been implemented.

_get_reference is renamed to _get_dataset as the data structure returned by the method
defines a general solution/solute property data set (reference data + Solution object
which can be used for benchmarking).

benchmark_engine functionality moved to report_results for simplicity. If more fine-tuned
control over data set partitioning or result reporting is desired, this can be implemented
quite readily by undoing this change.

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Don't explicitly create soln; drafted CRC loader

20e0973

Solution creation is now handed when reference files are read.

Benchmarking is parametrized by engine.

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Draft parsers for benchmarking data sources

61b03ad

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Implement function to calculate mean activity

70b7b16

Although EOS subclasses define get_activity_coefficicient to return the
activity coefficients of individual solutes, one can only measure the
mean activity coefficient experimentally. As such, to benchmark against
experimental data, we must use the definition of the mean activity
coefficient in terms of the activity coefficients of the individual ions.
Alternatively, this function could be implemented as a property
(Solution.mean_activity) or a method (Solution.get_mean_activity).

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Clean up functions; add docstrings

c4965e4

Publicize functions/classes for processing input/output data models

Return values from main function (renamed to benchmark_engine for clarity) and pass
engine and source as arguments.

Store results instead of printing to console.

Parse Solution dictionary into Solution object when reading dataset from file.

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Fix type hints

abd1c04

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Fix docstrings and errors

248585d

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Move reference data loading to own module

8d3b943

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Sketch loading functions

62c26e5

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Remove parsers

32b74df

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Remove todos

4cc1b5e

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Optimized strategy-switching algorithm for sources

1862e00

Passing a string corresponding to an internal source name retrieves the Path to
the data file (distributed with package). All files are loaded in the same way.

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Parse dataset values into Quantity objects

39679fa

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Draft benchmark tests

f26ec0e

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Convert solute and solution data values

526ec71

...to list[tuple[str, Quantity]]

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Rename property to match conventional name

5cb8843

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Expand benchmark unit tests

2d113d6

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Fix property retrieval, RMSE calc; rename vars

cf247ce

The new RMSE is more robust in the case that property values are Python
primitives.

Property getters are now called to retrieve properties. This fixes a bug
where the getter was returned instead of the value.

Variables are renamed (from property) to avoid shadowing a Python built-in.

Enumerating solution data (in get_dataset) yields the right number of items
to unpack.

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Fix implementation of RMSE

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Add molar conductivity reference data

d8aeaa0

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Rename function for clarity

66dee23

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Add utility to create BenchmarkEntry

ffbd84f

Improve docstrings

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Remove extra Salt key; fix reduce signature

e0002c8

Salt takes no extra kwargs. reduce takes no kwargs.

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Create source fixture

85277fd

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Add more properties

fd05ceb

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>

ugognw added 23 commits

June 22, 2025 23:53


          Write new signature for benchmark_engine

b5e7f62

Users should be able to select which properties as well as which solutions
(compositions, thermodynamic state) to benchmark against. This will also
enable speed-ups in testing.

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Add solns/props in load_dataset/BenchmarkEntry

a1ac0b5

Implementing BenchmarkEntry.solute_properties/solution_properties in this way
will enable speed-ups in testing when calculating error statistics between
calculated and reference datasets (list[BenchmarkEntry]) as data lookup from a
dictionary by key has O(1) time complexity (one will have the name of the property
when calculating error) while finding a specific item in a list has O(N) time
complexity.

Accordingly, load_dataset now expects solute/solution properties within data files
to be structured as dictionaries instead of prperty-value 2-tuples and accepts solutions,
solute_properties, and solution_properties as keyword-only arguments. That data for a
solution is included is determined by its components, molar concentration, temperature,
and pressure.

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Defer decision on key structure

a613aaa

Encapsulating the logic for creating dictionary keys in a function and creating
a type alias for its type should minimize the amount of refactor required if the
format is changed and appease mypy. This also makes comparison of Solutions
for the purpose of benchmarking considerably more simple.

load_dataset now returns a dictionary with keys indicating the Solution for which
the data applies as generated by _create_solution_key.

Internal data sources are now opened more reliably using resources.files.

Generating the list of SolutionKeys corresponding to the `solutions` parameter in advance
simplifies the logic required to check for the Solution in `solutions`.

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Use same data structures for engine/reference data

dc3eab1

Using the same data structures for engine and reference data is desired because
it will separate the logic for retrieving properties and calculating stats and
simplify the retrieval of corresponding items in calculate_stats.

This implementation ensures that only one Solution object is created for each
Solution across all benchmark datasets and that each BenchmarkEntry object is
associated with a unique Solution. In this way, a single dataset can be generated
for the engine, which can then be benchmarked against each reference dataset.

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Implement reference/calculated for calculate_stats

178f49e

Accepting two identical data structures generalizes calculate_stats for
all combinations of sources of data (e.g., reference-reference, engine-engine,
engine-reference). An additional benefit is that now, from the signature, it
is more clear to users how to organize the two sets of data to be compared.
Previously, engine data was implicitly generated using the Solution objects in
each BenchmarkEntry. As mentioned previously, this now removes the logic for
retrieving solute and solution properties.

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Reorder functions in order or use

eaff9a9

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Update docstrings; remove unneeded func

11329f0

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Remove database file

f129c93

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Update database files

f7fb349

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Update tests; shrink solution properties tested

6092c0b

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Remove use of FormulaDict; fix bugs

48f80bb

FormulaDict was initially used in order to automatically standardize how solutes
appear in solution compositions and solute property lists. However, in addition
to standardizing ion formulas, FormulaDict also sorts its internal copy of the
dictionary by its values. This led to an error in the previous implementation
because the values are dictionaries, and dictionaries don't implement the required
methods for comparison with '<' and '>'.

Instead, we manually call standardize_formula in the public functions where
malformatted formulas are likely. This is when loading datasets and when
creating a BenchmarkEntry from a Solution.

Bugs: initialize property data as dictionaries in create_entry; fix indices and
data type in _create_engine_dataset; initialize data pair lists in calculate_stats.

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Polish docstrings

6160eb0

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Add units for concentration in SolutionKey

df0063a

The concentration for Solution objects that are created in _create_engine_dataset
is determined from the SolutionKey. Without units, the amounts are not interpreted
as concentrations and the Solution does not have the correct composition.

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Add more unit tests

a6f8678

The three load_dataset unit tests are just outlines and are expected to fail.
The explicit test cases for test_should_benchmark_all_engines are intended to
eventually overlap with those in test_conductivity and eventually replace them.

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Update module docstring

5a025b2

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Exclude the solvent from the solution key

c0dc6fb

The number of moles of solvent is dependent on the engine, but in the
capacity that SolutionKey will be used, two Solutions with the same concentrations
of their solutes with differing engines (and thus differing moles of solvent) should
be considered identical if they are at the same state (temperature and pressure).

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Outline additional unit tests

e72d68b

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Fix language in docstrings

bbb1b02

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Revise docstrings

5d17d08

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Update docstrings; Fix metric args

ced1b68

Updated module docstring variable for consistency.

Require both _create_engine_dataset arguments.

For now, only require that the metric argument to calculate_stats can handle
Quantity arguments, so that we can ensure that units are handled correctly.
Note that the property values in BenchmarkEntry.solute_data and
BenchmarkEntry.solution_data are expected to be Quantity objects.

Also, the additional conditional statements in calculate_stats serve to handle
the edge cases of when the return values of _get_solute_property and
_get_solution_property are None and when there does not exist data in `calculated`
corresponding to `reference`. Because this may be of intereste to the user, the
event is logged at the INFO level.

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Fix typo

1d041e7

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Add example for load_dataset

cc94ea8

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>


          Add benchmarking to docs

82e06df

Signed-off-by: Ugochukwu Nwosu <ugognw@gmail.com>

ugognw force-pushed the benchmarking-suite branch from fb3df4c to 82e06df Compare

June 23, 2025 06:53

Member

rkingsbury commented Jul 14, 2025

Wow, thank you for the effort on this @ugognw! It looks well thought out. I'm traveling this week and part of next week but will dedicate some time to reviewing in detail as soon as I can.

Update: I will do my best to look at this next week. I have some proposals due at the end of this week that will keep me busy until then. Thanks for your patience!

rkingsbury added 2 commits

July 28, 2025 17:52


          Merge branch 'main' into benchmarking-suite

ae02065


          Merge branch 'main' into benchmarking-suite

a5516c8

rkingsbury assigned YitongPan1 and rkingsbury

YitongPan1 approved these changes

View reviewed changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet