tests: Unify LGBM test datasets

# Issue

_Originally posted by @AndreFCruz in https://github.com/feedzai/feedzai-openml-java/pull/116#discussion_r889007891_:

> It was generated with the following ipython notebook. Should we commit this?
[generate-fairgbm-sensitive-attribute.zip](https://github.com/feedzai/feedzai-openml-java/files/8833075/generate-fairgbm-sensitive-attribute.zip)

-----

# Request

To test fairgbm the test datasets were modified. Documentation of the new column generated by André should be added to the file https://raw.githubusercontent.com/feedzai/feedzai-openml-java/master/openml-lightgbm/lightgbm-provider/src/test/resources/test_data/stats.org, which declares how features were generated.

Also, since @shenggwang introduced tests for the explanations/contributions, he added another whole set of test sets based on those initial test datasets that I had created, meaning we now have two similar but different datasets.

I suggest unification of all those test resources to avoid the redundant test payloads in the repo. Given that Sheng had the work to explain how to generate the new test sets in Python, with code that can be executed in the future, and André also used Python code to generate updated test set, I suggest getting rid of the older datasets I generated (and described in stats.org, for they use excel formulas instead), regenerating Sheng's datasets with the new python code from André and refactoring the tests to use the new test sets. Also, add that updated python code info to the README.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

tests: Unify LGBM test datasets #128

Issue

Request

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

tests: Unify LGBM test datasets #128

Description

Issue

Request

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions