-
Notifications
You must be signed in to change notification settings - Fork 15
Open
Description
Currently, it is possible to construct a dataset where, even with alpha=1, a fixed or random column is chosen because of the penalties in the quadratic term. This contradicts the docstring which claims
alpha:
Hyperparameter between 0 and 1 that controls the relative weight of
the relevance and redundancy terms.
``alpha=0`` places no weight on the quality of the features,
therefore the features will be selected as to minimize the
redundancy without any consideration to quality.
``alpha=1`` places the maximum weight on the quality of the features,
and therefore will be equivalent to using
:class:`sklearn.feature_selection.SelectKBest`.
One solution is to multiply the quadratic/redundany terms by 1-alpha to ensure that they are zeroed when alpha=1.
For instance we could replace
dwave-scikit-learn-plugin/dwave/plugins/sklearn/transformers.py
Lines 210 to 212 in 7db7d03
| # our objective | |
| # we multiply by 2 because the matrix is symmetric | |
| np.fill_diagonal(correlations, correlations[:, -1] * (-2 * alpha * num_features)) |
diag = np.array(correlations[:, -1] * (-2 * alpha * num_features), copy=True)
correlations *= (1-alpha)
# our objective
# we multiply by 2 because the matrix is symmetric
np.fill_diagonal(correlations, diag)Metadata
Metadata
Assignees
Labels
No labels