Test L0, L1, L2, and dropout regularization approaches

As per [issue #391 in -us-data](https://github.com/PolicyEngine/policyengine-us-data/issues/391). 

This story can be closed when we feel good about the value of each one of the L0 parameters, which were hastily set during its initial integration. There is a risk of optimizing on future target holdouts, but the algorithm it will be competing against has been tuned to the full set as well, so they should be on even footing. The idea is to get the L0 approach "working well" without explicitly tuning, and then doing holdout validation to get an (imperfect) idea of the out-of-sample performance.

Optionally, grab some ideas from the [L0 paper](https://arxiv.org/abs/1712.01312)

It might be worth it to chat with an AI about whether our current code has faithfully implemented this method
In Section 4, there are some initial parameter values that have been set. In one case, the number of records N was used to set the L0 parameter lambda at 0.001/N. I believe that's close to what is in the code now if N is the number of targets.
Evaluate our dropout approach and how well it plays with L0. I think it's going to be contradictory, and we're going to want to remove it, but I don't know for sure.

The paper has a way to mix L2 regularization (closer to our dropout's goals) just on the non-zero parameters. How easy would that be to add?

Though not mentioned in the paper, a method of "annealing" actually takes the temperature parameter down from a higher starting value and drops it lower in later epochs. It might be worth investigating this option.

I noticed that the algorithm is still improving at 400 epochs and I am curious how different it would be if it was allowed to go longer. What if we decreased the learning rate at some point during the training?

How sensitive is the algorithm to leaving out a set of targets? Do we need to set lambda dynamically?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Test L0, L1, L2, and dropout regularization approaches #2

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Test L0, L1, L2, and dropout regularization approaches #2

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions