It's too easy to apply invalid selections to an output

When applying selections to an output using `epymorph.tools.data.munge()` or any of the tools that use it like `out.plot.line()`, a validation step checks that the selections refer to the same set of objects as were used to create the output in the first place. For instance, it would be invalid to perform a simulation using an SIR model and then try to plot its output by selecting the H compartment from an SIRH model.

At present this validation relies on object identity, so selections are only valid if they were created from the exact same GeoScope/TimeFrame/CompartmentModel object that was used to produce the output.

In itself this isn't a problem, however the current API makes it very easy to get this wrong. For example, the following will fail during the plotting step:

```python
stored_sim_results = []
for frac in [0.01, 0.025, 0.05]:
    rume = SingleStrataRUME.build(
        ipm=ipm.SIRS(),
        mm=mm.No(),
        scope=StateScope.in_states(["AZ"], year=2020),
        time_frame=TimeFrame.rangex("2020-01-01", "2021-01-01"),
        init=init.Proportional(ratios=np.array([1 - frac, frac, 0])),
        params={
            "beta": 0.4,
            "gamma": 1 / 4,
            "xi": 1 / 120,
            "population": acs5.Population(),
        },
    )
    sim = BasicSimulator(rume)
    out = sim.run()
    stored_sim_results.append(out)

for out in stored_sim_results:
    out.plot.line(
        geo=rume.scope.select.all(),
        time=rume.time_frame.select.all(),
        quantity=rume.ipm.select.events("S->I"),
    )
```

With an error like:

```
ValueError: When applying a geo selection, please create that selection from the same scope you are applying it to.
```

In this case the selection used in the plotting loop (like `rume.scope.select.all()`) will use the value of `rume` during the last iteration of the loop. That's no longer the same RUME that produced the first output, and so `rume.scope` is not the same object as `out.rume.scope`. I'll note that this is a very easy trap to fall into as code evolves over time -- the above would have worked just fine if there were no loop, or only one value to loop over. It would be quite surprising for the addition of a loop to cause this kind of issue, it's not easy to debug at a glance (it doesn't _look_ obviously wrong), and the error produced isn't very helpful.

In essence we're requiring the user to keep track of which selections can be used with which outputs, and to be careful not to mix them up. It's possible to envision a redesign of this interface that relieves the user of this burden, but we'd have to carefully consider the trade-offs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

It's too easy to apply invalid selections to an output #232

Sub-issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

It's too easy to apply invalid selections to an output #232

Description

Sub-issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions