Skip to content

Conversation

@jmafoster1
Copy link
Collaborator

@jmafoster1 jmafoster1 commented Dec 12, 2025

Closes #245. As pointed out in #370, the CausalSpecification class is redundant. It only took me half an hour to remove it entirely, so I thought I'd submit a draft PR so we have something to compare #370 to.

EDIT: Following on from our meeting this morning, I have decided to go with this PR rather than #370 as it is truer to our original vision for the CTF. I have also updated the tutorials to reflect this. I checked through the documentation, but I don't think there's anything to update. There is a "causal specification" page, but it just talks about the modelling scenario and the DAG, which are still relevant. There is no mention of a wrapper class or even really how the two elements are handled specifically in the CTF - it's just high-level background - so I think it's fine (and indeed necessary) to leave as is.

@github-actions
Copy link

github-actions bot commented Dec 12, 2025

🦙 MegaLinter status: ✅ SUCCESS

Descriptor Linter Files Fixed Errors Elapsed time
✅ PYTHON black 32 0 0.84s
✅ PYTHON pylint 32 0 5.13s

See detailed report in MegaLinter reports

MegaLinter is graciously provided by OX Security

@jmafoster1 jmafoster1 mentioned this pull request Dec 12, 2025
@codecov
Copy link

codecov bot commented Dec 12, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.29%. Comparing base (045246d) to head (5aa34ac).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #371      +/-   ##
==========================================
+ Coverage   96.64%   97.29%   +0.64%     
==========================================
  Files          28       27       -1     
  Lines        1608     1550      -58     
==========================================
- Hits         1554     1508      -46     
+ Misses         54       42      -12     
Files with missing lines Coverage Δ
causal_testing/estimation/ipcw_estimator.py 99.26% <ø> (ø)
causal_testing/main.py 100.00% <100.00%> (ø)
causal_testing/specification/causal_dag.py 98.25% <100.00%> (-0.64%) ⬇️
causal_testing/specification/scenario.py 100.00% <100.00%> (+31.25%) ⬆️
...sal_testing/surrogate/causal_surrogate_assisted.py 100.00% <100.00%> (ø)
...l_testing/surrogate/surrogate_search_algorithms.py 98.50% <100.00%> (ø)
causal_testing/testing/causal_effect.py 98.55% <ø> (ø)
causal_testing/testing/causal_test_case.py 100.00% <ø> (ø)
causal_testing/testing/metamorphic_relation.py 100.00% <100.00%> (ø)

... and 1 file with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d274678...5aa34ac. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jmafoster1 jmafoster1 marked this pull request as ready for review December 12, 2025 13:04
@jmafoster1 jmafoster1 requested a review from f-allian December 12, 2025 13:13
Comment on lines +30 to +32
- name: Register Jupyter Kernel
run: |
python -m ipykernel install --user --name python3
Copy link
Collaborator

@f-allian f-allian Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmafoster1 What do we need this for? Is it for the Jupyter Notebook test?

Copy link
Collaborator Author

@jmafoster1 jmafoster1 Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise it says "no kernel found" when you try to run the notebooks

Copy link
Collaborator

@f-allian f-allian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The two images generated by this script should also be updated: images/schematic.tex
  2. References to the auto-generated causal specification API needs to be removed as it's causing the docs build to fail (see image below).
Image
  1. We also need to remove other instances of causal specification in the docs otherwise it's just confusing to reference it, especially if it's not used anywhere in the codebase. A quick search shows the following files:
  • Module documentation: docs/source/modules/causal_specification.rst
  • Causal specification page causal_specification.rst:1-90
  • Tutorial notebook: Step-by-step creation in Poisson line process tutorial poisson_line_process_tutorial.ipynb:345-347
  • Background documentation: Referenced in framework overview background.rst:89-90
  • Glossary: Definition entry glossary.rst:20-22
  • Causal testing module: Referenced as key ingredient causal_testing.rst:6-8
  1. The tests/tutorial_tests/test_tutorial.py script is also showing a couple of Jupyter/Windows specific errors (see below)
image

Adding the following to the very top of this script supresses those warnings:

import os
os.environ["JUPYTER_PLATFORM_DIRS"] = "1"

import sys
import warnings
import asyncio

warnings.filterwarnings(
    "ignore",
    category=DeprecationWarning,
    message=r"Jupyter is migrating.*"
)

if sys.platform.startswith("win"):
    asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())

import pathlib
import pytest
import nbformat
from nbclient.client import NotebookClient

NOTEBOOK_DIR = pathlib.Path(__file__).parent.parent.parent / "docs" / "source" / "tutorials"
NOTEBOOK_FILES = list(NOTEBOOK_DIR.rglob("[!.]*/*.ipynb")

.... rest of the tests below

@jmafoster1 jmafoster1 force-pushed the jmafoster1/remove-causal-spec branch from f982b60 to 6de85f3 Compare December 12, 2025 15:27
@jmafoster1
Copy link
Collaborator Author

I've done everything in here except schematic.tex. For me, I think this is still valid since the causal specification is more of a conceptual construct than a programatic one - any testing framework must have a specification as it's how the expected behaviour is specified. Since, even in the paper, it's just a pair, then this wouldn't typically be worthy of its own class in an implementation anyway.

@jmafoster1 jmafoster1 requested a review from f-allian December 12, 2025 16:33
@f-allian
Copy link
Collaborator

@jmafoster1 Thanks Michael, looks good.

I've done everything in here except schematic.tex. For me, I think this is still valid since the causal specification is more of a conceptual construct than a programatic one - any testing framework must have a specification as it's how the expected behaviour is specified. Since, even in the paper, it's just a pair, then this wouldn't typically be worthy of its own class in an implementation anyway.

I'm against keeping the specification in the diagram though. It doesn't make sense to have a user-facing workflow diagram that includes a component that doesn't exist anywhere in the codebase. Almost all users won't have to know or interact with the concept of a causal specification, so it'll be confusing. But if they do for whatever reason, we could always just point them to the relevant papers either in the tutorials or README. - e.g. if a new user that's unfamiliar with the codebase looks at this diagram and asks 'how do I create a causal specification?' what do we tell them?

If you're really against removing it, a compromise could be to add a caption in the diagram to highlight that the causal specification is implicit and conceptual or something along those lines.

Maybe @neilwalkinshaw has some thoughts?

@jmafoster1
Copy link
Collaborator Author

Could put a dotted line around it? From a technical perspective, I think it's nice to indicate what forms the specification. From a personal perspective, removing it will take ages because tikz

@jmafoster1
Copy link
Collaborator Author

@f-allian I redrew the schematic diagram. I think this is a more accurate encoding of the workflow. The modelling scenario (often implicit) feeds into the causal test cases (generally through the expected behaviour), statistical estimand (the modelling scenario may explicitly adjust for certain variables), and the causal estimate (through filtering of the data). What do you think?

@f-allian
Copy link
Collaborator

@jmafoster1 I think we need to make sure the workflow represents your current main.py. At the moment, the diagram makes it look like the causal test case and modelling scenario are unrelated to the input DAG/test data, and that it's also being used in parallel. Something like this makes the most sense to me (either all flowing from left to right, or all flowing from top to bottom):

  1. DAG + Test Data -> Modelling Scenario
  2. DAG + Modelling Scenario + Test Data -> Causal Test Case
  3. Causal Test Case -> Statistical Estimand
  4. Statistical Estimand -> Causal Estimate
  5. Causal Estimate -> Test Oracle
  6. Test Oracle -> Test Outcomes

Also, if possible, a legend of some sort would be very useful, e.g. the DAG and test data are "Inputs", the modelling scenario + causal test case + statistical estimand + causal estimate could be "Causal Testing", the test oracle as "Verification" and finally the test outcome is the "Output" and could show "Pass" and "Fail" below the emojis for clarity. The legend labels could be in different colours too to help accessibility, as our diagram is very B+W at the moment.

Could you also please make the arrow-heads slightly larger so they're more visible?

@jmafoster1
Copy link
Collaborator Author

I think we diverge slightly on our goals for the diagram. My original idea was to give a very high level conceptual mapping between the various components of the CTF and Figure 1 of this paper. Since we operate in a very different context to traditional software testing, I think it's good to show which components form the specification, tests, oracle, etc., how they feed into each other, and what the user needs to provide. This is why I wanted to keep the "causal specification" box even though we got rid of the actual class for this. In this way, all the "root nodes" represent inputs from the user (although we could possibly have DAG --> test case as an optional arrow) and everything else is performed automatically by the CTF.

Your proposal is perhaps a more accurate representation of how main interacts with the various components, but I think this operates at a more technical level and obscures the conceptual mapping somewhat. I'm also not sure about your edge 2 since you can formulate the DAG, data, and test case in any order (e.g. collect the test data and then form causal tests around this, or formulate your test cases and then collect data to evaluate them). I wouldn't want the diagram to imply that there's a single "correct" way to do it as this may give people a false impression that the CTF is only useful for one scenario. For edge 3, I think we need to have "DAG + Test case --> Estimand" since this is the identification step. We need to know the treatment and outcome variables (from the test case) and also the causal structure of the system (from the DAG) in order to do this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Causal Specification Refactor?

3 participants