Skip to content

Conversation

@XuehaiPan
Copy link
Contributor

@XuehaiPan XuehaiPan commented Dec 13, 2025

Description

Fixes #5926

For !defined(PYBIND11_HAS_SUBINTERPRETER_SUPPORT), the previous behavior is unchanged.

For defined(PYBIND11_HAS_SUBINTERPRETER_SUPPORT), the pybind11 internals is already a per-interpreter-dependent singleton stored in the interpreter's state dict. This PR adds a new map to the interpreter-dependent internals from the gil_safe_call_once_and_store's address to the result, where the gil_safe_call_once_and_store is a static object that has a fixed address.

UPDATE: The storage map is moved to a separate capsule in the per-interpreter state dict instead of putting it as a member of internals. Because the per-interpreter internals are leaked in the program, the storage map is never GC-ed on (sub-)interpreter shutdown.

Suggested changelog entry:

  • Add per-interpreter storage for gil_safe_call_once_and_store to make it safe under multi-interpreters.

@XuehaiPan XuehaiPan force-pushed the subinterp-call-once-and-store branch 3 times, most recently from 58834ac to a46973b Compare December 14, 2025 04:07
@XuehaiPan XuehaiPan force-pushed the subinterp-call-once-and-store branch from a46973b to e741760 Compare December 14, 2025 04:08
@XuehaiPan XuehaiPan marked this pull request as ready for review December 14, 2025 06:47
Comment on lines 599 to 608
// FIXME: We cannot use `get_num_interpreters_seen() > 1` here to create a fast path for
// the multi-interpreter case. The singleton may be initialized by a subinterpreter not the
// main interpreter.
//
// For multi-interpreter support, the subinterpreters can be initialized concurrently, and
// the first time this function may not be called in the main interpreter.
// For example, a clean main interpreter that does not import any pybind11 module and then
// spawns multiple subinterpreters using `InterpreterPoolExecutor` that each imports a
// pybind11 module concurrently.
if (get_num_interpreters_seen() > 1) {
Copy link
Contributor Author

@XuehaiPan XuehaiPan Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @b-pass @rwgk for thoughts about this, see the code comment referenced for more details.

I test the patch in this PR in:

Most things works fine except test_import_in_subinterpreter_before_main:

run_in_subprocess(
    """
    with contextlib.closing(interpreters.create()) as subinterpreter:
        subinterpreter.exec('import optree')

    import optree
    """
)

If I remove the get_num_interpreters_seen() > 1 condition, my import test works but the cpptest in pybind11 CI breaks because internals_singleton_pp_ is never initialized. For instance, the memory address get_internals() should be a shared constant for differenet pybind11 modules in a single program.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_num_interpreters_seen() > 1 check is meant to keep the "there are no subinterpreters" case (the most common case) as fast as possible.

Can you help me understand the problem that concurrent subinterpreter imports is causing?

I'm not understanding what exactly the problem is... doing concurrent imports of pybind11 modules each in its own subinterpreter should still result in each subinterpreter getting its own internals (as it should). The one that gets to this code first would populate the internals_singleton_pp_, but this does not have to be the main interpreter, and the next time get_pp is called from that first subinterprter it will take the other (> 1) code path. Since get_num_interpreters_seen() never decreases (it is "seen" not "currently alive"), once it goes up to 2 the inner code path will be used and what is stored in the internals_singleton_pp_ doesn't matter for the rest of the program.

What's the purpose of get_pp_for_main_interpreter() and the associated once flag?

Copy link
Contributor Author

@XuehaiPan XuehaiPan Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me elaborate more about my failed test. I have a C++ singleton instance using gil_safe_call_once_and_store to ensure exactly one instance per-interpreter. With the patch in this PR, the call-once result is stored in the internals (*get_pp()), where the internals is stored in the interpreter's state dict.

MyClass &get_singleton() {
    PYBIND11_CONSTINIT static py::gil_safe_call_once_and_store<MyClass> storage;
    return storage
        .call_once_and_store_result([]() -> MyClass {
            MyClass instance{};

            {
                // initialize
                ...
            }
            return instance;
        })
        .get_stored();
}
  • OK - test_import_in_subinterpreters_concurrently: import the module in multiple subinterpreters concurrently, without any import for a pybind11 module in the main interpreter.
  • OK - test_import_in_subinterpreter_after_main: import the module in the main interpreter, then import in a subinterpreter.
  • FAIL - test_import_in_subinterpreter_before_main: import the module in a subinterpreter (the main interpreter does not import any pybind11 module yet), then import the module in the main interpreter.

    The import succeeds in the subinterpreter, but the instance is ill-initialized in the main interpreter. If I remove the get_num_interpreters_seen() > 1 entirely or initialize the atomic counter for seen interpreters with 2 instead of 0, my test passes.


What's the purpose of get_pp_for_main_interpreter() and the associated once flag?

Just to ensure the singleton is initialized once. I'm not sure if the scoped GIL is still working when there are multiple interpreters running. I can revert the call_once approach back to scopded GIL approach.

@XuehaiPan XuehaiPan force-pushed the subinterp-call-once-and-store branch from 3a6ec0e to 0830872 Compare December 17, 2025 04:09
@XuehaiPan XuehaiPan force-pushed the subinterp-call-once-and-store branch from 0830872 to ac02a32 Compare December 17, 2025 04:11
@XuehaiPan
Copy link
Contributor Author

I have updated the approach to move the storage map as a separate capsule in the per-interpreter state dict instead of putting it in the internals. The PR description is also updated.

The CI tests are looking good so far. But I found some concurrency bugs for this patch in my downstream PR metaopt/optree#245. I'm not sure if it is coming from pybind11 or the Python core itself.

I will do further investigation about these.


Bug 1: Duplicate native enum registration.

ImportError: pybind11::native_enum<...>("PyTreeKind") is already registered!
Test Case
def test_import_in_subinterpreter_before_main():
    # OK
    check_script_in_subprocess(
        textwrap.dedent(
            """
            import contextlib
            import gc
            from concurrent import interpreters

            subinterpreter = None
            with contextlib.closing(interpreters.create()) as subinterpreter:
                subinterpreter.exec('import optree')

            import optree

            del optree, subinterpreter
            for _ in range(10):
                gc.collect()
            """,
        ).strip(),
        output='',
        rerun=NUM_FLAKY_RERUNS,
    )

    # FAIL
    check_script_in_subprocess(
        textwrap.dedent(
            f"""
            import contextlib
            import gc
            import random
            from concurrent import interpreters

            subinterpreter = subinterpreters = stack = None
            with contextlib.ExitStack() as stack:
                subinterpreters = [
                    stack.enter_context(contextlib.closing(interpreters.create()))
                    for _ in range({NUM_FUTURES})
                ]
                random.shuffle(subinterpreters)
                for subinterpreter in subinterpreters:
                    subinterpreter.exec('import optree')

            import optree

            del optree, subinterpreter, subinterpreters, stack
            for _ in range(10):
                gc.collect()
            """,
        ).strip(),
        output='',
        rerun=NUM_FLAKY_RERUNS,
    )

    # FAIL
    check_script_in_subprocess(
        textwrap.dedent(
            f"""
            import contextlib
            import gc
            import random
            from concurrent import interpreters

            subinterpreter = subinterpreters = stack = None
            with contextlib.ExitStack() as stack:
                subinterpreters = [
                    stack.enter_context(contextlib.closing(interpreters.create()))
                    for _ in range({NUM_FUTURES})
                ]
                random.shuffle(subinterpreters)
                for subinterpreter in subinterpreters:
                    subinterpreter.exec('import optree')

                import optree

            del optree, subinterpreter, subinterpreters, stack
            for _ in range(10):
                gc.collect()
            """,
        ).strip(),
        output='',
        rerun=NUM_FLAKY_RERUNS,
    )
Traceback
__________________________________________________________________ test_import_in_subinterpreter_before_main ___________________________________________________________________

    def test_import_in_subinterpreter_before_main():
        check_script_in_subprocess(
            textwrap.dedent(
                """
                import contextlib
                import gc
                from concurrent import interpreters
    
                subinterpreter = None
                with contextlib.closing(interpreters.create()) as subinterpreter:
                    subinterpreter.exec('import optree')
    
                import optree
    
                del optree, subinterpreter
                for _ in range(10):
                    gc.collect()
                """,
            ).strip(),
            output='',
            rerun=NUM_FLAKY_RERUNS,
        )
    
>       check_script_in_subprocess(
            textwrap.dedent(
                f"""
                import contextlib
                import gc
                import random
                from concurrent import interpreters
    
                subinterpreter = subinterpreters = stack = None
                with contextlib.ExitStack() as stack:
                    subinterpreters = [
                        stack.enter_context(contextlib.closing(interpreters.create()))
                        for _ in range({NUM_FUTURES})
                    ]
                    random.shuffle(subinterpreters)
                    for subinterpreter in subinterpreters:
                        subinterpreter.exec('import optree')
    
                import optree
    
                del optree, subinterpreter, subinterpreters, stack
                for _ in range(10):
                    gc.collect()
                """,
            ).strip(),
            output='',
            rerun=NUM_FLAKY_RERUNS,
        )


concurrent/test_subinterpreters.py:267: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

script = "import contextlib\nimport gc\nimport random\nfrom concurrent import interpreters\n\nsubinterpreter = subinterpreters ...optree')\n\nimport optree\n\ndel optree, subinterpreter, subinterpreters, stack\nfor _ in range(10):\n    gc.collect()"

    def check_script_in_subprocess(script, /, *, output, env=None, cwd=TEST_ROOT, rerun=1):
        result = ''
        for _ in range(rerun):
            try:
                result = subprocess.check_output(
                    [sys.executable, '-Walways', '-Werror', '-c', script],
                    stderr=subprocess.STDOUT,
                    text=True,
                    encoding='utf-8',
                    cwd=cwd,
                    env={
                        key: value
                        for key, value in (env if env is not None else os.environ).items()
                        if not key.startswith(('PYTHON', 'PYTEST', 'COV_'))
                    },
                )
            except subprocess.CalledProcessError as ex:
>               raise CalledProcessError(ex.returncode, ex.cmd, ex.output, ex.stderr) from None
E               helpers.CalledProcessError: Command '['/home/PanXuehai/Projects/optree/venv/bin/python3', '-Walways', '-Werror', '-c', "import contextlib\nimport gc\nimport random\nfrom concurrent import interpreters\n\nsubinterpreter = subinterpreters = stack = None\nwith contextlib.ExitStack() as stack:\n    subinterpreters = [\n        stack.enter_context(contextlib.closing(interpreters.create()))\n        for _ in range(16)\n    ]\n    random.shuffle(subinterpreters)\n    for subinterpreter in subinterpreters:\n        subinterpreter.exec('import optree')\n\nimport optree\n\ndel optree, subinterpreter, subinterpreters, stack\nfor _ in range(10):\n    gc.collect()"]' returned non-zero exit status 1.
E               Output:
E               Traceback (most recent call last):
E                 File "<string>", line 16, in <module>
E                   import optree
E                 File "/home/PanXuehai/Projects/optree/optree/__init__.py", line 17, in <module>
E                   from optree import accessors, dataclasses, functools, integrations, pytree, treespec, typing
E                 File "/home/PanXuehai/Projects/optree/optree/accessors.py", line 25, in <module>
E                   import optree._C as _C
E               ImportError: pybind11::native_enum<...>("PyTreeKind") is already registered!

_          = 0
cwd        = PosixPath('/home/PanXuehai/Projects/optree/tests')
env        = None
output     = ''
rerun      = 8
result     = ''
script     = "import contextlib\nimport gc\nimport random\nfrom concurrent import interpreters\n\nsubinterpreter = subinterpreters ...optree')\n\nimport optree\n\ndel optree, subinterpreter, subinterpreters, stack\nfor _ in range(10):\n    gc.collect()"

helpers.py:187: CalledProcessError

Bug 2: Thread-safety issues for multiple-interpreters are ever created.

  • Case 1: No multiple interpreters, only the main interpreter. Everything works fine.

    # Run tests separately
    make test PYTESTOPTS='-k "subinterpreter"'       # OK, no threading concurrency tests
    make test PYTESTOPTS='-k "not subinterpreter"'   # OK, the program only has the main interpreter
  • Case 2: Create multiple interpreters, and destroy them before doing threading concurrency tests (only the main interpreter is alive during the concurrency tests).

    The C++ objects are locked with critical sections, and the GIL is enabled during the test.

    # There is only the main interpreter alive during threading concurrency tests
    #
    #     ```python
    #     with contextlib.closing(concurrent.interpreters.create()) as subintrepreter:
    #         # do something
    #         ...
    #
    #     # at this point, only the main interpreter is alive
    #
    #     # threading tests
    #     ...
    #     ```
    
    # Run tests alltogather
    make test  # subinterpreter tests all passed, but threading concurrency tests produce incorrect results

@rwgk
Copy link
Collaborator

rwgk commented Dec 21, 2025

Quick update regarding Move Subinterpreter:

The test still hangs with these most-recent-as-of-now cpython commits (using this pybind11 PR, but Move Subinterpreter re-enabled):

  • cpython 3.14 362d11e5454 (2025-12-21T06:52:55Z)
  • cpython main 09044dd42b5 (2025-12-21T16:58:07Z)

Copy link
Collaborator

@rwgk rwgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me, thanks for the work!

I'll wait for a review from @b-pass before merging.

@b-pass
Copy link
Contributor

b-pass commented Dec 21, 2025

Looks good to me.

I haven't had time to read the Cursor outputs in detail for the subinterpreter test, and I haven't so far been able to get a 3.14.2t install functioning to reproduce the problem, but I should be able to sort that out soon. So I'll continue to work on that failing test in #5940.

@XuehaiPan
Copy link
Contributor Author

I added three tests I mentioned in #5933 (comment), and they failed in CI. I can disable them for now if they are kind of out of the scope of this PR.

// This is only valid if there is a single interpreter. Otherwise, it is not used.
// WARNING: We cannot use thread local cache similar to `internals_pp_manager::internals_p_tls`
// because the thread local storage cannot be explicitly invalidated when interpreters
// are destroyed (unlike `internals_pp_manager` which has explicit hooks for that).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If internals_pp_manager can do it, why can't we?

Should the map be a member of local_internals so we can benefit from the caching and finalization logic that already exists there? We can add things to local internals without bumping the internals version.

Copy link
Contributor Author

@XuehaiPan XuehaiPan Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are still some bugs with the internals_pp_manager. internals and local_internals are not reliable for now. There is still some work that needs to be done to ensure it works correctly under the multiple-interpreters case. So I added this warning comment, and maybe we can do some improvement in the future. The first priority for this PR is to make things work right.

For example, in the comment #5933 (comment) and the CI failure https://github.com/pybind/pybind11/actions/runs/20414341520?pr=5933, py::native_enum<...> in import mod complains that the enum type is already registered. However, the registration map is stored in internals and local_internals, which should be per-interpreter dependent.

if (detail::get_local_type_info(typeid(EnumType)) != nullptr
|| detail::get_global_type_info(typeid(EnumType)) != nullptr) {
pybind11_fail(
"pybind11::native_enum<...>(\"" + enum_name_encoded
+ "\") is already registered as a `pybind11::enum_` or `pybind11::class_`!");
}
if (detail::global_internals_native_enum_type_map_contains(enum_type_index)) {
pybind11_fail("pybind11::native_enum<...>(\"" + enum_name_encoded
+ "\") is already registered!");
}

This indicates there exist bugs in internals_pp_manager. The bugs are not related to gil_safe_call_once_and_store.

I suspect that the unref and destroy hooks for internal_pp_manager only work for interpreters created by the pybind11::subinterpreter API. The tls storage will be malformed when the interpreters are created/destroyed from the Python side. For example, using concurrent.interpreters and concurrent.futures.InterpreterPoolExecutor and internal_pp_manager is not aware that the interpreter is destroyed from the Python side.

cc @b-pass

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that the unref and destroy hooks for internal_pp_manager only work for interpreters created by the pybind11::subinterpreter API. The tls storage will be malformed when the interpreters are created/destroyed from the Python side.

That does seem likely. Can we fix it for everyone, instead of working around it for this specific case while leaving a problem in other areas? I believe everything would work out if we do per-interpreter cleanup from the internals capsule destructor, since clearing the interpreter state dict is one of the last actions on interpreter shutdown.

call_once_storage<T> *value = nullptr;
{
gil_scoped_acquire gil_acq;
// Only one thread will enter here at a time.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem true if we're talking about a free-threading build of Python? In general the assumption in this code that gil_scoped_acquire enforces mutual exclusion is not valid on free-threaded Python.

I think get_or_create_call_once_storage_map() needs to be reworked to do something like a test-and-test-and-set, i.e.: If the dict entry doesn't exist, then create the map and use PyDict_SetDefault[Ref]; if setdefault returns a different map, use that and free the one you just created. You will also need a lock to synchronize updates to the map. It might be easier to use a Python dict (which comes with a built-in lock) instead of a std::unordered_map, with pointer-cast-to-integer keys and capsule values. The capsule destructor would finalize a single element instead of all of them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will change to per-storage capsule approach rather than one capsule of a collection of storages.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment here still says "only one thread will enter at a time" which is not true.

rwgk added a commit to rwgk/pybind11 that referenced this pull request Dec 22, 2025
This commit improves the C++ test infrastructure to ensure test output
is visible in CI logs, and disables a test that hangs on free-threaded
Python 3.14+.

Changes:

## CI/test infrastructure improvements

- .github/workflows: Added `timeout-minutes: 3` to all C++ test steps
  to prevent indefinite hangs.

- tests/**/CMakeLists.txt: Added `USES_TERMINAL` to C++ test targets
  (cpptest, test_cross_module_rtti, test_pure_cpp) to ensure output is
  shown immediately rather than buffered and possibly lost on crash/timeout.

- tests/test_with_catch/catch.cpp: Added a custom Catch2 progress reporter
  with timestamps, Python version info, and a SIGTERM handler to make test
  execution and failures clearly visible in CI logs.

## Disabled hanging test

- The "Move Subinterpreter" test is disabled on free-threaded Python 3.14+
  due to a hang in Py_EndInterpreter() when the subinterpreter is destroyed
  from a different thread than it was created on. Work on fixing the
  underlying issue will continue under PR pybind#5940.

Context: We were in the dark for months (since we started testing with
Python 3.14t) because CI logs gave no clue about the root cause of hangs.
This led to ignoring intermittent hangs (mostly on macOS). Our hand was
forced only with the Python 3.14.1 release, when hangs became predictable
on all platforms.

For the full development history of these changes, see PR pybind#5933.
@rwgk
Copy link
Collaborator

rwgk commented Dec 22, 2025

After seeing @oremanj's comments, I decided to break out #5942, so we get rid of the 90 minute hangs asap, and keep this PR focused.

I'll merge #5942 when I see that the CI is green there.

@XuehaiPan there will be some conflicts because I added a couple commits under #5942 that are not in this PR (mainly to report SKIPPING the Move Interpreter test). To resolve them, simply git checkout the affected files from then master.

@XuehaiPan
Copy link
Contributor Author

there will be some conflicts because I added a couple commits under #5942 that are not in this PR

There are no merge conflicts at this time. PR #5942 does not involve multiple-interpreters tests.

@rwgk
Copy link
Collaborator

rwgk commented Dec 22, 2025

there will be some conflicts because I added a couple commits under #5942 that are not in this PR

There are no merge conflicts at this time. PR #5942 does not involve multiple-interpreters tests.

I haven't merged yet; the CI for 5942 is still running.

This PR currently overlaps with 5942. But after 5942 is merged into master, and you update this PR by merging master, the overlap will be gone, and this PR will be much more focused (smaller) again.

rwgk added a commit that referenced this pull request Dec 22, 2025
…p hanging Move Subinterpreter test (#5942)

* Improve C++ test infrastructure and disable hanging test

This commit improves the C++ test infrastructure to ensure test output
is visible in CI logs, and disables a test that hangs on free-threaded
Python 3.14+.

Changes:

## CI/test infrastructure improvements

- .github/workflows: Added `timeout-minutes: 3` to all C++ test steps
  to prevent indefinite hangs.

- tests/**/CMakeLists.txt: Added `USES_TERMINAL` to C++ test targets
  (cpptest, test_cross_module_rtti, test_pure_cpp) to ensure output is
  shown immediately rather than buffered and possibly lost on crash/timeout.

- tests/test_with_catch/catch.cpp: Added a custom Catch2 progress reporter
  with timestamps, Python version info, and a SIGTERM handler to make test
  execution and failures clearly visible in CI logs.

## Disabled hanging test

- The "Move Subinterpreter" test is disabled on free-threaded Python 3.14+
  due to a hang in Py_EndInterpreter() when the subinterpreter is destroyed
  from a different thread than it was created on. Work on fixing the
  underlying issue will continue under PR #5940.

Context: We were in the dark for months (since we started testing with
Python 3.14t) because CI logs gave no clue about the root cause of hangs.
This led to ignoring intermittent hangs (mostly on macOS). Our hand was
forced only with the Python 3.14.1 release, when hangs became predictable
on all platforms.

For the full development history of these changes, see PR #5933.

* Add test summary to progress reporter

Print the total number of test cases and assertions at the end of the
test run, making it easy to spot if tests are disabled or added.

Example output:
  [  PASSED  ] 20 test cases, 1589 assertions.

* Add PYBIND11_CATCH2_SKIP_IF macro to skip tests at runtime

Catch2 v2 doesn't have native skip support (v3 does with SKIP()).
This macro allows tests to be skipped with a visible message while
still appearing in the test list.

Use this for the Move Subinterpreter test on free-threaded Python 3.14+
so it shows as skipped rather than being conditionally compiled out.

Example output:
  [ RUN      ] Move Subinterpreter
  [ SKIPPED ] Skipped on free-threaded Python 3.14+ (see PR #5940)
  [       OK ] Move Subinterpreter

* Fix clang-tidy bugprone-macro-parentheses warning in PYBIND11_CATCH2_SKIP_IF
@rwgk
Copy link
Collaborator

rwgk commented Dec 22, 2025

@XuehaiPan I just merged #5942 (the CI was green). There is now a conflict showing here, as expected. Could you please git merge master here, to clean up this PR? Just copy tests/test_with_catch/catch.cpp from master to resolve the conflict.

@rwgk
Copy link
Collaborator

rwgk commented Dec 22, 2025

Quick note: I just removed the "piggy-backed" section from the PR description.

call_once_storage<T> *value = nullptr;
{
gil_scoped_acquire gil_acq;
// Only one thread will enter here at a time.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment here still says "only one thread will enter at a time" which is not true.

{
// Disable the destructor of our unused capsule to prevent double-free:
// unique_ptr will clean up the storage on function exit, and the capsule should not.
if (PyCapsule_SetDestructor(new_capsule.ptr(), nullptr) != 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be simpler (since it can't fail) to release the unique_ptr here, effectively passing ownership to the new capsule.

"(clear destructor of unused capsule)");
throw error_already_set();
}
// Get the storage pointer from the existing capsule.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This duplicates the block of code starting at line 264; can you rearrange so it only has to be written once? Something like:

  • try to get capsule from dict
  • if unsuccessful, try to add capsule to dict, or obtain the concurrently-added one
  • regardless of which of the above paths is taken, wind up with a reference to the right capsule object; pull the pointer from it and return

I think that would simplify the control flow here a good deal. You could also treat a newly created capsule as taking ownership of the storage as soon as it has been successfully created, so you don't need to switch between returning the unique_ptr.release() on one path vs the capsule pointer on another.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG]: pybind11::gil_safe_call_once_and_store is not safe for Python subinterpreters

4 participants