[CLI] Move native file locking into workers #2997

brandonpayton · 2025-12-08T22:05:26Z

Motivation for the change, related issues

In order to fix native file locking in Windows, we are moving native file locking into workers.

More details coming...

Implementation details

TBD

Testing Instructions (or ideally a Blueprint)

TBD

…Windows FileLockManager tests

brandonpayton · 2025-12-09T05:33:28Z

We can do this with a FileLockManagerForPosix and a FileLockManagerForWindows and fall back to the FileLockManagerForNode if the native locking API is not available.

I've working on the implementation for both. The main things that require care are:

Treat zero-length ranges as extending to the end of the file
Release locks when a related file descriptor is closed
Release locks when the process exits (or in this case, when a PHP request is completed)
Implement fcntl() semantics
- Release fcntl() locks when any file descriptor for the locked file is closed by the locking process.
- Locked ranges can be unlocked or merged piece by piece. In contrast, Windows locking requires that an unlock range corresponds exactly to the locked range.

For Posix, we can keep it fairly simple for fcntl() by keeping track of which files a process has locked via fcntl() and then unlocking the entire range via fcntl() when locks need to be released.

For Windows, implementing fcntl() semantics is more complicated. We'll have to maintain a collection of which ranges are locked per file in order to be able to unlock those ranges. If a caller wants to unlock part of a range, we'll have to unlock the entire range and then obtain locks for the remaining portions of the original locked range. For shared locks, we can obtain the new ranges before releasing the original range, but for exclusive ranges, we'll have to release the original range before attempting to obtain locks on the remaining ranges. (According to a Google answer about whether Windows allows overlapping exclusive locks by the same process)

The good news is that we are already tracking locked ranges in the FileLockManagerForNode. The work for the Windows locks shouldn't be that different.

cc @adamziel

…Posix

…ire remaining range

…and process exit

brandonpayton · 2025-12-10T06:03:14Z

I roughed out native FileLockManager's for POSIX and Windows, but they are yet untested. Tomorrow, I plan to start by adapting native locking tests and testing these new classes.

adamziel · 2025-12-10T17:01:57Z

The pre-requisites for this one seem to be mostly in place. The CLI spawn handler now creates a new OS process for any spawned PHP subprocess. The request handler still uses multiple PHP instances, but can be tuned down by adding maxPhpInstances: 1, to every bootWordPress* call in worker-thread-v*.ts files – which we could do in this PR.

In #3014, I'm exploring a CI stress test to confirm multiple workers are indeed used for handling concurrent requests.

brandonpayton · 2025-12-10T21:20:52Z

I made the FileLockManager test suite declarations reusable so it could test FileLockManagerForPosix and FileLockManagerForWindows.... except I forgot we need to run multiple workers to properly test the native locking, so I'm reworking the tests to do that.

It shouldn't be too hard. Without additional work, Vitest doesn't really support tests creating workers based on TS modules, but I think we might just require Node.js v22 for these tests and create workers using TS type stripping flags and our custom unbuilt module loader. The main thing we want to test is that all the file lock managers function properly across workers.

Note: I renamed FileLockManagerForNode to FileLockManagerInMemory because it no longer implements any native locking and completely relies on in-memory lock tracking.

adamziel · 2025-12-10T22:27:31Z

@brandonpayton we have a test that creates a new Worker, a search for new Worker should surface it. tl;dr it uses execArgs to strip types and use our custom loader. Or we could ditch vitest for that one as you've said, anything goes.

brandonpayton · 2025-12-10T23:49:06Z

@brandonpayton we have a test that creates a new Worker, a search for new Worker should surface it. tl;dr it uses execArgs to strip types and use our custom loader.

@adamziel I remember something like that too but didn't find an example in files with *.spec.*. It might be in another kind of test... Regardless, this is what I locally did with one test for this PR. Now, it just needs to be used across tests. Comlink continues to be a huge help, especially the sync remote API stuff (because it doesn't require making a sync interface Promised).

Or we could ditch vitest for that one as you've said, anything goes.

👍 thanks

brandonpayton · 2025-12-11T05:47:57Z

@adamziel My mental model here was incorrect. I'd been implementing the FileLockManagerForPosix and the FileLockManagerForWindows as if the individual workers were actually separate processes. If they were separate processes, native OS locking would naturally enforce locks between the different workers.

But workers from the Node.js worker_threads module are just threads. I think that has been clear to me in the past, so I'm not sure how I became mistaken in this case.

Because all workers share the same native OS process ID, the native OS locking facilities cannot protect the workers from themselves because all workers appear as part of the same process.

I think what this means is that the platform-specific FileLockManagers have to wrap a FileLockManagerInMemory that continues to be shared across workers. The platform-specific lock managers can still run within each worker, but in order to avoid corruption due to php-wasm workers within the same process, they'll have to coordinate with the FileLockManagerInMemory.

I think the following may work for each native lock manager:

Attempt to obtain a requested lock from the in-memory lock manager. If that fails, deny the lock.
If the in-memory lock is obtained, request a corresponding native lock. Wait until the lock is granted.
If the native lock is denied, release the corresponding in-memory lock.

There are a bunch of fcntl() subtleties this does not respect (e.g., unlocking, upgrading, or downgrading only part of a locked range). But for the sake of SQLite, it looks like SQLite should only require adding and removing whole-and-exact ranges as as mentioned above. (The unixLock() function is implemented here.)

brandonpayton · 2025-12-11T05:52:18Z

NOTE: I tested lockFileExSync() from the fs-ext-extra-prebuilt package in Windows, and lockFileExSync() and unlockFileExSync() do not appear to have a return value, even though the types say they return a number. The functions do not return a number but rather throw errors to indicate failure.

adamziel · 2025-12-11T11:31:48Z

@brandonpayton I'm not convinced going through a central file log manager is the way to go. the native OS mechanics depends on separate PIDs for or enforcing logs. Let's turn every stone we have and try to find a way to use separate PIDs in here before shifting the direction. For example, could an alternative technique of spawning workers do the trick?

This came out of an LLM, it's untested but seems promising. Note fork does not actually use fork(2) but seems to spawn a separate process. Who picked that name?!

The child_process.fork() method is a special case of child_process.spawn()
Unlike the fork(2) POSIX system call, child_process.fork() does not clone the current process.

IPC messaging seems available on all OSes:

// parent.ts
import { fork, ChildProcess } from "node:child_process";
import { resolve } from "node:path";

const childPath = resolve(__dirname, "child.js"); // compiled JS

const child: ChildProcess = fork(process.argv[0], [
	...process.execArgv,
	...process.argv.slice(1),
	// somehow replace the current script path with child.ts
], {
  stdio: ["inherit", "inherit", "inherit", "ipc"], // "ipc" is the important part
});

console.log("Parent PID:", process.pid, "Child PID:", child.pid);

child.on("message", (msg: unknown) => {
  console.log("Parent got message from child:", msg);
});

child.on("exit", (code) => {
  console.log("Child exited with code", code);
});

// send something to the child
child.send({ type: "hello", payload: "from parent" });

// child.ts
// child.ts

console.log("Child PID:", process.pid);

// type-safe-ish helper
type IPCProcess = NodeJS.Process & {
  send?: (message: unknown) => void;
};

const ipcProcess = process as IPCProcess;

process.on("message", (msg: unknown) => {
  console.log("Child got message from parent:", msg);

  // reply
  ipcProcess.send?.({
    type: "reply",
    payload: "got it",
  });
});

brandonpayton · 2025-12-11T13:59:14Z

@adamziel thanks for the encouragement. Separate processes are preferable, and I will explore that direction first.

I'd looked at the built-in child_process and cluster packages yesterday but landed on the shared lock manager. We discussed using the in-memory lock manager as a fallback in cases where the native locking is unavailable. But what do think about just requiring native locking now that we have prebuilt binaries for common platforms? Based on what you've done with prebuilds, would anyone with a less common platform still be able to build fs-ext-extra-prebuild via node-gyp if they had the right tools?

adamziel · 2025-12-11T16:41:11Z

Yeah let's just require native locks 👍 we can always add more prebuilds if anyone struggles.

I am not sure if the IPC module supports synchronous communication. But even if it doesn't, then we can do a mixture of techniques; for example, spin a worker with which we are able to exchange messages synchronously. And then use that worker to communicate asynchronously with a process.

brandonpayton · 2025-12-11T17:12:10Z

@adamziel OK, cool, that simplifies things a bit. Independent worker processes will still have access to synchronous native file locking APIs used by the system call overrides.

Do you know of any other reason we need synchronous communication between independent worker processes and Playground CLI?

This currently seems like a really good direction to me.

One open question I have is what this change would mean for XDebug support. It actually might not matter at all since XDebug creates the network connection to the debugger, not the other way around.

@mho22, we are exploring moving php-wasm workers into separate processes for Playground CLI. Do you have any concerns about how that might affect XDebug support

adamziel · 2025-12-11T17:30:27Z

As for other reasons for sync communication - that's mostly to avoid dealing with asyncify. I use it in an upcoming gethostbyname PR. Is it troublesome to keep around?

mho22 · 2025-12-11T17:48:25Z

@brandonpayton I don't think something will change for Xdebug.

brandonpayton · 2025-12-11T19:36:51Z

As for other reasons for sync communication - that's mostly to avoid dealing with asyncify. I use it in an upcoming gethostbyname PR. Is it troublesome to keep around?

I don't think it will be troublesome. It seems like a good idea to make each php-wasm worker process look like this:

Single php-wasm worker process
- Main thread (setup and asynchronous operations)
  - Worker thread that runs php-wasm and can block, waiting for the main thread to perform async operations.

Even before considering blocking for gethostbyname(), it seemed like a good idea to keep the main thread unblocked since Node.js is designed around an async event loop.

I'm working on this today.

brandonpayton · 2025-12-12T06:38:05Z

@adamziel, as we discussed elsewhere, I'm testing the native locking approach via a multi-process unit test setup to validate this direction before doing the work to move Playground CLI from workers to child processes.

There is an initial comlink adapter working for IPC between parent and child Node.js processes. It is partially AI-generated and may need some work if we want to use it elsewhere, but it is good enough to run the tests. There are some failures for the POSIX file lock manager tests. I plan to fix those first to confirm the test set is good and then run the same test suite against the Windows file lock manager.

Planning to continue this in the morning.

brandonpayton · 2025-12-12T06:39:55Z

The relevant test files are under packages/php-wasm/node/src/test/.

…nd-cli/move-native-locking-into-workers

brandonpayton · 2025-12-13T08:02:11Z

The POSIX file lock manager tests are running well locally. There seems to be a path-related issue breaking the Windows tests. Planning to work on this tomorrow and see how the file lock manager is looking on Windows.

WIP: Make separate FileLockManagers for native locking

f32a8c6

brandonpayton self-assigned this Dec 8, 2025

brandonpayton added [Type] Bug An existing feature does not function as intended [Focus] Windows Support [Package][@php-wasm] Node [Package][@wp-playground] CLI labels Dec 8, 2025

brandonpayton added 6 commits December 8, 2025 20:33

Rename releaseLocksForProcessFd to releaseLocksOnFdClose

f4b7bd1

Restore tests that were accidentally left commented out in project.json

aac607e

Fix type errors in file-lock-manager-for-node tests

a6a2d25

Skip native file locking tests because they'll be moved to POSIX and …

872e8a3

…Windows FileLockManager tests

Fix posix manager class name

69302cf

Add tracking and cleanup for POSIX-native whole file locks

f4963d6

Implement release-on-close and release-on-exit for FileLockManagerFor…

6023b7d

…Posix

brandonpayton force-pushed the playground-cli/move-native-locking-into-workers branch from 60a6f0b to 6023b7d Compare December 9, 2025 18:10

brandonpayton added 8 commits December 9, 2025 23:40

Explain Path, Pid, and Fd types

d65feca

Use Path type instead of string in POSXI lock manager

8a02e76

Implement whole-file locking for Windows

2319613

Add POSIX lock manager TODO

d713beb

Cleanup relevant lock records on FD close for POSIX

47a4c1c

Implement whole-file lock cleanup on FD close and process exit

33d8d8f

Update POSIX lock manager to treat zero length ranges as covering ent…

222273d

…ire remaining range

Implement initial fcntl() for Windows along with cleanup on FD close …

00668c4

…and process exit

Address typechecking errors

9ab68d9

brandonpayton added 4 commits December 10, 2025 12:27

Update fs-ext-extra-prebuilt

7c22502

Use fcntlSync() with start/end params

677bca0

Declare tests for FileLockManager for Windows and POSIX

0d08953

Rename FileLockManagerForNode to FileLockManagerInMemory

f29e544

Add JSPI tests for POSIX and Windows file lock managers

68225e6

brandonpayton added 3 commits December 12, 2025 01:29

Move in-memory FileLockManager to php-wasm/universal and restore tests

965bbd2

Fix lockFileByteRange fcntl() lock type bug

7606626

Use multi-process tests for native FileLockManagers

cd35ff6

adamziel and others added 5 commits December 12, 2025 10:56

Merge branch 'php-wasm/node/testable-syscall-overrides' into playgrou…

50f8b30

…nd-cli/move-native-locking-into-workers

Fixes for tests. Still a few broken tests remaining.

ce260d1

Merge branch 'php-wasm/node/testable-syscall-overrides' into playgrou…

0404fa1

…nd-cli/move-native-locking-into-workers

FIx remaining failing tests

20371d1

Remove in-memory lock manager and fix other lint/type errors

b8612bd

[CLI] Move native file locking into workers #2997

Are you sure you want to change the base?

[CLI] Move native file locking into workers #2997

Uh oh!

Conversation

brandonpayton commented Dec 8, 2025

Motivation for the change, related issues

Implementation details

Testing Instructions (or ideally a Blueprint)

Uh oh!

brandonpayton commented Dec 9, 2025

Uh oh!

brandonpayton commented Dec 10, 2025

Uh oh!

adamziel commented Dec 10, 2025

Uh oh!

brandonpayton commented Dec 10, 2025

Uh oh!

adamziel commented Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandonpayton commented Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandonpayton commented Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandonpayton commented Dec 11, 2025

Uh oh!

adamziel commented Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandonpayton commented Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adamziel commented Dec 11, 2025

Uh oh!

brandonpayton commented Dec 11, 2025

Uh oh!

adamziel commented Dec 11, 2025

Uh oh!

mho22 commented Dec 11, 2025

Uh oh!

brandonpayton commented Dec 11, 2025

Uh oh!

brandonpayton commented Dec 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandonpayton commented Dec 12, 2025

Uh oh!

brandonpayton commented Dec 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

adamziel commented Dec 10, 2025 •

edited

Loading

brandonpayton commented Dec 10, 2025 •

edited

Loading

brandonpayton commented Dec 11, 2025 •

edited

Loading

adamziel commented Dec 11, 2025 •

edited

Loading

brandonpayton commented Dec 11, 2025 •

edited

Loading

brandonpayton commented Dec 12, 2025 •

edited

Loading