docs: More accurate NPROC documentation #239

timmc-edx · 2025-03-31T17:18:07Z

No description provided.

MoisesGSalas

hi @timmc-edx, sorry for the late response.

I left a question in my comment, I'm mostly thinking about running codejail in a Kubernetes cluster with multiple independent openedx instances.

MoisesGSalas · 2025-04-02T16:08:35Z

README.rst

+  * The ``NPROC`` limit constrains the ability of the *current* process to
+    create new threads and processes, but the usage count (how many processes
+    already exist) is the sum across *all* processes with the same UID, even in
+    other containers on the same host where the UID may be mapped to a different
+    username. This constraint also applies to the app user due to how the
+    rlimits are applied. Even if a UIDs are chosen so they aren't used by other
+    software on the host, multiple codejail sandbox processes on the same host
+    will share this usage pool and can reduce each other's ability to create
+    processes. In this situation, ``NPROC`` will need to be set higher than it
+    would be for a single codejail instance taking a single request at a time.


So if I'm getting this right, if the app user spawns multiple sandboxes (for example the codejail service handling multiple requests) the process pool will be shared between them. But not only that the same pool will be shared across different containers in the same host? is that correct? then if one codejail instance is running alongside other instances and I set NPROC to a low value it might always fail?

That's correct, yes. It's a fundamental limit of how rlimit operates. One option would be to ensure that your codejail pods are spread out over several hosts (using Kubernetes' anti-affinity mechanism). Also see the notes here on how to choose UIDs for the app and sandbox users: https://github.com/openedx/codejail-service/blob/main/docs/deployment.rst#app-user-uid

I think a longer term solution would be to replace the current codejail mechanism with something that spins up a container per execution (giving better memory confinement) and that also uses systemd's virtual-user mechanism (which creates an ephemeral user with randomized UID, for better NPROC isolation).

docs: More accurate NPROC documentation

8c2ea5b

timmc-edx mentioned this pull request Mar 31, 2025

Allow setting low NPROC in codejail-service edx/edx-arch-experiments#983

Closed

8 tasks

MoisesGSalas reviewed Apr 2, 2025

View reviewed changes

MoisesGSalas mentioned this pull request Apr 2, 2025

fix: set FLASK_APP_SETTINGS env for k8s-deployment eduNEXT/tutor-contrib-codejail#65

Merged

MoisesGSalas approved these changes Apr 3, 2025

View reviewed changes

timmc-edx merged commit cc731d4 into master Apr 3, 2025
4 checks passed

timmc-edx deleted the timmc/doc-nproc branch April 3, 2025 19:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: More accurate NPROC documentation #239

docs: More accurate NPROC documentation #239

Uh oh!

timmc-edx commented Mar 31, 2025

Uh oh!

MoisesGSalas left a comment

Uh oh!

MoisesGSalas Apr 2, 2025

Uh oh!

timmc-edx Apr 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

docs: More accurate NPROC documentation #239

docs: More accurate NPROC documentation #239

Uh oh!

Conversation

timmc-edx commented Mar 31, 2025

Uh oh!

MoisesGSalas left a comment

Choose a reason for hiding this comment

Uh oh!

MoisesGSalas Apr 2, 2025

Choose a reason for hiding this comment

Uh oh!

timmc-edx Apr 2, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants