forked from Shopify/ruby
-
Notifications
You must be signed in to change notification settings - Fork 1
[WIP] Fix @Shopify/ruby/issues/703 -- remove the temporary workaround in zjit/src/hir.rs and fix the crash #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
Copilot
wants to merge
6
commits into
master
Choose a base branch
from
copilot/fix-fd60cfc5-a137-4d1d-b267-8759e42da1e8
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Don't abort the entire compilation. Fix Shopify#700
Copilot stopped work on behalf of
tekknolagi due to an error
August 25, 2025 15:11
tekknolagi
pushed a commit
that referenced
this pull request
Aug 25, 2025
If we malloc when the current Ractor is locked, we can deadlock because
GC requires VM lock and Ractor barrier. If another Ractor is waiting on
this Ractor lock, then it will deadlock because the other Ractor will
never join the barrier.
For example, this script deadlocks:
r = Ractor.new do
loop do
Ractor::Port.new
end
end
100000.times do |i|
r.send(nil)
puts i
end
On debug builds, it fails with this assertion error:
vm_sync.c:75: Assertion Failed: vm_lock_enter:cr->sync.locked_by != rb_ractor_self(cr)
On non-debug builds, we can see that it deadlocks in the debugger:
Main Ractor:
frame #3: 0x000000010021fdc4 miniruby`rb_native_mutex_lock(lock=<unavailable>) at thread_pthread.c:115:14
frame Shopify#4: 0x0000000100193eb8 miniruby`ractor_send0 [inlined] ractor_lock(r=<unavailable>, file=<unavailable>, line=1180) at ractor.c:73:5
frame Shopify#5: 0x0000000100193eb0 miniruby`ractor_send0 [inlined] ractor_send_basket(ec=<unavailable>, rp=0x0000000131092840, b=0x000000011c63de80, raise_on_error=true) at ractor_sync.c:1180:5
frame Shopify#6: 0x0000000100193eac miniruby`ractor_send0(ec=<unavailable>, rp=0x0000000131092840, obj=4, move=<unavailable>, raise_on_error=true) at ractor_sync.c:1211:5
Second Ractor:
frame #2: 0x00000001002208d0 miniruby`rb_ractor_sched_barrier_start [inlined] rb_native_cond_wait(cond=<unavailable>, mutex=<unavailable>) at thread_pthread.c:221:13
frame #3: 0x00000001002208cc miniruby`rb_ractor_sched_barrier_start(vm=0x000000013180d600, cr=0x0000000131093460) at thread_pthread.c:1438:13
frame Shopify#4: 0x000000010028a328 miniruby`rb_vm_barrier at vm_sync.c:262:13 [artificial]
frame Shopify#5: 0x00000001000dfa6c miniruby`gc_start [inlined] rb_gc_vm_barrier at gc.c:179:5
frame Shopify#6: 0x00000001000dfa68 miniruby`gc_start [inlined] gc_enter(objspace=0x000000013180fc00, event=gc_enter_event_start, lock_lev=<unavailable>) at default.c:6636:9
frame Shopify#7: 0x00000001000dfa48 miniruby`gc_start(objspace=0x000000013180fc00, reason=<unavailable>) at default.c:6361:5
frame Shopify#8: 0x00000001000e3fd8 miniruby`objspace_malloc_increase_body [inlined] garbage_collect(objspace=0x000000013180fc00, reason=512) at default.c:6341:15
frame Shopify#9: 0x00000001000e3fa4 miniruby`objspace_malloc_increase_body [inlined] garbage_collect_with_gvl(objspace=0x000000013180fc00, reason=512) at default.c:6741:16
frame Shopify#10: 0x00000001000e3f88 miniruby`objspace_malloc_increase_body(objspace=0x000000013180fc00, mem=<unavailable>, new_size=<unavailable>, old_size=<unavailable>, type=<unavailable>) at default.c:8007:13
frame Shopify#11: 0x00000001000e3c44 miniruby`rb_gc_impl_malloc [inlined] objspace_malloc_fixup(objspace=0x000000013180fc00, mem=0x000000011c700000, size=12582912) at default.c:8085:5
frame Shopify#12: 0x00000001000e3c30 miniruby`rb_gc_impl_malloc(objspace_ptr=0x000000013180fc00, size=12582912) at default.c:8182:12
frame Shopify#13: 0x00000001000d4584 miniruby`ruby_xmalloc [inlined] ruby_xmalloc_body(size=<unavailable>) at gc.c:5128:12
frame Shopify#14: 0x00000001000d4568 miniruby`ruby_xmalloc(size=<unavailable>) at gc.c:5118:34
frame Shopify#15: 0x00000001001eb184 miniruby`rb_st_init_existing_table_with_size(tab=0x000000011c2b4b40, type=<unavailable>, size=<unavailable>) at st.c:559:39
frame Shopify#16: 0x00000001001ebc74 miniruby`rebuild_table_if_necessary [inlined] rb_st_init_table_with_size(type=0x00000001004f4a78, size=524287) at st.c:585:5
frame Shopify#17: 0x00000001001ebc5c miniruby`rebuild_table_if_necessary [inlined] rebuild_table(tab=0x000000013108e2f0) at st.c:753:19
frame Shopify#18: 0x00000001001ebbfc miniruby`rebuild_table_if_necessary(tab=0x000000013108e2f0) at st.c:1125:9
frame Shopify#19: 0x00000001001eba08 miniruby`rb_st_insert(tab=0x000000013108e2f0, key=262144, value=4767566624) at st.c:1143:5
frame Shopify#20: 0x0000000100194b84 miniruby`ractor_port_initialzie [inlined] ractor_add_port(r=0x0000000131093460, id=262144) at ractor_sync.c:399:9
frame Shopify#21: 0x0000000100194b58 miniruby`ractor_port_initialzie [inlined] ractor_port_init(rpv=4750065560, r=0x0000000131093460) at ractor_sync.c:87:5
frame Shopify#22: 0x0000000100194b34 miniruby`ractor_port_initialzie(self=4750065560) at ractor_sync.c:103:12
6d2afdc to
fa348b6
Compare
tekknolagi
pushed a commit
that referenced
this pull request
Oct 9, 2025
We need to free the current_block_exits in parse_program when we're done
with it to prevent memory leaks. This fixes the following memory leak detected
when running Ruby using `RUBY_FREE_AT_EXIT=1 ruby -nc -e "break"`:
Direct leak of 32 byte(s) in 1 object(s) allocated from:
#0 0x5bd3c5bc66c8 in realloc (miniruby+0x616c8) (BuildId: ruby/prism@ba6a96e5a060)
#1 0x5bd3c5f91fd9 in pm_node_list_grow prism/templates/src/node.c.erb:35:40
#2 0x5bd3c5f91e9d in pm_node_list_append prism/templates/src/node.c.erb:48:9
#3 0x5bd3c6001fa0 in parse_block_exit prism/prism.c:15788:17
Shopify#4 0x5bd3c5fee155 in parse_expression_prefix prism/prism.c:19221:50
Shopify#5 0x5bd3c5fe9970 in parse_expression prism/prism.c:22235:23
Shopify#6 0x5bd3c5fe0586 in parse_statements prism/prism.c:13976:27
Shopify#7 0x5bd3c5fd6792 in parse_program prism/prism.c:22508:40
ruby/prism@fdf9b8d24a
tekknolagi
pushed a commit
that referenced
this pull request
Oct 23, 2025
When RUBYOPT is invalid, it raises an error which causes moreswitches
to leak memory. It can be seen when building with LSAN enabled:
$ RUBY_FREE_AT_EXIT=1 RUBYOPT=f ruby
ruby: invalid option -f (-h will show valid options) (RuntimeError)
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x618cef8efa23 in malloc (miniruby+0x64a23)
#1 0x618cefa0e8d8 in rb_gc_impl_malloc gc/default/default.c:8182:5
#2 0x618cef9f7f01 in ruby_xmalloc2_body gc.c:5182:12
#3 0x618cef9f7eac in ruby_xmalloc2 gc.c:5176:34
Shopify#4 0x618cefb547b2 in moreswitches ruby.c:919:18
Shopify#5 0x618cefb526fe in process_options ruby.c:2350:9
Shopify#6 0x618cefb524ac in ruby_process_options ruby.c:3202:12
Shopify#7 0x618cef9dc11f in ruby_options eval.c:119:16
Shopify#8 0x618cef8f2fb5 in rb_main main.c:42:26
Shopify#9 0x618cef8f2f59 in main main.c:62:12
tekknolagi
pushed a commit
that referenced
this pull request
Nov 4, 2025
We can avoid taking this barrier if we're not incremental marking or lazy sweeping. I found this was taking a significant amount of samples when profiling `Psych.load` in multiple ractors due to the vm barrier. With this change, we get significant improvements in ractor benchmarks that allocate lots of objects. -- Psych.load benchmark -- ``` Before: After: r: itr: time r: itr: time 0 #1: 960ms 0 #1: 943ms 0 #2: 979ms 0 #2: 939ms 0 #3: 968ms 0 #3: 948ms 0 Shopify#4: 963ms 0 Shopify#4: 946ms 0 Shopify#5: 964ms 0 Shopify#5: 944ms 1 #1: 947ms 1 #1: 940ms 1 #2: 950ms 1 #2: 947ms 1 #3: 962ms 1 #3: 950ms 1 Shopify#4: 947ms 1 Shopify#4: 945ms 1 Shopify#5: 947ms 1 Shopify#5: 943ms 2 #1: 1131ms 2 #1: 1005ms 2 #2: 1153ms 2 #2: 996ms 2 #3: 1155ms 2 #3: 1003ms 2 Shopify#4: 1205ms 2 Shopify#4: 1012ms 2 Shopify#5: 1179ms 2 Shopify#5: 1012ms 4 #1: 1555ms 4 #1: 1209ms 4 #2: 1509ms 4 #2: 1244ms 4 #3: 1529ms 4 #3: 1254ms 4 Shopify#4: 1512ms 4 Shopify#4: 1267ms 4 Shopify#5: 1513ms 4 Shopify#5: 1245ms 6 #1: 2122ms 6 #1: 1584ms 6 #2: 2080ms 6 #2: 1532ms 6 #3: 2079ms 6 #3: 1476ms 6 Shopify#4: 2021ms 6 Shopify#4: 1463ms 6 Shopify#5: 1999ms 6 Shopify#5: 1461ms 8 #1: 2741ms 8 #1: 1630ms 8 #2: 2711ms 8 #2: 1632ms 8 #3: 2688ms 8 #3: 1654ms 8 Shopify#4: 2641ms 8 Shopify#4: 1684ms 8 Shopify#5: 2656ms 8 Shopify#5: 1752ms ```
tekknolagi
pushed a commit
that referenced
this pull request
Nov 10, 2025
We were seeing errors like: ``` * thread Shopify#8, stop reason = EXC_BAD_ACCESS (code=1, address=0x803) * frame #0: 0x00000001001fe944 ruby`rb_st_lookup(tab=0x00000000000007fb, key=1, value=0x00000001305b7490) at st.c:1066:22 frame #1: 0x000000010002d658 ruby`remove_class_from_subclasses [inlined] class_get_subclasses_for_ns(tbl=0x00000000000007fb, ns_id=1) at class.c:604:9 frame #2: 0x000000010002d650 ruby`remove_class_from_subclasses(tbl=0x00000000000007fb, ns_id=1, klass=4754039232) at class.c:620:34 frame #3: 0x000000010002c8a8 ruby`rb_class_classext_free_subclasses(ext=0x000000011b5ce1d8, klass=4754039232, replacing=<unavailable>) at class.c:700:9 frame Shopify#4: 0x000000010002c760 ruby`rb_class_classext_free(klass=4754039232, ext=0x000000011b5ce1d8, is_prime=true) at class.c:105:5 frame Shopify#5: 0x00000001000e770c ruby`classext_free(ext=<unavailable>, is_prime=<unavailable>, namespace=<unavailable>, arg=<unavailable>) at gc.c:1231:5 [artificial] frame Shopify#6: 0x000000010002d178 ruby`rb_class_classext_foreach(klass=<unavailable>, func=(ruby`classext_free at gc.c:1228), arg=0x00000001305b75c0) at class.c:518:5 frame Shopify#7: 0x00000001000e745c ruby`rb_gc_obj_free(objspace=0x000000012500c400, obj=4754039232) at gc.c:1282:9 frame Shopify#8: 0x00000001000e70d4 ruby`gc_sweep_plane(objspace=0x000000012500c400, heap=<unavailable>, p=4754039232, bitset=4095, ctx=0x00000001305b76e8) at default.c:3482:21 frame Shopify#9: 0x00000001000e6e9c ruby`gc_sweep_page(objspace=0x000000012500c400, heap=0x000000012500c540, ctx=0x00000001305b76e8) at default.c:3567:13 frame Shopify#10: 0x00000001000e51d0 ruby`gc_sweep_step(objspace=0x000000012500c400, heap=0x000000012500c540) at default.c:3848:9 frame Shopify#11: 0x00000001000e1880 ruby`gc_continue [inlined] gc_sweep_continue(objspace=0x000000012500c400, sweep_heap=0x000000012500c540) at default.c:3931:13 frame Shopify#12: 0x00000001000e1754 ruby`gc_continue(objspace=0x000000012500c400, heap=0x000000012500c540) at default.c:2037:9 frame Shopify#13: 0x00000001000e10bc ruby`newobj_cache_miss [inlined] heap_prepare(objspace=0x000000012500c400, heap=0x000000012500c540) at default.c:2056:5 frame Shopify#14: 0x00000001000e1074 ruby`newobj_cache_miss [inlined] heap_next_free_page(objspace=0x000000012500c400, heap=0x000000012500c540) at default.c:2280:9 frame Shopify#15: 0x00000001000e106c ruby`newobj_cache_miss(objspace=0x000000012500c400, cache=0x0000600001b00300, heap_idx=2, vm_locked=false) at default.c:2387:38 frame Shopify#16: 0x00000001000e0d28 ruby`newobj_alloc(objspace=<unavailable>, cache=<unavailable>, heap_idx=<unavailable>, vm_locked=<unavailable>) at default.c:2411:15 [artificial] frame Shopify#17: 0x00000001000d7214 ruby`newobj_of [inlined] rb_gc_impl_new_obj(objspace_ptr=<unavailable>, cache_ptr=<unavailable>, klass=<unavailable>, flags=<unavailable>, wb_protected=<unavailable>, alloc_size=<unavailable>) at default.c:2490:15 frame Shopify#18: 0x00000001000d719c ruby`newobj_of(cr=<unavailable>, klass=4313971728, flags=258, wb_protected=<unavailable>, size=<unavailable>) at gc.c:995:17 frame Shopify#19: 0x00000001000d73ec ruby`rb_wb_protected_newobj_of(ec=<unavailable>, klass=<unavailable>, flags=<unavailable>, size=<unavailable>) at gc.c:1044:12 [artificial] frame Shopify#20: 0x0000000100032d34 ruby`class_alloc0(type=<unavailable>, klass=4313971728, namespaceable=<unavailable>) at class.c:803:5 ```
XrXr
pushed a commit
that referenced
this pull request
Nov 13, 2025
We don't decrement the super and module subclasses count for iclasses that
are having their classext replaced. This causes the reference count to be
incorrect and leak memory.
The following script demonstrates the memory leak:
module Foo
refine(Object) do
define_method(:<=) {}
end
end
class Bar
include Comparable
end
With RUBY_FREE_AT_EXIT and ASAN, we can see many memory leaks, including:
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x599f715adca2 in calloc (miniruby+0x64ca2)
#1 0x599f716bd779 in calloc1 gc/default/default.c:1495:12
#2 0x599f716d1370 in rb_gc_impl_calloc gc/default/default.c:8216:5
#3 0x599f716b8ab1 in ruby_xcalloc_body gc.c:5221:12
Shopify#4 0x599f716b269c in ruby_xcalloc gc.c:5215:34
Shopify#5 0x599f715eab23 in class_alloc0 class.c:790:22
Shopify#6 0x599f715e4bec in class_alloc class.c:836:12
Shopify#7 0x599f715e60c9 in module_new class.c:1693:17
Shopify#8 0x599f715e60a2 in rb_module_new class.c:1701:12
Shopify#9 0x599f715e6303 in rb_define_module class.c:1733:14
Shopify#10 0x599f715ebc5f in Init_Comparable compar.c:315:22
Shopify#11 0x599f716e35f5 in rb_call_inits inits.c:32:5
Shopify#12 0x599f7169cbfd in ruby_setup eval.c:88:9
Shopify#13 0x599f7169cdac in ruby_init eval.c:100:17
Shopify#14 0x599f715b0fa9 in rb_main main.c:41:5
Shopify#15 0x599f715b0f59 in main main.c:62:12
Shopify#16 0x739b2f02a1c9 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
Shopify#17 0x739b2f02a28a in __libc_start_main csu/../csu/libc-start.c:360:3
Shopify#18 0x599f7157c424 in _start (miniruby+0x33424)
tekknolagi
pushed a commit
that referenced
this pull request
Nov 15, 2025
These tests use NM threads but NT is not freed for MN thread, causing it
to be reported as memory leaks in LSAN. For example:
#1 0x62ee7bc67e99 in calloc1 gc/default/default.c:1495:12
#2 0x62ee7bc7ba00 in rb_gc_impl_calloc gc/default/default.c:8216:5
#3 0x62ee7bc631d1 in ruby_xcalloc_body gc.c:5221:12
Shopify#4 0x62ee7bc5cdbc in ruby_xcalloc gc.c:5215:34
Shopify#5 0x62ee7bdea4c6 in native_thread_alloc thread_pthread.c:2187:35
Shopify#6 0x62ee7bdec31b in native_thread_check_and_create_shared thread_pthread_mn.c:429:39
Shopify#7 0x62ee7bdea484 in native_thread_create_shared thread_pthread_mn.c:531:12
Shopify#8 0x62ee7bdea1da in native_thread_create thread_pthread.c:2403:16
Shopify#9 0x62ee7bdde2eb in thread_create_core thread.c:884:11
Shopify#10 0x62ee7bde4466 in thread_initialize thread.c:992:16
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Thanks for asking me to work on this. I will get started on it and keep this PR's description up to date as I form a plan and make progress.
Original description:
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.