sync: RWMutex Readers can starve writers for many seconds

### Go version

go version go1.25.5 darwin/arm64

### Output of `go env` in your module/workspace:

```shell
AR='ar'
CC='cc'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_ENABLED='1'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
CXX='c++'
GCCGO='gccgo'
GO111MODULE=''
GOARCH='arm64'
GOARM64='v8.0'
GOAUTH='netrc'
GOBIN=''
GOCACHE='/Users/evan.jones/Library/Caches/go-build'
GOCACHEPROG=''
GODEBUG=''
GOENV='/Users/evan.jones/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFIPS140='off'
GOFLAGS=''
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/pp/tvwz4y2x2qz97pf8bftqxhrw0000gp/T/go-build2627804322=/tmp/go-build -gno-record-gcc-switches -fno-common'
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMOD='/Users/evan.jones/unfairlocks/go.mod'
GOMODCACHE='/Users/evan.jones/go/pkg/mod'
GOOS='darwin'
GOPATH='/Users/evan.jones/go'
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/opt/homebrew/Cellar/go/1.25.5/libexec'
GOSUMDB='sum.golang.org'
GOTELEMETRY='on'
GOTELEMETRYDIR='/Users/evan.jones/Library/Application Support/go/telemetry'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/opt/homebrew/Cellar/go/1.25.5/libexec/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.25.5'
GOWORK=''
PKG_CONFIG='pkg-config'
```

### What did you do?

A contented RWMutex allows readers to starve waiting writers for an extremely long time (e.g. >10 seconds). I believe this is because `RWMutex.Unlock()` first unblocks all readers, before unchecking if there are any waiting writers. I think this disagrees with the [documentation of `RLock`](https://pkg.go.dev/sync#RWMutex.RLock): "a blocked Lock call excludes new readers from acquiring the lock". 

We observed this causing some goroutines for an "overloaded" server to be blocked for a very long time (at least >1 second). For this scenario, I think it would be better if Unlock() did not unblock readers if there is another writer waiting. This would then allow continuously arriving *writers* to starve readers instead, but I think that would better match the package documentation.

The attached program simulates how Datadog's metrics package `datadog-go` uses an RWMutex for a map of counters in [aggregator.count](https://github.com/DataDog/datadog-go/blob/master/statsd/aggregator.go#L226):

* RLock()
* Read the map. If the counter key exists: increment it.
* RUnlock()
* If the key did not exist: 
   - Lock()
   - Check for the key again, increment or insert the new key
   - Unlock()

I think the following is happening in both the attached demo program and the production server:

* A writer acquires the RWMutex.Lock().
* Many more writers block in the internal Mutex.Lock().
* Many readers block in Mutex.RLock().
* The writer calls RWMutex.Unlock(). It releases all blocked readers (`// Announce to readers there is no active writer.` [in RWMutex.Unlock](https://cs.opensource.google/go/go/+/refs/tags/go1.25.5:src/sync/rwmutex.go;l=201) ).
* New readers can now acquire the RLock(). Since requests are continuously arriving, there are always running readers.
* The writer unblocks all blocked readers with a for loop in Unlock.
* The writer unblocks the next waiting writer (`// Allow other writers to proceed.`)
* The unblocked writer get scheduled, then finally blocks new readers.
* The next writer now must wait for all readers to finish, and the writer can finally enter the critical section.
* Repeat this for the next writer. The result is it takes ~100 ms to get to each writer in the queue.


Demo program: https://github.com/evanj/unfairlocks/blob/main/unfairlocks.go#L47


### What did you see happen?

In servers that use this in a very hot loop, we occasionally see "stuck" goroutines that are blocked for > 1 second.

The attached demo program prints "slow" increment requests. When I run it with increasing numbers of requests, the slowest increment time continues to increase. It appears the waiting time is basically unbounded. I can make an increment block basically ~forever by continuously adding more simulated requests.

```
$ go run . -requests=50000
...
Shard shard-35 increment duration: 598.276583ms
$ go run . -requests=500000
...
Shard shard-32 increment duration: 6.82352725s
$ go run . -requests=5000000
...
Shard shard-31 increment duration: 17.665173458s
```

This output shows the waiting time increases as I add more requests. The last line shows some goroutines were blocked for up to ~17 seconds. With this particular program and configuration, this is the worst delay I can observe: after that, the queue of blocked writers is gone.

### What did you expect to see?

The mutex should sometimes be unfair, but not exceptionally unfair, and it should not starve writers ~forever. The demo program also prints timing of a version that only uses a Mutex, and it only shows waits up to ~200 ms.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

sync: RWMutex Readers can starve writers for many seconds #76808

Go version

Output of `go env` in your module/workspace:

What did you do?

What did you see happen?

What did you expect to see?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

sync: RWMutex Readers can starve writers for many seconds #76808

Description

Go version

Output of go env in your module/workspace:

What did you do?

What did you see happen?

What did you expect to see?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Output of `go env` in your module/workspace: