📖 Analogy
Picture a workshop with a few workbenches (the Ps — exactly GOMAXPROCS of them), a pool of workers (the OS threads, Ms), and a big pile of job tickets (goroutines, Gs). A worker can only work at a bench, and each bench keeps its own short stack of next tickets. If a worker has to leave for a long phone call (a blocking syscall), they hand their bench to another worker so the queued tickets keep moving. A bench that runs out of tickets doesn’t idle — its worker walks over and steals half of a busy bench’s stack. And a floor manager (sysmon) roams without a bench, tapping anyone who’s hogged a bench too long and reclaiming benches from workers stuck on the phone.
This page assumes the concurrency scheduler page, which introduces the G-M-P model and GOMAXPROCS. Here we go beneath it: the monitor thread, preemption, the netpoller, and stealing.
G, M, P — the quick recap
- G — a goroutine: its stack, instruction pointer, and scheduling state.
- M — a “machine,” i.e. an OS thread. Only an M can execute code.
- P — a logical processor: a scheduling context holding a local run queue and the mcache. An M must hold a P to run Gs. There are
GOMAXPROCSPs, which bounds parallelism.
graph TD subgraph P0["P (local run queue)"] G1["G"] --- G2["G"] --- G3["G"] end subgraph P1["P (local run queue)"] G4["G"] --- G5["G"] end M0["M (OS thread)"] --> P0 M1["M (OS thread)"] --> P1 GQ["global run queue"] -.->|"refilled from / stolen to"| P0 GQ -.-> P1 SM["sysmon (no P)"] -.->|"preempt · retake P · netpoll"| M0
Handoff: why a blocking syscall doesn’t stall everything
The P/M split exists for one big reason. When a goroutine makes a blocking syscall, its M parks in the kernel and can’t run anything. If work lived on the M, the whole local queue would freeze. Instead the work lives on the P: the runtime detaches the P from the blocked M and hands it to another M (waking a parked one or spawning a new one), which immediately resumes the queued goroutines. When the syscall returns, the original M tries to reacquire a P; if none is free, its goroutine goes to the global queue and the M parks.
This handoff is why a handful of blocking calls don’t tank throughput — but it’s also why a program that spawns thousands of simultaneously blocking syscalls can create thousands of Ms.
Work-stealing with handoff
There’s no central dispatcher assigning goroutines to Ps. Instead, when a P’s local run queue empties, its M looks for work in order:
- its own local queue (empty),
- the global run queue,
- the netpoller (goroutines whose I/O just became ready),
- steal ~half of a randomly chosen other P’s local queue.
Stealing half (not one) means a freshly-stolen P quickly has work to share again, spreading load in a few hops. New goroutines go on the creator’s local queue (fast, no lock); overflow spills to the global queue.
sysmon and preemption
A normal scheduling decision happens at a safe point — typically a function call. But some things can’t wait for the running goroutine to be polite:
- a goroutine in a tight loop with no calls would never yield;
- an M stuck in a long syscall holds resources;
- the network needs polling and the GC needs triggering on time.
So Go runs sysmon, a special M with no P that loops forever (backing off when idle). sysmon:
- retakes Ps from Ms blocked in syscalls for too long,
- marks long-running goroutines for preemption,
- polls the network (the netpoller integrates epoll/kqueue/IOCP so a blocked socket parks its goroutine instead of its thread),
- triggers GC and memory scavenging on timers.
Since Go 1.14, preemption is asynchronous: sysmon sends a signal (SIGURG on Unix) to a thread running a hog goroutine; the runtime handler stops it at a safe point. Before 1.14, a for {} with no calls could pin a P forever and even stall the GC — now it can’t.
Observing the scheduler
You can’t watch the run queues from pure Go, but you can see the moving parts — GOMAXPROCS, thread and goroutine counts — and the effect of yielding:
package main
import (
"fmt"
"runtime"
"sync"
)
func main() {
fmt.Println("GOMAXPROCS (P count):", runtime.GOMAXPROCS(0))
fmt.Println("NumCPU:", runtime.NumCPU())
fmt.Println("goroutines at start:", runtime.NumGoroutine())
var wg sync.WaitGroup
for i := 0; i < 1000; i++ {
wg.Add(1)
go func(n int) {
defer wg.Done()
// A little CPU work; the scheduler multiplexes 1000 Gs onto GOMAXPROCS Ps.
sum := 0
for j := 0; j < 1000; j++ {
sum += j
}
if n == 0 {
fmt.Println("peak goroutines (incl. this one):", runtime.NumGoroutine())
}
runtime.Gosched() // voluntarily yield this G back to the scheduler
}(i)
}
wg.Wait()
fmt.Println("goroutines after wait:", runtime.NumGoroutine())
}
runtime.Gosched() is the explicit “yield the P now” call. You rarely need it — the scheduler preempts for you — but it makes the cooperative side of scheduling concrete.
To watch the scheduler itself, use the schedtrace debug knob (build-time/runtime env, not in-playground):
# Print scheduler state every 1000ms: run queue sizes, idle Ps/Ms, etc.
GODEBUG=schedtrace=1000 ./myprogram
# SCHED 1003ms: gomaxprocs=8 idleprocs=6 threads=12 spinningthreads=1 runqueue=0 ...
# Add scheddetail=1 for per-P/per-M breakdowns.
GODEBUG=schedtrace=1000,scheddetail=1 ./myprogram
Reference
| Term | Meaning |
|---|---|
| G / M / P | Goroutine / OS thread / logical processor |
GOMAXPROCS | Number of Ps (parallelism bound) |
| Local run queue | Per-P queue (lock-free fast path) |
| Global run queue | Shared overflow queue |
| Handoff | Detach P from a syscall-blocked M to another M |
| Work-stealing | Idle P steals ~half a victim P’s queue |
| netpoller | epoll/kqueue/IOCP integration; parks Gs on I/O |
| sysmon | P-less monitor: preemption, retake, netpoll, GC timers |
| Async preemption | SIGURG-based interruption of hog goroutines (1.14+) |
🐹 GOMAXPROCS, blocking, and the netpoller
Three practical takeaways. GOMAXPROCS defaults to NumCPU — usually right; in containers, set it to your CPU limit (or use automaxprocs) so the runtime doesn’t over-schedule. Network I/O doesn’t burn a thread — the netpoller parks the goroutine and frees the M, so “a goroutine per connection” scales to hundreds of thousands. But blocking syscalls (file I/O, cgo, some DNS) do tie up an M via handoff, so a flood of them can spawn many threads — bound that concurrency with a semaphore or worker pool.
⚠️ Fairness is good now, but starvation patterns remain
Async preemption (1.14+) killed the classic “tight loop pins a CPU and stalls GC” bug — but a few sharp edges remain. cgo and blocking syscalls run outside Go’s preemption, so a long C call holds its M the whole time. runtime.LockOSThread pins a goroutine to its M (needed for some OS/graphics APIs) and that M won’t run other Gs until you unlock. And runtime.Gosched() is almost never the fix for a performance problem — if goroutines aren’t progressing, look for blocking calls, lock contention, or unbounded goroutine creation, not missing yields. Verify with schedtrace and the execution tracer.
See also
- the scheduler (concurrency) — the G-M-P model and
GOMAXPROCSat the user level. - runtime introspection —
schedtrace,NumGoroutine, the execution tracer. - goroutines (concurrency) — what’s being scheduled, and leak avoidance.
- the garbage collector — why timely preemption matters for GC.
Next: observing all of this from inside your program — runtime introspection.
Related topics
Observe the live runtime from inside your program — runtime.MemStats, the modern runtime/metrics package, and GODEBUG knobs like gctrace and schedtrace.
memoryThe Garbage CollectorGo's concurrent garbage collector — tricolor mark-and-sweep, write barriers, the GOGC and GOMEMLIMIT knobs, and how to trade speed against footprint.
memoryThe Stack & the HeapWhere Go values live — the fast per-goroutine stack vs the garbage-collected heap, why stacks grow and get copied, and how the compiler (not you) decides.
Check your understanding
Score: 0 / 51. What do G, M, and P stand for in the Go scheduler?
Goroutines (G) run on OS threads (M), but only while the M holds a P. A P is a logical processor: it owns a local run queue of runnable goroutines and the mcache. The number of Ps is GOMAXPROCS, which bounds how many goroutines run in parallel.
2. Why does a P need to exist separately from an M?
If a goroutine makes a blocking syscall, its M parks in the kernel. The P it was holding is detached and handed to another (or new) M so the local run queue keeps executing. This handoff is why a few blocking syscalls don't stall all your goroutines — the P, not the M, owns the schedulable work.
3. What is work-stealing?
To balance load without a central dispatcher, an idle P checks the global queue and the netpoller, then steals ~half of a randomly chosen victim P's local run queue. This keeps all Ps busy with minimal coordination, the core of the scheduler's scalability.
4. What does sysmon do?
sysmon (system monitor) is a special M that runs in a loop without needing a P. It handles the things that can't wait for a normal scheduling point: marking long-running goroutines for preemption, taking back Ps from Ms blocked in syscalls, network polling, and triggering GC or scavenging on timers.
5. How does Go preempt a goroutine that never makes a function call (e.g. a tight math loop), as of Go 1.14+?
Before Go 1.14, preemption was cooperative — only at function-call safe points — so a tight loop with no calls could monopolize a P. Go 1.14 added asynchronous preemption: sysmon sends a signal (SIGURG on Unix) to the running thread, and the runtime stops the goroutine at a safe point, guaranteeing fairness and timely GC.
Comments
Sign in with GitHub to join the discussion.