Go’s garbage collector has improved dramatically over the years — sub-millisecond pause times are now the norm. But “the GC is fast” is not a license to ignore allocation patterns. In high-throughput services processing millions of events per day, GC pressure compounds: more allocations mean more GC cycles, more cycles mean more CPU stolen from your hot path.
What to Instrument First
Before optimizing anything, measure. The two metrics that matter most:
go_gc_duration_seconds — how long GC pauses are
go_memstats_alloc_bytes — live heap size
go_memstats_sys_bytes — total memory requested from OS A rising sys_bytes that stays high after GC cycles usually means you’re holding references to memory you don’t need. A high alloc_bytes / sys_bytes ratio means your working set is small but your allocation rate is high — the GC is working hard just to keep up.
The Escape Analysis Tool
Before profiling, use the compiler’s escape analysis to understand what’s going on:
go build -gcflags="-m=2" ./... 2>&1 | grep "escapes to heap" Heap escapes are not always bad, but understanding why an allocation escapes tells you where to focus. The most common culprits:
Interface boxing — passing a concrete type through an interface allocates unless the value fits in a pointer-sized slot.
Closures capturing variables — the captured variable moves to the heap.
Slices exceeding stack limits — Go’s stack limit for inline allocation is roughly 64KB; anything larger goes to the heap immediately.
The sync.Pool Pattern
For services with a high rate of short-lived, same-shape allocations, sync.Pool is the right tool:
var bufPool = sync.Pool{
New: func() any {
return make([]byte, 0, 4096)
},
}
func processEvent(data []byte) error {
buf := bufPool.Get().([]byte)
defer func() {
buf = buf[:0]
bufPool.Put(buf)
}()
buf = append(buf, data...)
// ... process
return nil
} sync.Pool objects are cleared between GC cycles — which is the point. They reduce allocation pressure without leaking memory. Don’t use a pool for objects with non-trivial cleanup requirements; the GC will clear them at an unpredictable time.
Profiling in Production
pprof over HTTP is safe to enable in production with a tight IP allowlist:
import _ "net/http/pprof"
go func() {
log.Println(http.ListenAndServe("127.0.0.1:6060", nil))
}() Capture a 30-second heap profile during a high-traffic window:
go tool pprof -http :8080 http://localhost:6060/debug/pprof/heap The flame graph view makes allocation hot spots immediately visible. Sort by inuse_space to see what’s currently live; sort by alloc_space to see cumulative pressure.
The GOGC Lever
GOGC controls when the GC triggers, expressed as a percentage of heap growth since the last collection. The default is 100 (trigger when heap doubles).
For latency-sensitive services with a small working set, a lower GOGC (e.g. GOGC=50) trades CPU for more frequent but shorter GC cycles. For throughput-first services with a large working set, raising GOGC reduces GC frequency at the cost of higher peak memory.
There is no universal right value. Measure your specific workload.