Importing net/http/pprof
adds HTTP endpoints to serve profile files that can be viewed or charted with a command line tool when the runtime.Set*
functions are executed.
import _ "net/http/pprof"
If your application is not already running an http or https server, add net/http
to the program imports and the following code to the start of the main
function:
go http.ListenAndServe(":8550", nil)
# SetBlockProfileRate controls the fraction of goroutine blocking events that are reported in the blocking profile.
# The profiler aims to sample an average of one blocking event per rate nanoseconds spent blocked.
# To include every blocking event in the profile, pass rate = 1. To turn off profiling entirely, pass rate <= 0.
runtime.SetBlockProfileRate(1)
# SetMutexProfileFraction controls the fraction of mutex contention events that are reported in the mutex profile.
# On average 1/rate events are reported. The previous rate is returned.
# To turn off profiling entirely, pass rate 0. To just read the current rate, pass rate < 0. (For n>1 the details of sampling may change.)
# You need a rate greater than 0 to gather the contended mutex data.
# if you use a number greater than 1, not all events will be reported because the "fraction" will be less than 1/1. A value of 1 means report 100% of the events at a 1:1 ratio of time.
# For example, if a caller holds a lock for 1s while 5 other goroutines are waiting for the entire second to acquire the lock, its unlock call stack will report 5s of contention.
runtime.SetMutexProfileFraction(1)
pprof
is a tool that runs on any machine with go
and fetches the profile files from the server you placed in your program via http
.
- goroutine stack trace:
go tool pprof "http://127.0.0.1:8550/debug/pprof/goroutine"
- memory heap sample:
go tool pprof "http://127.0.0.1:8550/debug/pprof/heap"
- sampling of all past memory allocations:
go tool pprof "http://127.0.0.1:8550/debug/pprof/allocs"
- threadcreate stack trace:
go tool pprof "http://127.0.0.1:8550/debug/pprof/threadcreate"
- blocking of synchronization primitives, available with
runtime.SetBlockProfileRate(1)
:
go tool pprof "http://127.0.0.1:8550/debug/pprof/block"
- holders of contended mutexes, available with
runtime.SetMutexProfileFraction(1)
:
go tool pprof "http://127.0.0.1:8550/debug/pprof/mutex"
With the contended mutex report you will quickly know if there are contended mutexes by the time in the left column.
Showing nodes accounting for 86.11hrs, 99.49% of 86.55hrs total
Dropped 117 nodes (cum <= 0.43hrs)
flat flat% sum% cum cum%
51.76hrs 59.81% 59.81% 51.76hrs 59.81% runtime.selectgo
34.35hrs 39.69% 99.49% 34.35hrs 39.69% sync.(*Cond).Wait
Contended is not locked.
Contended mutexes are just busy. If you want to find which mutexes you did not unlock:
To find a locked mutex by line number, download the goroutine stack and look at every running goroutine with curl -o goroutine_stack.txt "http://127.0.0.1:8550/debug/pprof/goroutine?debug=2"
then less goroutine_stack.txt
.
Start at the top of the file and search for Lock,
, you will find something similar to goroutine 1242 [sync.Mutex.Lock, 1039 minutes]:
. Near the end of that line there is a time showing how long the goroutine has been waiting. Beneath that line is the function trace with line numbers that allows you find the locked mutex in your code.
The first block with Lock,
and a duration is near the oldest allowing you to find the mutex that never unlocked.
The duration that the mutex has been locked is not available when you get the stack trace with hold ctrl
and press \
to exit the running program. You must download the goroutine stack with debug=2
to view that information.
First install graphviz
because it is used to create the charts from the available data with sudo yum install graphviz
.
Then run pprof
with any profile goroutine, heap, allocs, threadcreate, block or mutex
and these additional command line parameters: go tool pprof -http 0.0.0.0:8999 "http://127.0.0.1:8550/debug/pprof/goroutine"
.
Open a browser to the IP address of the machine running go tool pprof
and port 8999
.
- View all in-use functions heirarchically with
go tool pprof -http 0.0.0.0:8999 "http://127.0.0.1:8550/debug/pprof/goroutine
by clickingView -> Graph
.

The CPU profile is not available as a Profile. It has a special API, the StartCPUProfile
and StopCPUProfile
functions, because it streams output to a writer during profiling.
That means you need to write code to start and stop output to a file that you can then read with go tool pprof
.
- Add
SetCpuProfileRate
to the start of the main function.
# SetCPUProfileRate sets the CPU profiling rate to hz samples per second. If hz <= 0, SetCPUProfileRate turns off profiling.
# If the profiler is on, the rate cannot be changed without first turning it off.
runtime.SetCPUProfileRate(27)
- Import
runtime/pprof
:
import "runtime/pprof"
- Start and end CPU profiling in your program with:
f, perr := os.Create("/tmp/cpu.pprof")
if perr != nil {
fmt.Println(perr)
os.Exit(1)
}
pprof.StartCPUProfile(f)
// the code you want to profile
pprof.StopCPUProfile()
Then you can view flame graphs of CPU usage with go tool pprof -http 0.0.0.0:8999 /tmp/cpu.pprof
go tool
is a command line tool, once it is open type help for more information.
------------------------------------------------------------------------------
IMPORTANT!
Use -sample_index=alloc_space
, the default -sample_index=inuse_space
does not show all memory leaks causing out of memory.
go tool pprof -sample_index=alloc_space "http://127.0.0.1:8550/debug/pprof/heap"
This is example output from the memory heap profiling tool in pprof
.
go tool pprof "http://127.0.0.1:8550/debug/pprof/heap"
Fetching profile over HTTP from 127.0.0.1:8550/debug/pprof/heap
Saved profile in /Users/zip/pprof/pprof.listener.alloc_objects.alloc_space.inuse_objects.inuse_space.001.pb.gz
File: listener
Type: inuse_space
Time: Aug 5, 2023 at 7:18pm (CDT)
Entering interactive mode (type "help" for commands, "o" for options)
top - show functions using the most memory
(pprof) top
Showing nodes accounting for 159.41MB, 98.13% of 162.44MB total
Dropped 25 nodes (cum <= 0.81MB)
Showing top 10 nodes out of 33
flat flat% sum% cum cum%
155.61MB 95.80% 95.80% 155.61MB 95.80% reflect.unsafe_NewArray
3.29MB 2.03% 97.82% 3.29MB 2.03% encoding/gob.(*encBuffer).writeByte
0.50MB 0.31% 98.13% 1MB 0.62% main.collectors_read_journal_entry
0 0% 98.13% 156.11MB 96.11% encoding/gob.(*Decoder).Decode
0 0% 98.13% 156.11MB 96.11% encoding/gob.(*Decoder).DecodeValue
0 0% 98.13% 156.11MB 96.11% encoding/gob.(*Decoder).decOpFor.func2
0 0% 98.13% 155.61MB 95.80% encoding/gob.(*Decoder).decOpFor.func3
0 0% 98.13% 155.61MB 95.80% encoding/gob.(*Decoder).decOpFor.func4
0 0% 98.13% 74.01MB 45.56% encoding/gob.(*Decoder).decodeArrayHelper
0 0% 98.13% 156.11MB 96.11% encoding/gob.(*Decoder).decodeMap
tree - show a text rendering of call graph
peek - show callers/callees of functions
(pprof) peek reflect.unsafe_NewArray
Showing nodes accounting for 162.44MB, 100% of 162.44MB total
----------------------------------------------------------+-------------
flat flat% sum% cum cum% calls calls% + context
----------------------------------------------------------+-------------
155.61MB 100% | reflect.MakeSlice
155.61MB 95.80% 95.80% 155.61MB 95.80% | reflect.unsafe_NewArray
----------------------------------------------------------+-------------
(pprof) peek reflect.MakeSlice
Showing nodes accounting for 162.44MB, 100% of 162.44MB total
----------------------------------------------------------+-------------
flat flat% sum% cum cum% calls calls% + context
----------------------------------------------------------+-------------
155.61MB 100% | encoding/gob.(*Decoder).decodeSlice
0 0% 0% 155.61MB 95.80% | reflect.MakeSlice
155.61MB 100% | reflect.unsafe_NewArray
----------------------------------------------------------+-------------
trace data with the go tool trace
command. To collect a 5-second execution trace:
curl -o trace.out "http://127.0.0.1:8550/debug/pprof/trace?seconds=5"
go tool trace trace.out
Hold ctrl
and press \
.
Remember that the duration that the mutex has been locked is not available when you get the stack trace with hold ctrl
and press \
to exit the running program. You must download the goroutine stack with debug=2
to view that information.
Download the goroutine stack, it is similar to the output from a go program that crashes and lists every running goroutine on exit.
curl -o goroutine_stack.txt "http://127.0.0.1:8550/debug/pprof/goroutine?debug=2"
This is useful for solving the error runtime: goroutine stack exceeds 1000000000-byte limit
. Reading the goroutine stack allows you to know what subroutines are being started too often.
# this fails with runtime: goroutine stack exceeds 1000000000-byte limit
func a() {
if (queue_item_exists == true) {
// process queue item
a()
} else {
time.Sleep(time.Second)
a()
}
}
# this works
func a() {
if (queue_item_exists == true) {
// process queue item
go a()
} else {
time.Sleep(time.Second)
go a()
}
}