perf is very handy, it comes with linux and works without a lot of setup (contrast: oprofile, systemtap)
probes work for kernel and userspace and you can record function arguments and local frame and other variables in the trace
you can use perf script to view the output of such traces, and it will display the variables bundled with the trace, by name, and with the format given when the probe was added
recently someone was asked to record what happened in systemd at source_dispatch, in sd-event; to see what the value of s->description was at the time, they were advised to use gdb
this kind of information is easy to record with perf
add the probe
perf probe -x /usr/lib/systemd/systemd --add 'source_dispatch s->description:string'
record the probe
perf record -e probe_systemd:source_dispatch -a
stop the record whenever appropriate
display the probe
perf script
small sample of trace recorded with this probe
systemd 1 [000] 2837.388006: probe_systemd:source_dispatch: (800b4340) description="automount-expire"
systemd 1 [000] 2838.637992: probe_systemd:source_dispatch: (800b4340) description="automount-expire"
systemd 1 [000] 2839.887996: probe_systemd:source_dispatch: (800b4340) description="automount-expire"
systemd 1 [001] 2840.138399: probe_systemd:source_dispatch: (800b4340) description="manager-notify"
systemd 1 [000] 2841.137888: probe_systemd:source_dispatch: (800b4340) description="automount-expire"
systemd 1 [000] 2842.387882: probe_systemd:source_dispatch: (800b4340) description="automount-expire"
you'll need debug symbols, various ways to do that, on fedora dnf debuginfo-install package
will install pretty much everything required for a specific package
finding the trace points
perf probe -x /usr/lib/systemd/systemd -v -F
will output a list of functions that are probable, part of which look like
...
socket_trigger_notify.lto_priv.382
socket_unwatch_control_pid.lto_priv.861
socket_unwatch_fds
socket_watch_fds.lto_priv.864
socketpair@plt
sort_chain
source_disconnect
source_dispatch.lto_priv.921
source_free.lto_priv.267
source_io_register
source_io_unregister
source_new.lto_priv.266
source_set_pending.lto_priv.630
specifier_boot_id
specifier_cgroup
...
the part we're interested is right in the middle, source_dispatch
finding variables available in function at trace point
perf probe -x /usr/lib/systemd/systemd -v -L source_dispatch
entire output reproduced here for clarity
Open Debuginfo file: /usr/lib/debug/usr/lib/systemd/systemd.debug
fname: src/libsystemd/sd-event/sd-event.c, lineno:2077
New line range: 2077 to 2147483647
path: src/libsystemd/sd-event/sd-event.c
<source_dispatch@/usr/src/debug/systemd-222/src/libsystemd/sd-event/sd-event.c:0>
0 static int source_dispatch(sd_event_source *s) {
int r = 0;
3 assert(s);
4 assert(s->pending || s->type == SOURCE_EXIT);
6 if (s->type != SOURCE_DEFER && s->type != SOURCE_EXIT) {
7 r = source_set_pending(s, false);
8 if (r < 0)
return r;
}
12 if (s->type != SOURCE_POST) {
sd_event_source *z;
Iterator i;
/* If we execute a non-post source, let's mark all
* post sources as pending */
19 SET_FOREACH(z, s->event->post_sources, i) {
20 if (z->enabled == SD_EVENT_OFF)
continue;
23 r = source_set_pending(z, true);
24 if (r < 0)
return r;
}
}
29 if (s->enabled == SD_EVENT_ONESHOT) {
30 r = sd_event_source_set_enabled(s, SD_EVENT_OFF);
31 if (r < 0)
return r;
}
35 s->dispatching = true;
37 switch (s->type) {
case SOURCE_IO:
40 r = s->io.callback(s, s->io.fd, s->io.revents, s->userdata);
break;
case SOURCE_TIME_REALTIME:
case SOURCE_TIME_BOOTTIME:
case SOURCE_TIME_MONOTONIC:
case SOURCE_TIME_REALTIME_ALARM:
case SOURCE_TIME_BOOTTIME_ALARM:
48 r = s->time.callback(s, s->time.next, s->userdata);
break;
case SOURCE_SIGNAL:
52 r = s->signal.callback(s, &s->signal.siginfo, s->userdata);
break;
case SOURCE_CHILD: {
bool zombie;
zombie = s->child.siginfo.si_code == CLD_EXITED ||
59 s->child.siginfo.si_code == CLD_KILLED ||
s->child.siginfo.si_code == CLD_DUMPED;
62 r = s->child.callback(s, &s->child.siginfo, s->userdata);
/* Now, reap the PID for good. */
65 if (zombie)
66 waitid(P_PID, s->child.pid, &s->child.siginfo, WNOHANG|WEXITED);
break;
}
case SOURCE_DEFER:
r = s->defer.callback(s, s->userdata);
break;
case SOURCE_POST:
r = s->post.callback(s, s->userdata);
break;
case SOURCE_EXIT:
80 r = s->exit.callback(s, s->userdata);
break;
case SOURCE_WATCHDOG:
case _SOURCE_EVENT_SOURCE_TYPE_MAX:
case _SOURCE_EVENT_SOURCE_TYPE_INVALID:
86 assert_not_reached("Wut? I shouldn't exist.");
}
89 s->dispatching = false;
91 if (r < 0) {
92 if (s->description)
93 log_debug_errno(r, "Event source '%s' returned error, disabling: %m", s->description);
else
95 log_debug_errno(r, "Event source %p returned error, disabling: %m", s);
}
98 if (s->n_ref == 0)
99 source_free(s);
else if (r < 0)
101 sd_event_source_set_enabled(s, SD_EVENT_OFF);
103 return 1;
104 }
static int event_prepare(sd_event *e) {
int r;
the value of s->description is needed and the syntax for dereference basically matches C's (s->description) and we can pass some help to script, which will be run to read the trace, that indicates it is a printable string (:string)
add the probe
perf probe -x /usr/lib/systemd/systemd -v --add 'source_dispatch s->description:string'
perf prints some useful information when you add this type of probe
probe-definition(0): source_dispatch s->description:string
symbol:source_dispatch file:(null) line:0 offset:0 return:0 lazy:(null)
parsing arg: s->description:string into type:string s, description(1)
1 arguments
Open Debuginfo file: /usr/lib/debug/usr/lib/systemd/systemd.debug
Try to find probe point from debuginfo.
Matched function: source_dispatch
Probe point found: source_dispatch+0
Searching 's' variable in context.
Converting variable s into trace event.
converting description in s
description type is (null).
Found 1 probe_trace_events.
Opening /sys/kernel/debug/tracing/uprobe_events write=1
Added new event:
Writing event: p:probe_systemd/source_dispatch /usr/lib/systemd/systemd:0x2d340 description=+0(+16(%ax)):string
probe_systemd:source_dispatch (on source_dispatch in /usr/lib/systemd/systemd with description=s->description:string)
You can now use it in all perf tools, such as:
perf record -e probe_systemd:source_dispatch -aR sleep 1
look at list of probes
perf probe -l
probe_systemd:source_dispatch (on source_dispatch@sd-event/sd-event.c in /usr/lib/systemd/systemd with description)
since in this case the program is already running and this trace point is already being triggered all the time we just record it immediately and don't use the suggested invocation
perf record -e probe_systemd:source_dispatch -a
perf records perf.data to the local directory and that is where perf report/script will read it from by default
have perf run the script that decodes the trace parameters for nice display
perf script
remove probe
perf probe --del probe_systemd:source_dispatch
one thing that is kind of confusing is that the man pages turn -> into the unicode code point →
if you spot any errors or have a suggestion feel free to contact
Thanks.
fix:
LC_ALL=C man perf-probe
:)