The current vmm
trace ring buffer implementation is in vmm_sol_dev.c
. There is one global ring buffer with a maximum size of 1 MB. Each message in the ring buffer is 280 bytes, allowing 3744 messages in the buffer.
> ::sizeof vmm_trace_dmsg_t |=D
280
> ::print -t vmm_trace_dmsg_t
vmm_trace_dmsg_t {
timespec_t timestamp {
time_t tv_sec
long tv_nsec
}
char [256] buf
struct vmm_trace_dmsg *next
}
Questionanble implementation decisions include:
- There is one global ring buffer and messages are not tagged with the vmm instance that logged.
- Each node in the buffer is dynamically allocated until the 1 MB limit is reached, leading to rather complicated allocation logic.
- The ring buffer memory is not freed when vmm unloads.
Currently there are no callers of vmm_trace_log()
, which means what is present
is dead code.
The tracing facility needs to fulfill the following requirements:
- Each call must be relatively light-weight.
- Each message needs to be attributable to a particular instantiation of a virtual machine.
- Each message needs a time stamp
- Each message needs a severity
- Messages must survive the death of VM and the non-global zone that contains the VM.
- Messages must be easily accessible through mdb
- Messages must be retrievable via
vmm_ioctl()
or similar.
There will be a single global ring buffer that is allocated and initialized as the first virtual machine starts. The number of entries in the ring buffer will be tunable up until the first VM is started.
Trace messages inserted into a global ring buffer with:
void vmm_trace_log(struct vm *vm, vmm_trace_flags_t severity,
const char *fmt, ...);
The number of trace messages kept is tunable until the first VM is started.
int vmm_trace_log_count = 5000; /* Arbitrary, about 1.5 MB */
The following symbols will be helpful for debugging:
vmm_trace_node_t *vmm_debug_rbuf
is the ring bufferint vmm_debug_rbuf_entries
number of elements invmm_debug_rbuf
list_t vmm_trace_heads
keeps track of per-VMvmm_trace_head_t
s. This helps you find trace messages for VMs that no longer exist but still have messages in the ring buffer.
Messages can be read by user space with:
int vm_get_next_log(struct vmctx *ctx, vmm_trace_flags_t severities,
vmm_trace_dmsg_t *msg);
Important declarations shared between user space and kernel space are:
typedef enum {
/* logging levels */
VTF_DEBUG = 0x01,
VTF_ERROR = 0x02,
} vmm_trace_flags_t;
typedef struct vmm_trace_dmsg {
timespec_t vtd_timestamp;
char vtd_buf[256]; /* Always nul terminated */
vmm_trace_flags_t vtd_flags;
} vmm_trace_dmsg_t;
Each ring buffer entry will be:
typedef struct vmm_trace_node {
vmm_trace_dmesg_t vtn_msg;
vmm_trace_head_t *vtd_vm_head; /* Which VM was this from? */
list_node_t vtd_vm_linkage; /* List head in vtd_vm_head */
} vmm_trace_node_t;
Each struct vm
will reference a newly allocated vmm_trace_head_t
.
vmm_trace_log()
will update vtd_vm_head
to refrence the per-VM
vmm_trace_head_t
and insert this vmm_trace_dmsg_t
at the tail of a list in
vmm_trace_head_t
.
typedef struct vmm_trace_head {
vmm_t *vth_vm; /* If NULL, candidate for destruction */
char vth_name[32]; /* VM name */
zoneid_t vth_zoneid;
zonedid_t vth_zonedid;
list_head_t vth_msgs;
kmutex_t vth_lock;
kmutex_t vth_cv; /* To signal vm_get_next_log() */
vmm_trace_dmsg_t *vth_nextmsg; /* To help vm_get_next_log() */
/* XXX included for discussion. Not sure how useful they are. */
struct timeval vth_create_time;
struct timeval vth_destroy_time;
} vmm_trace_head_t;
The vtd_flags
field will be used to track logging levels and perhaps other
things. vmm_trace_log()
's severity
argument will be masked to only accept
the logging severity bits.
Each vmm_trace_head_t
is a member of the vmm_trace_heads
linked list. As
the corresponding bhyve instance is being destroyed, the vmm_trace_head_t
becomes a candidate for destruction. It will only be destroyed when it no
longer references any messages.
There will be a global lock that is held during the following operations:
- During allocation of the ring buffer.
- As
vmm_vtrace_log()
selects the nextvmm_trace_dmsg_t
to use. It is not held as thevmm_trace_dmsg_t
is updated. - As nodes are added to or removed from
vmm_trace_heads
.
The vth_lock
in a vth_trace_head_t
is held during the following operations:
- While the
vm_get_next_log()
ioctl
is readingvth_nextmsg
. - While the
vm_get_next_log()
ioctl
is copying a message. - While a
vmm_trace_node_t
is being removed from or added to itsvth_msgs
list (e.g. viavmm_th_remove_node()
andvmm_th_add_node()
)
The vth_cv
is signaled by vmm_th_add_node()
when it changes vth_nextmsg
from NULL
. It will also be signaled as the VM is destroyed so that the ioctl
may return an error.
As vmm_th_remove_node()
is removing a node, it will check to see if the
vmm_trace_head_t
should be destroyed. It should be destroyed if vth_vm
is
NULL
and the vth_msgs
list is empty.
zhyve
will gain a logging thread that will loop on calls to
vm_get_next_log()
, writing the received messages to stderr
. Per OS-6831,
these messages will find their way into the platform log at
<zonepath>/logs/platform.log
.