The current vmm trace ring buffer implementation is in vmm_sol_dev.c. There is one global ring buffer with a maximum size of 1 MB. Each message in the ring buffer is 280 bytes, allowing 3744 messages in the buffer.
> ::sizeof vmm_trace_dmsg_t |=D
280
> ::print -t vmm_trace_dmsg_t
vmm_trace_dmsg_t {
timespec_t timestamp {
time_t tv_sec
long tv_nsec
}
char [256] buf
struct vmm_trace_dmsg *next
}
Questionanble implementation decisions include:
- There is one global ring buffer and messages are not tagged with the vmm instance that logged.
- Each node in the buffer is dynamically allocated until the 1 MB limit is reached, leading to rather complicated allocation logic.
- The ring buffer memory is not freed when vmm unloads.
Currently there are no callers of vmm_trace_log(), which means what is present
is dead code.
The tracing facility needs to fulfill the following requirements:
- Each call must be relatively light-weight.
- Each message needs to be attributable to a particular instantiation of a virtual machine.
- Each message needs a time stamp
- Each message needs a severity
- Messages must survive the death of VM and the non-global zone that contains the VM.
- Messages must be easily accessible through mdb
- Messages must be retrievable via
vmm_ioctl()or similar.
There will be a single global ring buffer that is allocated and initialized as the first virtual machine starts. The number of entries in the ring buffer will be tunable up until the first VM is started.
Trace messages inserted into a global ring buffer with:
void vmm_trace_log(struct vm *vm, vmm_trace_flags_t severity,
const char *fmt, ...);
The number of trace messages kept is tunable until the first VM is started.
int vmm_trace_log_count = 5000; /* Arbitrary, about 1.5 MB */The following symbols will be helpful for debugging:
vmm_trace_node_t *vmm_debug_rbufis the ring bufferint vmm_debug_rbuf_entriesnumber of elements invmm_debug_rbuflist_t vmm_trace_headskeeps track of per-VMvmm_trace_head_ts. This helps you find trace messages for VMs that no longer exist but still have messages in the ring buffer.
Messages can be read by user space with:
int vm_get_next_log(struct vmctx *ctx, vmm_trace_flags_t severities,
vmm_trace_dmsg_t *msg);Important declarations shared between user space and kernel space are:
typedef enum {
/* logging levels */
VTF_DEBUG = 0x01,
VTF_ERROR = 0x02,
} vmm_trace_flags_t;
typedef struct vmm_trace_dmsg {
timespec_t vtd_timestamp;
char vtd_buf[256]; /* Always nul terminated */
vmm_trace_flags_t vtd_flags;
} vmm_trace_dmsg_t;Each ring buffer entry will be:
typedef struct vmm_trace_node {
vmm_trace_dmesg_t vtn_msg;
vmm_trace_head_t *vtd_vm_head; /* Which VM was this from? */
list_node_t vtd_vm_linkage; /* List head in vtd_vm_head */
} vmm_trace_node_t;Each struct vm will reference a newly allocated vmm_trace_head_t.
vmm_trace_log() will update vtd_vm_head to refrence the per-VM
vmm_trace_head_t and insert this vmm_trace_dmsg_t at the tail of a list in
vmm_trace_head_t.
typedef struct vmm_trace_head {
vmm_t *vth_vm; /* If NULL, candidate for destruction */
char vth_name[32]; /* VM name */
zoneid_t vth_zoneid;
zonedid_t vth_zonedid;
list_head_t vth_msgs;
kmutex_t vth_lock;
kmutex_t vth_cv; /* To signal vm_get_next_log() */
vmm_trace_dmsg_t *vth_nextmsg; /* To help vm_get_next_log() */
/* XXX included for discussion. Not sure how useful they are. */
struct timeval vth_create_time;
struct timeval vth_destroy_time;
} vmm_trace_head_t;The vtd_flags field will be used to track logging levels and perhaps other
things. vmm_trace_log()'s severity argument will be masked to only accept
the logging severity bits.
Each vmm_trace_head_t is a member of the vmm_trace_heads linked list. As
the corresponding bhyve instance is being destroyed, the vmm_trace_head_t
becomes a candidate for destruction. It will only be destroyed when it no
longer references any messages.
There will be a global lock that is held during the following operations:
- During allocation of the ring buffer.
- As
vmm_vtrace_log()selects the nextvmm_trace_dmsg_tto use. It is not held as thevmm_trace_dmsg_tis updated. - As nodes are added to or removed from
vmm_trace_heads.
The vth_lock in a vth_trace_head_t is held during the following operations:
- While the
vm_get_next_log()ioctlis readingvth_nextmsg. - While the
vm_get_next_log()ioctlis copying a message. - While a
vmm_trace_node_tis being removed from or added to itsvth_msgslist (e.g. viavmm_th_remove_node()andvmm_th_add_node())
The vth_cv is signaled by vmm_th_add_node() when it changes vth_nextmsg
from NULL. It will also be signaled as the VM is destroyed so that the ioctl
may return an error.
As vmm_th_remove_node() is removing a node, it will check to see if the
vmm_trace_head_t should be destroyed. It should be destroyed if vth_vm is
NULL and the vth_msgs list is empty.
zhyve will gain a logging thread that will loop on calls to
vm_get_next_log(), writing the received messages to stderr. Per OS-6831,
these messages will find their way into the platform log at
<zonepath>/logs/platform.log.