Side note: all available resource metrics are documented here:
These are used for isolating files on disk from both the host system as well as other running tasks.
Generic POSIX-compatible file isolation. Essentially creates a folder which is owned by the task user/group.
// TODO(hausdorff): (MESOS-5462) For now the Windows isolators are essentially
// direct copies of their POSIX counterparts. In the future, we expect to
// refactor the POSIX classes into platform-independent base class, with
// Windows and POSIX implementations. For now, we leave the Windows
// implementations as inheriting from the POSIX implementations.
Linux-specific isolation using mount namespaces.
// This isolator is to be used when all containers share the host's
// filesystem. It supports creating mounting "volumes" from the host
// into each container's mount namespace. In particular, this can be
// used to give each container a "private" system directory, such as
// /tmp and /var/tmp.
Being deprecated in favor of filesystem/linux
These isolators are used to ensure that a task behaves well at runtime and also provide runtime usage metrics for the given resource.
No actual resource isolation but does support returning usage metrics.
Metrics: cpu user time & system time See: https://github.com/apache/mesos/blob/037a346a205ad7bdba99d771855f8caeea835d4a/src/usage/usage.cpp#L35
No actual resource isolation but does support returning usage metrics.
Metrics: mem_rss_bytes See: https://github.com/apache/mesos/blob/037a346a205ad7bdba99d771855f8caeea835d4a/src/usage/usage.cpp#L35
Uses du -k -s
to ensure tasks stay within disk usage limits.
Can Kill Tasks? Yes
Metrics: disk_limit_bytes, disk_used_bytes
// This isolator monitors the disk usage for containers, and reports
// ContainerLimitation when a container exceeds its disk quota. This
// leverages the DiskUsageCollector to ensure that we don't induce too
// much CPU usage and disk caching effects from running 'du' too
// often.
Alias for posix/disk
Can Kill Tasks? Yes
The XFS Disk isolator uses XFS project quotas to track the disk space used by each container sandbox and to enforce the corresponding disk space allocation. Write operations performed by tasks exceeding their disk allocation will fail with an EDQUOT error. The task will not be terminated by the containerizer.
The XFS disk isolator is functionally similar to Posix Disk isolator but avoids the cost of repeatedly running the du. Though they will not interfere with each other, it is not recommended to use them together.
Metrics: disk_limit_bytes, disk_used_bytes
// A basic MesosIsolatorProcess that keeps track of the pid but
// doesn't do any resource isolation. Subclasses must implement
// usage() for their appropriate resource(s).
//
// TODO(hausdorff): (MESOS-5462) For now the Windows isolators are essentially
// direct copies of their POSIX counterparts. In the future, we expect to
// refactor the POSIX classes into platform-independent base class, with
// Windows and POSIX implementations. For now, we leave the Windows
// implementations as inheriting from the POSIX implementations.
Uses Cgroups cpu
and cpuacct
subsystems:
cpu
Cgroups can be guaranteed a minimum number of "CPU shares"
when a system is busy. This does not limit a cgroup's CPU
usage if the CPUs are not busy.
Further information can be found in the kernel source file
Documentation/scheduler/sched-bwc.txt.
cpuacct
This provides accounting for CPU usage by groups of tasks.
Further information can be found in the kernel source file
Documentation/cgroup-v1/cpuacct.txt.
(from cgroups(7) man page)
// Use the Linux cpu cgroup controller for cpu isolation which uses the
// Completely Fair Scheduler (CFS).
// - cpushare implements proportionally weighted scheduling.
// - cfs implements hard quota based scheduling.
Metrics: processes, threads, cpus_user_time_secs, cpus_system_time_secs
Additional metrics when using CFS: cpus_nr_periods, cpus_nr_throttled, cpus_throttled_time_secs
// This isolator uses the cgroups devices subsystem to
// restrict access to devices in `/dev`. A small set of
// default devices are whitelisted upon container creation,
// and access to all other devices is restricted. It is
// assumed that other isolators will be used to allow / deny
// access to devices outside the default whitelist.
devices
This supports controlling which tasks may create (mknod)
devices as well as open them for reading or writing. The
policies may be specified as whitelists and blacklists.
Hierarchy is enforced, so new rules must not violate existing
rules for the target or ancestor cgroups.
Further information can be found in the kernel source file
Documentation/cgroup-v1/devices.txt.
(from cgroups(7) man page)
Metrics: none
Cgroups memory
subsystem:
memory
The memory controller supports reporting and limiting of
process memory, kernel memory, and swap used by cgroups.
Further information can be found in the kernel source file
Documentation/cgroup-v1/memory.txt.
Can Kill Tasks? Yes
Metrics:
mem_total_bytes
// Total memory + swap usage. This is set if swap is enabled.
mem_total_memsw_bytes
// Hard memory limit for a container.
mem_limit_bytes
// Soft memory limit for a container.
mem_soft_limit_bytes
// Broken out memory usage information: pagecache, rss (anonymous),
// mmaped files and swap.
// TODO(chzhcn) mem_file_bytes and mem_anon_bytes are deprecated in
// 0.23.0 and will be removed in 0.24.0.
mem_file_bytes
mem_anon_bytes
// mem_cache_bytes is added in 0.23.0 to represent page cache usage.
mem_cache_bytes
// Since 0.23.0, mem_rss_bytes is changed to represent only
// anonymous memory usage. Note that neither its requiredness, type,
// name nor numeric tag has been changed.
mem_rss_bytes
mem_mapped_file_bytes
// This is only set if swap is enabled.
mem_swap_bytes
mem_unevictable_bytes
The cgroups/net_cls isolator allows operators to provide network performance isolation and network segmentation for containers within a Mesos cluster.
Metrics: none
TODO
See docker/runtime
below. Same concept, except for appc images.
Metrics: none
The Docker Runtime isolator is used for supporting runtime configurations from the docker image (e.g., Entrypoint/Cmd, Env, etc.). This isolator is tied with --image_providers=docker. If --image_providers contains docker, this isolator must be used. Otherwise, the agent will refuse to start.
To enable the Docker Runtime isolator, append docker/runtime to the --isolation flag when starting the agent.
Currently, docker image default Entrypoint, Cmd, Env, and WorkingDir are supported with docker runtime isolator. Users can specify CommandInfo to override the default Entrypoint and Cmd in the image (see below for details). The CommandInfo should be inside of either TaskInfo or ExecutorInfo (depending on whether the task is a command task or uses a custom executor, respectively).
// The docker runtime isolator is responsible for preparing mesos
// container by merging runtime configuration specified by user
// and docker image default configuration.
Metrics: none
Allows using Docker Volumes within Mesos. Read docs here
Metrics: none
TODO
TODO
PID namespaces isolate the process ID number space, meaning that
processes in different PID namespaces can have the same PID. PID
namespaces allow containers to provide functionality such as
suspending/resuming the set of processes in the container and
migrating the container to a new host while the processes inside the
container maintain the same PIDs.
PIDs in a new PID namespace start at 1, somewhat like a standalone
system, and calls to fork(2), vfork(2), or clone(2) will produce
processes with PIDs that are unique within the namespace.
from pid_namespaces(7)
Metrics: none
TODO
TODO