Skip to content

Instantly share code, notes, and snippets.

-------------------------------------------------------------
Skipping Docker tests because validation failed
[Error] Failed to execute 'docker version': exited with status exited with status 127
-------------------------------------------------------------
-------------------------------------------------------------
Skipping Docker tests because validation failed
[Error] Failed to execute 'docker version': exited with status exited with status 127
-------------------------------------------------------------
-------------------------------------------------------------
Skipping Docker tests because validation failed
$ sudo ./bin/mesos-tests.sh --gtest_list_tests | grep -i cgroups
ROOT_CGROUPS_Cfs
ROOT_CGROUPS_Cfs_Big_Quota
ROOT_CGROUPS_Sample
CGROUPS_ROOT_PerfRollForward
ROOT_CGROUPS_BalloonFramework
CgroupsAnyHierarchyTest.
ROOT_CGROUPS_Enabled
ROOT_CGROUPS_Subsystems
ROOT_CGROUPS_Mounted
$ grep -R ::testing::Types src/tests -A 1
src/tests/isolator_tests.cpp:typedef ::testing::Types<PosixCpuIsolatorProcess,
src/tests/isolator_tests.cpp- CgroupsCpushareIsolatorProcess> CpuIsolatorTypes;
--
src/tests/isolator_tests.cpp:typedef ::testing::Types<PosixCpuIsolatorProcess> CpuIsolatorTypes;
src/tests/isolator_tests.cpp-#endif // __linux__
--
src/tests/isolator_tests.cpp:typedef ::testing::Types<PosixMemIsolatorProcess,
src/tests/isolator_tests.cpp- CgroupsMemIsolatorProcess> MemIsolatorTypes;
--
@bmahler
bmahler / reconciliation.md
Last active August 29, 2015 14:07
Reconciliation

Reconciliation

There's no getting around it, frameworks on Mesos are distributed systems.

Distributed systems must deal with failures, and partitions (the two are indistinguishable from a system's perspective).

Concretely, what does this mean for frameworks? Mesos uses an actor-like message passing programming model, in which messages are delivered at-most-once. (Exceptions to this include task status updates, most of

@bmahler
bmahler / yosemite
Created October 18, 2014 22:28
Yosemite
[ RUN ] VersionTest.Parse
terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::bad_lexical_cast> >'
what(): bad lexical cast: source type value could not be interpreted as target
*** Aborted at 1413669921 (unix time) try "date -d @1413669921" if you are using GNU date ***
PC: @ 0x7fff9570e282 __pthread_kill
*** SIGABRT (@0x7fff9570e282) received by PID 51428 (TID 0x7fff78786300) stack trace: ***
@ 0x7fff91106f1a _sigtramp
@ 0x100aef30f (anonymous namespace)::get_safe_base_mutex()::safe_base_mutex
@ 0x7fff9543ab73 abort
@ 0x100a203ab __gnu_cxx::__verbose_terminate_handler()
$ grep ci-tagupdate-stryker.2014.10.21T22.57.44-1414434875917-1-10-us_west_2c scheduler_log
DEBUG [2014-10-30 19:01:27,762] com.hubspot.singularity.scheduler.SingularityNewTaskChecker: Got task state UNHEALTHY_KILL_TASK for task ci-tagupdate-stryker.2014.10.21T22.57.44-1414434875917-1-10-us_west_2c in 00:00.004
DEBUG [2014-10-30 19:01:52,303] com.hubspot.singularity.scheduler.SingularityCleaner: Killing a task SingularityTaskCleanup [user=Optional.absent(), cleanupType=UNHEALTHY_NEW_TASK, timestamp=1414695687762, taskId=ci-tagupdate-stryker.2014.10.21T22.57.44-1414434875917-1-10-us_west_2c] immediately because of its cleanup type
DEBUG [2014-10-30 19:01:52,304] com.hubspot.singularity.scheduler.SingularityCleaner: TaskCleanup SingularityTaskCleanup [user=Optional.absent(), cleanupType=UNHEALTHY_NEW_TASK, timestamp=1414695687762, taskId=ci-tagupdate-stryker.2014.10.21T22.57.44-1414434875917-1-10-us_west_2c] had LB state NOT_LOAD_BALANCED after 00:00.000
INFO [2014-10-30 19:01:52,304] com.hubspot.singularity.m
I1030 19:23:53.686470 1592 master.cpp:3349] Performing explicit task state reconciliation for 12 tasks of framework Singularity
I1030 19:23:53.696506 1592 master.cpp:3349] Performing explicit task state reconciliation for 12 tasks of framework Singularity
I1030 19:23:53.706125 1592 master.cpp:3349] Performing explicit task state reconciliation for 12 tasks of framework Singularity
I1030 19:23:53.715679 1592 master.cpp:3349] Performing explicit task state reconciliation for 12 tasks of framework Singularity
I1030 19:23:53.726356 1592 master.cpp:3349] Performing explicit task state reconciliation for 11 tasks of framework Singularity
I1030 19:23:53.735402 1592 master.cpp:3349] Performing explicit task state reconciliation for 11 tasks of framework Singularity
I1030 19:23:53.744413 1592 master.cpp:3349] Performing explicit task state reconciliation for 11 tasks of framework Singularity
I1030 19:23:53.753397 1592 master.cpp:3349] Performing explicit task state reconciliation for 11 tasks of framework Sing
@bmahler
bmahler / operational-guide.md
Last active August 29, 2015 14:13
Operational Guide

Operational Guide

Changing the master quorum

Currently the master leverages a paxos-based replicated log as its storage backend (--registry=replicated_log is the only storage backend supported). Each master participates in the ensemble as a log replica. The --quorum flag determines a majority of the masters.

The following table shows the tolerance to master failures, for each quorum size:

Masters Quorum Size Failure Tolerance
1 1 0
@bmahler
bmahler / gist:d9c5ab9ab30124ffa8d9
Last active August 29, 2015 14:18
Contributor's Guide

Contributor's Guide

If you are making your first contributions, please review the instructions for making a contribution.

This document is an attempt to capture a shared set of values, practices, and learnings. Even though a lot of this may seem obvious, there is value in establishing a more formal reference: to come to an agreed upon set of values, to help new contributors ramp-up in the project, to foster discussion, etc.

Engineering Principles and Practices

Many companies rely on Mesos as a foundational layer of their software infrastructure and it is imperative that we ship high quality, robust code. We aim to foster a culture where we can trust and rely upon the work of the community.

@bmahler
bmahler / ev.c
Created October 9, 2015 18:51
Sleeps injected into ev.c
/*
* libev event processing core, watcher management
*
* Copyright (c) 2007,2008,2009,2010,2011,2012,2013 Marc Alexander Lehmann <[email protected]>
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without modifica-
* tion, are permitted provided that the following conditions are met:
*
* 1. Redistributions of source code must retain the above copyright notice,