Ken Sipe kensipe

Intro

this screen cast will demo how to setup an mesosphere cluster for the purposes of analytics. We will show how to provision mesosphere on Google Compute Platform along with installing Mesos-DNS, HDFS and Spark.

We will start with setting up mesosphere on GCE by directing our browser to google.mesosphere.com

GCE setup

setup through wizard
download and run openvpn
see mesos ui

Mesos-DNS

Scripts for setting up

sudo mkdir /etc/mesos-dns
sudo vi /etc/mesos-dns/config.json

config.json

SSH steps

go to aws console, filter to the name of your cluster
find your master (it will be 1 with a public IP and a Security group which includes the words MasterSecurityGroup)
1. get its public DNS
add the following to ~/.ssh/config

	Host ec2-52-25-163-225.us-west-2.compute.amazonaws.com (this is your DNS)
	        Compression yes
 ForwardAgent yes

sudo apt-key adv --keyserver keyserver.ubuntu.com --recv E56151BF
DISTRO=$(lsb_release -is | tr '[:upper:]' '[:lower:]')
CODENAME=$(lsb_release -cs)

Add the repository

echo "deb http://repos.mesosphere.io/${DISTRO} ${CODENAME} main" | sudo tee /etc/apt/sources.list.d/mesosphere.list
sudo apt-get -y update

Unreachable Strategy

In order for Marathon to provide partition aware unreachable strategy support there are 2 high level events that must occur; 1) Mesos needs to communicate a task is unreachable and 2) Marathon must respond to that event if unresolved within a specified amount of time. Each of these events have configuration options and DCOS system defaults which are worth review in order to fully understand how and when an unreachable task will be managed by Marathon.

Apache Mesos Unreachable Strategies

Apache Meso's ability to communicate a task / node is unreachable is controlled by 2 concepts; 1) mesos-agent health check and 2) node rate limiter. Regarding agent health checks, the mesos-master flags of control are: -max_agent_ping_timeouts and -agent_ping_timeout. While the Mesos defaults are 5 and 15s respectively providing a 75 second notification event by default (assuming the loss of 1 agent). The default for DC/OS for [max_slave_ping_timeouts is 20](https://github.com/dcos/dcos/blob/9

	setup: You have a multi-purpose cluster environment used for end user web traffic and in-house analytics. In this example we have 4 running docker instances of nginx hosting our web application fronted by haproxy and 2 small and 2 medium instances of YARN running on Mapr Hadoop with MapRFS.

	note: most organizations underutilize their datacenter resources by separating these two concerns. In this demonstration we are co-locating these separate needs.
	<setup scripts>
	1. look at master port 80 (web app)
	- technical dive: look at /etc/haproxy/haproxy.cfg on master

	2. lets run a terrasort job
	<launch terrasort job>

	sudo /etc/init.d/mapr-warden restart

	sudo maprcli node list -columns ip
	maprcli volume list -columns n,p

	hadoop fs -ls /var/mapr/local
	hadoop fs -stat /var/mapr/local
	hadoop fs -stat /var/mapr/local/demo-mapr-slave1.c.inbound-bee-664.internal
	hadoop fs -lsr /var/

	{
	"version":"3.0.0",
	"gauges":{
	"api.mesosphere.marathon.core.event.impl.stream.HttpEventStreamActorMetrics.number-of-streams":{
	"value":0
	},
	"jvm.buffers.direct.capacity":{
	"value":856750
	},
	"jvm.buffers.direct.count":{

	#!/bin/bash
	# expect git and aws with prod creds
	# expects to be in the marathon dir or have the MARATHON_PROJECT_DIR set

	if [ -z "$MARATHON_PROJECT_DIR" ]; then
	echo "MARATHON_PROJECT_DIR NOT set... using current directory"
	else
	pushd $MARATHON_PROJECT_DIR
	fi