Skip to content

Instantly share code, notes, and snippets.

@alexamies
Last active October 5, 2018 22:29
Show Gist options
  • Save alexamies/0c061b9e10c463bce0d22c53b12460be to your computer and use it in GitHub Desktop.
Save alexamies/0c061b9e10c463bce0d22c53b12460be to your computer and use it in GitHub Desktop.
Mesh of HTTP Probes

Monitoring with a grid of HTTP probes

This document describes monitoring of access to a collection of URLs from a
fleet of HTTP probes deployed in multiple geographic locations. Connection latency, including DNS lookup, TLS handshake, connection establishment, and data transfer are recorded. For background on the HTTP probe see the post [https://medium.com/google-cloud/identification-of-sources-of-mobile-client-connection-failure-fec9dad8dd13 Identification of sources of mobile client connection failure]. Kubernetes clusters will be deployed to multiple zones, each with multiple probes. Each instance of a probe will send requests to a single URL.

Enable the Stackdriver and Kubernetes Engine APIs in the Cloud Console. Edit the included config.yaml file for the id of your project, target URLs and zones to deploy the probes to.

Create a service account in the console and download the key json file into the directory that you run these commands from. Assign the service account role Project > Owner when you create it.

export GOOGLE_APPLICATION_CREDENTIALS=???.json

Define the Stackdriver custom metric:

docker build -f Dockerfile-define-metrics -t define_metrics .
docker run -it \
  --env GOOGLE_APPLICATION_CREDENTIALS=$GOOGLE_APPLICATION_CREDENTIALS \
  define_metrics

Build and deploy the probe in a Docker container:

docker build -f Dockerfile-httpprobe -t httpprobe .
PROJECT_ID=[Your project]
ZONE=us-east1-b
TARGET_URL=[Your URL]
TARGET_URL_LABEL=[Your label]
docker run -it  --env PROJECT_ID=$PROJECT_ID \
  --env GOOGLE_APPLICATION_CREDENTIALS=$GOOGLE_APPLICATION_CREDENTIALS \
  --env ZONE=$ZONE \
  --env TARGET_URL=$TARGET_URL \
  --env TARGET_URL_LABEL=$TARGET_URL_LABEL \
  httpprobe

Uploading to the Google Container Repo

gcloud auth configure-docker
TAG=v1
docker tag httpprobe gcr.io/$PROJECT_ID/httpprobe:$TAG
docker push gcr.io/$PROJECT_ID/httpprobe:$TAG

Start up a GKE cluster in each zone:

gcloud components install kubectl
gcloud config set project $PROJECT_ID
gcloud config set compute/zone $ZONE
CLUSTER_NAME=probes-$ZONE
gcloud container clusters create $CLUSTER_NAME --zone $ZONE --num-nodes 1
gcloud container clusters get-credentials $CLUSTER_NAME --zone $ZONE

Deploy a probe for each URL to each cluster

TARGET_URL=[Your URL]
TARGET_URL_LABEL=[Your label]
HTTP_PROBE_WORKLOAD=httpprobe-$ZONE-$TARGET_URL_LABEL
kubectl run $HTTP_PROBE_WORKLOAD --image gcr.io/$PROJECT_ID/httpprobe:$TAG \
  --env="PROJECT_ID=$PROJECT_ID" \
  --env GOOGLE_APPLICATION_CREDENTIALS=$GOOGLE_APPLICATION_CREDENTIALS \
  --env ZONE=$ZONE \
  --env TARGET_URL=$TARGET_URL \
  --env TARGET_URL_LABEL=$TARGET_URL_LABEL \
  --limits="cpu=50m,memory=128Mi"

License

Apache 2.0 Copyright 2018 Google. All rights reserved.

projectid: [Your project]
targeturls: # URLs for the probes to send HTTP requests to
- targeturl:
url: [First URL]
label: [First label] # Human readable label
- targeturl:
url: [Second URL]
label: [Second label]
zones: # Zones that the probes will be deployed to
- us-east1-b
- us-central1-a
- us-west1-c
- europe-west1-d
# Copyright 2018 Google. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# Program to define Stackdriver metric descriptors for a fleet of HTTP probes
import yaml
from google.cloud import monitoring_v3
CONFIG_FILE = "config.yaml"
# Define the Stackdriver custom metric descriptors
def create_descriptors(config):
projectid = config["projectid"]
targeturls = config["targeturls"]
zones = config["zones"]
client = monitoring_v3.MetricServiceClient()
name = client.project_path(projectid)
for zone in zones:
for targeturl in targeturls:
create_descriptor(client, name, projectid, zone, targeturl, "namelookup",
"DNS lookup time", "DOUBLE")
create_descriptor(client, name, projectid, zone, targeturl, "time_connect",
"Connection time", "DOUBLE")
create_descriptor(client, name, projectid, zone, targeturl, "tls_handshake",
"Time for TLS handshake", "DOUBLE")
create_descriptor(client, name, projectid, zone, targeturl, "starttransfer",
"Time for first byte was just about to be transferred",
"DOUBLE")
create_descriptor(client, name, projectid, zone, targeturl, "time_total",
"Total time for HTTP request", "DOUBLE")
create_descriptor(client, name, projectid, zone, targeturl, "responsecode",
"HTTP response code", "INT64")
# Define a guage kind custom guage metric
def create_descriptor(client, name, projectid, zone, targeturl, prefix,
description, value_type):
url = targeturl["url"]
label = targeturl["label"]
metric_shortname = "{0}_{1}_{2}".format(prefix, zone, label)
metric_name = "projects/{0}/metricDescriptors/custom.googleapis.com%2F{1}".format(projectid,
metric_shortname)
metric_type = "custom.googleapis.com/{}".format(metric_shortname)
metric_description = "{0} for {1} target {2}".format(description, zone, label)
metric_descriptor = {
"name": metric_name,
"type": metric_type,
"description": metric_description,
"metric_kind": "GAUGE",
"value_type": value_type}
response = client.create_metric_descriptor(name, metric_descriptor)
# print("create_descriptor {0}, {1}, {2}".format(zone, label, response))
# Read the YAML configuration file
def read_config():
print "Reading config file\n"
with open(CONFIG_FILE, 'r') as f:
config = {}
try:
config = yaml.load(f)
except yaml.YAMLError as e:
print(e)
return config
def main():
print "Setting up probe custom metrics\n"
config = read_config()
create_descriptors(config)
if __name__ == '__main__':
main()
FROM google/cloud-sdk:latest
ADD define_metrics.py /
ADD *.json /
ADD config.yaml /
RUN pip install pyyaml
RUN pip install --upgrade google-cloud-monitoring
CMD [ "python", "./define_metrics.py", "$GOOGLE_APPLICATION_CREDENTIALS"]
FROM google/cloud-sdk:latest
ADD httpprobe.py /
ADD *.json /
RUN pip install --upgrade google-cloud-monitoring
CMD [ "python", "./httpprobe.py", "$PROJECT_ID", "$GOOGLE_APPLICATION_CREDENTIALS", "$TARGET_URL", "$ZONE", "$TARGET_URL", "$TARGET_URL_LABEL"]
# Copyright 2018 Google. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# Program to measure and report times for DNS lookup, connection establishment,
# TLS handshake, ready to start transfer, and completion of response.
# See https://github.com/reorx/httpstat for a more robust curl wrapper
# See https://cloud.google.com/monitoring/docs/reference/libraries#client-libraries-usage-python
# for monitoring background
# Beforing running, install the Stackdriver monitoring libraries with
# pip install --upgrade google-cloud-monitoring
import json
import os
import subprocess
import time
from google.cloud import monitoring_v3
CONFIG_FILE = "config.yaml"
def httpstat(url):
write_format = """{\n\"time_namelookup\": %{time_namelookup},
\"response_code\": %{response_code},
\"time_appconnect\": %{time_appconnect},
\"time_connect\": %{time_connect},
\"time_starttransfer\": %{time_starttransfer},
\"time_total\": %{time_total}
}"""
cmd = ["curl", "-w", write_format, "-o" "/dev/null", url, "-s", "-S"]
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
out, err = p.communicate()
if p.returncode != 0:
if err:
print("Error calling command: {}".format(err))
else:
print("Error calling command")
return {}
try:
result = json.loads(out)
return result
except ValueError as e:
print("Error parsing json: {}".format(e))
print(out)
return {}
# Send a HTTP request to the target URLs every minute
def poll():
client = monitoring_v3.MetricServiceClient()
project_id = os.getenv("PROJECT_ID")
zone = os.getenv("ZONE")
url = os.getenv("TARGET_URL")
label = os.getenv("TARGET_URL_LABEL")
project_name = client.project_path(project_id)
while 0 < 1:
data = httpstat(url)
if len(data) > 0:
write_metrics(data, client, project_name, zone, label)
else:
print("No data to report")
time.sleep(60)
# Write the metric data to Stackdriver
def write_metrics(data, client, project_name, zone, label):
now = time.time()
prefixes = ["namelookup", "time_connect", "tls_handshake", "starttransfer",
"time_total"]
keys = ["time_namelookup", "time_connect", "time_appconnect",
"time_starttransfer", "time_total"]
for i in range(len(prefixes)):
metric_shortname = "{0}_{1}_{2}".format(prefixes[i], zone, label)
metric_type = "custom.googleapis.com/{}".format(metric_shortname)
write_metric_double(metric_type, data[keys[i]], now, client, project_name)
write_metric_int("custom.googleapis.com/responsecode",
data['response_code'], now, client, project_name)
print("Successfully wrote time series")
# Write a double type time series
def write_metric_double(metric_type, value, now, client, project_name):
series = monitoring_v3.types.TimeSeries()
series.metric.type = metric_type
series.resource.type = "global"
point = series.points.add()
point.value.double_value = value
point.interval.end_time.seconds = int(now)
point.interval.end_time.nanos = int(
(now - point.interval.end_time.seconds) * 10**9)
client.create_time_series(project_name, [series])
# Write an integer type time series
def write_metric_int(metric_type, value, now, client, project_name):
series = monitoring_v3.types.TimeSeries()
series.metric.type = metric_type
series.resource.type = "global"
point = series.points.add()
point.value.int64_value = value
point.interval.end_time.seconds = int(now)
point.interval.end_time.nanos = int(
(now - point.interval.end_time.seconds) * 10**9)
client.create_time_series(project_name, [series])
def main():
poll()
if __name__ == '__main__':
main()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment