Skip to content

Instantly share code, notes, and snippets.

@sam-github
Last active February 25, 2016 04:15
Show Gist options
  • Save sam-github/848bd8dca934c9ccd5b4 to your computer and use it in GitHub Desktop.
Save sam-github/848bd8dca934c9ccd5b4 to your computer and use it in GitHub Desktop.

Build for Scale

Who am I:

???

  • Describe what I did at StrongLoop

Who are you?

  • Wrote a Node.js app?
  • In production?
  • ... on prem?
  • ... in a cloud?

What is "scale"?

linear increase in capacity with increase in resources

  • What is capacity?
  • What is resources?

What are the 12-factors?

I. Codebase II. Dependencies III. Config IV. Backing Services V. Build, release, run VI. Processes VII. Port binding VIII. Concurrency IX. Disposability X. Dev/prod parity XI. Logs XII. Admin processes

???

Describe where from, and why related to scale

I will use as framework to discuss

Packaging Concerns

I. Codebase: One codebase tracked in revision control, many deploys II. Dependencies: Explicitly declare and isolate dependencies V. Build, release, run: Strictly separate build and run stages

Configuration Concerns

III. Config: Store config in the environment IV. Backing Services: Treat backing services as attached resources VII. Port binding: Export services via port binding X. Dev/prod parity: Keep development, staging, and production as similar as possible XI. Logs: Treat logs as event streams

Design Concerns

VI. Processes: Execute the app as one or more stateless processes VIII. Concurrency: Scale out via the process model IX. Disposability: Maximize robustness with fast startup and graceful shutdown XII. Admin processes: one-off processes

Packaging Concerns

I. Codebase: One codebase tracked in revision control, many deploys

What are people using for version control?

  • git
  • mercurial
  • perforce
  • subversion
  • CVS/RCS/SCCS?
  • ...?

Note: We can git push to Heroku and OpenShift, but not to Bluemix

II. Dependencies: Explicitly declare and isolate dependencies

Node.js/npm makes it difficult to not do this:

  • node will not require "globally" installed packages
  • npm installs its dependencies into the current package

???

  • Elaborate on the no-global-require, many people do not realize

Dependencies, strict vs loose

  • Use loose dependencies: XXX node-semver
   "express": "^4.3"

???

Make the case for loose...

... mesh-models has 250 unique modules as deps, sometimes multiple versions

Dependencies, fix them late

???

  • discuss shrinkwrap
  • discuss git commit
  • discuss npm packfiles

Dependencies, how to deal with in CI

  • Use npm packages for independent code, and try to build independent code
  • Auto-publish your masters to a staging npm registry
  • Resolve packages from the staging registry
  • Use semver during development

???

ci and node.... talk about why its related

rmg:

for our dependant package builds, for a PR I basically fork off a new universe where that PR is published as the latest version of that package, and then see how the packages that depend on it work in that universe I created a docker image specifically for this usage of sinopia https://hub.docker.com/r/strongloop/ephemeral-npm/

V. Build, release, run: Strictly separate build and run stages

The release stage takes the build produced by the build stage and combines it with the deploy’s current config. The resulting release contains both the build and the config and is ready for immediate execution in the execution environment.

Can be hard to do. At the least, pre-compile your assets at build time, not deploy time:

  • npm install
  • native addon build (unless its a cloud)
  • template and css pre-compilation

all belong at build, not deploy.

???

Should strong-supervisor be injected? Even for BlueMix, this seems to be a good idea, restarts would be seconds, not 10s of seconds (or more) for a full container restart.

Configuration Concerns

III. Config: Store config in the environment

  • Forces seperation of code and config
  • All orchestrators support environment injection
  • Pay attention to orthogonality
  • You can still group ENV vars in files, or point to files (works well with puppet or chef)

12-factor anti-pattern, do it anyway:

    NODE_ENV=production

???

discuss problems with grouping sets of behaviours into a single variable

defaults should be "dev friendly" - external for this example, more commonly localhost

immutability

Config: TLS

Hard to pass X.509 certs or PFX files...

  1. terminate outside of Node.js (see later)
  2. use an environment variable to specify location of crypto materials

Config: Using etcd or redis to fetch configuration

???

Talk about why:

  • consistency, externality, maybe right for you, maybe not, arguably violates 12-factor... that is OK

IV. Backing Services: Treat backing services as attached resources

  • Try to use URL-like variables, instead of env var groups:

      MYSQL=mysql://user:pass@host:port/db
    

instead of

   MYSQL_USER=user
   MYSQL_PASS=pass
   ... etc.

Example: LoopBack

Modify a standard LoopBack application to be configured via environment.

Static server/datasources.json:

exports = module.exports = {
  mysqlDs: {
    host: "demo.strongloop.com",
    port: 3306,
    database: "getting_started_intermediate",
    username: "demo",
    password: "L00pBack",
    name: "mysqlDs",
    connector: "mysql"
  }
}

Example: LoopBack

Dynamic server/datasources.local.js using URL:

exports = module.exports = {
  // Not working consistently (yet?) for all loopback connectors:
  mysqlDs: {
    url: process.env.DB_MYSQLDS ||
      'mysql://demo:[email protected]:3306/getting_started_intermediate'
    name: 'mysqlDs',
  }
};

Example: LoopBack

Dynamic server/datasources.local.js using URL:

exports = module.exports = {};

set(exports, 'mysqlDs',
    'mysql://demo:[email protected]:3306/getting_started_intermediate'
   );
};

Example: LoopBack

Dynamic server/datasources.local.js using URL:

function set(config, dsname, def) {
  var env = process.env['DB_' + dsname.toUpperCase];
  config[dsname] = parse(env || def);
  config[dsname].name = dsname;
  return config;
}

Example: LoopBack

Dynamic server/datasources.local.js using URL:

function parse(uri) {
  var url = require('url');

  var parsed = url.parse(uri);
  var auth = parsed.auth ? parsed.auth.split(':') : [];
  return {
    host: parsed.hostname,
    port: parsed.port,
    database: parsed.pathname ? parsed.pathname.replace(/^\//,'') : undefined,
    username: auth[0],
    password: auth[1],
    connector: parsed.protocol.replace(/:$/, ''),
  };
}

VII. Port binding: Export services via port binding

Natural for Node.js

X. Dev/prod parity: Keep development, staging, and production as similar as possible

Run linux on your laptop :-).

At least, use the same database types.

XI. Logs: Treat logs as event streams

  • Logs go to stdout, consumed by system (or displayed to console)
  • Bluemix requires this

Design Concerns

Why does Node.js do this well?

Event loop:

  • node to OS: please connect to x.com
  • node to OS: please connect to y.com
  • node to OS: please connect to z.com
  • node to OS: wake me when something happened ... blocks on OS, waiting for notifications, using no CPU...
  • OS to node: got a connection to y.com
  • node to OS: write this data to y.com
  • node to OS: wake me when something happened
  • OS to node: wrote that data
  • ... etc.

Node.js does not do everything well

Use node for things its good at, as a data switch, services with lots of concurrency, not a lot of CPU per request.

VI. Processes: Execute the app as one or more stateless processes

  • Stateless refers to HTTP requests, each one can be routed to any instance
  • Persisent state must be stored out-of-instance: SQL, NoSQL, ...

A pre-requisite for concurrency/horizontal scaling.

???

loopback models are by default

websockets might share state by default, make sure that session state is offloaded in production

discuss granularity: talk about how to build purpose built components, services that will scale similarly they will have similar request times for example

VIII. Concurrency: Scale out via the process model

Node is good at this, as are clouds

Node cluster exists, but use production (auto-)scaling instead.

???

discuss node cluster problems:

  • hides behaviour from load balancer
  • hides scaling from auto-scaler
  • uneven balancing on v0.10, complex and unstable in 0.12+

IX. Disposability: Maximize robustness with fast startup and graceful shutdown

Fast means sub-second, not sub-minute.

Use strong-supervisor (or equivalent) even with BlueMix.

XII. Admin processes: Run admin/management tasks as one-off processes

War Stories

Log/debug message formatting

  • Do not format what you will not use

log.debug('look at this: %s', JSON.serialize(someLargeObject));

Do not poll for state

  • Use data sources that can update you on change: evented everywhere, not just the loop!
  • Subscribe to things you are interested in only, if possible.
  • If the data source cannot do this... consider writing a micro service (in node), that can poll as necessary, batch, and implement a subcribe/notify protocol.

Be aware of the requirements for your data

Know the strengths of your tools. Similar looking things can have very different performance capabilities:

  • Example: mongo vs couch: do you want consistency, or partition tolerance?

Terminate your TLS outside of Node.js

  • Do it at the perimeter if you can, YMMV, you can see 10x performance increase in TLS if you do it outside of Node.js

  • If you require TLS on the wire everywhere: Do it on the host with nginx, and proxy to Node.js on that host, bound to localhost.

  • Hardware acceleration is really useful, but only accelerates the handshake, so you need hardware for high concurrency of unique connections, but not for long-standing high rate connections. Benchmark.

Set your ulimit!

Default limits on descriptors are wrong for high concurrency servers... set them higher.

How high: test what your app can handle, and set them higher than that.

???

Describe what it does, and why.

<!DOCTYPE html>
<html>
<head>
<title>Title</title>
<meta charset="utf-8">
<style>
@import url(https://fonts.googleapis.com/css?family=Yanone+Kaffeesatz);
@import url(https://fonts.googleapis.com/css?family=Droid+Serif:400,700,400italic);
@import url(https://fonts.googleapis.com/css?family=Ubuntu+Mono:400,700,400italic);
body { font-family: 'Droid Serif'; }
h1, h2, h3 {
font-family: 'Yanone Kaffeesatz';
font-weight: normal;
}
.remark-code, .remark-inline-code { font-family: 'Ubuntu Mono'; }
@media print {
.remark-slide-number {
/* hide slide numbers on print/PDF, viewer has its own */
display: none;
}
}
@page {
size: 16cm 12cm;
}
</style>
</head>
<body>
<textarea id="source">
class: center, middle
# Survey of Node.js Process Managers
Who am I:
- email: [email protected]
- github: @sam-github
- twitter: @octetcloud
???
Ask what is background of audience?
---
Process Manager:
- Runs node applications, but does not manage the HTTP
- Tools to interact with the manager
???
Should I contrast java "app servers" and Node.js "process managers"?
---
"Process Managers", or the like:
- BlueMix: <https://console.ng.bluemix.net/>
- API Connect Collective: <XXX>
- strong-pm: <http://strong-pm.io>
- pm2: <https://github.com/Unitech/pm2>
- forever: <https://github.com/foreverjs/forever>
???
@kraman, @sai, are above names correct?
Perhaps I should rename all of the above...
---
Restart on failure
- BlueMix: Yes
- Collective: Yes
- strong-pm: Yes
- pm2: Yes
- forever: Yes
???
Bluemix restarter is still worth while, because it avoids the droplet being
restarted, and all disk state will get lost (but you shouldn't use disk anyhow)
---
Graceful/rolling restarts
- BlueMix: Use blue/green deployments
- Collective: XXX
- strong-pm: Yes
- pm2: Yes
- forever: No
???
[blue/green](https://docs.cloudfoundry.org/devguide/deploy-apps/blue-green.html)
- not so easy if you are using all your resources already
- capped by mem usage, across ALL instances
@kraman, @cvignola: collective, how are we going to do this? next release?
---
OS startup script support
- BlueMix: N/A
- Collective: No
- strong-pm: Yes
- pm2: Yes
- forever: No
???
XXX backlog startup script support
---
Dynamic/persistent env variable configuration
- BlueMix: cf env/set-env/unset-env
- Collective: apiconnect env:set/get/list
- strong-pm: slc ctl env-set/get/list
- pm2: As part of ecosystem configuration
- forever: No
???
XXX @kraman pm2: means what?
XXX @kraman apiconnect: what is actual syntax?
---
Log aggregation/rotation
- BlueMix: Yes, but needs a forwarder setup for persistance
- Collective: No
- strong-pm: Yes; log file and syslog
- pm2: Yes; multihost, with rotation. Log file only; no syslog
- forever: No
???
https://docs.cloudfoundry.org/devguide/services/log-management.html
XXX collective, backlog whatever we need to do
XXX pm2, what does this mean? tempted to just say Yes
---
Multiple app support
- BlueMix: Yes
- Collective: Yes
- strong-pm: Yes
- pm2: Yes
- forever: Yes
???
@kraman, you had caveats around these... I think I'll boil down to "yes"
---
Language Support
- BlueMix: Node.js and Java, Go, PHP, Python, Ruby
- Collective: Node.js and Java
- strong-pm: Node.js
- pm2: Node.js (some support for shell commands)
- forever: Any script
???
XXX @sai, @krishna, I don't think this is relevant, and many of the features in
here don't work or work differently depending on the language. Remove?
XXX bluemix: https://www.ng.bluemix.net/docs/cfapps/runtimes.html
---
# Lifecycle Tooling
---
Run app locally
- BlueMix: No
- Collective: No
- strong-pm: slc start
- pm2: pm2 start app.js -name foo
- forever: forever start app.js
---
Build and package repositories
- BlueMix: Yes
- Collective: Yes
- strong-pm: Yes, as npm packfile or into git
- pm2: No
- forever: No
???
- bluemix: uses manifest.yml, otherwise push ./ or a zip file, or a
specific path, see
https://docs.cloudfoundry.org/devguide/deploy-apps/manifest.html and a
.cfignore file
---
Run apps in a container
- BlueMix: Yes (XXX)
- Collective: No
- strong-pm: Yes (Docker)
- pm2: No
- forever: No
???
@kraman, @sai is this relevant? I think we should take it out, deploying docker
containers is more usful than running in a container. Or perhaps bluemix should
be said to do this - it does run in a container.
---
Remote deploy
- BlueMix: Yes
- Collective: Yes
- strong-pm: Yes
- pm2: Yes
- forever: No
---
Multiple deploys/revert
- BlueMix: Yes
- Collective: Yes
- strong-pm: Yes
- pm2: Yes
- forever: No
???
- XXX pm2: ... I assume, can't find in their docs... and what I do see makes
it look like pm2 requires code in git, so reverts by cloning older commits,
which other than being pull, is identical to the strong-pm "push" model of
revert. So, WTF?
- cf. IBM DevOps Services
- collective: in-progress :-(
---
# Clustering & Management
---
Manage remotely
- BlueMix: Yes
- Collective: TBD
- strong-pm: Yes
- pm2: XXX
- forever:
???
Doesn't this half-duplicate the "security" section, which is really remote
deployment?
pm2: does it have remote management?
---
Remote security
- BlueMix: HTTPS auth
- Collective: HTTPS auth
- strong-pm: HTTP auth and HTTP+SSH
- pm2: XXX
- forever: N/A
???
XXX pm2: means what?
---
Internal Clustering
- BlueMix: No
- Collective: No
- strong-pm: Yes, statically and dynamically via slc ctl set-size
- pm2: Yes, statically
- forever: No
???
- @kraman, real command name?
cf push has a --no-start option, useful with cf scale
maybe put up into first section?
---
Load balancer auto-configuration
- BlueMix: Yes
- Collective: Yes
- strong-pm: Yes
- pm2: No
- forever: No
---
Horizontal Scaling
- BlueMix: Yes, by config and cf scale
- Collective: Yes, apiconnect ...
- strong-pm: No
- pm2: No
- forever: No
???
@kraman, actual command?
---
Horizontal Auto-Scaling
- BlueMix: Yes
- Collective: WIP
- strong-pm: No
- pm2: No
- forever: No
???
@sai, how for bluemix?
collective: depends, its a WIP
# Profiling
---
Profiling
- BlueMix: No
- Collective: No
- strong-pm: Yes, CPU and Heap
- pm2: No
- forever: No
---
Profile triggering
- BlueMix: No
- Collective: No
- strong-pm: Trigger profiling based on slow event loop
- pm2: No
- forever: No
---
Debugging
- BlueMix: Yes, in dev-mode can run node-inspector
- Collective: No
- strong-pm: No
- pm2: No
- forever: No
???
- <https://www.ng.bluemix.net/docs/manageapps/app_mng.html>
---
# Metrics
---
Metrics
- BlueMix: CPU, memory, error stacks: yes
- Collective: CPU, memory (WIP)
- strong-pm: CPU, memory, loop, databases, etc.
- pm2: CPU, memory
- forever: No
???
bluemix error stacks: confirm and demo
Monitoring and analytics service
- being rewritten to use appmetrics
- buildpack finds your script and inserts first line
---
Integrate with external metrics
- BlueMix: logs: yes, metrics: XXX
- Collective: No
- strong-pm: logs: file, syslog; metrics: statsd, splunk, syslog, etc.
- pm2: No
- forever: No
???
bluemix?
collective?
https://docs.cloudfoundry.org/devguide/services/log-management.html
---
# Demo
</textarea>
<script src="https://gnab.github.io/remark/downloads/remark-latest.min.js">
</script>
<script>
var slideshow = remark.create();
</script>
</body>
</html>

Process Managers

Demo

Show the API mesh user-experience?

Or the BlueMix experience?

  • restart
  • using sl-run
  • blue/green deployment
  • log forwarding setup
  • basically anything I can think of in the above...

BlueMix - Node.js Instant Runtime

Questions:

  • VCAP_APP_PORT, not PORT?

  • defaults to latest node version will be used: I'm getting 0.12, that isn't even the latest stable version, much less latest version

  • cf files--> agent_api_version, what is this?

  • "restage"?

API Mesh Collectives

Comparison

cf. http://strong-pm.io/compare/:

  • run app locally: yes

  • restart on failure: yes

  • graceful/rolling restarts: ?

  • os startup script support: ?

  • security: ssh and HTTP auth?

  • set env vars: apim env:set/get/list?

  • log aggregation: no?

  • multiple app support: yes

  • language support: java and node

  • build and package: yes (slc build, wlpn-server pack, apim ???)

  • deploy apps into a docker container: no... or do we support docker?

  • remote deploy: yes

  • multiple deploys/revert: ... like strong-pm, not really?

  • clustering: yes (not node cluster, using scaling and load-balancing)

  • resize clusters: deploy-time: ?, start-time: ?, runtime: ?

  • manage remotely: yes

  • load balancer auto-config: yes

  • profiling: no

  • profile triggering: no

  • metrics: no

  • integrate with external: n/a (no metrics)

Display the source blob
Display the rendered blob
Raw
@bajtos
Copy link

bajtos commented Feb 19, 2016

Missing LoopBack features:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment