Skip to content

Instantly share code, notes, and snippets.

@ejwinter
Last active March 6, 2017 14:38
Show Gist options
  • Save ejwinter/a1506e7f070614941dbb7e6cf88fb0ec to your computer and use it in GitHub Desktop.
Save ejwinter/a1506e7f070614941dbb7e6cf88fb0ec to your computer and use it in GitHub Desktop.
#notes #nfjs

An architects guide to evaluating cloud services

Matt Stine

War stories on how he has gone about doing things over the past years.

Topics

  • Rationale
  • Evaluation categories
    • Business case
    • Resiliency
    • Security
    • Regulatory compliance
    • Economics
    • Scalability
    • Provider "lock-in"
    • Available tooling
    • Undifferentiated heavy lifting
    • Differentiating features
  • How to create a scorecard based on some examples.

Rationale

Cloud Native Landscape https://github.com/cncf/landscape

There are a lot of things!

Categories

Business case

What business problem are we trying to solve?

How does this type of service address this problem?

  • e.g. relational v columnar v document v key:value databases

Resiliency

What are your resiliency requirements?

  • Do NOT just say "It needs to be 'HA'"
  • False dichotomy between HA and non-HA

Components of Resiliency

  • "Available" means the system is functioning as designed
  • "Consistent" you get the same response for the same request
    • weak
    • eventual
  • "Partition tolerance" - network partition
    • the system can function if internal communication is disrupted
  • "Durability"

Security

  • security is NOT binary

Authentication

  • how does a client prove their identity?
  • how are credentials provisioned/stored
  • how are credentials delivered?
  • how are credentials rotated? - avoid long living tokens (passwords? certificates?)

Authorization

  • What permissions types are supported?
  • Are permissions grouped into roles?
  • Are roles customizable?
  • How are roles assigned to actors?

Most software running in the world is not very secure. They don't sleep very well :)

Regulatory compliance

  • "Residency" - where does the data live
  • "Sovereignty" - who has control of the data what rules and laws apply to the data
    • German's data cannot live outside of Germany

Encryption

Encryption can make you more/less secure.

Compliance tells you how much encryption is needed to be legal.

  • 'Data at rest' is NOT binary, encryption as it sits on the filesystem
    • Who owns the data? Who can see it? Who as the keys to the data?
  • 'Data in flight' can data be transferred over network

Auditability

  • What happened?
  • When did it happen?
  • What actor caused it?
  • Where did it happen?
  • Why did it happen?

Certification

  • HIPPA
  • SOX

Economics

Who is operating the service? If something goes wrong who's job is it to fix it?

What is your expected rate of consumption?

How is the service priced/costed? How is money spent to run the service?

Is the equation cost effective as a function of consumption and growth rate?

Scalability

The system can maintain performance as the use ramps up.

Most software doesn't need to scale.

How is your load/volume expected to grow?

Is your load/volume consistent or bursty? Is it predictable?

Provider lock-in

Your always locked into something even if you build it yourself.

How easy can you change your architecture?

Is there a sensible way to leverage multiple providers? Often no because of expenses.

Is the service supported by open/defacto standards?

Is there a meaningful extraction layer? This helps us make changes more easily in the future.

Are you subject to "data gravity". How easy is it easy to move data around? You are likely to never move it out.

Available tooling

How good is the documentation?

How persistent is the documentation?

Does the service have a "well-designed" API? Do the abstractions map to yours?

Are client libraries available for your language(s) of choice?

Does your app framework of choice (Spring) support the service?

Is good management tooling available?

Is there a management API?

Is there automation tooling available for management?

Provider resiliency

Do you think the company is resilient. Will they suddenly go out of business? What happens in that case?

Undifferentiated heavy lifting

There are gaps between what the service provides and what you need it to do.

How will you close those gaps?

How much will it cost? How much will it cost to keep them closed?

Are there partners (consultants) you can utiltize to help close these gaps?

Is the provider trying toc lose the gaps you have? Will you fillings be fleeting?

Differentiating features

There is a lot of parity out there!

Are they willing to sign my companies BAA?

What if the provider is also competitor in some fashion?

Interplay

How well does this solution play with my other solutions?

When you pick the best of breed for each box do you really get the best overall solution?

Look at things in the aggregate!

Scorecards

  • keep them simple
  • simple ranges 1-3, 1-5
  • Stay away from false dichotomies
  • weight priorities
  • callout subcategories when valuable

Often documentation seems to be and often is transient. When we have to support systems for years and extend the system with time how should/do you capture documentation for the versions you have now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment