Skip to content

Instantly share code, notes, and snippets.

@timmow
Created November 17, 2014 17:12
Show Gist options
  • Select an option

  • Save timmow/1403c18f5ecd7d4b70eb to your computer and use it in GitHub Desktop.

Select an option

Save timmow/1403c18f5ecd7d4b70eb to your computer and use it in GitHub Desktop.
velocity day 1 notes

Steve Shorrck - Life after human error

"someone did something someone else thinks is wrong"

Examples of headlines blaming human error

"Hal 9000 explanation" - not completely telling the truth

lots of examples of words used in failure situations

"limits of my language are limits of my world" - language acts as a constraint

human error - "post hoc social judgement"

difficult to pinpoint human error without exact catalog of things that must be done in each situation

magazine article and white paper taking point further

"be mindful of your mindset, reactions and langauge" in particular your feelings when things go wrong

"study the system in the context of normal work" - failure is caused by people doing their normal day to day work

need to understand how staff meet demand and handle pressure - human error often caused by pressure

how do people respond to failure? - should adjust and vary performance, make trade offs.

3/5

Aaron Rudger - keynote systems - maximise return of digital investments.

Can't understand language children use, and vice versa.

Performance is the same thing - difficult for performance people to get the message across.

43% of managers feel IT hinders business

3/5

Match up business metrics (growth, conversions) to perf metrics (page load time)

Perf metrics don't tell the business anything by themselves. Need to correlate and compare to business context.

Keynote has a new RUM system that does that.

LIGHTNING DEMO: Always Keep an Eye on Your Website Performance - PerfBar Khalid Lafi (WireFilter)

script to add to page to measure things - can add custom metrics - budget of things on page.

Designed for developers so they are mindful of performance metrics on page.

4/5

The Impatience Economy, Where Velocity Creates Value Monica Pal (Aerospike Inc.)

10 days wait for information down to 10 minute expectation

everyone wants everything faster now, and are less patient

1 second faster - 2% conversion increase

advertising, real time bidding big things.

real time analysis + hadoop for processing older data.

2/5

Recruiting for Diversity in Tech Laine Campbell (Pythian)

diversity benefits company, bottom line and everything in the world.

Assume good intentions, culture of forgiveness - dont be afraid to speak up.

Acquired diversity as well as inherent diversity - breaking down in different ways

2D diversity

more diverse companies have higher market caps.

meritocracy is broken

"governing group" reach out to own networks, by definition similar to group.

requiring sixteen hour days is anti diversity.

culture of comfort, not diversity, by just carrying on as normal

Apache foundation used as un-diverse example

goal should be to get applications %s to meet demographic

do this by reaching out to these demographicss

need to help these people, mentoring / junior opportunities.

"get people to the table"

once there, how do you get them to stay?

code of conduct.

Eliminate implicit bias.

orchestras - screen around auditions, increased women from 5% to 50%

interviewing groups should be as diverse as possible

5/5

no names / pictures on applications.

"Embracing Your Personal Apocalypse: A Light-Hearted Jaunt Through an Abject Failure Will Pressly (EdgeCast)"

quick hack fix config change, triggered kernel bug

iptables rules confused the issue, all dns failing

talk about recovering from failure rather than details of failure.

keep mind clear and keep perspective.

after recovery - ask "what did we learn?"

3/5

Time For a New Way to Measure User Experience?! Klaus Enzenhofer (Dynatrace)

"the performance impact"

apdex and navigation timings are outdated.

JS heavy apps dont fit these metrics.

Mobile apps dont fit this, browsers dont have nav timing

"user action response time"

errors - look bad.

look at a users whole journey, not just part of it.

take into account where user is

Dynatrace have a product to do it...

3/5

Better Performance Through Better Design Mark Zeman (SpeedCurve)

Designed a beautiful website, ask to make it fast now.

Need to take perf into account when designing things.

not visual aesthetics or user experience - use knowledge / creativity to solve problem

not designers fault, processes fault.

iterative deign process.

need a flexible process.

mindfulness.

dont be dogmatic and follow a process, step back and analyse the process.

have some principles - high level, 5 - 10 performance should be one of these.

example "speed is more important than design embellishment"

"engage quickly and then make it feel like you are there" - travel site, deliver initial content quickly then add rich content after main point of page is loaded.

"small interdisciplinary teams"

"share your knowledge" - experts within team need to work with non experts.

pick a metric and work towards it.

benchmark against competitors.

simplify data and present it in easy to comprehend way.

get knowledge out of head and facilitate performance discussions.

4/5

Continuous and Visible Security Testing with BDD-Security Stephen de Vries (ContinuumSecurity)

make devs responsible for security run tests continuously

compares QA testing to security testing - similar, but different

Business context / architecure / app features affect threats we look for - threat model

Likely enough / high impact enough that we care about it.

ok to accept certain threats if conscience decision made and documented

password reset account leakage used as an example - different for standard online shopping, compared to dating site for having affairs (ashley madison)

security requirements visible, actionable, up to date, testable, automated

jbehave + selenium, owasp zap

https://github.com/continuumsecurity/bdd-security

uses a port scanning example to verify only certain ports are open

Also a nessus example, that removes false positives - docs of why false positives are removed.

owasp zap - like charles proxy, focused on security

use selenium + zap api to drive this

create java class with login / logout etc methods

get zap to do submit all forms on app, then spider the app so it is in zaps db, then zap will report on security vulnerabilities

can check different users dont see other users sensitive data

showed the ci process - commit, automated deploy and automated bdd scan from jenkins

similar tools - zap junit

gauntlet (ruby) http://gauntlt.org/

f-secure/mittn python + Burp (propeitary sec scanner)

4/5

Monitoring: The Math Behind Bad Behavior Theo Schlossnagle

not math heavy talk

online / offline models

example of a disk alert, 85%, bumping to 85.1% at 2am to fix in the morning

classification of messy data

example given of a counter which periodically gets reset leading to sawtooth graph

guage vs rate - disk - gauge is how full rate is how much added in last hour

categorisation difficult for machines to do, easy for humans to do

bayesian categorization?

signal vs noise

average median min / max horrible things to do

spikes on an api due to cronjobs

graph stdeviation of data as well

need more data more often to lower p value

or from more places - cpu usage on 1 server vs 500

mesure residuals from a mean

cyclical data - fourier transform can remove the cycles

exponential windowed / weighted mean

cant do this on historic data

sliding windowed mean

huge window must be kept in memory

lurching window - 3day buckets - 2 days, yesterday today

each day is exponential weighted mean of the data.

Cusum test / method

tukey test is another one looking at

minus a std dev - leads to negative answers

your data is not a normal distribution

if we could all the data on perfomrance, cna change our stats

information compression - 1019 goes in 1000 bin with 1056, 2018 in 2000 bin with 2431

summarize as a histogram over 1 minute

once summarized, no longer a distribution

http://www.brendangregg.com/FrequencyTrails/modes.html

5/5

Who's Afraid of the Big Bad Preloader? Yoav Weiss (WL Square)

not many people have heard of preloader

added to browsers in 2008

browser parses into a dom tree

used to work like this

once dom resource url was seen, was added to list of resources to download

script elements are more important than other elements - cause parser to halt when discovered, as executing could change the dom. Also had to wait for any css resources to finish.

This is slow!

minimise number of resources, scripts at bottom were workarounds for preloader - still good practice but impact not as high.

different terms in different browsers - preloader is vendor neutral

"the greatest browser optimisation of all time" - steve souders

keeps looking at html even while parser is halted on script execution - speculatively downloads things

preprocessing

tokenization

parsing

preloader between tokenization and parsing

more about why script blocks the dom - document.write, create element, style queries (hence waiting for css download)

what can preloader speculatively download? - external css, images

@import - only webkit

video poster - firefox only

chrome and opera now, soon to be firefox

other things are not preloaded

input images

iframe

object

link rel=import

video, audio

css based resources - webfonts background images not preloaded

js based resources - cannot be preloaded (by definition)

priorities -

context based priority

css scripts visible images non visible images

higher priority for resources in

preloader has no spec, is not an api - dont rely on its behaviour

20% typical improvement on average

critical resources must be in markup - not scripts or css (for critical rendering path)

non critical resources - maybe not - consider taking out of markup

bottom scripts can bubble up the waterfall due to preloader

dont invalidate the dom - rewrite base element, add comments

set charset in http headers (at least in IE)

assume nothing about loading order, dont rely on it!!!!

dont rely on js cookies for image requests - image requests may have been made before js executed

resource hints - link rel=preload / preconnect

get it to download webfonts, prefetch dns make ssl connection to other domain

future -

more reources, iframes, link rel=import, video poster, input image

css support (imports)

add support for fonts

resource priorities - smarter unimportant scripts, images etc

http2 - sends all resources and priority

server can implement its own preloader

resource priorities proposal

Cognitive Biases in Engineering Organizations Jonathan Klein (Etsy)

http://jikle.in/biases/links

thinking fast and slow - daniel kahneman

projection bias and switching roles

covers planning poker in detail

bandwagon effect - people believing something because other people do

sunk cost

cutting corners - hyperbolic discounting - prefer reward that arrives sooner

fundamental attribution error - emphasis on internal characteristics rather than external - blameless post mortems avoid this

everyone has good intentions, intuition is not perfect

bayesian reasoning

5/5

The Machine is Dead, Long Live the Machine! - Service Resilience and Deployment Automation at The BBC Yavor Atanasov (BBC)

previously - ops / dev separation

longer release cycle

now - no hard limit on technologies, mix up devs / ops, cont delivery

teams own infra and deploys, choose tech used

60000 deploys in 18 months

2 day release process to 10 mins

fat vs thin containers

interested in containers but not using them

use mock tool or docker containers to ensure clean environment

packages same across all envs, use config to change them

full images vs base images

2 snapshots, one for image with software, one on top with config as well.

use cloudformation for deploys

templates versioned along with code

can apply same template to different environments

use troposphere to do this

separate templates for stateless and stateful templates

isolate instances and networks

make sure api limits / resource limits not exceeded

sec groups isolate instances

subnets and acls to isolate groups

vpc

you should use auto scaling groups - multi AZ

use chaos monkey

deployment by updating the image id of autoscaling group

vpn through vpc for ssh access

Synthetic and RUM – The Best of Both Worlds Cliff Crocker (SOASTA), Mark Zeman (SpeedCurve)

rum and synthetic give very different sets of numbers. Why? Should we report both?

normal distributions again - real users are definitely not normally distributed, extremes have effects on whole dataset

empty cache and repeat views - real users will be in between

what is the 1 number to comm back to org?

too many metrics

similar to web analytics

different people will have different important numbers

ceo - revenue vs perf ops - page load fe dev - start render design - speed index

revenue per second of page load is a good metric for ceo

competitive benchmarking - synthetic only

should show relative numbers, accuracy of synthetic may not matter

guardian responsive site is a good example

key point - simplify data from many metrics to display

huffington post removed all 3rd parties added back one by one, to count requests

set SLA using RUM, but be specific

median and 95th percentile for visitors in the USA

use synthetic to set budgets - js size / css size

rum is good for showing the reasons to use a cdn

work needed to relate site changes to rum / synthetic measures

single page apps make it hard - synthetic only measure first load, not xhr

rum - page or service xhr? dilute overall numbers by counting xhr as a page

designers want to know when does my site become usable

usable somewhere in between first render and page load

easyish to see in film strip view of webpage test, difficult to measure

user timing with js events in synthetic can help with this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment