larahogan/app-perf.md

Last active May 7, 2021 01:18

Star (66) You must be signed in to star a gist
Fork (7) You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/larahogan/b681da601e3c94fdd3a6.js"></script>
Save larahogan/b681da601e3c94fdd3a6 to your computer and use it in GitHub Desktop.

Download ZIP

Native app performance metrics

Raw

app-perf.md

Native app performance metrics

This is a draft list of what we're thinking about measuring in Etsy's native apps.

Currently we're looking at how to measure these things with Espresso and Kif (or if each metric is even possible to measure in an automated way). We'd like to build internal dashboards and alerts around regressions in these metrics using automated tests. In the future, we'll want to measure most of these things with RUM too.

Overall app metrics

App launch time - how long does it take between tapping the icon and being able to interact with the app?
Time to complete critical flows - using automated testing, how long does it take a user to finish the checkout flow, etc.?
Battery usage, including radio usage and GPS usage
Peak memory allocation

Metrics per screen

Frame rate - we need to figure out where we're dropping frames (and introducing scrolling jank). We should be able to dig into render, execute, and draw times.
Memory leaks - using automated testing, can we find flows or actions that trigger a memory leak?
An app version of Speed Index - visible completion of the above-the-fold screen over time.
Time it takes for remote images to appear on the screen
Time between tapping a link and being able to do something on the next screen
Average time looking at spinners

Additional considerations

API performance
Webview Performance

mainroach commented Mar 10, 2015

Hey Lara,
So, a couple more points wrt getting these statistics (again, talking about only the Android side of things)

Android doesn't currently have a central API for gathering counter statistics (like chrome does). As such, you end up needing to create an automation process, (like espresso / monkeyrunner) which can reproduce the test, while separate analysis is running. (e.g. Once for Traceview, Once for BatteryStats, Once for HeapManager, etc)
This creates a couple interesting problems with determinism (how can you be 100% sure that your app is responding the same way each run?)
This also creates interesting problems with recording overhead (Android's UIAutomator is not free...)
And worse, is how do you correlate these runs in a way that's meaningful to your engineering team?

To be clear, there's a lot of work here that needs to be done. Eventually, I'd love for these types of things to be rolled into Android's tooling.
In a super awesome happy fun fun world you'd be able to insert counters into your traceview/batterystats/heapmanager events that would correlate with the UIAutomator events, so that you could say "taking this action caused these performance events to occur"

From there, your CI tests become pretty straightforward. Design a typical user action flow, and for each user input (or system input, if you're using accelerator or something..). For each tooling system you're interested in gathering stats for, execute your test. A counter is associated with each event, for each file, and that way, at the end of it all, you can combine it into a single parsable timeline of performance across the test.

From there, standard data processing can help you understand flows/trends/commonality

kennydee commented Mar 10, 2015

As i mentionned yesterday in my tweet, i think SPOF requests in native apps, are also something really important to check. I've seen so many apps, not loading because of synchronous requests on launch !

You could easily check SPOF request with a proxy software like CharlesProxy, by redirecting domains to http://blackhole.webpagetest.org and see if your app crashes, or is waiting for the request to timeout (75sec on iOS !).
You could do that with the

I previously write an article (in french sorry), on that topic : [http://tech.m6web.fr/performances-web-disaster-case-applications-mobile-native/](Perf & Web Disaster case on mobile apps)

mfcampos commented Mar 10, 2015

Hi,
I would add metrics regarding the amount of data being transferred.

I can't really help much with the how, but I do believe that though some metrics aren't directly performance metrics they could be indicative of performance issues and therefore would be useful to measure anyway: battery usage, memory usage/leaks, load, etc.

I do worry about what the performance impact of measuring all this will be though.

Cheers!

stuartphilp commented Mar 10, 2015

obviously loading times, animations/transition framerates, resource usage. the most important to me is probably network usage (i'm in bad coverage areas frequently):

I would look at things like frequency (how often an external request is made), average time to completion, time between requests, and bytes downloaded. you need to keep in mind the audience (mobile/commuters/cell-network based, home/office/wifi-based) for the app, and make sure your using the network efficiently. batching requests, not keeping the device active and draining battery, delivering an experience to the user that lets them enjoy the app even on poor networks.

lafikl commented Mar 11, 2015

Native mobile performance is different than mobile web, and must be approached in a different way.
The native mobile perf (From now on i'll call it mobile perf, for brevity sake) is more about being adaptable to the conditions you're in at the moment.
Also, you need to keep in mind that the conditions are more likely to change while the user is using your (Switching cell towers, Subway tunnels...etc). A user can start a session with a great LTE connection but maybe at the middle of the session he/she moved into another room and switched to 2G.

The great thing about native apps, is that you gain access to the OS APIs, like connection type, background sync.
Which means that apps can and needs to be reactive to these kinds of changes.

Defensive patterns:

Network

As explained above the network on mobile devices are more hostile than they're on desktops.
Therefore, the network layer needs to abstracted by a library that gaurds the developer and user from such changes. (Think of it like Netflix's Hystrix project but done on both client and back-end side.)

Circuit Breaker:
In the context of mobile: You can use this patterns to turn off sending requests to the upstream server if it's timing-out or not responding in a timely manner.
Adaptive upstream responses: Server responses should be tailored based on the conditions of the user connection. Think of it like cutting the mustard technique, but the check must done continuously by listening to the system notifications, using something like https://github.com/tonymillion/Reachability/.

Perceived Performance

Speed index. can be done by using ADB to start recording the screen and Appium to automate the UI task, then pull the file by ADB.
Optimistic UI. It's best explained by Rauch and the folks at Instagram 1 2 3

RUM Metrics to keep track of

Content size downloaded per session per type of content.
Percentage of types of networks used by users.
Median/Percentiles of network latency.
Time it takes app the to be interactive when launched, per session.
Number of requests per session.
Local cache hits per session.
How often does the connection change in a session.

This is just the first draft, i might update it later.

-KL

colinbendell commented Mar 12, 2015

Lots of good content and ideas here. It's important to note that there are different audiences looking for app performance data. Each audience has a slightly different pivot from the necessary data - and each likely could have levels as guypo suggests.

One audience who is interested in how the app performs from action to action. How long does it take to load the app? How long does it take for the animation to complete? How much cpu, memory are used for an action? etc.

The second audience are those connected to network activities. User does X, causes Y network request(s), results in Z display change. In this scenario we are looking at it very much in the same light as web page timings. In fact most of the existing w3c spec would naturally apply here. You want to capture all the nitty gritty things like dns time, ttfb, and finish animations. For this audience some interesting things to track would include:

Signal strength, tower id, gps coordinates
number of message retries; number of failed attempts for content
connection pool utilization (is this a new connection, or re-using a connection)

There is a third audience here as wel:. Those that are concerned about the impact of usability and usage of the app. How much time does the user spend in the app. What is the abandonment rate? In the world of web, there are many incidental artifacts that allow us to track (from logs) the users progress through a transaction (finding an item, putting it in the cart, checking out, etc). With native apps, these loggable artifacts aren't always available and therefore require more explicit capture and beaconed back.

/colin

glureau commented Jan 25, 2016

Hi guys!

Is there some improvements on this topic?
I search (but didn't find) a tool that could ensure we are not lowering our app performance (network/DB/io(sdcard)/memory/cpu) when merging new features.

Greg

sharifsakr commented May 20, 2016

Hi Lara,

Sorry to plug my own product, but it might be worth taking a look at GameBench: www.gamebench.net

Although we originally built the tool for gaming, it's increasingly being used to measure app responsiveness and UX, including many of the metrics you list (fps, memory, battery).

By way of an illustration, you can see an article about using GameBench to measure the responsiveness of news reader apps here: https://www.gamebench.net/blog/case-study-news-apps-and-quest-extreme-responsiveness

We're also gradually adding ways to red-flag excessive wait times, especially by means of GameBench Automation (still currently in closed beta, but you're welcome to try it).

ppcuenca commented Mar 1, 2017

Hi,

Great discussion. I realize this is more than a year old now, but has anyone come across any more tools or methods to measure and maybe benchmark rendering in native mobile.

I am going to give the visual metrics video measurement from https://github.com/WPO-Foundation/visualmetrics a try.

Has anyone had any luck with this?

Thanks,

foobargeez commented Mar 22, 2017

Nice thread. Does anyone have any pointers/product suggestions on how to measure mobile app performance?

Thanks!

enguyen commented Feb 26, 2019

Just wanted to share this resource: User-centric performance metrics

I found this to be a very powerful, perception-centered, and platform-independent way of thinking about app performance. I'd love everyone's help in thinking through how these should map to lower-level performance metrics, however. Here's how far I've gotten:

The experience	Notes	Metric
Is it happening?	Did the navigation start successfully? Has the server responded?	First Paint (FP) / First Contentful Paint (FCP) / Measuring any jank related to loading spinners or scrolling while UI is "busy"
Is it useful?	Has enough content rendered that users can engage with it?	First Meaningful Paint (FMP) / Hero Element Timing / Speed index and also the link above from @andydavies
Is it usable?	Can users interact with the page, or is it still busy loading?	Time to Interactive (TTI) / Not sure of native app metric...
Is it delightful?	Are the interactions smooth and natural, free of lag and jank?	Long Tasks (technically the absence of long tasks) / Frame rate / Jank per session (number of clusters of dropped frames per user session?)