Last active December 4, 2018 09:03
Random Thoughts on Authentication in REST APIs

Disclaimer: I am not an auth expert, but I know someone who is. All errors and ramblings below are mine, not theirs.

First, some considerations:

  • Is your API a public data service or just the "backend piece" of your own web or mobile app?
  • Are you employing a third-party (e.g., "login with Facebook") or are you managing credentials yourself?
  • Basic auth or token auth?
  • Stateless (token fully contains verification info) or stateful (must consult data store every time to verify) tokens?
  • Long-lived tokens or short-lived tokens?
  • How do you (if at all) refresh tokens?
  • How do you (if at all) invalidate tokens?
  • If using encryption, is it symmetric or asymmetric?
  • How do you pass credentials, tokens, or keys? (NOT in query parameters, at least!)
  • Are you careful about your CORS headers?
  • You know what CSRF and XSS attacks are, and whether they apply or not, right?
  • Where on the client do you store (if at all) credentials and/or tokens?

API Purpose

  • If your API is part of a user-facing web or mobile app that you own, then auth is about sessions.
    • CORS required here, but please restrict to your web client's domain.
    • If you are managing credentials yourself:
    • If you aren't:
      • Then three-legged Oauth is great, so understand apps, access tokens, and refresh tokens.
      • Consider an auth provider (e.g. Auth0) or library (e.g. Passport) and let it do the work for you.
  • If your API is a public-facing service or provided to your customers so they can built their own web or mobile apps, then auth is about API keys.
    • Not saying we have to create actual "keys," just saying there are no traditional sessions with login and logout.
    • Keys and tokens are similar (a token may has well be a type of API key), so the discussion about tokens below applies.
    • You should NOT have CORS headers! Make your clients hit the API from their servers, not their web pages.

Basic auth - send username and password on every request

  • Pros:
    • Super easy to implement.
    • Trivial to "revoke": if compromised, just change username and password.
  • Cons:
    • They are the most important credentials: you should minimize the number of times they are passed around.
    • If they are compromised, it's game over, prety much:
      • Significant damage can be done with stolen credentials.
      • Attacker that steals username/password can use these to change the password and lock out legitimate owner,
        • unless password reset requires email to the person who originally signed up.
    • Slow!
      • Passwords stored as digests with good hashing algorithm are intentionally very slow to mitigate brute-force attacks
      • So basic auth makes the API slow since you compute the digest every time
      • You did hash the password, right? And use lots of iterations?

Token auth - long-lived tokens

  • Pros:
    • Tokens (in general) mean you don't have to pass the username and password around much.
    • Long-lived tokens are convenient for legitimate users: they don't have to "log in" very often, only on expiration.
    • No need for a refresh mechanism: entering username and password once a month or once a year is not a big deal.
      • If password is forgotten, a password reset to the owner's email is fine.
  • Cons:
    • Long-lived tokens increase the chance of compromise.
    • For long-lived tokens there MUST be a panic button to invalidate all tokens.
      • This of course means your tokens are stateful.
      • If tokens are in a whitelist, invalidation is trivial, just remove from the data store, easy.
      • Can also add to a blacklist (but keep blacklist small by having tokens disappear when they would expire, but if they don't expire, good luck
        • Maybe for long-lived tokens, invalidation via logout time or password reset is better
      • For whitelists, limit the number of tokens allocated per user so the whitelist does not get too big.

Token auth - very short-lived tokens

  • Pros:
    • If super short-lived, no need for invalidation.
      • The time it takes to fire up an invalidation might be about the same as letting the tokens expire.
  • Cons:
    • Refresh has to be used, since without refresh you are close to basic auth (too many passes of username/password).
      • Refresh has some problems (see below).
      • If you have automatic refresh where a token can be used to get another token, then that's just a long-lived token.

On Passwords vs. Tokens

  • What is the difference between a password and a very, very long-lived token?
    • Well, both have a long life, but passwords still live "longer."
    • A password is the "most important" and "most sensitive" credential.
    • Tokens meant to be easy to revoke; password changes go through a reset process.
    • You expect to hash/digest a password to protect against brute-force attacks; tokens should never be subjected to performance issues, we expect validation to be super performant.
  • What is the difference between a long-lived token and short-lived tokens that you can automatically refresh?
    • I don't know, probably nothing.
  • How do you pass tokens?
    • Generally in the Authorization header with type Bearer, though you could b64-encode it together with a fake or real username and carry it with Authorization: Basic. 🙂

How do you invalidate?

If you need a way to invalidate tokens right now, because you expect a token has been stolen, you need to store some info somewhere and check presented tokens. Whether you use a whitelist or blacklist, think about how this info is distributed among microservices. Also means JWTs can't be used to fullest advantage.

Option 1: Whitelist - store all valid tokens

  • This can get huuuuuuuuuge!
    • Use Redis or similar so the tokens can be automatically removed on expiration.
    • Probably not good if tens of millions of valid tokens out there.
  • Can invalidate individual tokens or all tokens for a user.

Option 2: Blacklist - store only the invalid tokens (that have not expired)

  • Should in theory be smaller than a whitelist.
    • You wouldn't expect people to hit the panic button to invalidate tokens much (unless tokens were good for many years).
  • Can invalidate individual tokens.

Option 3: Invalidation Timestamp in Database

  • On each logout (panic) or password reset, store the current time in the db for the user.
    • All tokens presented for a user must have an issued_at time that is after the logout time.
  • Can be kept in database or Redis-style persistent cache.
  • Cannot invalidate an individual token...only can invalidate all tokens for a user.
  • If the cache gets lost, what do you assume?

Other options include blacklisting IPs, but that's part of a different story....

How do you refresh?

Option 1: Require username and password to refresh

  • Only applies to very long-lived tokens.
  • Might be the best choice if tokens have a very long lifetime.
  • Should not be automatic, since that would imply client storage of username/password, which is no good.
  • Hopefully does not kick in when user is in the middle of doing something. :)

Option 2: Token itself can be used to refresh

  • Presenting an unexpired valid token to get a new one is very similar to a long-lived token.
    • If compromised, the attacker gets to keep refreshing and can do damage for a long time (until detected).
    • Perhaps a short-lived token minimizes the chances that a token would get stolen.
      • Though XSS to get tokens from local storage can be done even with short-lived tokens.
  • If token life is too short, clients need frequent refresh and need to be careful not to miss their window.

Option 3: Refresh token

  • A very long-lived (or never expiring) refresh token used only to get new access tokens after expiry is less likely to get compromised because it is used so infrequently.
    • Although XSS can get it if a web client has it in local storage.
  • Losing a refersh token is bad though, but it might be rarer.
  • Change of password could invalidate refresh tokens too.
    • Lots of coding here.
  • Oauth uses refresh tokens, but they can be used even if you manage auth yourself.

Client-side storage of tokens

Watch out for both XSS and CSRF, among other things

JWTs (Stateless) vs Opaque (Stateful) Tokens

  • JWTs
    • in theory mean you never need to consult a db or persistent cache.
    • good when you have third-party type things (login via social network, OAuth, Auth0) where statelessness, federation, and distributed auth are a big deal.
  • Opaque tokens (e.g. 24-char secure random string).
    • Database table indexed by token, can store anything you want.
    • Fine if rolling your own (you store credentials and tokens yourself).
    • Easy to implement with Redis:
      • Expire with TTL.
      • Invalidate by removing from store.

Possible Recommendations

These have not been vetted by a security professional; I'm just thinking out loud. (Use at own risk, author not responsible, yada yada.)

  • If you have a public API and you let people log in with 3rd party credentials, go with OAuth. Or better, use a provider like Auth0.
  • If you want to manage credentials yourself:
    • The easiest ways (not necessarily the most secure) would be either:
      • Super long-lived tokens with no refresh.
        • When the token expires, require username/password (login).
        • Although no refresh, we will need to invalidate, so whitelist/blacklist/timestamp whatever....
      • Super short-lived tokens with no invalidation.
        • But refresh with a refresh token you get from user/pass login instead of self-refreshing tokens.
          • Because if there's no invalidation, self-refreshing tokens that are compromised are bad news.
        • Refresh token can last for months or years, so client responsible for keeping them safe.
          • But maybe then your refresh token needs to be revokable!
            • So invalidation is needed too, but only for the refresh tokens; the access tokens can be JWTs.
    • What about medium life with refresh and invalidation? Dunno.

Further Reading

Comments on this gist are welcome.

