Random Thoughts on Authentication in REST APIs

Disclaimer: I am not an auth expert, but I know someone who is. All errors and ramblings below are mine, not theirs.

First, some considerations:

Is your API a public data service or just the "backend piece" of your own web or mobile app?
Are you employing a third-party (e.g., "login with Facebook") or are you managing credentials yourself?
Basic auth or token auth?
Stateless (token fully contains verification info) or stateful (must consult data store every time to verify) tokens?
Long-lived tokens or short-lived tokens?
How do you (if at all) refresh tokens?
How do you (if at all) invalidate tokens?
If using encryption, is it symmetric or asymmetric?
How do you pass credentials, tokens, or keys? (NOT in query parameters, at least!)
Are you careful about your CORS headers?
You know what CSRF and XSS attacks are, and whether they apply or not, right?
Where on the client do you store (if at all) credentials and/or tokens?

API Purpose

If your API is part of a user-facing web or mobile app that you own, then auth is about sessions.
- CORS required here, but please restrict to your web client's domain.
- If you are managing credentials yourself:
  - DO NOT USE JWTs or any stateless solution.
- If you aren't:
  - Then three-legged Oauth is great, so understand apps, access tokens, and refresh tokens.
  - Consider an auth provider (e.g. Auth0) or library (e.g. Passport) and let it do the work for you.
If your API is a public-facing service or provided to your customers so they can built their own web or mobile apps, then auth is about API keys.
- Not saying we have to create actual "keys," just saying there are no traditional sessions with login and logout.
- Keys and tokens are similar (a token may has well be a type of API key), so the discussion about tokens below applies.
- You should NOT have CORS headers! Make your clients hit the API from their servers, not their web pages.

Basic auth - send username and password on every request

Pros:
- Super easy to implement.
- Trivial to "revoke": if compromised, just change username and password.
Cons:
- They are the most important credentials: you should minimize the number of times they are passed around.
- If they are compromised, it's game over, prety much:
  - Significant damage can be done with stolen credentials.
  - Attacker that steals username/password can use these to change the password and lock out legitimate owner,
    - unless password reset requires email to the person who originally signed up.
- Slow!
  - Passwords stored as digests with good hashing algorithm are intentionally very slow to mitigate brute-force attacks
  - So basic auth makes the API slow since you compute the digest every time
  - You did hash the password, right? And use lots of iterations?

Token auth - long-lived tokens

Pros:
- Tokens (in general) mean you don't have to pass the username and password around much.
- Long-lived tokens are convenient for legitimate users: they don't have to "log in" very often, only on expiration.
- No need for a refresh mechanism: entering username and password once a month or once a year is not a big deal.
  - If password is forgotten, a password reset to the owner's email is fine.
Cons:
- Long-lived tokens increase the chance of compromise.
- For long-lived tokens there MUST be a panic button to invalidate all tokens.
  - This of course means your tokens are stateful.
  - If tokens are in a whitelist, invalidation is trivial, just remove from the data store, easy.
  - Can also add to a blacklist (but keep blacklist small by having tokens disappear when they would expire, but if they don't expire, good luck
    - Maybe for long-lived tokens, invalidation via logout time or password reset is better
  - For whitelists, limit the number of tokens allocated per user so the whitelist does not get too big.

Token auth - very short-lived tokens

Pros:
- If super short-lived, no need for invalidation.
  - The time it takes to fire up an invalidation might be about the same as letting the tokens expire.
Cons:
- Refresh has to be used, since without refresh you are close to basic auth (too many passes of username/password).
  - Refresh has some problems (see below).
  - If you have automatic refresh where a token can be used to get another token, then that's just a long-lived token.

On Passwords vs. Tokens

What is the difference between a password and a very, very long-lived token?
- Well, both have a long life, but passwords still live "longer."
- A password is the "most important" and "most sensitive" credential.
- Tokens meant to be easy to revoke; password changes go through a reset process.
- You expect to hash/digest a password to protect against brute-force attacks; tokens should never be subjected to performance issues, we expect validation to be super performant.
What is the difference between a long-lived token and short-lived tokens that you can automatically refresh?
- I don't know, probably nothing.
How do you pass tokens?
- Generally in the Authorization header with type Bearer, though you could b64-encode it together with a fake or real username and carry it with Authorization: Basic. 🙂

How do you invalidate?

If you need a way to invalidate tokens right now, because you expect a token has been stolen, you need to store some info somewhere and check presented tokens. Whether you use a whitelist or blacklist, think about how this info is distributed among microservices. Also means JWTs can't be used to fullest advantage.

Option 1: Whitelist - store all valid tokens

This can get huuuuuuuuuge!
- Use Redis or similar so the tokens can be automatically removed on expiration.
- Probably not good if tens of millions of valid tokens out there.
Can invalidate individual tokens or all tokens for a user.

Option 2: Blacklist - store only the invalid tokens (that have not expired)

Should in theory be smaller than a whitelist.
- You wouldn't expect people to hit the panic button to invalidate tokens much (unless tokens were good for many years).
Can invalidate individual tokens.

Option 3: Invalidation Timestamp in Database

On each logout (panic) or password reset, store the current time in the db for the user.
- All tokens presented for a user must have an issued_at time that is after the logout time.
Can be kept in database or Redis-style persistent cache.
Cannot invalidate an individual token...only can invalidate all tokens for a user.
If the cache gets lost, what do you assume?

Other options include blacklisting IPs, but that's part of a different story....

How do you refresh?

Option 1: Require username and password to refresh

Only applies to very long-lived tokens.
Might be the best choice if tokens have a very long lifetime.
Should not be automatic, since that would imply client storage of username/password, which is no good.
Hopefully does not kick in when user is in the middle of doing something. :)

Option 2: Token itself can be used to refresh

Presenting an unexpired valid token to get a new one is very similar to a long-lived token.
- If compromised, the attacker gets to keep refreshing and can do damage for a long time (until detected).
- Perhaps a short-lived token minimizes the chances that a token would get stolen.
  - Though XSS to get tokens from local storage can be done even with short-lived tokens.
If token life is too short, clients need frequent refresh and need to be careful not to miss their window.

Option 3: Refresh token

A very long-lived (or never expiring) refresh token used only to get new access tokens after expiry is less likely to get compromised because it is used so infrequently.
- Although XSS can get it if a web client has it in local storage.
Losing a refersh token is bad though, but it might be rarer.
Change of password could invalidate refresh tokens too.
- Lots of coding here.
Oauth uses refresh tokens, but they can be used even if you manage auth yourself.

Client-side storage of tokens

Watch out for both XSS and CSRF, among other things

Where to store JWTs - Stormpath article
Where to store tokens - Auth0
StackOverflow has a bunch of these questions

JWTs (Stateless) vs Opaque (Stateful) Tokens

JWTs
- in theory mean you never need to consult a db or persistent cache.
- good when you have third-party type things (login via social network, OAuth, Auth0) where statelessness, federation, and distributed auth are a big deal.
Opaque tokens (e.g. 24-char secure random string).
- Database table indexed by token, can store anything you want.
- Fine if rolling your own (you store credentials and tokens yourself).
- Easy to implement with Redis:
  - Expire with TTL.
  - Invalidate by removing from store.

Possible Recommendations

These have not been vetted by a security professional; I'm just thinking out loud. (Use at own risk, author not responsible, yada yada.)

If you have a public API and you let people log in with 3rd party credentials, go with OAuth. Or better, use a provider like Auth0.
If you want to manage credentials yourself:
- The easiest ways (not necessarily the most secure) would be either:
  - Super long-lived tokens with no refresh.
    - When the token expires, require username/password (login).
    - Although no refresh, we will need to invalidate, so whitelist/blacklist/timestamp whatever....
  - Super short-lived tokens with no invalidation.
    - But refresh with a refresh token you get from user/pass login instead of self-refreshing tokens.
      - Because if there's no invalidation, self-refreshing tokens that are compromised are bad news.
    - Refresh token can last for months or years, so client responsible for keeping them safe.
      - But maybe then your refresh token needs to be revokable!
        
        So invalidation is needed too, but only for the refresh tokens; the access tokens can be JWTs.
- What about medium life with refresh and invalidation? Dunno.

rtoal/rest-auth-notes.md