See https://github.com/therootcompany/base62-token.js.
const DICT = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
const PREFIX_LEN = 4
const CHECKSUM_LEN = 6
func GenerateBase62Token(prefix string, len int) string {
entropy := []string{}
for 0..len {
index := math.RandomInt(62) // 0..61
char := DICT[index]
entropy = append(entropy, char)
}
chksum := crc32.Checksum(entropy) // uint32
pad := CHECKSUM_LEN
chksum62 := base62.Encode(DICT, chksum, pad)
// ex: "ghp_" + "zQWBuTSOoRi4A9spHcVY5ncnsDkxkJ" + "0mLq17"
return prefix + string(entropy) + chksum62
}
func VerifyBase62Token(token string) bool {
// prefix is not used
entropy := token[PREFIX_LEN:len(token)-CHECKSUM_LEN]
chksum := base62.Decode(DICT, token[len(token)-CHECKSUM_LEN:]) // uint32
return crc32.Checksum(entropy) == chksum
}
The 40-character tokens are broken down into 3 consequitive parts:
pre_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxcccccc
- Prefix: 4-char (ex:
ghx_
) - Entropy: 30-char (178-bits + leading 0 padding)
BITS_PER_CHAR = Math.log(62) / Math.log(2) // about 5.9541
BITS_PER_CHAR * 30 // about 178.6258
- Checksum: 6-char CRC32 (32-bits, 4 bytes, 6 base62 characters)
BITS_PER_CHAR * 5 // about 35.7251
Prefix | Entropy | Checksum |
---|---|---|
pre_ | xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx | cccccc |
See
- https://github.blog/2021-04-05-behind-githubs-new-authentication-token-formats/
- https://github.blog/changelog/2021-09-23-npm-has-a-new-access-token-format/
There appears to be a "de facto" standard that has a few different names:
-
Base X
-
Bitcoin Base58/Base62
-
GMP Base62
-
GnuGP Base62
Integer
Buffer:
- jxskiss/base62#1
- https://cs.opensource.google/go/go/+/refs/tags/go1.17.5:src/math/bits/bits_tables.go;drc=refs%2Ftags%2Fgo1.17.5;l=63
- https://github.com/lytics/base62
- https://codeberg.org/ac/base62
Here I generated 20 Personal Access Tokens (with no privileges), that expired yesterday (and were also automatically revoked), for inspection.
I doubt that it will be possible to determine how GitHub generates the checksum. If I were them I would scramble the base62 dictionary as a server-side secret so that attackers still have to hit up against rate limits to check tokens rather than having the same advantage of checking them online.
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
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
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
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
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
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
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
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
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
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
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
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
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
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
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
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
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
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
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
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