Skip to content

Instantly share code, notes, and snippets.

@coolaj86
Last active December 16, 2022 18:42
Show Gist options
  • Save coolaj86/eb8cddfd3e4251258712dfed03e1f86d to your computer and use it in GitHub Desktop.
Save coolaj86/eb8cddfd3e4251258712dfed03e1f86d to your computer and use it in GitHub Desktop.
for base62 in javascript / go / pseudocode + github base62 token format

Update

See https://github.com/therootcompany/base62-token.js.

Pseudocode

const DICT = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
const PREFIX_LEN = 4
const CHECKSUM_LEN = 6
func GenerateBase62Token(prefix string, len int) string {
    entropy := []string{}
    for 0..len {
        index := math.RandomInt(62) // 0..61
        char := DICT[index]
        entropy = append(entropy, char)
    }
    chksum := crc32.Checksum(entropy) // uint32

    pad := CHECKSUM_LEN
    chksum62 := base62.Encode(DICT, chksum, pad)

    // ex: "ghp_" + "zQWBuTSOoRi4A9spHcVY5ncnsDkxkJ" + "0mLq17"
    return prefix + string(entropy) + chksum62
}
func VerifyBase62Token(token string) bool {
    // prefix is not used
    entropy := token[PREFIX_LEN:len(token)-CHECKSUM_LEN]
    chksum := base62.Decode(DICT, token[len(token)-CHECKSUM_LEN:]) // uint32

    return crc32.Checksum(entropy) == chksum
}

GitHub Token Breakdown

The 40-character tokens are broken down into 3 consequitive parts:

pre_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxcccccc

  • Prefix: 4-char (ex: ghx_)
  • Entropy: 30-char (178-bits + leading 0 padding)
    • BITS_PER_CHAR = Math.log(62) / Math.log(2) // about 5.9541
    • BITS_PER_CHAR * 30 // about 178.6258
  • Checksum: 6-char CRC32 (32-bits, 4 bytes, 6 base62 characters)
    • BITS_PER_CHAR * 5 // about 35.7251
Prefix Entropy Checksum
pre_ xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx cccccc

See

"Base X" Implementations

There appears to be a "de facto" standard that has a few different names:

Other implementations

Integer

Buffer:

Example Tokens

Here I generated 20 Personal Access Tokens (with no privileges), that expired yesterday (and were also automatically revoked), for inspection.

I doubt that it will be possible to determine how GitHub generates the checksum. If I were them I would scramble the base62 dictionary as a server-side secret so that attackers still have to hit up against rate limits to check tokens rather than having the same advantage of checking them online.

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
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
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
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
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

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
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
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
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
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 

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
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
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
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
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

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
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
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
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
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
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment