Claude Shannon says:
The entropy is a statistical parameter which measures in a certain sense, how much information is produced on the average for each letter of a text in the language. If the language is translated into binary digits (0 or 1) in the most efficient way, the entropy H is the average number of binary digits required per letter of the original language.
-- NIST Special Publication 800-63-1 - Appendix A: Estimating Password Entropy and Strength
A session IDs are often used to identify a user on a web client. If the ID is stolen by someone, the system might get a MITM (Man In The Middle) attack. In order to avoid ID leak, ID theft or ID guess, the ID must be implemented strong.
Generally, ID is generated by web libraries, frameworks or languages, and it's assigned to client via Cookie when necessary.
Many developers understand well how it's important to control the timing of the ID assign to avoid a session fixation attack. But, checking whether or not the ID has enough bit strength to protecting from brute force attack is also very important.
OWASP has defined the minimum session ID bits length as follows:
Session identifiers should be at least 128 bits long to prevent brute-force session guessing attacks.
In brief, the important point of the session ID is the bits length, not the length looking.
As an extreme example, let's suppose that the system generates the following session ID:
a00a0a1001100a000a0aaaa1aa0000aaa00a0a1001100a000a0aaaa1aa0000aa
This session ID has 64 length. It is a session ID length longer than generated by common web framework.
But the ID contains only 3 type of characters; a, 0 and 1. That means the system could be weak against brute force attack.
The bit length of this session ID is 101 bits, which is calculated by the equation that will be described later. This session ID has long characters but not enough strength.
If the library have bugs at the random algorithm and it will only select 3 type of characters, the result will be the same, even if the library generates ID with PRNG(Pseudo Random Number Generator) algorithm, which is selected randomly from [0-9a-zA-Z].
Therefore, the bits length is calculated with character types and ID length. If the result of the calculation is below 128 bits, it means the ID is weak against some attacks.
Calculate a string bits length selected from character list randomly with the Randomly Selected Passwords formula designed by Claude Shannon.
The following code is implemented by Python.
import math
hashv = "a00a0a1001100a000a0aaaa1aa0000aaa00a0a1001100a000a0aaaa1aa0000aa"
fb = len(list(set(list( hashv )))) # b
fbl = math.pow(fb, len( hashv )) # b^l
fH = math.log2(fbl) # log2(b^l)
print(fH)
The bits length must be calculated based on the facts, such as 10,000 times sampling to make sure what character list is, including whether the string generated by PRNG has bias or not.
For security assessment of the session ID bits length generated by a system, it is inappropriate to determine the character list solely on the basis of only one session ID.
It is also inappropriate to determine that the system is using [a-z] character list solely on the basis of confirmation of one character of [a-z] appears (eg, b) once.
The following script is a sample to perform a assessment of a web system.
https://gist.github.com/4k1/3ffba860eaee8667ef6bacfba0c7a84f
How to use:
$ python3 calc_entropy.py www.example.com PHPSESSID
[ ] Collecting PHPSESSID...
[+] Collected.
Length : 32
Charlist : 64
Strength : 320.0 bits
[+] Ok.
Thanks,
**
@4k1
https://github.com/4k1
https://twitter.com/aki_81jp