Created
July 21, 2022 04:01
-
-
Save James-E-A/284fa918d232eef08551db860d4686bb to your computer and use it in GitHub Desktop.
stdlib RSA walkthrough
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
### RSA textbook-ish walkthrough ### | |
# This is WORK IN PROGRESS. | |
# Intend to cover: | |
# -- "RSA uses numbers, not words?" (DONE) | |
# -- RSA ACTUAL MATH FORMULA (DONE) | |
# -- Hybrid encryption (TODO) | |
# -- Asymmetric padding (TODO) | |
# -- Forward secrecy, part 1 (TODO) | |
# -- Encapsulation protocols (TODO) | |
# -- Forward secrecy2 part 2 (TODO) | |
# -- PQC, AKA "Teaching RSA is a prank" (TODO) | |
## INTRO ## | |
# RSA lets you "encrypt" a message, to mangle it | |
# into a so-called "ciphertext": | |
# A ciphertext looks like garbage, | |
# but it actually contains a real message! | |
# The only way to "decrypt" it | |
# (turn it back into the original message) | |
# is by knowing a SECRET, and then following | |
# the corresponding un-mangling process | |
# (known as "decryption"). | |
# As a very basic example, you can think of the | |
# newspaper "word jumble" puzzles. | |
# If the "ciphertext" is {GKA-LKI}, | |
# and the SECRET is {G=H;K=O;A=T;L=D;I=G}, | |
# then you can EASILY calculate the original message: | |
# {HOT-DOG}! | |
# What makes RSA interesting is that the SECRET, | |
# which you NEED to restore an RSA-encrypted message, | |
# is NOT NEEDED to RSA-encrypt the message in the first place! | |
# So you can have anyone on earth mangle a message | |
# in such a way that only you can un-mangle it -- | |
# without having to co-ordinate especially with | |
# each person beforehand. This is a HUGE step up | |
# from "codebooks" like the KGB needed to give | |
# their spies for encrypted communications many years ago, | |
# which required both parties to communicate beforehand, | |
# in a place that isn't eavesdropped, | |
# and they both had to keep their "key material" secret afterwards. | |
# With RSA, you can exchange the "key material" for encryption | |
# freely over a public channel as the world watches, without | |
# compromising the scheme! | |
# Technically, everything I described above applies to | |
# ANY "public-key encryption" algorithm, but RSA is the | |
# most famous example of it -- and the math powering it, | |
# uniquely, can be understood by a layman! | |
# "Elliptic-curve"-based schemes like X448 and P-256 ECDH | |
# are much cooler and more efficient, and "error-correcting- | |
# code"-based schemes will keep you safe even from eavesdroppers | |
# who posses quantum computers; but both of these require | |
# graduate-level mathematics even to build a SIMPLE demonstration of. | |
# But they all "work" the same superficially/semantically, | |
# just as a gasoline (petrol), diesel, and electric vehicles | |
# will have the same steering wheel, accelerators, and blinkers. | |
# So, with that said, let's see the "internal combustion" | |
# of RSA in action! | |
## Bootstrap yourself ## | |
# Before you can ask anyone to send you an "RSA-encrypted" message, | |
# you will first have to select two BIG prime numbers. | |
# They should be about the same length. | |
p = 33372027594978156556226010605355114227940760344767554666784520987023841729210037080257448673296881877565718986258036932062711 | |
q = 64135289477071580278790190170577389084825014742943447208116859632024532344630238623598752668347708737661925585694639798853367 | |
# (Extra credit: | |
# -- p and q MUST NOT be the same number. | |
# -- p and q should be ABOUT the same length. | |
# -- q should be a few times larger than p, to make it harder to deduce them) | |
# You will also have to select a so-called "public exponent". | |
# don't worry about this too much for now. | |
# A number of the form 2^n + 1 is generally a good choice, | |
# especially 3 or 65537. The latter is EXTREMELY popular. | |
e = 16385 # This is 0b100000000000001, AKA 2^14 + 1 | |
# (Extra credit: look up these words in quotes: | |
# -- e MUST be "coprime" to p-minus-1 times q-minus-1. | |
# -- (TECHNICALLY, it must be coprime to "euler's totient function" of p times q) | |
# -- e should be pretty small, for faster encryption and decryption. | |
# -- e should have a low "hamming weight", for much faster encryption and decryption.) | |
# You can send your PUBLIC KEY to anyone you like. | |
# They can use it to "encrypt" a message before sending it, | |
# so that no-one but you can make it out | |
# -- especially not a hacker who has wiretapped your e-mail or text messages! | |
public_key = { | |
'n': p * q, | |
'e': e | |
} | |
# Your public key does NOT contain p and q! | |
# It ONLY contains their product, which we'll call "n" | |
# (as well as the boring "e" number). | |
# You need your PRIVATE KEY to decrypt any encrypted message sent to you. | |
# Keep it secure: | |
# -- If you lose it, you will be unable to decrypt encrypted messages sent to you. | |
# Keep it safe: | |
# -- If someone steals it, they will be able to decrypt any encrypted messages sent to you that they harvested via wiretap. | |
private_key = { | |
'p': p, | |
'q': q, | |
'e': e | |
} | |
# Your "e" number is not private, but you will need it for | |
# other things. | |
## PRELIMINARY WRAPPERS ## | |
# RSA doesn't actually encrypt messages. It's just math. | |
# and math works with numbers. | |
# Specifically, discrete math works with "integers". (0, 1, 2, etc.) | |
# However, working with PURE INTEGERS isn't exactly useful. | |
# Even if you gave every word a number (apple=1, zebra=9999999), | |
# sending one of them at a time would be somewhat painful. | |
# And how would you send multi-word messages? | |
# So we take "bytes", which could be ANYTHING, | |
# including files, e-mails, pictures, and text messages, | |
# and crunch them down into integers so that RSA can deal with them. | |
# (If you're taking notes, put an ASTERISK on there^. This is technically a lie. | |
# In reality, we generally use "hybrid encryption" and/or "asymmetric padding" | |
# to make things easier and mitigate certain attacks, but worrying about those | |
# sort of things will probably not be useful when reading this section.) | |
def rsa_encrypt(pk, M): | |
n, e = _get_ne(pk) | |
# Convert message from an octet-string into an integer | |
# -- example: the string "Big Chungus" (in ASCII) | |
# -- becomes the number 80,286,854,914,723,116,587,906,419 | |
# -- We do this because RSA works on integers, fundamentally | |
# -- it's just a math formula, and you can't exponentiate "Big Chungus"! | |
m = _os2ip(M) | |
# PULL THE TRIGGER, DO THE ACTUAL ENCRYPTION | |
# (don't worry, we'll explain this next.) | |
c = _rsa_encrypt((n, e), m) | |
# Convert ciphertext from an integer into octets | |
# -- We should also convert the "ciphertext" integer that RSA gives us | |
# -- into something more amenable for sending over the internet | |
# -- after all, "Ovt Puhathf" (perhaps a "Big Chungus" ciphertext) is just 11 symbols | |
# -- but "96064521145336971767605350" is 26... | |
C = _i2osp(c, _byte_length(n)) | |
# Done | |
return C | |
def rsa_decrypt(sk, C): | |
# -- RSA doesn't TECHNICALLY use p and q for encryption. | |
# -- It needs a related number "d", | |
# -- which you can calculate from p and q and e. | |
# -- "d" is also secret, because you can use it to figure out p and q! | |
n, d = _get_nd(sk) | |
_byte_length(n) | |
# Convert ciphertext from an octet-string into an integer | |
# -- We received this ciphertext over the internet from our | |
# -- communication partner; it's time to feed it into RSA | |
# -- for decryption! | |
c = _os2ip(C) | |
# PULL THE TRIGGER, DO THE ACTUAL DECRYPTION | |
m = _rsa_decrypt((n, d), c) | |
# Convert message from an integer into octets | |
# -- now | |
M = _i2osp(m, _byte_length(n)) | |
# Done | |
return C | |
## RAW FUNCTIONS ## | |
# Basic(ish) arithmetic underlying the RSA cryptosystem | |
def _rsa_encrypt(_pk, m): | |
n, e = _pk | |
# Bounds checking | |
if not 0 <= m < n: | |
raise ValueError("message representative out of range") | |
# Encrypt | |
# -- it's literally just arithmetic! | |
# -- the ciphertext (c) | |
# -- is nothing more than | |
# -- THE REMAINDER when you take | |
# -- the message (m) to the power of the public exponent (e) | |
# -- and divide the public modulus (n) by the result! | |
c = pow(m, e, n) | |
# Done | |
return c | |
def _rsa_decrypt(_sk, c): | |
n, d = _sk | |
# Bounds checking | |
if not 0 <= c < n: | |
raise ValueError("ciphertext representative out of range") | |
# Decrypt | |
# -- just the same! | |
# -- to get the message back, just | |
# -- take THE REMAINDER of | |
# -- the public modulus (n) divided by | |
# -- the ciphertext (c) to the power of the private exponent (d)! | |
m = pow(c, d, n) | |
# Done | |
return m | |
def _rsa(_k, x): | |
# -- in fact, if you wanted to look at it THIS way, | |
# -- encryption and decryption are just the same thing in RSA! | |
modulus, exponent = _k | |
if not 0 <= x < n: | |
raise ValueError("input representative out of range") | |
y = pow(x, exponent, modulus) | |
return y | |
# -- ...but that's kind of goofy, and might be confusing later. | |
# -- nevermind. | |
del _rsa | |
def _rsa_decrypt_mp(sk, c): | |
"https://datatracker.ietf.org/doc/html/rfc8017#page-14" | |
# -- (Don't look at this function, either -- I'm not finished writing it!) | |
raise NotImplemented("TODO") | |
# ANNOYING GARBAGE # | |
# There is no (free) magic in programming. | |
# You can't just write "turn_this_number_into_letters(12345)!" | |
# and have it work automatically. | |
# SOMEONE had to create every one of the random, idiosyncratic functions | |
# that the above cool and useful functions relied on internally. | |
# It's like building a car: you still have to build the engine, | |
# the brakes, the fuel injectors, etc., or the car won't do anything. | |
# Below here, you can watch the sausage being made, if you please -- | |
# to observe the silent pillars holding up the castle above, | |
# just know that the following functions are pretty boring EVEN IF you're a programmer. | |
def _i2osp(i: int, k: int): | |
"""Integer-to-octet-string primitive | |
https://datatracker.ietf.org/doc/html/rfc8017#section-4.1 | |
""" | |
# This function just takes a number i (e.g. 7) | |
# and turns it into an octet-string (e.g. b'\x00\x07') | |
# with the specified number of octets, k (e.g. 2) | |
# It does so in the so-called "big endian" format, | |
# where the smallest digits go into the end of the octet-string | |
# and the biggest digits go into the start of the octet-string. | |
# -- (People like to pretend that this has something to do with | |
# -- CPU architecture internals, but the fact of the matter is | |
# -- that I don't even know what my CPU "endianness" is, and | |
# -- Python doesn't care. Computers are machines that process octets. | |
# -- Personally, I bet we do it this way because it makes more sense | |
# -- in a world where we write the biggest digits first. | |
# -- I don't know of any countries that write "three million and five" | |
# -- as 5000003 in a scientific or academic context...) | |
return int.to_bytes(i, k, 'big') | |
def _os2ip(o: bytes): | |
"""Octet-string-to-integer primitive | |
https://datatracker.ietf.org/doc/html/rfc8017#section-4.2 | |
""" | |
# This function just takes an octet-string like b'\x00\x07' | |
# and turns it into an integer like 7. | |
# It doesn't need to be told how many digits "k" the string should be, | |
# because it can see directly how many digits it IS. | |
# Since it's the opposite twin of _i2osp, it also decodes the | |
# integers in big-endian format so you don't turn 7 into 1792. | |
return int.from_bytes(o, 'big') | |
def _byte_length(i: int): | |
"Returns the minimum number of bytes needed to encode i" | |
# With a number like 1, 2, 3, 99, or 255, this returns 1. | |
# With a number like 256, 300, 400, 4000, or 65535, this returns 2. | |
# With a number like 65536 or 16777215, this returns 3. | |
# And so on... | |
# The way to do this is to just measure the number of bits in the number | |
# and divide it by eight, rounding up. | |
return _ceil_div(i.bit_length(), 8) | |
def _ceil_div(a: int, b: int): | |
"Returns the smallest integer greater than or equal to the ratio of a and b" | |
# -- Speaking of integer-division-rounding-up... | |
# -- Python rounds integer division down. | |
# -- Always. | |
# -- So we have to "trick" it, by dividing NEGATIVE a by b, | |
# -- and then taking the NEGATIVE of the result! | |
# -- If you tried to divide 42 by 10, you'd get 4 plus a remainder of 2, | |
# -- rounding down to 4. | |
# -- If you divide (-42) by 10, however, you get (-5) plus a remainder of 8, | |
# -- rounding down, obviously, to (-5). | |
# -- If you negate this, you get "rounded up" division! | |
# -- (Don't worry about even divisions -- they have a remainder of 0, so | |
# -- the positives and negatives all come out the same, correctly. | |
# -- Negative thirty divided by ten is just negative three plus a remainder of 0.) | |
return -(-a // b) | |
def _get_ne(pk): | |
# -- Defining a dictionary looks GREAT for teaching, back in line #5, | |
# -- but then getting the numbers out you actually have to ACCESS them by keys. | |
# -- This function pulls out "n" and "e" directly and returns them together, | |
# -- so you never have to "access key indices" yourself. | |
# -- That is a Python thing, not a Maths thing, so I wanted to kick the can | |
# -- down the road without forcing the reader to learn about "data structures" | |
# -- or "object-oriented programming" or other Computer Science topics like that. | |
return pk['n'], pk['e'] | |
def _get_nd(sk): | |
# TODO support multiprime RSA | |
# -- OK, this one is even worse. | |
# -- As you saw, we only defined the "private key" as having | |
# -- the private factors "p" and "q" and the public exponent "e". | |
# -- RSA doesn't actually use these numbers to decrypt, though. | |
# -- It uses "d" | |
if ('n' not in sk) and ('p' in sk and 'q' in sk): | |
sk['n'] = sk['p'] * sk['q'] | |
if ('d' not in sk) and (('e' in sk) and ('p' in sk and 'q' in sk)): | |
sk['d'] = pow(e, -1, (sk['p'] - 1) * (sk['q'] - 1)) | |
return sk['n'], sk['d'] | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment