Skip to content

Instantly share code, notes, and snippets.

@James-E-A
Created July 21, 2022 04:01
Show Gist options
  • Save James-E-A/284fa918d232eef08551db860d4686bb to your computer and use it in GitHub Desktop.
Save James-E-A/284fa918d232eef08551db860d4686bb to your computer and use it in GitHub Desktop.
stdlib RSA walkthrough
### RSA textbook-ish walkthrough ###
# This is WORK IN PROGRESS.
# Intend to cover:
# -- "RSA uses numbers, not words?" (DONE)
# -- RSA ACTUAL MATH FORMULA (DONE)
# -- Hybrid encryption (TODO)
# -- Asymmetric padding (TODO)
# -- Forward secrecy, part 1 (TODO)
# -- Encapsulation protocols (TODO)
# -- Forward secrecy2 part 2 (TODO)
# -- PQC, AKA "Teaching RSA is a prank" (TODO)
## INTRO ##
# RSA lets you "encrypt" a message, to mangle it
# into a so-called "ciphertext":
# A ciphertext looks like garbage,
# but it actually contains a real message!
# The only way to "decrypt" it
# (turn it back into the original message)
# is by knowing a SECRET, and then following
# the corresponding un-mangling process
# (known as "decryption").
# As a very basic example, you can think of the
# newspaper "word jumble" puzzles.
# If the "ciphertext" is {GKA-LKI},
# and the SECRET is {G=H;K=O;A=T;L=D;I=G},
# then you can EASILY calculate the original message:
# {HOT-DOG}!
# What makes RSA interesting is that the SECRET,
# which you NEED to restore an RSA-encrypted message,
# is NOT NEEDED to RSA-encrypt the message in the first place!
# So you can have anyone on earth mangle a message
# in such a way that only you can un-mangle it --
# without having to co-ordinate especially with
# each person beforehand. This is a HUGE step up
# from "codebooks" like the KGB needed to give
# their spies for encrypted communications many years ago,
# which required both parties to communicate beforehand,
# in a place that isn't eavesdropped,
# and they both had to keep their "key material" secret afterwards.
# With RSA, you can exchange the "key material" for encryption
# freely over a public channel as the world watches, without
# compromising the scheme!
# Technically, everything I described above applies to
# ANY "public-key encryption" algorithm, but RSA is the
# most famous example of it -- and the math powering it,
# uniquely, can be understood by a layman!
# "Elliptic-curve"-based schemes like X448 and P-256 ECDH
# are much cooler and more efficient, and "error-correcting-
# code"-based schemes will keep you safe even from eavesdroppers
# who posses quantum computers; but both of these require
# graduate-level mathematics even to build a SIMPLE demonstration of.
# But they all "work" the same superficially/semantically,
# just as a gasoline (petrol), diesel, and electric vehicles
# will have the same steering wheel, accelerators, and blinkers.
# So, with that said, let's see the "internal combustion"
# of RSA in action!
## Bootstrap yourself ##
# Before you can ask anyone to send you an "RSA-encrypted" message,
# you will first have to select two BIG prime numbers.
# They should be about the same length.
p = 33372027594978156556226010605355114227940760344767554666784520987023841729210037080257448673296881877565718986258036932062711
q = 64135289477071580278790190170577389084825014742943447208116859632024532344630238623598752668347708737661925585694639798853367
# (Extra credit:
# -- p and q MUST NOT be the same number.
# -- p and q should be ABOUT the same length.
# -- q should be a few times larger than p, to make it harder to deduce them)
# You will also have to select a so-called "public exponent".
# don't worry about this too much for now.
# A number of the form 2^n + 1 is generally a good choice,
# especially 3 or 65537. The latter is EXTREMELY popular.
e = 16385 # This is 0b100000000000001, AKA 2^14 + 1
# (Extra credit: look up these words in quotes:
# -- e MUST be "coprime" to p-minus-1 times q-minus-1.
# -- (TECHNICALLY, it must be coprime to "euler's totient function" of p times q)
# -- e should be pretty small, for faster encryption and decryption.
# -- e should have a low "hamming weight", for much faster encryption and decryption.)
# You can send your PUBLIC KEY to anyone you like.
# They can use it to "encrypt" a message before sending it,
# so that no-one but you can make it out
# -- especially not a hacker who has wiretapped your e-mail or text messages!
public_key = {
'n': p * q,
'e': e
}
# Your public key does NOT contain p and q!
# It ONLY contains their product, which we'll call "n"
# (as well as the boring "e" number).
# You need your PRIVATE KEY to decrypt any encrypted message sent to you.
# Keep it secure:
# -- If you lose it, you will be unable to decrypt encrypted messages sent to you.
# Keep it safe:
# -- If someone steals it, they will be able to decrypt any encrypted messages sent to you that they harvested via wiretap.
private_key = {
'p': p,
'q': q,
'e': e
}
# Your "e" number is not private, but you will need it for
# other things.
## PRELIMINARY WRAPPERS ##
# RSA doesn't actually encrypt messages. It's just math.
# and math works with numbers.
# Specifically, discrete math works with "integers". (0, 1, 2, etc.)
# However, working with PURE INTEGERS isn't exactly useful.
# Even if you gave every word a number (apple=1, zebra=9999999),
# sending one of them at a time would be somewhat painful.
# And how would you send multi-word messages?
# So we take "bytes", which could be ANYTHING,
# including files, e-mails, pictures, and text messages,
# and crunch them down into integers so that RSA can deal with them.
# (If you're taking notes, put an ASTERISK on there^. This is technically a lie.
# In reality, we generally use "hybrid encryption" and/or "asymmetric padding"
# to make things easier and mitigate certain attacks, but worrying about those
# sort of things will probably not be useful when reading this section.)
def rsa_encrypt(pk, M):
n, e = _get_ne(pk)
# Convert message from an octet-string into an integer
# -- example: the string "Big Chungus" (in ASCII)
# -- becomes the number 80,286,854,914,723,116,587,906,419
# -- We do this because RSA works on integers, fundamentally
# -- it's just a math formula, and you can't exponentiate "Big Chungus"!
m = _os2ip(M)
# PULL THE TRIGGER, DO THE ACTUAL ENCRYPTION
# (don't worry, we'll explain this next.)
c = _rsa_encrypt((n, e), m)
# Convert ciphertext from an integer into octets
# -- We should also convert the "ciphertext" integer that RSA gives us
# -- into something more amenable for sending over the internet
# -- after all, "Ovt Puhathf" (perhaps a "Big Chungus" ciphertext) is just 11 symbols
# -- but "96064521145336971767605350" is 26...
C = _i2osp(c, _byte_length(n))
# Done
return C
def rsa_decrypt(sk, C):
# -- RSA doesn't TECHNICALLY use p and q for encryption.
# -- It needs a related number "d",
# -- which you can calculate from p and q and e.
# -- "d" is also secret, because you can use it to figure out p and q!
n, d = _get_nd(sk)
_byte_length(n)
# Convert ciphertext from an octet-string into an integer
# -- We received this ciphertext over the internet from our
# -- communication partner; it's time to feed it into RSA
# -- for decryption!
c = _os2ip(C)
# PULL THE TRIGGER, DO THE ACTUAL DECRYPTION
m = _rsa_decrypt((n, d), c)
# Convert message from an integer into octets
# -- now
M = _i2osp(m, _byte_length(n))
# Done
return C
## RAW FUNCTIONS ##
# Basic(ish) arithmetic underlying the RSA cryptosystem
def _rsa_encrypt(_pk, m):
n, e = _pk
# Bounds checking
if not 0 <= m < n:
raise ValueError("message representative out of range")
# Encrypt
# -- it's literally just arithmetic!
# -- the ciphertext (c)
# -- is nothing more than
# -- THE REMAINDER when you take
# -- the message (m) to the power of the public exponent (e)
# -- and divide the public modulus (n) by the result!
c = pow(m, e, n)
# Done
return c
def _rsa_decrypt(_sk, c):
n, d = _sk
# Bounds checking
if not 0 <= c < n:
raise ValueError("ciphertext representative out of range")
# Decrypt
# -- just the same!
# -- to get the message back, just
# -- take THE REMAINDER of
# -- the public modulus (n) divided by
# -- the ciphertext (c) to the power of the private exponent (d)!
m = pow(c, d, n)
# Done
return m
def _rsa(_k, x):
# -- in fact, if you wanted to look at it THIS way,
# -- encryption and decryption are just the same thing in RSA!
modulus, exponent = _k
if not 0 <= x < n:
raise ValueError("input representative out of range")
y = pow(x, exponent, modulus)
return y
# -- ...but that's kind of goofy, and might be confusing later.
# -- nevermind.
del _rsa
def _rsa_decrypt_mp(sk, c):
"https://datatracker.ietf.org/doc/html/rfc8017#page-14"
# -- (Don't look at this function, either -- I'm not finished writing it!)
raise NotImplemented("TODO")
# ANNOYING GARBAGE #
# There is no (free) magic in programming.
# You can't just write "turn_this_number_into_letters(12345)!"
# and have it work automatically.
# SOMEONE had to create every one of the random, idiosyncratic functions
# that the above cool and useful functions relied on internally.
# It's like building a car: you still have to build the engine,
# the brakes, the fuel injectors, etc., or the car won't do anything.
# Below here, you can watch the sausage being made, if you please --
# to observe the silent pillars holding up the castle above,
# just know that the following functions are pretty boring EVEN IF you're a programmer.
def _i2osp(i: int, k: int):
"""Integer-to-octet-string primitive
https://datatracker.ietf.org/doc/html/rfc8017#section-4.1
"""
# This function just takes a number i (e.g. 7)
# and turns it into an octet-string (e.g. b'\x00\x07')
# with the specified number of octets, k (e.g. 2)
# It does so in the so-called "big endian" format,
# where the smallest digits go into the end of the octet-string
# and the biggest digits go into the start of the octet-string.
# -- (People like to pretend that this has something to do with
# -- CPU architecture internals, but the fact of the matter is
# -- that I don't even know what my CPU "endianness" is, and
# -- Python doesn't care. Computers are machines that process octets.
# -- Personally, I bet we do it this way because it makes more sense
# -- in a world where we write the biggest digits first.
# -- I don't know of any countries that write "three million and five"
# -- as 5000003 in a scientific or academic context...)
return int.to_bytes(i, k, 'big')
def _os2ip(o: bytes):
"""Octet-string-to-integer primitive
https://datatracker.ietf.org/doc/html/rfc8017#section-4.2
"""
# This function just takes an octet-string like b'\x00\x07'
# and turns it into an integer like 7.
# It doesn't need to be told how many digits "k" the string should be,
# because it can see directly how many digits it IS.
# Since it's the opposite twin of _i2osp, it also decodes the
# integers in big-endian format so you don't turn 7 into 1792.
return int.from_bytes(o, 'big')
def _byte_length(i: int):
"Returns the minimum number of bytes needed to encode i"
# With a number like 1, 2, 3, 99, or 255, this returns 1.
# With a number like 256, 300, 400, 4000, or 65535, this returns 2.
# With a number like 65536 or 16777215, this returns 3.
# And so on...
# The way to do this is to just measure the number of bits in the number
# and divide it by eight, rounding up.
return _ceil_div(i.bit_length(), 8)
def _ceil_div(a: int, b: int):
"Returns the smallest integer greater than or equal to the ratio of a and b"
# -- Speaking of integer-division-rounding-up...
# -- Python rounds integer division down.
# -- Always.
# -- So we have to "trick" it, by dividing NEGATIVE a by b,
# -- and then taking the NEGATIVE of the result!
# -- If you tried to divide 42 by 10, you'd get 4 plus a remainder of 2,
# -- rounding down to 4.
# -- If you divide (-42) by 10, however, you get (-5) plus a remainder of 8,
# -- rounding down, obviously, to (-5).
# -- If you negate this, you get "rounded up" division!
# -- (Don't worry about even divisions -- they have a remainder of 0, so
# -- the positives and negatives all come out the same, correctly.
# -- Negative thirty divided by ten is just negative three plus a remainder of 0.)
return -(-a // b)
def _get_ne(pk):
# -- Defining a dictionary looks GREAT for teaching, back in line #5,
# -- but then getting the numbers out you actually have to ACCESS them by keys.
# -- This function pulls out "n" and "e" directly and returns them together,
# -- so you never have to "access key indices" yourself.
# -- That is a Python thing, not a Maths thing, so I wanted to kick the can
# -- down the road without forcing the reader to learn about "data structures"
# -- or "object-oriented programming" or other Computer Science topics like that.
return pk['n'], pk['e']
def _get_nd(sk):
# TODO support multiprime RSA
# -- OK, this one is even worse.
# -- As you saw, we only defined the "private key" as having
# -- the private factors "p" and "q" and the public exponent "e".
# -- RSA doesn't actually use these numbers to decrypt, though.
# -- It uses "d"
if ('n' not in sk) and ('p' in sk and 'q' in sk):
sk['n'] = sk['p'] * sk['q']
if ('d' not in sk) and (('e' in sk) and ('p' in sk and 'q' in sk)):
sk['d'] = pow(e, -1, (sk['p'] - 1) * (sk['q'] - 1))
return sk['n'], sk['d']
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment