Last active
August 29, 2015 13:57
-
-
Save dualbus/9670831 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/sed -f | |
# unary to binary converter tool | |
# | |
# Converts an input resembling a number in a unary base (for example, | |
# 5 represented as ..... i.e. 5 dots), to the representation of the | |
# number in a binary base (using 1 and 0 as the symbols for the | |
# base). | |
# | |
# | |
# This is the outline of the algorithm: | |
# | |
# - For the case of zero (which is the empty string), output 0 and | |
# exit immediately. | |
# | |
# - If the number is not zero, proceed to: | |
# | |
# - Put a | separating the first unit from the rest. So, if you have: | |
# | |
# ..... | |
# | |
# make that: | |
# | |
# .|.... | |
# | |
# - The next part is a loop, which tries to group the units in powers | |
# of two (1, 2, 4, 8, ...). It stops when it can't make a bigger | |
# group. This group will tell us the most significant bit of the | |
# number. For example, if the biggest group we were able to make | |
# was 16, it means that the number is a 5 bit number, with the | |
# following form: 1xxxx. We still don't know the rest of the bits, | |
# but we do know that the highest corresponds to 16. | |
# | |
# - Once we finish with the MSB loop, we'll be left with something | |
# like this (for the case of 8 being the MSB): | |
# | |
# ........|... | |
# | |
# which is the number 1xxx (in fact, it's 1011, but we still don't | |
# know that! :)) | |
# | |
# - The string we have left doesn't tell us much still, just the MSB. | |
# We need to find what the other bits are. We'll start by splitting | |
# the MSB in half, and finding out if it is followed by its half. | |
# If it is, then the next bit is 1, if it isn't, then the next bit | |
# is 0. | |
# | |
# So, it looks something like: | |
# | |
# The first bit is 1: | |
# ........|... -> ........1... | |
# | |
# ..../....1... | |
# ^ cut here | |
# | |
# Is 1 followed by '....'? No, it means the next bit is zero. | |
# Reduce to: | |
# | |
# 1....0... | |
# | |
# and split again: | |
# | |
# 1../..0... | |
# | |
# Is 0 followed by '..'? Yes, it means the next bit is one. | |
# Reduce to: | |
# | |
# 10..1. | |
# | |
# and split again: | |
# | |
# 10./.1. | |
# ^ this 1 | |
# | |
# Is 1 followed by '.'? Yes, it means the next bit is one. | |
# Reduce to: | |
# | |
# 101.1 | |
# | |
# And we're finished. Remove that period: | |
# | |
# 1011 | |
# | |
# Et voilá! | |
# | |
# | |
# So, this is a rough overview of the algorithm. | |
# Greet the user. We want to be polite, right lil' seddy? | |
x; s/.*/Hello, user. This is your input:/; p; x | |
l | |
# The line is empty. This means the number is zero. No further | |
# processing is needed. Just put '0' and branch-the-fcuk-out. | |
/./! { | |
s/$/0/ | |
b | |
} | |
# Isolate the first unit. Poor little unit. It's all alone. We'll try | |
# to pair it with some powers-of-two friends later. | |
s/\(.\)/\1|/ | |
# Confess to the user what we did: | |
x; s/.*/Dear user, we grouped units like this:/; p; x | |
l | |
# Group the units into the biggest power of two we can find. | |
x; s/.*/Lovely user, we are finding out the most significant bit:/; p; x | |
:msb | |
l | |
# Yes, you don't want me to explain this. | |
s/^\([^|]\{1,\}\)|\1/\1\1|/ | |
t msb | |
# The first bit will always be one (except when the number is exactly | |
# 0, but we covered that already). This is because we Do Not Like | |
# Leading Zeroes. No like. Never. | |
s/|/1/ | |
x; s/.*/Magnificent user, we will fill the empty bits according to the position:/; p; x | |
:fill | |
l | |
# Bit is followed by its half, so make the next bit a 1 and remove | |
# the units of the current bit. And continue loop. | |
s/^\([10]*\)\([^10]\{1,\}\)\2\([10]\)\2/\1\3\21/ | |
t fill | |
# Bit is now followed by its half. So sad, where could our bit find | |
# its other half!? Well, let's fill that void with a 0, and continue | |
# our cycle. | |
s/^\([10]*\)\([^10]\{1,\}\)\2\([10]\)/\1\3\20/ | |
t fill | |
# Remove the remaining period (or whatever character remains, if you | |
# were curious enough to see if other characters worked) | |
s/[^10]\([10]\)$/\1/ | |
# Hint, any character except '1', '0', and '|' should work. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment