Created
August 8, 2020 14:28
-
-
Save irwincong/f191b9a827fd0efa0143718ed0aafeda to your computer and use it in GitHub Desktop.
Initial solution for https://web.archive.org/web/20200618185144/https://techdevguide.withgoogle.com/resources/compress-decompression/
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python3 | |
# DATE: 2020.08.08 | |
# My initial solution for: | |
# https://web.archive.org/web/20200618185144/https://techdevguide.withgoogle.com/resources/compress-decompression/ | |
# | |
# Time complexity: O(length of input string). | |
# Storage complexity: I think my storage complexity is related to recursion (stack) depth and string length. | |
# I don't recall how python passes strings, but I hope it is passed by reference instead of by value. | |
# If by reference, then I think I have a O(length of input string) | |
# If not, my solution may have a storage requirement of O(length of input string * stack depth) | |
import sys | |
def decompress(deflated: str, pos: int=0) -> (int, str): | |
inflated = "" | |
N = "" | |
k = pos | |
# NOTE: This is not pythonic, but it works. I considered using enumerate() but | |
# I wouldn't be able to control the string index to "jump forward" on recursion return. | |
while True: | |
try: | |
c = deflated[k] | |
except IndexError: | |
return k, inflated | |
# What follows is probably too heavy on string copy operations because python strings are immutable. | |
# | |
# EDIT: Google's solution used nested generators, and they cited code clarity as the primary rationale | |
# with copy-operations as a secondary reason. | |
if "0" <= c <= "9": | |
N += c | |
elif c == "[": | |
k, s = decompress(deflated, k+1) | |
inflated += int(N) * s # NOTE: Could avoid the mult. operation if we checked for N == 1. | |
N = "" | |
elif c == "]": | |
return k, inflated | |
else: | |
inflated += c | |
k += 1 | |
if __name__ == "__main__": | |
_, t = decompress("10[a]") | |
assert(t == 10*"a") | |
_, t = decompress("2[3[a]b]") | |
assert(t == 2*(3*"a"+"b")) | |
# EDIT: Google's worst-case test case | |
_, t = decompress("1[1[1[1[1[1[1[1[1[1[1[1[1[1[1[1[1[1[1[1[xx]]]]]]]]]]]]]]]]]]]]") | |
assert(t == "xx") |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I didn't use any of the hints. Taking a look at google's solution (below). Some observations:
sub_times
(their version of myN
), like my solution, assumes a well-formed decimal value. The problem description says the input is always valid according to the other rules; however, this is a little ambiguous because in Google Hint #4, it says to consider edge cases such asa[]b
and0[abc]
. These, by the grammar, are all valid. The grammar, however, also allows for "weird" numbers such as00[abc]
and01[abc]
. Nothing in Google's provided solution says that these edge cases were considered, but, to be frank, I also did not consider these "weird" numbers until after I implemented my solution.isdigit()
instead of"0" <= c <= "9"
. I guess I am not familiar enough with builtin methods to the Pythonstr
classbasestring
. I was not familiar withbasestring
, but by its name I can guess what it is.basestring
is not really used in Python3 - https://docs.python.org/3.0/whatsnew/3.0.html.