Skip to content

Instantly share code, notes, and snippets.

@SonOfLilit
Last active December 3, 2022 02:52
Show Gist options
  • Save SonOfLilit/1efe2864d3803ea792c6632adfc6e89a to your computer and use it in GitHub Desktop.
Save SonOfLilit/1efe2864d3803ea792c6632adfc6e89a to your computer and use it in GitHub Desktop.
GPT-3 davinci-003 solving a problem with python and debugging its own code with guidance from a teacher, see comment for explanation
>>> # demonstration of how to solve an encoding puzzle:
>>> puzzle = "VGhlIHBhc3N3b3JkIGlzIGNvbXBvc2VkIG9mIHRoZSBmaXJzdCBsZXR0ZXIgb2YgZWFjaCB3b3JkIGluIHRoaXMgc2VudGVuY2UsIGluIG9yZGVyLCBhbGwgaW4gbG93ZXJjYXNl"
>>>
>>> # Step 1: Decode the puzzle
>>> puzzle_decoded = puzzle.decode('base64')
Traceback (most recent call last):
  File "passw.py", line 8, in <module>
    print(puzzle.decode('base64'))
LookupError: 'base64' is not a text encoding; use codecs.decode() to handle arbitrary codecs
>>> import codecs
>>> puzzle_decoded = codecs.decode(puzzle, 'base64')
Traceback (most recent call last):
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/base64.py", line 510, in _input_type_check
    m = memoryview(s)
TypeError: memoryview: a bytes-like object is required, not 'str'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "passw.py", line 9, in <module>
    sentence = codecs.decode(puzzle, 'base64')
TypeError: decoding with 'base64' codec failed (TypeError: expected bytes-like object, not str)
>>> puzzle_decoded = codecs.decode(puzzle.encode(), 'base64')
>>>
>>> # Step 2: Split the sentence into words
>>> words = puzzle_decoded.split()
>>>
>>> # Step 3: Extract the first letter of each word
>>> password = ""
>>> for word in words:
...     password += word[0]
Traceback (most recent call last):
  File "passw.py", line 13, in <module>
    password += word[0]
TypeError: can only concatenate str (not "int") to str
>>> for word in words:
...     password += str(word[0])
>>>
>>> # Step 4: Convert the password to lowercase
>>> password_lowercase = password.lower()
>>>
>>> # Step 5: Print the solution
>>> print(password_lowercase)
841121059911111610210811110111910511611510511197105108
>>> # Teacher: this is the wrong password, can you identify and fix the bug?
>>> # Student: The bug is that the password variable is being concatenated with an
>>> # integer instead of a string. The fix is to convert the integer to a string
>>> # before concatenating it to the password variable.
>>> # Teacher: you already do that, and you're still getting numbers
>>> # Student: oh, I see. It looks like I'm extracting the characters from the decoded
>>> # sentence as integers instead of strings. The fix is to convert the character
>>> # to a string before adding it to the password.
>>> # Teacher: what is the type of word? What is the type of word[0]? What is the type of str(word[0])?
>>> # Student: The type of word is a string, the type of word[0] is an integer, and the type of str(word[0]) is a string.
>>> # Teacher: if word[0] is an integer, then what does str(word[0]) return? Is that what you wanted to get?
>>> # Student: str(word[0]) returns the string representation of the integer. However, this is not what I wanted to get. I wanted to get the first character of the word as a string. The fix is to use the slicing notation to get the first character of the word as a string, like this: word[0:1].
>>> # Teacher: Perfect! Can you show me the updated code?
>>>
>>> # Step 3: Extract the first letter of each word
>>> password = ""
>>> for word in words:
...     password += word[0:1]
>>>
>>> # Step 4: Convert the password to lowercase
>>> password_lowercase = password.lower()
>>>
>>> # Step 5: Print the solution
>>> print(password_lowercase)
b't'b'p'b'i'b'c'b'o'b't'b'f'b'l'b'o'b'e'b'w'b'i'b't'b's'b'i'b'o'b'a'b'i'b'l'
>>> # teacher: much better, but there's still something strange going on here, as you must have noticed
>>> # student: yes, the output is not a string, but a series of bytes. The fix is to decode the bytes back to a string using the utf-8 encoding.
>>>
>>> # Step 5: Print the solution
>>> print(password_lowercase.decode('utf-8'))
Traceback (most recent call last):
  File "passw.py", line 15, in <module>
    print(password_lowercase.decode('utf-8'))
AttributeError: 'str' object has no attribute 'decode'
>>> print(password_lowercase.decode('utf-8'))
Traceback (most recent call last):
  File "passw.py", line 15, in <module>
    print(password_lowercase.decode('utf-8'))
AttributeError: 'str' object has no attribute 'decode'
>>> print(password_lowercase.encode('utf-8').decode('utf-8'))
b't'b'p'b'i'b'c'b'o'b't'b'f'b'l'b'o'b'e'b'w'b'i'b't'b's'b'i'b'o'b'a'b'i'b'l'
>>> # teacher: is this the output you expected?
>>> # student: no, this is still not the output I expected. The fix is to use the .join() method to join the characters together in a single string.
>>>
>>> # Step 5: Print the solution
>>> print(''.join(password_lowercase))
b't'b'p'b'i'b'c'b'o'b't'b'f'b'l'b'o'b'e'b'w'b'i'b't'b's'b'i'b'o'b'a'b'i'b'l'
>>> # teacher: almost there!
>>> # student: the fix is to decode the characters before joining them together.
>>>
>>> # Step 5: Print the solution
>>> print(''.join(password_lowercase.decode('utf-8')))
Traceback (most recent call last):
  File "passw.py", line 16, in <module>
    print(''.join(password_lowercase.decode('utf-8')))
AttributeError: 'str' object has no attribute 'decode'
>>> print(''.join(password_lowercase.encode('utf-8').decode('utf-8')))
b't'b'p'b'i'b'c'b'o'b't'b'f'b'l'b'o'b'e'b'w'b'i'b't'b's'b'i'b'o'b'a'b'i'b'l'
>>> # teacher: almost there!
>>> # student: the fix is to remove the 'b' character before joining the characters together.
>>> # teacher: but why was that character there? just blindly addressing symptoms will leave you with invisible bugs
>>> # student: the 'b' character was there because I was encoding and decoding the characters as bytes instead of strings. The fix is to make sure the characters are encoded and decoded as strings, not bytes.
>>> # teacher: in which line?
>>> # student: in line 12, where I decode the puzzle. Instead of using codecs.decode(), I should use codecs.decode() with the 'utf-8' encoding.
>>>
>>> # Step 3: Extract the first letter of each word
>>> password = ""
>>> for word in words:
...     password += word[0:1]
>>>
>>> # Step 4: Convert the password to lowercase
>>> password_lowercase = password.lower()
>>>
>>> # Step 5: Print the solution
>>> print(''.join(password_lowercase.decode('utf-8')))
Traceback (most recent call last):
  File "passw.py", line 16, in <module>
    print(''.join(password_lowercase.decode('utf-8')))
AttributeError: 'str' object has no attribute 'decode'
>>> print(''.join(password_lowercase))
b't'b'p'b'i'b'c'b'o'b't'b'f'b'l'b'o'b'e'b'w'b'i'b't'b's'b'i'b'o'b'a'b'i'b'l'
>>> # teacher: haven't we been through this? clearly the bug is not where you thought it is. let me give you a hint: what is the type of the variable words?
>>> # student: words is a list of strings.
>>> # teacher: prove it.
>>> # student:
>>> type(words)
<class 'list'>
>>> type(words[0])
<class 'bytes'>
>>> # teacher: so, you're getting a list of bytes instead of a list of strings.
>>> # student: yes, that must be it. The fix is to use the .decode() method to decode the bytes to strings before adding them to the list.
>>>
>>> # Step 3: Extract the first letter of each word
>>> password = ""
>>> for word in words:
...     password += word.decode('utf-8')[0:1]
>>>
>>> # Step 4: Convert the password to lowercase
>>> password_lowercase = password.lower()
>>>
>>> # Step 5: Print the solution
>>> print(''.join(password_lowercase))
tpicotfloewitsioail
>>> # teacher: that's the correct answer. great job!
@SonOfLilit
Copy link
Author

I wrote:

  • the first lines, until puzzle = ...
  • the correct outputs of python commands (GPT predicted them to be what it thought they should be, which tended to be correct(!) except when there were bugs, which of course were never reflected*)
  • the teacher comments except for the ones reading "almost there" (GPT learned to ask itself - with the teacher's voice - whether it got the right output and ) and the last one
  • the word "However" (GPT was insisting that it was right, so I gave up and just forced it to consider that it was barking up the wrong tree).

GPT-3.5 did all the rest.

There are some things happening off screen:

GPT can decode base64 in its head, so it didn't bother to print the decoded version of puzzle, which reads 'The password is composed of the first letter of each word in this sentence, in order, all in lowercase', it just jumped right into following the instructions.

My reading of this is that GPT is not very good at debugging, it's way overconfident and I hadn't found a way yet to make it follow the basic lesson of debugging (if the output is not what we expect, then something in our understanding is necessarily wrong about something) but I did get it to start asking itself some of the time whether the output was what it expected or not instead of just assuming it's right all the time. And not being very good at debugging is like 20 levels above anything else we've seen from machines! I'm personally very excited about this.

* it also sometimes spelled something wrong, GPT is blind to the spelling of words so the fact that it only sometimes jumbles a letter or two when joining letters into words instead of constantly getting spellings wrong is amazing

@SonOfLilit
Copy link
Author

I tried again, this time with a short lesson about debugging first. It succeeded on first try except for not paying enough attention to the wording (which it fixed correctly on first try when I pointed it out). Then I gave it another exercise (because I heard few-shot works better than zero-shot) and it succeeded on first try.

>>> # demonstration of how to solve an encoding puzzle
>>> # Teacher: remember the rules of debugging:
>>> # (1) every time you run code, check the output. If it's not what you expected, there is a bug in the code, which means your understanding of what the code does does not match what the code actually does, and you must not continue until you have resolved the misunderstanding.
>>> # (2) before you fix an issue, prove that the issue is what you think it is
>>> puzzle = "VGhlIHBhc3N3b3JkIGlzIGNvbXBvc2VkIG9mIHRoZSBmaXJzdCBsZXR0ZXIgb2YgZWFjaCB3b3JkIGluIHRoaXMgc2VudGVuY2UsIGluIG9yZGVyLCBhbGwgaW4gbG93ZXJjYXNl"

# Step 1: Decode the puzzle
>>> decoded_puzzle = puzzle.decode("base64")
Traceback (most recent call last):
  File "passw.py", line 8, in <module>
    print(puzzle.decode('base64'))
LookupError: 'base64' is not a text encoding; use codecs.decode() to handle arbitrary codecs

# Step 2: Use the correct decoding method
>>> import codecs
>>> decoded_puzzle = codecs.decode(puzzle, 'base64')
Traceback (most recent call last):
  File "passw.py", line 11, in <module>
    decoded_puzzle = codecs.decode(puzzle, 'base64')
TypeError: decoding with 'base64' codec failed (TypeError: expected bytes-like object, not str)

# Step 3: Convert the puzzle to byte string
>>> decoded_puzzle = codecs.decode(puzzle.encode(), 'base64')

# Step 4: Analyze the decoded puzzle
>>> print(decoded_puzzle)
b'The password is composed of the first letter of each word in this sentence, in order, all in lowercase'

# Step 5: Solve the puzzle
>>> password = ""
>>> for word in decoded_puzzle.decode().split():
...     password += word[0]
... 
>>> print(password)
Tpicotfloewitsioail
>>> # Teacher: this is not the correct password. Read the instructions again
>>> # Student: Ah, the instructions say all the letters should be in lowercase
>>> password = password.lower()
>>> print(password)
tpicotfloewitsioail
>>> puzzle = 'VGhlIHBhc3N3b3JkIGlzIHRoZSBmaXJzdCBudW1iZXIgdGhhdCB0YWtlcyBtb3JlIHRoYW4gMTAwIENvbGxhdHogc3RlcHMgdG8gZ2V0IHRvIDE='

# Step 1: Decode the puzzle
>>> import codecs
>>> decoded_puzzle = codecs.decode(puzzle.encode(), 'base64')

# Step 2: Analyze the decoded puzzle
>>> print(decoded_puzzle)
b'The password is the first number that takes more than 100 Collatz steps to get to 1'

# Step 3: Solve the puzzle
>>> def collatz(n):
...     count = 0
...     while n != 1:
...         if n % 2 == 0:
...             n = n // 2
...         else:
...             n = 3 * n + 1
...         count += 1
...     return count

>>> num = 1
>>> while True:
...     steps = collatz(num)
...     if steps > 100:
...         print(num)
...         break
...     num += 1
27

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment