Skip to content

Instantly share code, notes, and snippets.

@VictorTaelin
Last active April 17, 2025 08:38
Show Gist options
  • Save VictorTaelin/45440a737e47b872d7505c6cda27b6aa to your computer and use it in GitHub Desktop.
Save VictorTaelin/45440a737e47b872d7505c6cda27b6aa to your computer and use it in GitHub Desktop.
INVERT A BINARY TREE - $10k AI REASONING CHALLENGE (v2)

THE PROBLEM

🌲 Invert a binary tree! 🌲

Except with 3 catches:

  1. It must invert the keys ("bit-reversal permutation")
  2. It must be a dependency-free, pure recursive function
  3. It must have type Bit -> Tree -> Tree (i.e., a direct recursion with max 1 bit state)

WHY THIS PROBLEM?

  • It is somehow NOT on the internet. (AFAIK)
  • Humans can solve it. (I've done it in ~1h.)
  • It requires reasoning. (My head hurts!)
  • The solution is simple. (7 lines of code!)
  • Obvious pre-requisite to automate CS research.
  • Honestly, it would make me believe I'll be automated.

THE CHALLENGE

I claim no AI will EVER solve this problem. If you prove me wrong, HOC will grant you $10k!

RULES

  • You must give it an approved prompt, nothing else.
  • It must output a correct solution, passing all tests.
  • You can use any software or AI model.
  • You can let it "think" for as long as you want.
  • You can propose a new prompt, as long as:
    • It imposes equivalent restrictions.
    • It clearly doesn't help the AI.
    • Up to 1K tokens, all included.
  • Common sense applies.

APPROVED PROMPTS

Agda Version

{-# OPTIONS --no-termination-check #-}

-- Let Tree be a Perfect Binary Tree:

data Nat : Set where
  Z : Nat
  S : Nat  Nat
{-# BUILTIN NATURAL Nat #-}

data Bit : Set where
  O : Bit
  I : Bit

data Tree (A : Set) : Nat  Set where
  N :  {d}  Tree A d  Tree A d  Tree A (S d)
  L : Nat  Tree A Z

-- Your goal is to implement an 'invert' function that performs a bit-reversal
-- permutation on a Tree, respecting the following limitations:
-- 1. You can NOT define or use any function other than 'invert'.
-- 2. You can NOT use any type not defined above (Nat, Bit and Tree).
-- 3. You can NOT use loops (but you can call 'invert' recursively).
-- 4. You can NOT use mutability. It must be a pure Agda function.
-- 5. You can use 1 bit of state (as an extra argument).
-- 6. You can use pattern-matching and constructors freely.
-- 
-- Example:
-- input  = (N(N(N(L 0)(L 1))(N(L 2)(L 3)))(N(N(L 4)(L 5))(N(L 6)(L 7))))
-- output = (N(N(N(L 0)(L 4))(N(L 2)(L 6)))(N(N(L 1)(L 5))(N(L 3)(L 7))))
-- Because that's the bit-reversal permutation of the original tree.
-- 
-- Now, complete the program below, with a valid implementation of 'invert':
invert :  {A d}  Bit  Tree A d  Tree A d

TypeScript Version

type Nat     = number;
type Bit     = false | true;
type Tree<A> = [Tree<A>, Tree<A>] | Nat;

// Your goal is to implement an 'invert' function that performs a bit-reversal
// permutation on a Tree, respecting the following limitations:
// 1. You can NOT define or use any function other than 'invert'.
// 2. You can NOT use any type not defined above (Nat, Bit and Tree).
// 3. You can NOT use loops (but you can call 'invert' recursively).
// 4. You can NOT use mutability. It must be a pure function.
// 5. You can NOT use primitive JS operators or functions.
// 6. You can use 1 bit of state (as an extra argument).
// 7. You can only use the operations allowed below.
// 
// Operations allowed:
// - Destructing (`const [a,b] = value`)
// - Variables (`const x = value`)
// - Branching (`if (x) { ... } else { ... }`)
// - Recursion (`invert(_, _)')
// - `Array.isArray`
// 
// All other operations are not allowed.
// 
// Example:
// input  = [[[[0,1],[2,3]],[[4,5],[6,7]]]]
// output = [[[[0,4],[2,6]],[[1,5],[3,7]]]]
// Because that's the bit-reversal permutation of the original tree.
// 
// Now, complete the program below, with a valid implementation of 'invert':
function invert<A>(bit: Bit, tree: Tree<A>): Tree<A> {
  ...
}

// A test:
const tree: Tree<Nat> = [[[[0,1],[2,3]],[[4,5],[6,7]]],[[[8,9],[10,11]],[[12,13],[14,15]]]];
console.log(JSON.stringify(invert(true, tree)));

CONCLUSION

✨ If it can't invert a tree, it won't solve P=NP. ✨

@VictorTaelin
Copy link
Author

As per the original post, modifications are allowed, as long as:

It clearly doesn't help the AI.

Given this is such a simple problem, telling it to use two recursive functions is giving it half of the solution, which is a quite significant help to the AI. I'll not consider a prompt that instructs it part of the algorithm as resolving this challenge.

That said, I do think this challenge will be "beaten" by trying enough prompts until one seed gives the right answer, and I'll gladly grant $10k for a solution that abides to the rules (I'm taking this as the cost of truth!!). But I think that'd just reinforce my point, as, ultimately, the solution was brute-force, not "reasoning". (This function can be found by symbolic proof search, with no LLMs whatsoever!)

@VictorTaelin
Copy link
Author

VictorTaelin commented Oct 14, 2024

Also, yes, this will obviously be answered correctly by future AIs, since they will be trained on this challenge :P but, once that happens (and the $10k is granted), I hope this helps people question whether there is, indeed, a key capacity which LLMs fundamentally lack. This function is ~10000x easier than proving the Poincare Conjecture, so, if the current gen AIs, produced with $100m training runs, can't do it, perhaps they aren't going to meaningfully contribute to research, as many seem to believe.

@msukiasyan
Copy link

Clarifying that the binary flag allows two unrelated functions to be defined is certainly some help but I don't think that amounts to significant help, at least from a human perspective. In fact, I'd guess you've arrived at this formulation by coming up with a two-function solution first, then merging them in one too. I imagine the challenge of finding a one-function single-input solution would stand longer.

Either way, it doesn't appear that Sonnet is using externalized "reasoning" of the type you're looking for when it's suggesting these solutions. It took me a good 15-20 min initially to figure out myself what solution you had in mind and I just don't think the current models can "reason" well enough and/or for long enough sustained periods to find the solution through "reasoning". I don't see any conceptual obstacle to future o1-style models getting there though.

@boutell
Copy link

boutell commented Oct 14, 2024

Meanwhile, can we get an edit to the prompt clarifying that the initial value of the bit flag will be "true" in the initial invocation all of the time?

@HamsterofDeath
Copy link

Except with 3 catches:

It must invert the keys ("bit-reversal permutation")
It must be a dependency-free, pure recursive function
It must have type Bit -> Tree -> Tree (i.e., a direct recursion with max 1 bit state)

Can someone explain these?

@boutell
Copy link

boutell commented Oct 17, 2024

@HamsterofDeath sure. Perhaps in the process we'll both get clarification from others who have seen the inside of a comp sci classroom in the last 30+ years.

"It must invert the keys ('bit-reversal permutation')"

First: we're inverting the keys, e.g. the indexes if you count from left to right over all the values, not the values themselves. It's confusing because in this input, given with the problem:

[[[[0,1],[2,3]],[[4,5],[6,7]]],[[[8,9],[10,11]],[[12,13],[14,15]]]];

The keys and the values just happen to be the same.

But in this input, which is just as valid, they are different:

[[3,2],[1,0]]

Here the keys, aka indexes, are (because there are 4 values):

0,1,2,3

And the values are:

3,2,1,0

Second: "inverting" means flipping each binary bit. So, converting each value to binary, inverting the bits, and going back to decimal we have:

0 -> 00 -> 11 -> 3
1 -> 01 -> 10 -> 2
2 -> 10 -> 01 -> 1
3 -> 11 -> 00 -> 0

Because we are "inverting the keys," we must move the value at index 0 to index 3, the value at index 1 to index 2, the value at index 2 to index 1, and the value at index 3 to index 0. So the end result in my tiny example would be:

[[0,1],[2,3]]

(It is only a coincidence that these values are consecutive, don't read anything into that)

"It must be a dependency-free, pure recursive function"

You can't pull in libraries of code from anywhere.

You can't modify any global variables or anything else external to the function itself.

You can only define one function.

That function (this is a free hint the instructions are giving you) is going to have to be recursive — it is going to have to call itself.

"It must have type Bit -> Tree -> Tree (i.e., a direct recursion with max 1 bit state)"

Just going by the TypeScript prompt here: it's a function that takes (bit, tree) and returns a new tree in which the transformation has been done. "bit" is a boolean (true or false), "tree" is an array with two elements either of which could be a number or another, nested array.

Finally, one thing that is not specified:

"The initial value of bit is true"

It seems to me that if the bit parameter is meant to be useful, its initial value on the first call must be known, and in the only example given that mentions the bit parameter it is true. So I think we can assume it starts out as true.

Of course, don't forget the rest of the rules given with the problem, which are quite specific about what we (and/or the bots) can't use.

I hope that this is helpful (and correct).

@VictorTaelin
Copy link
Author

@boutell beautifully explained

@youqad
Copy link

youqad commented Dec 9, 2024

Hi @VictorTaelin, I think the full version of o1 finally got it, on my side! 🎉
https://chatgpt.com/share/675735e2-ae68-800a-9394-6e150f809a69

(also reported here, along with some other failed attempts with other models)

I created a repo to try to automatize the process, but I had no success with claude-3-5-sonnet (including the new version), o1-mini, and even o1-preview.

Unfortunately, I don't have API access to the full o1 yet, so I had to go back to trying by hand (with no system prompt of course, I'm not sure how I can prove that?): it failed on the Haskell prompt but, to my surprise, worked first-try on the Python one!
(and just to be sure, I used a little script to test the proposed solution with Hypothesis, but let me know if I made a mistake)

@TheMagnumOpuss
Copy link

TheMagnumOpuss commented Dec 9, 2024

Hi @VictorTaelin, i believe the o1 pro mode has succeeded, and I think it is following your rules. The prompt used was basically msukiasyan's, but with your correction points taken into account. link: https://chatgpt.com/share/675725a8-1de8-800c-80de-e065b712b867

The model's cutoff is 2023, so this challenge is still new to it.

@lhemerly
Copy link

Using copilot code edit in vscode, main.py:

from typing import List

class Node:
    def __init__(self, key: int, value: int):
        self.key = key
        self.value = value

    def __str__(self):
        return f"{self.key}: {self.value}"

    def __repr__(self):
        return self.__str__()


class Tree:
    def __init__(self, left=None, right=None):
        self.left = left
        self.right = right


def bit_reversal_permutation(n: int) -> List[int]:
    # This function generates a bit-reversal permutation of the given length.
    # @param n: int
    # @return: List[int]
    return []

def invert(tree):
    # This function inverts the tree keys using bit-reversal permutation, but not the values by swapping the keys of the nodes in the tree.
    # @param tree: Tree
    # @return: Tree
    # @example: invert(Tree(Tree(Node(0, 3), Node(1, 2)), Tree(Node(2, 1), Node(3, 0))) -> Tree(Tree(Node(0, 0), Node(1, 1)), Tree(Node(2, 2), Node(3, 3)))
    return tree


def test():
    input = Tree(Tree(Node(0, 3), Node(1, 2)), Tree(Node(2, 1), Node(3, 0)))

    expected_output = Tree(Tree(Node(0, 0), Node(1, 1)), Tree(Node(2, 2), Node(3, 3)))

    print(invert(input) == expected_output)

Prompt in chat:

You should read the code, understand what is needed to pass the test and implement whatever is necessary to make it pass, but it should also pass any general case related. You are allowed to make edits to existing code if needed

Tested on Claude 3.5 Sonnet Preview and GPT4o. Both models implement eq in the classes, implement the bit_reversal_permutation and correctly implement the invert function.

I also tested it on o1 using the chatgpt interface with a bit different prompt (same "code prompt"), you can see it here:
o1 solves bit-reversal inversion of binary tree

A few notes:
I was tinkering with it and forgot to readd a few type hints. My bad, but it worked anyway.
The original example input and output was confusing the LLM a lot. Most thinking was trying to understand it. By migrating to the simpler example with defined classes it performed much better.
It struggled when I did not create the placeholder bit_reversal_permutation function.

My 2 cents:
Even though the problem definition may seem trivial for a CS Grad, I usually find that LLMs perform better when you define the problem as a non-specialist would. By "dumbing down" a few concepts you eventually contribute to attention and CoT. I've seen similar effects on logic benchmarks.

@VictorTaelin
Copy link
Author

Hey, just letting you know I'm very busy right now so I can't read the new replies yet. I think o3 will be able to solve it, so, around its release I'll find some time to read this thread and whoever was the first to point to an AI capable of inventing this function without human intervention/prompting (i.e., proving me wrong) will be granted 10k (it could be as simple as "hey o3 does it (link to a chat without system prompt)". I also think a reasoning AI in a loop with a function calling environment would easily be able to tackle this too, and I'd love to see an attempt like that. In any case around o3's release I'll check this thread again

@akshit-1729
Copy link

hey kimi 1.5 does this : (Click the link to view conversation with Kimi AI Assistant https://kimi.ai/share/cuciunv60ra4c2kt60fg)

the following is the code that it gives (the test cases are added by me, it just gave the invert function as wanted.

def invert(ary, level=0):
if isinstance(ary, int) or len(ary) == 0:
return []
if level == 0:
if len(ary) == 1:
return ary.copy()
mid = len(ary) // 2
left = invert(ary[:mid], level)
right = invert(ary[mid:], level)
return invert([left, right], level=1)
else:
left, right = ary[0], ary[1]
if not left and not right:
return []
head_left = [] if not left else [left[0]]
head_right = [] if not right else [right[0]]
new_left = left[1:] if len(left) > 1 else []
new_right = right[1:] if len(right) > 1 else []
return head_left + head_right + invert([new_left, new_right], level=1)

input1 = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
output1 = invert(input1)
print(output1)

Test with the second example

input2 = [1,1,0, 6, 9, 'D', 'A', 4, 5, 'E', 0, 5, 3, 'F', 8, 8]
output2 = invert(input2)
print(output2)

input3 = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32]
output3 = invert(input3)
print(output3)

input4 = ['A', 7, 'E', 9, 9, 6, 'F', 'B', 6, 3, 3, 3, 'D', 8, 2, 'E']
output4 = invert(input4)
print(output4)

@TheMagnumOpuss
Copy link

On the first attempt, and I believe o3-mini-high managed to solve it. Here is the link to the conversation. (Sorry for the language, pt-br, but just ask it to translate the text.)

Link: https://chatgpt.com/share/679d3a17-2f14-800c-b423-059104cc1fd5

@afafw
Copy link

afafw commented Feb 1, 2025

o3-mini & o3-mini-high; Vanila prompt, without any additonal system prompt. Ensured everything else was disabled including memory and stuff.
Haven't checked the code yet:
o3-mini Link: https://chatgpt.com/share/679d649f-8298-8002-8bba-d1a25ca68c4d
o3-mini-high: https://chatgpt.com/share/679d658b-2a20-8002-99f3-29275b734182

@afafw
Copy link

afafw commented Feb 1, 2025

o3-mini & o3-mini-high; Vanila prompt, without any additonal system prompt. Ensured everything else was disabled including memory and stuff. Haven't checked the code yet: o3-mini Link: https://chatgpt.com/share/679d649f-8298-8002-8bba-d1a25ca68c4d o3-mini-high: https://chatgpt.com/share/679d658b-2a20-8002-99f3-29275b734182

Update: I think both code generated are WRONG.

Kind of amazing to see o3 still struggle at this.

Will give it a few more attempts.

@afafw
Copy link

afafw commented Feb 1, 2025

Update Update: Tried a few times with the prompt provided. Kimi/R1/O3-mini/O3-mini-high still aren't able to resolve the issue.
Where Kimi/R1 showed similar pattern of which Qwen would solve the problem(which is still wrong).
O3-mini/O3-mini-high. Although having a different way of doing it, but still. It is wrong.
tested on case:
[[[[0,1],[2,3]],[[4,5],[6,7]]],[[[8,9],[10,11]],[[12,13],[14,15]]]]
to:
[[[[0,8],[4,12]],[[2,10],[6,14]]],[[[1,9],[5,13]],[[3,11],[7,15]]]]

@TheMagnumOpuss
Copy link

Finally, we have o3, and here it is in all its glory:
https://chatgpt.com/share/67fff234-1870-800c-9856-15f61dad609a

@TheMagnumOpuss
Copy link

Now the o4-mini, but it searched the internet.
https://chatgpt.com/share/67fff5cf-83ac-800c-ab4e-0a47301f3cac

@TheMagnumOpuss
Copy link

Now the o4-mini-high, and without internet (apparently):
https://chatgpt.com/share/67fff8a3-1154-800c-8dad-ef4aaf13636a

@youqad
Copy link

youqad commented Apr 17, 2025

This challenge is now solved without resorting to the web chat interface! https://gist.github.com/youqad/02a36419cbc4725601bffc05f14b3947

On 9 December, the full o1 solved it (first solution publicly submitted with no system prompt, to my knowledge) on the web interface: https://gist.github.com/VictorTaelin/45440a737e47b872d7505c6cda27b6aa?permalink_comment_id=5328789#gistcomment-5328789

And since then, other people have proposed solutions, but also on the web interface (which doesn't allow us to check the system prompt, unfortunately).

Now, here is a fully working and verifiable solution with o3, based on my repository (https://github.com/youqad/bit-reversal-trees):

📦 Published to https://wandb.ai/ox/bit-reversal-trees/weave/objects/0196425c-0650-7df2-8d57-05d272f6d111/versions/KOTE3EQkUWYSdy046vDxHGbma9A8gkuxXUHNpZ3tHH0 and visible on WandB Weave with Call ID: 0196425c-0650-7df2-8d57-05d272f6d111 🚀

def invert(tree: Tree) -> Tree:
    """
    Return a new perfect‑binary tree whose leaves are arranged according to
    the bit‑reversal permutation of their paths   (left = 0, right = 1).

    Mandatory constraints – all satisfied:
        1.  Only the inner   invert_helper(tree, flag)   is used.
        2.  invert_helper is a single, pure, *recursive* function; it calls
            no other helper, contains no loops and causes no side effects.
        3.  The public call is   invert_helper(tree, True) .
    """

    def invert_helper(t: Tree, flag: bool) -> Tree:
        # -------- leaf ---------------------------------------------------- #
        if isinstance(t, Leaf):
            return t

        # ------------------------------------------------------------------ #
        #  flag == True   : full bit‑reversal of  the  subtree  't'.
        #  flag == False  : *weave* two already bit‑reversed sub‑trees so
        #                   that their leaves are perfectly interleaved
        #                   (LSB that has just become the MSB).
        # ------------------------------------------------------------------ #
        if flag:                                            # bit‑reversal
            left_rev  = invert_helper(t.left,  True)        # reverse both halves
            right_rev = invert_helper(t.right, True)
            # After each half has been bit‑reversed we only have to
            # weave them together; the weaving is performed by a single
            # recursive call in the   flag == False   mode.
            return invert_helper(Node(left_rev, right_rev), False)

        # --------------------- weaving mode ------------------------------- #
        #  Children are leaves → nothing more to weave; just join them.
        if isinstance(t.left, Leaf):                        # depth == 1
            return Node(t.left, t.right)

        #  Non‑trivial case – recurse pairwise on both sides.
        return Node(
            invert_helper(Node(t.left.left,  t.right.left),  False),
            invert_helper(Node(t.left.right, t.right.right), False)
        )

    # single call required by the statement
    return invert_helper(tree, True)

It passes all the Hypothesis tests in https://github.com/youqad/bit-reversal-trees/tree/main/python, so I'm pretty positive it works!
(You can also double-check this with this script)

@youqad
Copy link

youqad commented Apr 17, 2025

Finally, we have o3, and here it is in all its glory: https://chatgpt.com/share/67fff234-1870-800c-9856-15f61dad609a

This doesn't work:
Original : [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]
Expected : [0,8,4,12,2,10,6,14,1,9,5,13,3,11,7,15]
But your solution gave : [0,8,2,10,4,12,6,14,1,9,3,11,5,13,7,15]

@youqad
Copy link

youqad commented Apr 17, 2025

And now the Haskell version (also found by o3, in one shot, without the web chat UI)!
https://gist.github.com/youqad/8a7f91650a2da951f4f39455cdccbbe8

invert :: Tree a -> Tree a
invert t = invertHelper True t
  where
    invertHelper :: Bool -> Tree a -> Tree a
    invertHelper True (Leaf x)          = Leaf x
    invertHelper True (Node l r)        =
        let l' = invertHelper True l
            r' = invertHelper True r
        in  invertHelper False (Node l' r')

    invertHelper False (Leaf x)         = Leaf x
    invertHelper False (Node a b) =
        case (a, b) of
          (Leaf _, Leaf _) -> Node a b
          (Node a1 a2, Node b1 b2) ->
              Node (invertHelper False (Node a1 b1))
                   (invertHelper False (Node a2 b2))
          _ -> error "invert: non‑perfect tree encountered"

Fully tested with QuickCheck (cf. https://github.com/youqad/bit-reversal-trees/blob/main/test/DynamicSpec.hs), so the challenge is now solved for Python and Haskell.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment