codegate 2020 teaser / verifier

This is the Write-Up from the CTF codegate 2020 teaser.

concept

This problem analyzes, predicts and executes the user's expression. Various expressions are supported, starting from arithmatic operations to things such as loops, conditionals, and randoms. User can create a variable, assign a value to it, or print its value. The After parsing the input without syntax errors, it predicts the script's behavior by simulating it, preventing malicous behaviours such as reference to an undefined variable, or print of a negative number, which is going to be our main topic. The behaviours of variables are predicted using a domain, which is really a domain of all the numbers that a variable can be.

For example, assigning the random from 1 to 10 will make the variable's domain to [1,10]. Using the ternary operator of x > 5 to x = [1, 10] will split this domain into two, a True domain of x = [6, 10] and a False domain of x = [1, 5]. And adding two domains [1,3] and [2,5] will produce the largest possible domain [3,8].

problem

The problem is compromised of five different python scripts. These scripts use the Python Lex-Yacc library to parse the user input. The prob.py script contains the interactive part, parsing the input and calling a_interp and interp on it. These functions are to be discussed later. The domain.py contains the Interval class, which is used to express the domain of a variable. The lex.py contains the token data for the ply.lex. The parser.py defines the operations of the variables.

The ast.py defines all the operations of the expressions,and prediction functions of it. The prediction functions are called a_interp and the real execution functions are called interp. The user-defined variables are stored as a dictionary, and it references and modify and extend it as needed.

goal

The goal of this problem can be found easily. In theast.py, this piece of code is the juicy part.

class Print(Comm):
    def __init__(self, expr):
        self.expr = expr

    def a_interp(self, env):
        a_val = self.expr.a_interp(env)
        if a_val.infimum < 0:
            raise ValueError("print domain error")
        return env

    def interp(self, env):
        value = self.expr.interp(env)

        if value < 0:
            with open('./flag') as f:
                print(f.read())
        print(value)
        return env

As you can see, the Print.interp method Opens, Reads, and then Writes the flag for you! However, the Print.a_interp method checks the domain of the value that can be printed. And if any negative number can be printed, it raises a ValueError and halts the script.

Thus, the goal should be clear. We have to make a expression that prints a negative number, and that isn't expected to print a negative number to the scripts' prediction.

solve

To achieve the above goal, we focused on finding the expression that is actually different in the prediction time and the runtime. Then, we found a strange point While class on the ast.py.

class While(Comm):
    def __init__(self, cond, comm):
        self.cond = cond
        self.comm = comm

    def a_interp(self, env):
        init_env = deepcopy(env)

        for i in range(3):
            tenv, _ = self.cond.a_interp(env)
            if tenv is not None:
                tenv = self.comm.a_interp(tenv)
            env = env_join(env, tenv)

        tenv, _ = self.cond.a_interp(env)
        if tenv is not None:
            tenv = self.comm.a_interp(tenv)
        env = env_widen(env, tenv)

        tenv, _ = self.cond.a_interp(env)
        if tenv is not None:
            tenv = self.comm.a_interp(tenv)
        env = env_join(init_env, tenv)
        _, fenv = self.cond.a_interp(env)

        if fenv is None:
            raise RuntimeError("loop analysis error")

        return fenv

    def interp(self, env):
        global loop_count
        cond = self.cond.interp(env)

        while cond:
            env = self.comm.interp(env)
            cond = self.cond.interp(env)
            loop_count += 1
            if loop_count > 10000:
                raise RuntimeError("infinite loop error")
        loop_count = 0

        return env

There is a fixed number 3 in the While.a_interp, which is basicly irrelevant to the variables nor the loop_count. It checks only for the first three repeats of the loop. So, we made a simple expression that prints a negative number at the 4th repeat.

i = 2;
[
    i > -5 
    {
        !i; # print
         i = i - 1
    }
]

Simple one that you can test here. i = 2 ; [i > -5 {!i; i = i - 1 }]

However, this was not possible becuase the following env = env_widen(env, tenv) caused the Print Domain Error again. The env_widen() is processed in the Interval.widen() of the domain.py.

def widen(self, other):
    w_infimum = self.infimum if self.infimum <= other.infimum else -inf
    w_supremum = self.supremum if other.supremum <= self.supremum else inf
    return Interval(w_infimum, w_supremum)

In short, if the value of a variable decreases in the 4th repeat, the script assumes that the value of this variable can decrease further, and expands it's domain to -inf. This has to be bypassed. So we wrote an expression that makes a hill, which increases till the 4th repeat, and decreases afterwards. The domain of the variable i will be -inf in the prediction, but it doesn't matter as i is not a number which gets printed.

i = 3;
b = 0;
[
    i < 3
    {
        i == -5 ?
            {
                !b #print
            } :
            {
                .  #do nothing
            }
    };
    b = b + i;
    i = i - 1
]

Simple one that you can test here. i = 3 ; b = 0 ; [ i <= 3 { i == -5 ? {!b} : { . } ; b = b + i ; i = i - 1 } ]

However, This didn't work either. Now, the script was emitting the loop analysis error now. The problem is happening after the env = env_join(init_env, tenv). It joins the domains of each variables from the state after the prediction and the initial state of the loop. Then, _, fenv = self.cond.a_interp(env) checks if there is a case where the loop ends. In that last check, our expression has the domains of the each variables, {'b': [-inf, inf], 'i': [-inf, 3]}. So the expression above can be easily modified to give such a range that doesn't satisfy the condition. Take a look at the following expression.

i = 3;
b = 0;
[
    i > -10 #changed!
    {
        i == -5 ? 
            {
                !b #print
            } :
            {
                .  #do nothing
            }
    };
    b = b + i;
    i = i - 1
]

Simple one that you can test here. i = 3 ; b = 0 ; [ i > -10 { i == -5 ? {!b} : { . } ; b = b + i ; i = i - 1 } ]
This expression should successfully print out -4, thus granting the long waited flag. The payload we succeed for the first time is a bit more complicated, but it's concept were the same.

Simple one that you can test here. i = 0 ; b = 1 ; [ i < 13 { i < 4 ? {b = b + 1} : { i > 9 ? { b = b + 1} : { b= b - 1 } } ; !b; i = i + 1 } ]

Stfort52/verifier.md

codegate 2020 teaser / verifier

concept

problem

goal

solve