Skip to content

Instantly share code, notes, and snippets.

@EvanMcBroom
Last active November 3, 2024 03:42
Show Gist options
  • Save EvanMcBroom/ace2a9af19fb5e7b2451b1cd4c07bf96 to your computer and use it in GitHub Desktop.
Save EvanMcBroom/ace2a9af19fb5e7b2451b1cd4c07bf96 to your computer and use it in GitHub Desktop.
Encrypting Strings at Compile Time

Encrypting Strings at Compile Time

Thank you to SpecterOps for supporting this research and to Duane and Matt for proofreading and editing! Crossposted on the SpecterOps Blog.

TLDR: You may use this header file for reliable compile time string encryption without needing any additional dependencies.

Programmers of DRM software, security products, or other sensitive code bases are commonly required to minimize the amount of human readable strings in binary output files. The goal of the minimization is to hinder others from reverse engineering their proprietary technology.

Common approaches that are taken to meet this requirement often add an additional maintenance burden to the developer and are prone to error. These approaches will be presented along with their drawbacks. An alternative solution will also be presented which targets the following goals:

  • A minimalistic implementation to ease integration into projects
  • A simple usage design to avoid programmer error
  • Builtin randomization to hinder automated string recovery

Common Approaches

Separate utilities are commonly built to precompute obfuscated strings for use in source code. Such tools will generate a header file or other output that must be manually added to and referenced in projects. The use of these tools may be automated with a toolchain but they will not integrate well with IDEs and they are tedious to maintain as more strings are added. They also tend to obfuscate strings in a uniform way that can be easily identified and reversed in an automated fashion.

In a similar manner, utilities are also commonly built to precompute string hashes for use in comparisons. One of the earliest examples of this is documented in "Win32 Assembly Components."1 These tools are also tedious to maintain as more strings are added but they can now be completely eliminated by hashing strings at compile time as described in a previous post.

Lastly, some development teams attempt to remove the use of strings entirely. Needless to say this is an impossible standard to maintain for any large or long lasting project with any amount of developer turnover.

An Alternative Solution

Modern C++ features may be used to encrypt strings at compile time which can greatly reduce the maintenance overhead for developers. There are several libraries that claim to support this use case. Unfortunately, they rarely work in practice. The few that do require BOOST libraries which may not be an option due to development constraints.2 So we will build our own!

We will first make a basic function for compile time string encryption which we can later improve upon. The below crypt function will convert a string literal into an encrypted blob and the make_string macro wraps crypt to ensure that it is used correctly to be evaluated at compile time.

template<typename T, size_t N>
struct encrypted {
    T data[N];
};

template<size_t N>
constexpr auto crypt(const char(&input)[N]) {
    encrypted<char, N> blob{};
    for (uint32_t index{ 0 }; index < N; index++) {
        blob.data[index] = input[index] ^ 'A';
    }
    return blob;
}

#define make_string(STRING) ([&] {            \
    constexpr auto _{ crypt(STRING) };        \
    return std::string{ crypt(_.data).data }; \
}())

The make_string macro will also expand to a single lambda expression which can be used for any variable assignment and argument passing operation.

void main() {
    std::string string1{ make_string("String 1") };
    std::string string2 = make_string("String 2");
    func(make_string("String 3"));
}

Improving the Solution

The previous solution would be easy to integrate and use in projects but it would also be easy for a reverse engineer to undo. It is essentially a XOR cipher with a static key. Once the key is identified the entire program can be XORed with it and then the original strings can be recovered using the humble strings utility.

Replacing the static key with a random bit stream would prevent this issue. We will now make a set of functions for generating such a stream at compile time. We will use Park-Miller's "Multiplicative Linear Congruential Generator" due to its simplicity to implement.3

constexpr uint32_t modulus() {
    return 0x7fffffff;
}

constexpr uint32_t prng(const uint32_t input) {
    return (input * 48271) % modulus();
}

We will also need a pseudorandom value to use as the initial input to prng. Admittedly, it is not easy to generate such a value at compile time but it can be accomplished using standard predefined macros such as __FILE__ and __LINE__. The below seed function can take these macros as input and reduce them to a single pseudorandom value to use with prng.

Note: These macros are defined by the ANSI C standard and are supported by all compilers. If you use a non-standard macro for entropy your mileage may vary.

template<size_t N>
constexpr uint32_t seed(const char(&entropy)[N], const uint32_t iv = 0) {
    auto value{ iv };
    for (size_t i{ 0 }; i < N; i++) {
        // Xor 1st byte of seed with input byte
        value = (value & ((~0) << 8)) | ((value & 0xFF) ^ entropy[i]);
        // Rotate left 1 byte
        value = value << 8 | value >> ((sizeof(value) * 8) - 8);
    }
    // The seed is required to be less than the modulus and odd
    while (value > modulus()) value = value >> 1;
    return value << 1 | 1;
}

The last thing that is required is to update our original crypt and make_string functions to use our random bit stream generator.

template<typename T, size_t N>
struct encrypted {
    int seed;
    T data[N];
};

template<size_t N>
constexpr auto crypt(const char(&input)[N], const uint32_t seed = 0) {
    encrypted<char, N> blob{};
    blob.seed = seed;
    for (uint32_t index{ 0 }, stream{ seed }; index < N; index++) {
        blob.data[index] = input[index] ^ stream;
        stream = prng(stream);
    }
    return blob;
}

#define make_string(STRING) ([&] {                               \
    constexpr auto _{ crypt(STRING, seed(__FILE__, __LINE__)) }; \
    return std::string{ crypt(_.data, _.seed).data };            \
}())

Note: If you are using Visual Studio, you will need to disable the "Edit and Continue" feature; otherwise, the __LINE__ macro will not need be usable in a constant expression.

Incident Response

If you are investigating a potentially malicious executable, it may also contain strings encrypted in such a manner. The provided code will protect strings against any cursory inspection, but they may all be recovered using FLARE's Obfuscated String Solver (FLOSS).

Additional small improvements may be made to prevent automated string recovery using FLOSS as well. One example would be to include an exception based control flow to the decryption routine. In the interest of incident responders though, these improvements will not be presented and are left as an exercise to the reader.

Conclusion

We now have a solution for encrypting strings at compile time that meets all of our original goals and will work with any mainstream compiler. The full source for which can be found here. Enjoy! 😄

If you enjoyed reading this work, you may enjoy some of my older posts as well. The first covers compile time hashing functions and the second gives a more user friendly alternative to the programming idiom for declaring strings in position independent code.

References

  1. The Last Stage of Delirium Research Group. Win32 Assembly Components, 2002. http://www.lsd-pl.net/documents/winasm-1.0.1.pdf
  2. Sebastien Andrivet. C++11 Metaprogramming Applied to Software Obfuscation, 2014. https://www.blackhat.com/docs/eu-14/materials/eu-14-Andrivet-C-plus-plus11-Metaprogramming-Applied-To-software-Obfuscation-wp.pdf
  3. Stephen Park and Keith Miller. Random Number Generators, 1988. https://www.firstpr.com.au/dsp/rand31/p1192-park.pdf
@LAGonauta
Copy link

Hi!
I wonder, what is the license of your header file?
Thank you :)

@EvanMcBroom
Copy link
Author

Hey @LAGonauta! I updated the file to include the MIT license verbiage. I hope that helps! 🙂

@LAGonauta
Copy link

It helps, thanks!

@KereKDereK
Copy link

Can you please explain to me why does resulting macro fully evaluates only at run-time?

@EvanMcBroom
Copy link
Author

EvanMcBroom commented Feb 29, 2024

Hey @KereKDereK, good question. Taking the code from the "An Alternative Solution" section as an example, the make_string macro resolves to roughly the following lines of code:

([&] {
    constexpr auto _{ crypt(/* input data*/) };
    return std::string{ crypt(_.data).data };
}())

Line 1 defines the start of a lambda function. Line 2 defines a local variable in the lambda function, named _, whose value can be computed at compile time. Line 3 uses the crypt function again, but its argument will be the member variable of local variable _ (e.g., .data). The member variable of local variable _ will be a stack address that will not be known at compile time so line 3 will only be able to be evaluated at runtime.

Now take a snippet from the example main function from that section:

std::string string1{ make_string("String 1") };

Without optimizations enabled, that will compile to roughly the following assembly which matches the above description.

; Initialize local variable '_' with data that was computed at compile time
; The compile time computed data is "String 1\x00" xored with letter the 'A'
mov         byte ptr [_],12h
mov         byte ptr [rbp+9],35h
mov         byte ptr [rbp+0Ah],33h
mov         byte ptr [rbp+0Bh],28h
mov         byte ptr [rbp+0Ch],2Fh
mov         byte ptr [rbp+0Dh],26h
mov         byte ptr [rbp+0Eh],61h
mov         byte ptr [rbp+0Fh],70h
mov         byte ptr [rbp+10h],41h

; call crypt with the stack address of local variable '_'
lea         rdx,[_]
lea         rcx,[rbp+128h]
call        crypt<9>

@KereKDereK
Copy link

Thank You for shedding light on this matter. The explanation is very clear!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment