Skip to content

Instantly share code, notes, and snippets.

@EvanMcBroom
Last active August 27, 2024 06:23
Show Gist options
  • Save EvanMcBroom/f5b1bc53977865773802d795ade67273 to your computer and use it in GitHub Desktop.
Save EvanMcBroom/f5b1bc53977865773802d795ade67273 to your computer and use it in GitHub Desktop.
Position Independent Code and String Literals

Position Independent Code and String Literals

A common programming idiom when writing position independent code (PIC) is to expand a string literal into its individual characters when instantiating a local variable.

void f() {
    // Example 1: A normal instantiation with a string literal
    char a[]{ "a long string" };

    // Example 2: The Pic idiom for instantiating a string
    char b[]{ 'a', 'b', 'c' };
}

For most compilers, the string literal in the above example will be stored by the compiler in the .data section. This prevents you from being able to directly write and execute the contents of the function into the memory space of another process because the code will reference the original address of "a long string" which will not be valid.

The PIC idiom avoids this issue because when you instatiate a string using an list of characters the compiler will generate code to instatiate the string without referencing a string literal in another section. This keeps all of the parts that the code needs to work correctly contained within the body of the function itself.

Idiom Improvements

The main downsides to the PIC idiom is that it is tedious to use and it creates code that can be difficult to read. A macro function is a natural solution for issues like this which is now possible with the assistance of constant expressions (described here). Albiet made to solve a different problem, the majority of the code to do this has already been worked out in a post on stackoverflow.

Here are the first two functions of the original code with small modifications for legibility.

template <size_t _Size>
constexpr char Index(char const (&stringLiteral)[_Size], std::ptrdiff_t index) {
    return stringLiteral[index];
}

#define PIC_APPEND_CHAR(_, index, data) \
    BOOST_PP_COMMA_IF(index) Index(data, index)

✏️ The code does use boost's preprocessor header only library.

The PIC_APPEND_CHAR function, when given a string literal and index, will evaluate to Index(data, index) prefixed by a comma if index is larger than 0. This can be used directly with the boost's BOOST_PP_REPEAT macro function to generate a list of characters from a string.

// After the preprocessor runs this line will be equivalent to:
// char localVariable[]{ Index("abc", 0) , Index("abc", 1) , Index("abc", 2) }
char localVariable[]{ BOOST_PP_REPEAT(3, PIC_APPEND_CHAR, "abc") }

This is almost a complete solution, but each call to Index still needs to be evaluated at compile time. This is possible because the Index function can be used in constant expressions. Using a constexpr function to instatiate a constexpr variable is one way to force the compiler to evaluate the function at compile time. The last step is to make a macro function that works similarly to the previous example but will instatiate a constexpr variable.

#define PIC_STRING(name, size, string) \
    constexpr char name[]{ BOOST_PP_REPEAT(size, PIC_APPEND_CHAR, string) }

Full Example

You can use the above macro to define a local variable that the compiler will instatiate the same way it would have if you had used the PIC idiom. The input to the macro would be the name of the local variable, the size of the string literal, and the string literal you would like to use.

#include <iostream>

int main() {
    // During compilation this line will be evaluated to:
    // char picString[]{ 'a', 'b', 'c', '\n', '\0' }
    PIC_STRING(picString, 5, "abc\n");
    std::cout << picString;
}

✏️ In Visual Studio you can hover over the string literal and Intellisense will display to you how large it is.

Using Visual Studio 2019, this is the assembly the compiler generated to instatiate the variable. It is the same assembly the PIC idiom would have generated but the code is now easy to write and understand.

Opcodes Mnemonic Arguments
48 89 44 24 40 mov qword ptr [rsp+40h],rax
C6 44 24 24 61 mov byte ptr [picString],61h
C6 44 24 25 62 mov byte ptr [rsp+25h],62h
C6 44 24 26 63 mov byte ptr [rsp+26h],63h
C6 44 24 27 0A mov byte ptr [rsp+27h],0Ah
C6 44 24 28 00 mov byte ptr [rsp+28h],0
48 8D 54 24 24 lea rdx,[picString]

My friend Alex pointed out that if you would prefer to not use macros at all, there's another answer to the same stackoverflow question that only uses templates.

@wulfgarpro
Copy link

s/.data/.text/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment