The Zygisk API is fairly easy to understand: two main functions, 4 functions that are set on runtime, pre and post for both apps and system server. However, when you get more into the development of a Zygisk module, it gets significantly harder to know how to improve the module in terms of efficiency, and how to avoid detections, after all there are no (properly) documented resources for either.
This post/guide, written by ReZygisk's developer, documents numerous information (and that are actually valuable) that allow to build a better Zygisk module, improving hiding and performance/speed.
Many Zygisk modules have specific targets, and aren't meant to execute on other processes. It is common for many to also need to execute a considerable amount of code preparing for later execution. However, it is important to check, as the first real step, in preAppSpecialize, if you actually need to run any code in that process.
Utilizing both the app's name and flag to see if the process is on the denylist, you can reduce the amount of code executed in apps. This way, it reduces the surface of "attack" for detection, while also reducing the overhead generated by your module in all other apps.
This idea is applied in ReLSPosed, which when injection hardening is enabled, it won't run its many hooks to only check if the process is in some module's scope, "fixing" a detection and avoid many possible others.
When building any software, you'll inevitably use some system's function, be it from libdl, or libc, as is required to perform some actions. While it doesn't represent any harm on other type of applications, Zygisk modules are meant to be built differently — It should be minimal in its contact with exterior code, so that possible traces are avoided.
This is hard to conform when using languages such as C++, which due to its abstractions, it will call numerous libc functions behind the curtains. Because of this, it's suggested that Zygisk modules are coded with more minimal languages, such as C.
An example of a detection created by those abstractions is the atexit handlers detection. C++ creates atexit handlers so that it can use that callback to free data/memory. However, those entries get permanently added to a global structure within libc. By iterating the g_array libc array, you can detect zero-d entries within a range, hence detecting those C++ modules.
The use of C in modules, such as Treat Wheel, made it so that it wasn't affected by this detection, while all (not literally) other modules were.
If you insist on using C++, which, again, is highly discouraged by me, ReZygisk's lead maintainer, you must be aware: your module will, with a 99% confidence, contain those, and as said previously, will lead to detections.
For that you must think of 2 scenarios and pick the one you think your module will be most used in:
- My module will be used with ReZygisk together with Treat Wheel
- My module will be used with ReZygisk or other Zygisks WITHOUT Treat Wheel
In the scenario where it will only be used together with Treat Wheel, you do not need to care about this detection, as Treat Wheel will automatically hide the atexit entries created by Zygisk modules. This leads to a simpler code, as you don't need to bother with adding code to bypass it.
As for a scenario where it won't be used with Treat Wheel, you must include the definition for both __cxa_atexit and __cxa_finalize functions, so that you can avoid it being added to the system's array. One published example is from 5ec1cff's local_cxa_atexit_finalize_impl, which by just including it to your module's code, you can avoid this detection.
Regardless of which scenario, you must not forget to take this into consideration when developing your Zygisk module, and should properly document how you're dealing with this detection.
Usually when developing libraries, we rarely look at their actual ELF through tools like readelf. However, it is critical that we do those for Zygisk modules, since it shows which external symbols are used and called.
It is important that you run your module's library through readelf -Ws to check what symbols are used. To reduce them, you will always need to tackle with commands and options. One that I should mention is -nostartfiles, which will avoid inclusion of atexit symbols and .init_array entries. Not only that option, but others might positively affect your library, making it smaller and less complex.
This is a ReZygisk specific improvement. ReZygisk preloads all modules in the original Zygote process, this opens up immense opportunity for performance improvements.
Utilizing GNU constructors (__attribute__((constructor)) static void some_silly_name(void)), you can execute code as soon as the library is fully linked and loaded, in the process (original/parent Zygote) which later will be forked to create system server and the apps, carrying down anything you do at this stage.
An example of good use of this is to read a big file and cache its content in a buffer so that it won't be read in every app specialized, significantly improving execution speed. Another example, but from Treat Wheel, which is running in production in thousand of devices is the read of maps, mountinfo, and other information at this stage to greatly improve its speed (by more than 5x for some old scenarios).
However do not forget to free anything you do here, and avoid perform code that might lead to detections, as it will impact ALL apps and system server.
There are numerous ways to detect when a file from procfs has been opened before the app could start running. However, it is inevitable the use/open of some file in /proc, such as maps or mountinfo, for those cases, you must handle it in a special way:
- Create a socketpair
- fork the process
- open the file in the children
- Send the content to the parent via the sockets
- close sockets
- kill/exit children
- See PerformanC's LSPlt code as an example.
This will allow to retrieve the content of the file without being detected. Moreover, the use of fork can be used in other ways, so that dirty work only happens in a disposable process. However this process causes great impact on performance, and should be avoided.
Surely, there are other ways to bypass this. However, this was the idea I came up with.
A module's feature might require it to be ran after all modules are dlclosed to achieve the desired effect. However when not setting DLCLOSE_MODULE_LIBRARY, it will cause memory-related detections due to the left memory mapping of the library.
This is where PLT hooks come to play. By PLT hooking munmap from the Zygisk's library PLT table, it will allow to execute code before the Zygisk unloads itself.
However, Zygisks such as ReZygisk will also munmap to clean module's traces. To differ between unloading modules and ReZygisk unload, the addr must be compared to the address of the first region of ReZygisk's memory (retrieved via soinfo or maps).
There you can execute your finalization code, then munmap ReZygisk, being followed by calling your library's deconstructors (if using C++), then only after that, as the last function to be executed, munmap your own library.
However, discovering your own size and address might be complicated if you want to avoid reading maps, for the complications it may lead to. We can utilize the system's soinfo structure information to retrieve those. Below, is Treat Wheel's code that applies this theory.
/* INFO: Unlicense licensed code. */
static void *_page_start(uintptr_t addr) {
return (void *)(addr & ~(getpagesize() - 1));
}
static void *_page_end(uintptr_t addr) {
return (void *)((addr + getpagesize() - 1) & ~(getpagesize() - 1));
}
struct mod_mem_info get_mem_info(void) {
#ifdef __aarch64__
int fd = open("/data/adb/modules/treat_wheel/zygisk/arm64-v8a.so", O_RDONLY);
#elif defined(__arm__)
int fd = open("/data/adb/modules/treat_wheel/zygisk/armeabi-v7a.so", O_RDONLY);
#elif defined(__x86_64__)
int fd = open("/data/adb/modules/treat_wheel/zygisk/x86_64.so", O_RDONLY);
#elif defined(__i386__)
int fd = open("/data/adb/modules/treat_wheel/zygisk/x86.so", O_RDONLY);
#else
#error "Unsupported architecture"
#endif
ElfW(Ehdr) ehdr;
if (pread(fd, &ehdr, sizeof(ehdr), 0) != sizeof(ehdr)) {
PLOGE("pread ehdr");
close(fd);
return (struct mod_mem_info) { 0 };
}
ElfW(Phdr) *phdrs = malloc(ehdr.e_phentsize * ehdr.e_phnum);
if (pread(fd, phdrs, ehdr.e_phentsize * ehdr.e_phnum, ehdr.e_phoff) != ehdr.e_phentsize * ehdr.e_phnum) {
PLOGE("pread phdrs");
free(phdrs);
close(fd);
return (struct mod_mem_info) { 0 };
}
ElfW(Addr) lo = UINTPTR_MAX, hi = 0;
for (size_t i = 0; i < ehdr.e_phnum; ++i) {
if (phdrs[i].p_type != PT_LOAD) continue;
if (phdrs[i].p_vaddr < lo) lo = phdrs[i].p_vaddr;
if (phdrs[i].p_vaddr + phdrs[i].p_memsz > hi) hi = phdrs[i].p_vaddr + phdrs[i].p_memsz;
}
free(phdrs);
close(fd);
struct maps *maps = parse_maps("/proc/self/maps");
if (!maps) {
LOGE("Failed to parse maps");
return (struct mod_mem_info) { 0 };
}
void *mem_start = NULL;
for (size_t i = 0; i < maps->size; i++) {
struct map *map = &maps->maps[i];
if (!map->path || !strstr(map->path, "treat_wheel/zygisk/")) continue;
mem_start = (void *)map->addr_start;
break;
}
free_maps(maps);
return (struct mod_mem_info) {
.start = (uintptr_t)mem_start,
.size = (size_t)((uintptr_t)_page_end(hi) - (uintptr_t)_page_start(lo))
};
}
void *rz_base = NULL;
dev_t rz_dev;
ino_t rz_ino;
struct mod_mem_info mod_info;
int my_munmap(void *addr, size_t length) {
if (addr == rz_base) {
LOGD("Found ReZygisk's library");
/* INFO: Be aware that you MUST NOT call any ReZygisk-related function after this munmap */
munmap(addr, length);
[[clang::musttail]] return munmap((void *)mod_info.start, mod_info.size);
}
return munmap(addr, length);
}
__attribute__((constructor)) static void initialization(void) {
struct maps *maps = parse_maps("/proc/self/maps");
for (size_t i = 0; i < maps->size; i++) {
struct map *map = &maps->maps[i];
if (!map->path || !strstr(map->path, "rezygisk")) continue;
if (!rz_base) rz_base = (void *)map->addr_start;
rz_dev = map->dev;
rz_ino = map->inode;
LOGD("Found ReZygisk map at %s (dev: %d, ino: %d)", map->path, (int)rz_dev, (int)rz_ino);
}
free_maps(maps);
if (rz_dev == 0 || rz_ino == 0) {
LOGE("Failed to find ReZygisk's library in maps.");
return;
}
mod_info = tw_get_mem_info();
LOGD("Module memory region: start=%p, size=%zu", (void *)mod_info.start, mod_info.size);
}
/* INFO: And not to forget to: */
api_table->pltHookRegister(rz_dev, rz_ino, "munmap", (void *)my_munmap, NULL);
api_table->pltHookCommit();
/* INFO: In the one you want to delay unload */Be aware this is specific to ReZygisk, and other Zygisks might require different approaches to achieve the same, as they might or not utilize a custom linker.
The most important piece of a Zygisk module — aside its own functionality — is the Zygisk API header, which defines structures and API functions so that the module can easily create itself. Not only that, but its own building system, to compile and pack it into a flashable zip.
You can utilize TheSillyOk's Zygisk template as a base for C Zygisk module creation, as it contain all the basics — more than topjohnwu's own example — for creating a proper module with a nice building system. From there, you can build your Zygisk module without even making changes outside the code itself.
For now, that's all that I have to suggest and document. This post will be constantly updated with grammar improvements, better examples, more suggestions as time passes.
Motherly Hand
Lies that flow through years like knives,
Ones that could hurt one too many lives.
Light the candle to guide you throughout.
Feel life by yourself in your own route.