AKA reference for citra CRO code.
wwylele: I am not a English native speaker, so there can be some strange words and sentences below. Suggestions for improvement are welcome.
Warning: CRO is still not completely understood, and there can be mistakes below. Please keep the sense of suspecting.
Note: some terms are given by comparing the behavior with similar concept. They may be inaccurate, or even incorrect.
- Module is a chunk of executable code and data. Modules can be linked to each other.
- Static module refer to the main application (i.e. ExeFS). The executable code is loaded on application booting, while the symbol information is loaded at runtime.
- Dynamic module is a module that can be loaded and unloaded at runtime.
- RO service (
ldr:ro
) is the system service for managing modules. - CRO is a file with extension
.cro
, which contains executable code, data and symbol information for a dynamic module. - CRS is a file with extension
.crs
, and in the same format as CRO, which contains symbol information for the static module. - CRR is a file with extension
.crr
, which contains verification data for all modules. the RO service will verify every dynamic modules by CRR file when loading them. - Symbol is a location for a function or a variable(?) in a module, which can be found by other modules. The location is described by a
SegmentTag
type. There are three types of symbol:- Named symbol is a symbol with a name which is usually a (mangled) function name or a variable name(?). A module can refer to a named symbol in other modules by name.
- Indexed symbol is a symbol without name but with an index. A module can refer to a indexed symbol in other modules by index.
- Anonymous symbol is a symbol without name or index. A module must(?) keep a list of imported anonymous symbol to refer to them.
- Importing is declaring the usage of symbols in other modules.
- Exporting is making symbols available to other modules.
- Resolving is making a imported symbols available by looking it up in the module that exports it, and patching the code in the importing modules.
- Linking is looking up multiple modules import / export information and resolving symbols between them.
- Auto-link module is a loaded module that will automatically link with new-loaded module. The static module is always an auto-link module. Game can specify a module to be auto-link or not when loading it. A module will automatically link with auto-link modules when loading, regardless of whether it is auto-link itself.
- Manual-link module is a module that is not auto-link.
- Rebasing is modifying all offsets in CRO file to actual virtual address (i.e. add CRO loaded address to all offsets). CRO will be rebased when loading, and unrebased when unloading.
A CRO file consists with a header, executable code, non-executable data and several tables. Their order is often as
- header
- .text segment
- .rodata segment
- tables
- .data segment
The header is in the following format,
Offset | Size | Description |
---|---|---|
0x0000 | 0x20 | SHA-256 from 0x80 to Code offset |
0x0020 | 0x20 | SHA-256 from Code offset to Module name offset |
0x0040 | 0x20 | SHA-256 from Module name offset to Data offset |
0x0060 | 0x20 | SHA-256 from Data offset to the end of CRO |
0x0080 | 0x04 | Magic "CRO0" |
0x0084 | 0x04 | Name offset |
0x0088 | 0x04 | Next module ** |
0x008C | 0x04 | Previous module ** |
0x0090 | 0x04 | File size |
0x0094 | 0x04 | .bss segment size(?). * |
0x0098 | 0x04 | Fixed size ** |
0x009C | 0x04 | Zero? * |
0x00A0 | 0x04 | nnroControlObject_ function segment tag. * |
0x00A4 | 0x04 | "OnLoad" function segment tag. 0xFFFFFFFF if not exists * |
0x00A8 | 0x04 | "OnExit" function segment tag. 0xFFFFFFFF if not exists * |
0x00AC | 0x04 | "OnUnresolved" function segment tag. 0xFFFFFFFF if not exists |
0x00B0 | 0x04 | Code offset |
0x00B4 | 0x04 | Code size |
0x00B8 | 0x04 | Data offset |
0x00BC | 0x04 | Data size |
0x00C0 | 0x04 | Module name offset. Equals to name offset (?) |
0x00C4 | 0x04 | Module name size |
0x00C8 | 0x04 | Segment table offset |
0x00CC | 0x04 | Segment count |
0x00D0 | 0x04 | Exported named symbol table offset |
0x00D4 | 0x04 | Exported named symbol count |
0x00D8 | 0x04 | Exported indexed symbol table offset |
0x00DC | 0x04 | Exported indexed symbol count |
0x00E0 | 0x04 | Exported strings offset |
0x00E4 | 0x04 | Exported strings size |
0x00E8 | 0x04 | Exported name tree offset |
0x00EC | 0x04 | Exported name tree node count |
0x00F0 | 0x04 | Imported module table offset |
0x00F4 | 0x04 | Imported module count |
0x00F8 | 0x04 | External patch table offset |
0x00FC | 0x04 | External patch count |
0x0100 | 0x04 | Imported named symbol table offset |
0x0104 | 0x04 | Imported named symbol count |
0x0108 | 0x04 | Imported indexed symbol table offset |
0x010C | 0x04 | Imported indexed symbol count |
0x0110 | 0x04 | Imported anonymous symbol table offset |
0x0114 | 0x04 | Imported anonymous symbol count |
0x0118 | 0x04 | Imported strings offset |
0x011C | 0x04 | Imported strings size |
0x0120 | 0x04 | Static anonymous symbol(?) table offset |
0x0124 | 0x04 | Static anonymous symbol(?) count |
0x0128 | 0x04 | Internal patch table offset |
0x012C | 0x04 | Internal patch count |
0x0130 | 0x04 | Static anonymous patch(?) table offset |
0x0134 | 0x04 | Static anonymous patch(?) count |
* RO service doesn't touch these fields
** Zero in CRO file. RO service will write to them
See code: HeaderField
, GetField
, SetField
All the "offset" fields in the header are relative to the file beginning. However, they will be modified to the virtual address when loading (i.e. rebasing).
For the detailed structure of each table, refer to the related struct
in the code.
CRO and CRS are loaded from RomFS to memory buffer by application, then the application specifies another address when calling RO service loading functions (Initialize
i.e. "LoadCRS", LoadCRO
, and LoadCRO_New
), and RO service will map the original buffer to the specified address. RO service will always read and write data in the mapping address, while the application can read data in both address, and can write to the original buffer. This requires a proper memory aliasing implement, which is not in citra yet. The current work-around is mapping a new buffer to the mapping address, copying the data and synchronizing at the beginning and the end of each service call (see MemorySynchronizer
).
See code: Register
, Unregister
Modules forms two doubly linked lists in RAM: each module has a previous
and a next
field in its header, and will be set pointing to other modules when loading. The previous
and next
field of the static module are pointing to the head of two list: manual-link list and auto-link list, respectively. The previous
field of the head of each list is pointing to the tail of the list. The next
field of the tail is set to 0.
A dynamic module will be added to the tail of one list when loading (as "registering"), depending on whether it is specified to be auto-link; the module will be removed from the linked list (as "unregistering"). RO service (and probably the application as well) uses these two linked lists to iterate among modules when linking.
A module has several segments, with a segment entry table pointing to each segment. A segment can be of type 0(.text), 1(.rodata), 2(.data), or 3(.bss). For the the static module, all these segment entry are set in CRS directly pointing to corresponding userland memory address (for example .text begins from 0x00100000). For dynamic modules, .text, .rodata and .data are stored in CRO, and the entries are set pointing to these data. During CRO loading and rebasing, .text and .rodata entries will be set pointing to where they are mapped in memory, and .data entry will be set pointing to a application-specified buffer (the RO service doesn't handle copying .data from CRO to buffer, which is done by the application). .bss will also be set to a application-specified buffer.
A location in a segment is always represented by a segment tag. A segment tag is a 32-bit type, with 4 lower bits integer as segment index, and 28 higher bits integer as offset into the segment. Symbols location and patch targets are all presented by segment tags.
See code: DecodeSegmentTag
, SegmentTagToAddress
The word "fix" comes from Subv's branch, and CRO's header magic (see below).
See code: GetFixEnd
, Fix
Application can specify a dynamic module to be "fixed" after loading. Fixing is cropping away some data from CRO end, where RO service will unmap the memory and return it back to the application for other use, so that memory can be saved.
A fix level can be specified. A higher level means to crop away more data and to lose more features.
-
Level 0 does not crop at all. Also, if a module with fix level 0 is unloaded, RO service will restore all the data of the module as if it hasn't been loaded. (See Patch - "clear patch") Therefore, a module with fix level 0 can be loaded and unloaded multiple times, without reloading from RomFS (?).
-
Level 1 crops away
- Static anonymous symbol table(?),
- internal patch table, and
- Static anonymous patch table(?).
- (and also very likely the data of .data segment)
A module with fix level 1 can't be reloaded after unloading(?), since the internal patch information was lost and it is not able to reapply internal patches for new allocated .data and .bss buffer(?).
-
Level 2 crops away
- all data that level 1 crops,
- imported module table,
- external patch table,
- imported named symbol table,
- imported indexed symbol table,
- imported anonymous symbol table, and
- imported strings.
Because of losing import information, a module with fix level 2 can't resolve symbols imported from modules that are loaded after itself.
-
Level 3 crops away
- all data that level 2 crops,
- exported named symbol table,
- exported indexed symbol table,
- exported strings, and
- exported name tree.
Because of losing export information, a module with fix level 3 can't resolve symbols exported to modules that are loaded after itself.
For modules with a fix level other than 0, the magic field in its header will be changed to "FIXD" when loading.
Note that only fix level 1 is known to be used by games (?), so the actual behavior different fix levels are not clear due to lack of test cases.
See code: PatchEntry
, ApplyPatch
, ApplyPatchBatch
Note: should be probably called "relocation" instead
Patch is the implement of resolving symbols. The module exporting symbols keeps the address (as segment tag) for each exported symbols, and the module importing symbols keeps a list of patches indicating where (also segment tags) and how to write the symbol address. One imported symbol is corresponding to several patches, which is called a patch batch. Patches have different types (PatchType
), but only two of them, AbsoluteAddress
and RelativeAddress
are known to used by games. Other types are left unimplemented because of lack of test case. Patch types apparently match relocation types in ELF for ARM.
Patches are "reset" when modules are being loaded before linking, and when module are being unlinked: places to patch are written with a "OnUnresolved" function address instead of imported symbol address. The "OnUnresolved" function is specified by CRO header. See code ResetExternalPatches
, ResetImportNamedSymbol
, ResetImportIndexedSymbol
, ResetImportAnonymousSymbol
, ResetExportNamedSymbol
, ResetModuleExport
.
Patches are "cleared" when modules with fix level 0 are being unloaded: places to patch are written with zero address. The purpose is to restore the CRO to the state before loading(?) (See Fixing - Level 0). See code ClearPatch
, ClearExternalPatches
, ClearInternalPatches
.
Patches are also used for another 2 different ways. One is internal patches. This can be treated as a module exporting symbols to itself. This is need for each segment to communicate to each other because their address will be changed on rebasing. The internal patches are slightly different from normal patches (see InternalPatchEntry
): they are not organized in batches; they store not only where to write address, but also what address to write (i.e. exported symbol address). Also, note that internal patches will be applied upon rebasing (not linking!). See code ApplyInternalPatches
.
Another uses of patches is for static anonymous symbols, which are symbols exporting from dynamic modules to the static module. They works quite similar to normal symbols and patches, but will be applied when rebasing dynamic modules (not linking, again), and never reset even the module is unloaded. No games (?) are actually known to use this feature, so it is unclear what it is used for. See code ApplyStaticAnonymousSymbolToCRS
.
A module exporting symbols keeps record of them in two table: named symbols are recorded in the exported named symbol table, while the indexed symbols are recorded in the exported indexed symbol table. The module does not keep record of exported anonymous symbols.
A module importing symbols keeps record of them in several tables: named symbols are directly recorded in the imported named symbol, while the indexed and anonymous symbols are grouped by the modules exporting them, and the imported module table records the referenced modules and indexed / anonymous symbols they contains.
The relationship of each table is illustrated below:
See code: FindExportNamedSymbol
A module contains a tree for symbol name fast lookups. When RO service look up a named symbol, it won't go over the exported named symbol table, but look up this tree instead.
The tree itself is a trie-like structure. Here is a reimplement in C++. The tree structure in CRO is bit_trie<symbol_name, symbol_index, string_tester>
as the reimplement. There are also some differences: the structure in CRO doesn't store keys in nodes; it uses absolute offsets in Branch
, instead of relative offsets in the reimplement.