Skip to content

Instantly share code, notes, and snippets.

@loveemu
Last active April 16, 2024 03:36
Show Gist options
  • Save loveemu/9b3063ffd9a76cb18e379324e43f3251 to your computer and use it in GitHub Desktop.
Save loveemu/9b3063ffd9a76cb18e379324e43f3251 to your computer and use it in GitHub Desktop.
Shin'en GAX Sound Engine (GBA) Specification / Research Note

Shin'en GAX Sound Engine (GBA) Specification

The research is based on GAX Engine V3. Information about V2 can be found at the end of the document.

FYI: GaXM is a tool that can analyze GAX data. You may find more information there.

FYI: You can find my IDA FLIRT signature for GAX V3 here. https://github.com/loveemu/ida-sig

FYI: Gaxtapper: Diagnostic tool / Automated GSF ripper for GAX Sound Engine.

Whole Layout in ROM

The data is normally stored somewhere on a GBA ROM in this order:

  1. Instrument definitions
    1. Data for each instruments
      1. Part A
      2. Part B
      3. Instrument header
    2. Instrument entries (address table)
  2. Sample definitions
    1. Sample pool
    2. Sample entries (address/size table)
      1. First entry indicates the start address of sample pool?
      2. Second entry indicates the end address of sample pool?
  3. Music data for each songs
    1. Sequence data for each patterns
    2. Title / Copyright string
    3. Pattern table for each tracks
    4. Song header
  4. SFX definitions
    1. SFX instruments
    2. SFX sample pool & sample entries
    3. SFX top-level header (same structure as the song header)
  5. Parameters / read-only data for sound driver code

Instruments

Instrument Header

TBA

Instrument Entries

An entry is a pointer to an instrument header (uint32).

Samples

Sample Pool

Samples are signed 8-bit PCM.

Sample Entries

An entry is structured as follows.

Name Type
Address uint32
Size uint32

Music Data

Song Header

Whole header size is about 200 bytes long (depends on the version).

Name Type Count Comment
Number of Channels uint16 1
Number of Rows per Pattern uint16 1
Number of Patterns per Channel uint16 1
Loop Point uint16 1 Pattern index starting from 0
Master Volume uint16 1 Standard volume is 0x100
Reserved uint16 1 0
Pointer to Sequence Data uint32 1 Must be accessed via Pattern Table
Pointer to Instrument Set uint32 1 Link to instrument entries
Pointer to Sample Set uint32 1 Link to sample entries
Mixing Rate (Hz) uint16 1 Can be programmatically overridden
FX Mixing Rate (Hz) uint16 1 0 for the same rate as music. Can be programmatically overridden
Number of FX Slots uint8 1 Up to 4? Can be programmatically overridden
Reserved uint8 1 0
Reserved uint16 1 0
Pointer to Pattern Table uint32[] N Playlist for each channels
Padding (filled by zero) uint32[] N

Available mixing rates: 5735, 9079, 10513, 11469, 13380, 15769, 18158, 21025, 26760, 31537, 36316, 40138, 42049

If a mixing rate different from the above is specified, the lowest rate that exceeds the given rate will be selected.

FX / Shared Samples

Collections that contain shared samples, such as FX, are represented by song headers with a channel count of zero.

Sequence Data

TBA

Title / Copyright string

Meta information for the song. For instance:

"Title Theme" © Martin Schioeler

Note that the text is NOT null-terminated. The end of string must be dword-aligned (padded with zero).

You may like to scan these strings by regular expression as follows. (Note: This regular expression pattern is not perfect because this text sometimes contains iso-8859-1 characters that are not in ASCII.)

"[ -~]+?" © [ -~]+

As far as I know, the string isn't referenced from anywhere. It's usually located just before the first pattern table.

Pattern Table

The pattern table is a list of pattern entries. An entry is a 32-bit long integer as follows.

Name Type Comment
Pattern Offset uint16 Offset from the beginning of Sequence Data
Transpose int8 In semitones
Reserved uint8 0

API and Setup Example

Note that the symbol names are for convenience only and are not the actual names.

// The details of the structure vary from version to version.
// The following is for Maya The Bee: Sweet Gold.
struct Gax2Params {
  void *wram;
  uint32_t wram_size;
  int16_t mixing_rate;
  int16_t fx_mixing_rate;
  int16_t field_C;
  int16_t flags;
  int16_t num_fx_channels;
  int16_t volume;
  uint8_t unknown_14[0x18];
  const void *global_samples;
  const void *music;
  int32_t field_34;
  bool8_t debug_assert;
  uint8_t field_39;
  uint8_t field_3A;
  uint8_t field_3B;
};

void gax2_estimate(Gax2Params *params);
void gax2_new(Gax2Params *params);
bool gax2_init(Gax2Params *params);
bool gax2_jingle(const Gax2SongHeader *);
void gax_irq();
void gax_play();
int gax_fx(uint8_t fxid);
void gax2_fx(Gax2FxParams *fxparams);
void gax2_new_fx(Gax2FxParams *fxparams);
size_t gax_save_fx(int fxchannel, void *buf);
void gax_restore_fx(int fxchannel, const void *buf);

void AgbMain() {
    // Initialization omitted

    REG_IME = 0;

    Gax2Param params;
    gax2_new(&params);

    // params.mixing_rate = -1;
    // params.fx_mixing_rate = -1;
    // params.flags = 0;
    // params.num_fx_channels = -1;
    // params.volume = -1;
    params.song = song_address;
    params.sfx = sfx_address;
    params.debug_assert = true;
    gax2_estimate(&params);

    params.wram = malloc(params.wram_size);
    gax2_init(&params);

    REG_DISPSTAT = DSTAT_VBL_IRQ | DSTAT_VCT_IRQ | DSTAT_VCT(10);
    REG_IE = IRQ_VBLANK | IRQ_VCOUNT;
    REG_IME = 1;

    while (1) {
        Halt();
    }
}

void VBlankIntr() {
    gax_irq();
}

void VCountIntr() {
    // It is not mandatory to use the VCOUNT interrupt.
    // gax_play can be called synchronously in the main function.
    gax_play();
}

Layout of GAX V2

While GAX V2 and GAX V3 share most of the same APIs, their data structures are different. Apparently, V2 has a more programmable and complex structure.

The research was done primarily for Bruce Lee: Return of the Legend and the details may be different for other games.

Top-Level Structure

The top-level structure of song has the following layout.

struct Gax2SongEntry {
    int num_handers;
    const Gax2SoundHandler* handler_0;
    const Gax2SoundHandler* handler_1;
    const Gax2SongEntry* handler_2;
    const Gax2SoundHandler* handler_3;
    const Gax2SoundHandler* more_handlers[];
};

The number of handlers must be at least 4, or 0 if you want to represent empty data.

The handler defines how the sound should be processed, and holds several function pointers and parameters.

struct Gax2SoundHandler {
    Gax2InitHandlerFunc init_handler;
    Gax2UnknownHandlerFunc unknown_handler;
    Gax2PlayHandlerFunc play_handler;
    int num_related_handlers;
    const Gax2SoundHandler* related_handlers; // details of the relationship and usage are still unknown
    int field_14;
    const void* data; // can be song header, pattern table, etc.
};

In the actual example, it looks like the first three handlers come first, followed by the handlers for each channel.

struct Gax2ExampleSongEntry {
    int num_handers = 9;
    const Gax2SoundHandler* patterns_handler;
    const Gax2SoundHandler* song_header_handler;
    const Gax2SongEntry* unknown_handler;
    const Gax2SoundHandler* channel_1_handler;
    const Gax2SoundHandler* channel_2_handler;
    const Gax2SoundHandler* channel_3_handler;
    const Gax2SoundHandler* channel_4_handler;
    const Gax2SoundHandler* channel_5_handler;
    const Gax2SoundHandler* channel_6_handler;
};

Song Header

The structure of the song header is almost identical to V3, but V2 does not have references to each channel.

Name Type Count Comment
Number of Channels uint16 1
Number of Rows per Pattern uint16 1
Number of Patterns per Channel uint16 1
Loop Point uint16 1 Pattern index starting from 0
Master Volume uint16 1 Standard volume is 0x100
Reserved uint16 1 0
Pointer to Sequence Data uint32 1 Must be accessed via Pattern Table
Pointer to Instrument Set uint32 1 Link to instrument entries
Pointer to Sample Set uint32 1 Link to sample entries
Mixing Rate (Hz) uint16 1 Can be programmatically overridden
Number of FX Slots uint8 1 Up to 4? Can be programmatically overridden
Reserved uint8 1 0
Unknown Pointer uint32 1 Not available before GAX 2.3

FX / Shared Samples

As in GAX V3, a song header with zero channels is used.

This entry is not only necessary for playing sound effects, but apparently it also affects the result of music playback. If you omit the pointer to this entry, some instruments may be missing.

@ The root of the whole structure is placed last

word_8263F3C:
    .2byte 0    @ number of channels is 0
    .2byte 0
    .2byte 0
    .2byte 0
    .2byte 0xD0
    .2byte 0
    .4byte 0
    .4byte off_820D820  @ instrument pointers
    .4byte stru_8263AD4 @ sample entries
    .2byte 0
    .1byte 0
    .1byte 0
    .4byte 0

stru_8263F5C:
    .4byte 0

stru_8263F60:
    .4byte sub_8137AB0+1
    .4byte nullsub_1+1
    .4byte sub_8137B38+1
    .4byte 1
    .4byte stru_8263F5C
    .4byte 0x4C
    .4byte stru_8263F3C

gaxSampleSet: @ The length of the array is variable (or may vary from version to version)
    .4byte stru_8263F60
    .4byte stru_8263F60
    .4byte stru_8263F60
    .4byte stru_8263F60
    .4byte stru_8263F60
    .4byte stru_8263F60
    .4byte stru_8263F60
    .4byte stru_8263F60
    .4byte stru_8263F60

Pattern Table

The format of the 4-byte pattern is the same as V3. The pointer to the pattern table is recorded in the handler of each channel.

Title / Copyright string

As in V3, the title text is embedded in the song. In the case of V2, the text will be placed just before the pattern table of the earliest appearing channel (i.e. the channel with the smallest address).

@nikku4211
Copy link

Hey LoveEmu, I've got some information on GAX's sequence data.

GAX v3 Sequence Data Format (7Aug2020)
By: Nikku4211 (1034co.neocities.org)
Determined with the help of data from: Loveemu

Each note is either 2 or 4 bytes long depending on the signing bit of the first byte for each note.

00: Note pitch in semitones, signed 8-bit. If the signing bit is 1, instead of an effect, the 3rd byte can be either FF or 80.

01: Instrument number.

02: If the first byte is unsigned, this byte is which effect to use. If the first byte is signed, this byte is either FF, which is note off, or 80, which sticks this note to the next.

03: If the first byte is unsigned, this byte is the effect parameter for the 3rd byte. If the 1st byte is signed, and the 3rd byte is FF, this byte is now for how long the Note Off lasts.

This data is incomplete. All suggestions are welcome!

@loveemu
Copy link
Author

loveemu commented Apr 7, 2021

❤️

@beanieaxolotl
Copy link

Sample data in GAX v1 is indeed signed 8-bit PCM, but from GAX 2 and onwards the sample data is unsigned 8-bit PCM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment