Pokémon Battle Revolution had an issue where certain effects (those that distort the camera) caused the entire screen to shift slightly.
This writeup mainly analyzes the psycho cut fifolog (from issue 12629, though the actual issue is described in issue 11875).
All of the effects work using indirect textures. First, the game draws the main environment (see ZPBR_Psy_base_new.png and ZPBR_Rain_base_new.png). Then, it makes an EFB copy, and clears the screen (but not the depth buffer). It then draws a second effect, which will serve as an offset to the screen (see ZPBR_Psy_indirect.png and ZPBR_Rain_indirect.png). This might be drawn in 3D space (as with the energy pattern used by Psycho Cut) or it might be drawn in 2D space (as with the rain effect); that depends on whether the effect remain still in 3D space while the camera moves (e.g. the energy pattern) or should move with the camera (e.g. the rain effect, which acts as water on the camera). Since the depth buffer was not cleared, the effects will not show through solid objects (this is visible in the Psycho Cut case, as the effect actually would continue underneath the floor, but it gets cut off, producing a visible edge at the bottom - note that the energy effect itself does not have depth updates enabled). All of this means that the game can adjust the camera and the effects still working properly with no additional CPU-side computation, and also means that Dolphin's free-look functionality can work properly with it. After that effect has been prepared, another EFB copy is made, and then both the original and new EFB copies are combined.
EFB copy 3 (0005aaf3)
BP register BPMEM_CLEAR_AR
Clear color alpha: 0x00
Clear color red: 0x00
BP register BPMEM_CLEAR_GB
Clear color green: 0x00
Clear color blue: 0x00
BP register BPMEM_CLEAR_Z
Clear Z value: 0xFFFFFF
BP register BPMEM_ZMODE
Z mode: Enable test: Yes
Compare function: Always (7)
Enable updates: No
BP register BPMEM_ZCOMPARE
EFB pixel format: RGB8_Z24 (0)
Depth format: linear (0)
Early depth test: No
BP register BPMEM_EFB_TL
EFB Left: 0
EFB Top: 0
BP register BPMEM_EFB_WH
EFB Width: 640
EFB Height: 480
BP register BPMEM_MIPMAP_STRIDE (0x140)
No description available
BP register BPMEM_EFB_ADDR
EFB Target address (32 byte aligned): 0xA44A20
BP register BPMEM_TRIGGER_EFB_COPY
Clamping: Top and Bottom
Converting from RGB to YUV: No
Target pixel format: RGBA8 (6)
Gamma correction: 1.0
Half scale: No
Vertical scaling: No
Clear: Yes
Frame to field: Progressive (0)
Copy to XFB: No
Intensity format: No
Automatic color conversion: Yes
EFB copy 4 (0005c726)
BP register BPMEM_CLEAR_AR
Clear color alpha: 0x00
Clear color red: 0x00
BP register BPMEM_CLEAR_GB
Clear color green: 0x00
Clear color blue: 0x00
BP register BPMEM_CLEAR_Z
Clear Z value: 0xFFFFFF
BP register BPMEM_EFB_TL
EFB Left: 0
EFB Top: 0
BP register BPMEM_EFB_WH
EFB Width: 640
EFB Height: 480
BP register BPMEM_MIPMAP_STRIDE (0x50)
No description available
BP register BPMEM_EFB_ADDR
EFB Target address (32 byte aligned): 0xB71520
BP register BPMEM_TRIGGER_EFB_COPY
Clamping: Top and Bottom
Converting from RGB to YUV: No
Target pixel format: RA8/IA8 (Z16 too?) (3)
Gamma correction: 1.0
Half scale: Yes
Vertical scaling: No
Clear: No
Frame to field: Progressive (0)
Copy to XFB: No
Intensity format: Yes
Automatic color conversion: Yes
Object 440
Indirect matrices (0005c778)
The indirect scale comes out as 17, which means that the effective scale is 2^(17-17) = 2^0 = 1
. In other words, the matrix entries can be used directly.
BP register BPMEM_IND_MTXA Matrix 0
Matrix 0 column A
Row 0 (ma): -0.036132812 (-37)
Row 1 (mb): -0.036132812 (-37)
Scale bits: 1 (shifted: 1)
BP register BPMEM_IND_MTXB Matrix 0
Matrix 0 column B
Row 0 (mc): 0.58203125 (596)
Row 1 (md): 0.58203125 (596)
Scale bits: 0 (shifted: 0)
BP register BPMEM_IND_MTXC Matrix 0
Matrix 0 column C
Row 0 (me): 0 (0)
Row 1 (mf): 0 (0)
Scale bits: 1 (shifted: 16), given to SDK as 1 (16)
TEV configuration (0005c7a9)
BP register BPMEM_TREF number 0
Stage 0 texmap: 0
Stage 0 tex coord: 0
Stage 0 enable texmap: Yes
Stage 0 color channel: Zero (7)
Stage 1 texmap: 0
Stage 1 tex coord: 0
Stage 1 enable texmap: Yes
Stage 1 color channel: Color chan 1 (1)
-Duplicate half-configured configuration skipped-
BP register BPMEM_TEV_COLOR_ENV Tev stage 0
dest.rgb = tex.rgb
a: ZERO (15)
b: ZERO (15)
c: ZERO (15)
d: tex.rgb (8)
Bias: 0 (0)
Op: Add (0) / Comparison: Greater than (0)
Clamp: Yes
Scale factor: 1 (0) / Compare mode: R8 (0)
Dest: prev (0)
BP register BPMEM_TEV_ALPHA_ENV Tev stage 0
dest.a = 0
a: ZERO (7)
b: ZERO (7)
c: ZERO (7)
d: ZERO (7)
Bias: 0 (0)
Op: Add (0) / Comparison: Greater than (0)
Clamp: Yes
Scale factor: 1 (0) / Compare mode: R8 (0)
Dest: prev (0)
Ras sel: 0
Tex sel: 0
BP register BPMEM_IND_CMD command 0
Indirect tex stage ID: 0
Format: ITF_8 (0)
Bias: None (0)
Bump alpha: Off (0)
Offset matrix index: Matrix 0 (1)
Offset matrix ID: Indirect (0)
Regular coord S wrapping factor: Off (0)
Regular coord T wrapping factor: Off (0)
Use modified texture coordinates for LOD computation: No
Add texture coordinates from previous TEV stage: No
BP register BPMEM_IREF
Stage 0 ntexmap: 1
Stage 0 ntexcoord: 0
Stage 1 ntexmap: 0
Stage 1 ntexcoord: 0
Stage 2 ntexmap: 0
Stage 2 ntexcoord: 0
Stage 3 ntexmap: 0
Stage 3 ntexcoord: 0
BP register BPMEM_RAS1_SS0
Indirect texture stages 0 and 1:
Even stage S scale: 1 (0.5)
Even stage T scale: 1 (0.5)
Odd stage S scale: 0 (1)
Odd stage T scale: 0 (1)
Texture 0 configuration (0005c7d6)
Texture 0 points to EFB copy 3.
BP register BPMEM_TX_SETMODE0 Texture Unit 0
Wrap S: Clamp (0)
Wrap T: Clamp (0)
Mag filter: Linear (1)
Mipmap filter: None (0)
Min filter: Linear (1)
LOD type: Diagonal LOD (1)
LOD bias: 0 (0)
Max anisotropic filtering: 1 (0)
LOD/bias clamp: No
BP register BPMEM_TX_SETMODE1 Texture Unit 0
Min LOD: 0 (0)
Max LOD: 0 (0)
BP register BPMEM_TX_SETIMAGE0 Texture Unit 0
Width: 640
Height: 480
Format: RGBA8 (6)
BP register BPMEM_TX_SETIMAGE1 Texture Unit 0
Even TMEM Offset: 0
Even TMEM Width: 3
Even TMEM Height: 3
Cache is manually managed: No
BP register BPMEM_TX_SETIMAGE2 Texture Unit 0
Odd TMEM Offset: 4000
Odd TMEM Width: 3
Odd TMEM Height: 3
BP register BPMEM_TX_SETIMAGE3 Texture Unit 0
Source address (32 byte aligned): 0xA44A20
Texture 1 configuration (0005c7f4)
Texture 1 points to EFB copy 4.
BP register BPMEM_TX_SETMODE0 Texture Unit 1
Wrap S: Clamp (0)
Wrap T: Clamp (0)
Mag filter: Linear (1)
Mipmap filter: None (0)
Min filter: Linear (1)
LOD type: Diagonal LOD (1)
LOD bias: 0 (0)
Max anisotropic filtering: 1 (0)
LOD/bias clamp: No
BP register BPMEM_TX_SETMODE1 Texture Unit 1
Min LOD: 0 (0)
Max LOD: 0 (0)
BP register BPMEM_TX_SETIMAGE0 Texture Unit 1
Width: 320
Height: 240
Format: IA8 (3)
BP register BPMEM_TX_SETIMAGE1 Texture Unit 1
Even TMEM Offset: 800
Even TMEM Width: 3
Even TMEM Height: 3
Cache is manually managed: No
BP register BPMEM_TX_SETIMAGE2 Texture Unit 1
Odd TMEM Offset: c00
Odd TMEM Width: 3
Odd TMEM Height: 3
BP register BPMEM_TX_SETIMAGE3 Texture Unit 1
Source address (32 byte aligned): 0xB71520
Texture coordinate scale configuration (0005c870)
This is configured twice; only the second one actually applies. (The configuration is based on a texture size, though actual hardware doesn't care about the texture corresponding to the coordinate.)
BP register BPMEM_SU_SSIZE number 0
S size info:
Scale: 320
Range bias: No
Cylindric wrap: No
Use line offset: No (s only)
Use point offset: No (s only)
BP register BPMEM_SU_TSIZE number 0
T size info:
Scale: 240
Range bias: No
Cylindric wrap: No
Use line offset: No (s only)
Use point offset: No (s only)
BP register BPMEM_SU_SSIZE number 0
S size info:
Scale: 640
Range bias: No
Cylindric wrap: No
Use line offset: No (s only)
Use point offset: No (s only)
BP register BPMEM_SU_TSIZE number 0
T size info:
Scale: 480
Range bias: No
Cylindric wrap: No
Use line offset: No (s only)
Use point offset: No (s only)
Genmode (0005c884)
BP register BPMEM_GENMODE
Num tex gens: 1
Num color channels: 0
Unused bit: 0
Flat shading (unconfirmed): No
Multisampling: No
Num TEV stages: 1
Cull mode: Back-facing primitives only (1)
Num indirect stages: 1
ZFreeze: No
Primitive data (0005c8fb)
This draws a rectangle covering the entire screen (from (0, 0) to (640, 480)) with a single texture coordinate ranging from (0, 0) to (1, 1).
Primitive GX_DRAW_TRIANGLE_STRIP (3) VAT 1
00000000 (0) 00000000 (0) 00000000 (0) 00000000 (0)
44200000 (640) 00000000 (0) 3f800000 (1) 00000000 (0)
00000000 (0) 43f00000 (480) 00000000 (0) 3f800000 (1)
44200000 (640) 43f00000 (480) 3f800000 (1) 3f800000 (1)
Pokémon Battle Revolution uses an RGB8 EFB (meaning each framebuffer pixel has 8 bits of red, green, and blue data, and no alpha data is stored at all). Object 68, offset 0002c85a (not shown below) sets the pixel format in this case, though the same value is used in several places before and after. The first relevant EFB copy is EFB copy 3, right before object 423; the second relevant one is EFB copy 4, right before object 440. Object 440 then combines the two, applying the indirect effect.
EFB copy 3 uses the RGBA8 format, and captures the whole screen as a 640 by 480 texture. RGBA8 indicates that there are 8 bits of data each for red, green, blue, and alpha color channels. Note that since this format includes an alpha channel, but the EFB does not, some default value needs to be used for it. This value is 1.0 (or 255).
EFB copy 4 uses the IA8 format, an intensity format. (The EFB copy's target pixel format is 3, which Dolphin labels as "RA8/IA8 (Z16 too?)"; it also has intensity format set to true.) This EFB copy captures the whole screen as a 320 by 240 texture, because it has half-scale enabled. The scaling down by 2 is presumably only to reduce memory usage; it doesn't seem to be necessary for rendering. IA8 indicates that the texture has an intensity channel and an alpha channel. This means that the system needs to map the EFB's red, green, and blue channels into a single channel, which is a somewhat complicated process I won't go into here. When rendering, the intensity channel is used as the value for the red, green, and blue channels (though that doesn't apply in this case, as EFB copy 4 isn't rendered directly to the screen). As with before, there's also an alpha channel, which needs to have some default value.
The game then draws over the entire screen using a rectangle where the texture coordinate ranges from (0, 0) at the top-left and (1, 1) at the bottom-right. This texture coordinate is used for two textures at the same time: the one corresponding to EFB copy 3, and the one corresponding to EFB copy 4. The texture coordinate is multiplied by the corresponding size info, causing it to range from (0, 0) to (640, 480). It then is used to sample the texture for EFB copy 4, as part of the indirect stage; however, it is scaled down by a factor of 2 for this (as set in BPMEM_RAS1_SS0
), which is needed because EFB copy 4 is half-size (thus, the scaled texture coordinate ranges from (0, 0) to (320, 240), which is what is needed for EFB copy 4).
EFB copy 4 is used as an indirect texture, meaning the color read from it is multiplied with the indirect matrix, and then that product is added back to the original texture coordinate (not the one that has been scaled by a factor of 2). The specific indirect matrix used has the same values for both rows, meaning that both the s
and t
texture coordinates are offset by the same amount (effectively resulting in brighter values in EFB copy 4 resulting in an pixels using texture coordinates that are further diagonally down-right, resulting in the overall image moving up-left). The modified texture coordinate is then used to sample the texture for EFB copy 3.
A somewhat trivial TEV configuration is used where the output color is simply set to the sampled color from the texture for EFB copy 3, and the output alpha value is set to 0. Since the EFB has no alpha channel (and the alpha test is configured to always pass), the value for alpha here does not matter.
A somewhat simplified version of what the game is doing (in pseudo-glsl-that-probably-has-compile-errors™) is this:
uniform sampler2D base_texture; // EFB copy 3
uniform sampler2D indirect_texture; // EFB copy 4
uniform mat3x2 ind_mtx;
varying vec2 texture_coordinate; // from (0, 0) to (1, 1)
void main(void) {
texture_coordinate.u *= 640;
texture_coordinate.v *= 480;
vec2 indirect_sample_coord = texture_coordinate;
indirect_sample_coord.u /= 2;
indirect_sample_coord.v /= 2;
vec4 color = texture(indirect_texture, indirect_sample_coord);
// color.r == color.g == color.b; all are set to the intensity channel of the indirect texture
texture_coordinate.u += ind_mtx[0][0] * color.a + ind_mtx[1][0] * color.r + ind_mtx[2][0] * color.g;
texture_coordinate.v += ind_mtx[0][1] * color.a + ind_mtx[1][1] * color.r + ind_mtx[2][1] * color.g;
// i.e. texture_coordinate += ind_mtx * color.arg;
gl_FragColor = vec4(texture(base_texture, texture_coordinate).rgb, 0);
}
The indirect texture matrix is s = t = -0.036132812 * a + 0.58203125 * r + 0 * g
, where a
is the alpha channel and r
/g
/b
are all the value of the intensity channel (b
is not available to the indirect matrix, however). In other It may seem odd that alpha appears in this equation, given that the EFB does not have an alpha channel. Indeed, that's the source of the problem: Dolphin was using an alpha value of 0 in this case, when it should have been using an alpha value of 255 (it already used an alpha value of 255 for non-intensity formats, but was using 0 for intensity formats).
But why is the alpha channel needed? Well, what they're actually trying to do is add a constant offset, and the only way to do that is to use a constant value. Specifically, since alpha is 255, the offset is -9.21386706
. But why would that offset be wanted? It seems rather arbitrary...
The answer is that intensity formats use the BT.601 standard, and that standard doesn't use a value of 0 for black and 255 for white. Instead, black is 16, and white is 235. (This can be seen in the indirect images.) So, if black is not supposed to perform any offsetting, that 16 needs to be converted back to zero. It just so happens that -0.036132812 * 255 + 0.58203125 * 16 = -9.21386706 + 9.3125 = 0.09863294
, or about 1/10th of a pixel in offset, which is the best that is achievable with the values you can use in the indirect matrix. Similarly, (235 - 16)*0.58203125
is 127.46484375
, the closest value to 127.5 = 255/2
that can be achieved. So, these strange-looking values are trying to convert to the range of [0, 127.5]
. But if the alpha channel has the wrong value, it instead ends up being a range of approximately [-9.2, 118.3]
, meaning pixels that were not supposed to be shifted at all were instead shifted by 9 pixels both horizontally and vertically.
There is still a slight shift (of 0.09863294
pixels) left over, but that shift would happen on real hardware too, and there's no practical way to eliminate it. But, given that the camera usually moves by more than that each frame, I don't think that shift is noticeable in practice, even at higher IRs where 1/10th of a pixel becomes 1 pixel.
As a side note, the original shift issue results in a black border. Why is it black? Well, that's actually a separate issue, caused by a game bug. See the next document for more information...
This is an excellent writeup; Kudos for the effort you obviously put into this.
If I've understood your explanation correctly I think there's a small typo:
I believe the scissor goes up to (638, 478) instead of 479.