Skip to content

Instantly share code, notes, and snippets.

@Pokechu22
Last active February 21, 2024 12:14
Show Gist options
  • Save Pokechu22/e9fa9037fe21312ff32475638b78ba27 to your computer and use it in GitHub Desktop.
Save Pokechu22/e9fa9037fe21312ff32475638b78ba27 to your computer and use it in GitHub Desktop.
Pokémon Battle Revolution graphical glitch writeup

Pokémon Battle Revolution had an issue where certain effects (those that distort the camera) caused the entire screen to shift slightly.

This writeup mainly analyzes the psycho cut fifolog (from issue 12629, though the actual issue is described in issue 11875).

How the effects render

All of the effects work using indirect textures. First, the game draws the main environment (see ZPBR_Psy_base_new.png and ZPBR_Rain_base_new.png). Then, it makes an EFB copy, and clears the screen (but not the depth buffer). It then draws a second effect, which will serve as an offset to the screen (see ZPBR_Psy_indirect.png and ZPBR_Rain_indirect.png). This might be drawn in 3D space (as with the energy pattern used by Psycho Cut) or it might be drawn in 2D space (as with the rain effect); that depends on whether the effect remain still in 3D space while the camera moves (e.g. the energy pattern) or should move with the camera (e.g. the rain effect, which acts as water on the camera). Since the depth buffer was not cleared, the effects will not show through solid objects (this is visible in the Psycho Cut case, as the effect actually would continue underneath the floor, but it gets cut off, producing a visible edge at the bottom - note that the energy effect itself does not have depth updates enabled). All of this means that the game can adjust the camera and the effects still working properly with no additional CPU-side computation, and also means that Dolphin's free-look functionality can work properly with it. After that effect has been prepared, another EFB copy is made, and then both the original and new EFB copies are combined.

EFB copy 3 (0005aaf3)
BP register BPMEM_CLEAR_AR
Clear color alpha: 0x00
Clear color red: 0x00

BP register BPMEM_CLEAR_GB
Clear color green: 0x00
Clear color blue: 0x00

BP register BPMEM_CLEAR_Z
Clear Z value: 0xFFFFFF

BP register BPMEM_ZMODE
Z mode: Enable test: Yes
Compare function: Always (7)
Enable updates: No

BP register BPMEM_ZCOMPARE
EFB pixel format: RGB8_Z24 (0)
Depth format: linear (0)
Early depth test: No

BP register BPMEM_EFB_TL
EFB Left: 0
EFB Top: 0

BP register BPMEM_EFB_WH
EFB Width: 640
EFB Height: 480

BP register BPMEM_MIPMAP_STRIDE (0x140)
No description available

BP register BPMEM_EFB_ADDR
EFB Target address (32 byte aligned): 0xA44A20

BP register BPMEM_TRIGGER_EFB_COPY
Clamping: Top and Bottom
Converting from RGB to YUV: No
Target pixel format: RGBA8 (6)
Gamma correction: 1.0
Half scale: No
Vertical scaling: No
Clear: Yes
Frame to field: Progressive (0)
Copy to XFB: No
Intensity format: No
Automatic color conversion: Yes
EFB copy 4 (0005c726)
BP register BPMEM_CLEAR_AR
Clear color alpha: 0x00
Clear color red: 0x00

BP register BPMEM_CLEAR_GB
Clear color green: 0x00
Clear color blue: 0x00

BP register BPMEM_CLEAR_Z
Clear Z value: 0xFFFFFF

BP register BPMEM_EFB_TL
EFB Left: 0
EFB Top: 0

BP register BPMEM_EFB_WH
EFB Width: 640
EFB Height: 480

BP register BPMEM_MIPMAP_STRIDE (0x50)
No description available

BP register BPMEM_EFB_ADDR
EFB Target address (32 byte aligned): 0xB71520

BP register BPMEM_TRIGGER_EFB_COPY
Clamping: Top and Bottom
Converting from RGB to YUV: No
Target pixel format: RA8/IA8 (Z16 too?) (3)
Gamma correction: 1.0
Half scale: Yes
Vertical scaling: No
Clear: No
Frame to field: Progressive (0)
Copy to XFB: No
Intensity format: Yes
Automatic color conversion: Yes
Object 440
Indirect matrices (0005c778)

The indirect scale comes out as 17, which means that the effective scale is 2^(17-17) = 2^0 = 1. In other words, the matrix entries can be used directly.

BP register BPMEM_IND_MTXA Matrix 0
Matrix 0 column A
Row 0 (ma): -0.036132812 (-37)
Row 1 (mb): -0.036132812 (-37)
Scale bits: 1 (shifted: 1)

BP register BPMEM_IND_MTXB Matrix 0
Matrix 0 column B
Row 0 (mc): 0.58203125 (596)
Row 1 (md): 0.58203125 (596)
Scale bits: 0 (shifted: 0)

BP register BPMEM_IND_MTXC Matrix 0
Matrix 0 column C
Row 0 (me): 0 (0)
Row 1 (mf): 0 (0)
Scale bits: 1 (shifted: 16), given to SDK as 1 (16)
TEV configuration (0005c7a9)
BP register BPMEM_TREF number 0
Stage 0 texmap: 0
Stage 0 tex coord: 0
Stage 0 enable texmap: Yes
Stage 0 color channel: Zero (7)
Stage 1 texmap: 0
Stage 1 tex coord: 0
Stage 1 enable texmap: Yes
Stage 1 color channel: Color chan 1 (1)

-Duplicate half-configured configuration skipped-

BP register BPMEM_TEV_COLOR_ENV Tev stage 0
dest.rgb = tex.rgb

a: ZERO (15)
b: ZERO (15)
c: ZERO (15)
d: tex.rgb (8)
Bias: 0 (0)
Op: Add (0) / Comparison: Greater than (0)
Clamp: Yes
Scale factor: 1 (0) / Compare mode: R8 (0)
Dest: prev (0)

BP register BPMEM_TEV_ALPHA_ENV Tev stage 0
dest.a = 0

a: ZERO (7)
b: ZERO (7)
c: ZERO (7)
d: ZERO (7)
Bias: 0 (0)
Op: Add (0) / Comparison: Greater than (0)
Clamp: Yes
Scale factor: 1 (0) / Compare mode: R8 (0)
Dest: prev (0)
Ras sel: 0
Tex sel: 0

BP register BPMEM_IND_CMD command 0
Indirect tex stage ID: 0
Format: ITF_8 (0)
Bias: None (0)
Bump alpha: Off (0)
Offset matrix index: Matrix 0 (1)
Offset matrix ID: Indirect (0)
Regular coord S wrapping factor: Off (0)
Regular coord T wrapping factor: Off (0)
Use modified texture coordinates for LOD computation: No
Add texture coordinates from previous TEV stage: No

BP register BPMEM_IREF
Stage 0 ntexmap: 1
Stage 0 ntexcoord: 0
Stage 1 ntexmap: 0
Stage 1 ntexcoord: 0
Stage 2 ntexmap: 0
Stage 2 ntexcoord: 0
Stage 3 ntexmap: 0
Stage 3 ntexcoord: 0

BP register BPMEM_RAS1_SS0
Indirect texture stages 0 and 1:
Even stage S scale: 1 (0.5)
Even stage T scale: 1 (0.5)
Odd stage S scale: 0 (1)
Odd stage T scale: 0 (1)
Texture 0 configuration (0005c7d6)

Texture 0 points to EFB copy 3.

BP register BPMEM_TX_SETMODE0 Texture Unit 0
Wrap S: Clamp (0)
Wrap T: Clamp (0)
Mag filter: Linear (1)
Mipmap filter: None (0)
Min filter: Linear (1)
LOD type: Diagonal LOD (1)
LOD bias: 0 (0)
Max anisotropic filtering: 1 (0)
LOD/bias clamp: No

BP register BPMEM_TX_SETMODE1 Texture Unit 0
Min LOD: 0 (0)
Max LOD: 0 (0)

BP register BPMEM_TX_SETIMAGE0 Texture Unit 0
Width: 640
Height: 480
Format: RGBA8 (6)

BP register BPMEM_TX_SETIMAGE1 Texture Unit 0
Even TMEM Offset: 0
Even TMEM Width: 3
Even TMEM Height: 3
Cache is manually managed: No

BP register BPMEM_TX_SETIMAGE2 Texture Unit 0
Odd TMEM Offset: 4000
Odd TMEM Width: 3
Odd TMEM Height: 3

BP register BPMEM_TX_SETIMAGE3 Texture Unit 0
Source address (32 byte aligned): 0xA44A20
Texture 1 configuration (0005c7f4)

Texture 1 points to EFB copy 4.

BP register BPMEM_TX_SETMODE0 Texture Unit 1
Wrap S: Clamp (0)
Wrap T: Clamp (0)
Mag filter: Linear (1)
Mipmap filter: None (0)
Min filter: Linear (1)
LOD type: Diagonal LOD (1)
LOD bias: 0 (0)
Max anisotropic filtering: 1 (0)
LOD/bias clamp: No

BP register BPMEM_TX_SETMODE1 Texture Unit 1
Min LOD: 0 (0)
Max LOD: 0 (0)

BP register BPMEM_TX_SETIMAGE0 Texture Unit 1
Width: 320
Height: 240
Format: IA8 (3)

BP register BPMEM_TX_SETIMAGE1 Texture Unit 1
Even TMEM Offset: 800
Even TMEM Width: 3
Even TMEM Height: 3
Cache is manually managed: No

BP register BPMEM_TX_SETIMAGE2 Texture Unit 1
Odd TMEM Offset: c00
Odd TMEM Width: 3
Odd TMEM Height: 3

BP register BPMEM_TX_SETIMAGE3 Texture Unit 1
Source address (32 byte aligned): 0xB71520
Texture coordinate scale configuration (0005c870)

This is configured twice; only the second one actually applies. (The configuration is based on a texture size, though actual hardware doesn't care about the texture corresponding to the coordinate.)

BP register BPMEM_SU_SSIZE number 0
S size info:
Scale: 320
Range bias: No
Cylindric wrap: No
Use line offset: No (s only)
Use point offset: No (s only)

BP register BPMEM_SU_TSIZE number 0
T size info:
Scale: 240
Range bias: No
Cylindric wrap: No
Use line offset: No (s only)
Use point offset: No (s only)

BP register BPMEM_SU_SSIZE number 0
S size info:
Scale: 640
Range bias: No
Cylindric wrap: No
Use line offset: No (s only)
Use point offset: No (s only)

BP register BPMEM_SU_TSIZE number 0
T size info:
Scale: 480
Range bias: No
Cylindric wrap: No
Use line offset: No (s only)
Use point offset: No (s only)
Genmode (0005c884)
BP register BPMEM_GENMODE
Num tex gens: 1
Num color channels: 0
Unused bit: 0
Flat shading (unconfirmed): No
Multisampling: No
Num TEV stages: 1
Cull mode: Back-facing primitives only (1)
Num indirect stages: 1
ZFreeze: No
Primitive data (0005c8fb)

This draws a rectangle covering the entire screen (from (0, 0) to (640, 480)) with a single texture coordinate ranging from (0, 0) to (1, 1).

Primitive GX_DRAW_TRIANGLE_STRIP (3) VAT 1

00000000 (0) 00000000 (0)  00000000 (0) 00000000 (0)  
44200000 (640) 00000000 (0)  3f800000 (1) 00000000 (0)  
00000000 (0) 43f00000 (480)  00000000 (0) 3f800000 (1)  
44200000 (640) 43f00000 (480)  3f800000 (1) 3f800000 (1)  

Pokémon Battle Revolution uses an RGB8 EFB (meaning each framebuffer pixel has 8 bits of red, green, and blue data, and no alpha data is stored at all). Object 68, offset 0002c85a (not shown below) sets the pixel format in this case, though the same value is used in several places before and after. The first relevant EFB copy is EFB copy 3, right before object 423; the second relevant one is EFB copy 4, right before object 440. Object 440 then combines the two, applying the indirect effect.

EFB copy 3 uses the RGBA8 format, and captures the whole screen as a 640 by 480 texture. RGBA8 indicates that there are 8 bits of data each for red, green, blue, and alpha color channels. Note that since this format includes an alpha channel, but the EFB does not, some default value needs to be used for it. This value is 1.0 (or 255).

EFB copy 4 uses the IA8 format, an intensity format. (The EFB copy's target pixel format is 3, which Dolphin labels as "RA8/IA8 (Z16 too?)"; it also has intensity format set to true.) This EFB copy captures the whole screen as a 320 by 240 texture, because it has half-scale enabled. The scaling down by 2 is presumably only to reduce memory usage; it doesn't seem to be necessary for rendering. IA8 indicates that the texture has an intensity channel and an alpha channel. This means that the system needs to map the EFB's red, green, and blue channels into a single channel, which is a somewhat complicated process I won't go into here. When rendering, the intensity channel is used as the value for the red, green, and blue channels (though that doesn't apply in this case, as EFB copy 4 isn't rendered directly to the screen). As with before, there's also an alpha channel, which needs to have some default value.

The game then draws over the entire screen using a rectangle where the texture coordinate ranges from (0, 0) at the top-left and (1, 1) at the bottom-right. This texture coordinate is used for two textures at the same time: the one corresponding to EFB copy 3, and the one corresponding to EFB copy 4. The texture coordinate is multiplied by the corresponding size info, causing it to range from (0, 0) to (640, 480). It then is used to sample the texture for EFB copy 4, as part of the indirect stage; however, it is scaled down by a factor of 2 for this (as set in BPMEM_RAS1_SS0), which is needed because EFB copy 4 is half-size (thus, the scaled texture coordinate ranges from (0, 0) to (320, 240), which is what is needed for EFB copy 4).

EFB copy 4 is used as an indirect texture, meaning the color read from it is multiplied with the indirect matrix, and then that product is added back to the original texture coordinate (not the one that has been scaled by a factor of 2). The specific indirect matrix used has the same values for both rows, meaning that both the s and t texture coordinates are offset by the same amount (effectively resulting in brighter values in EFB copy 4 resulting in an pixels using texture coordinates that are further diagonally down-right, resulting in the overall image moving up-left). The modified texture coordinate is then used to sample the texture for EFB copy 3.

A somewhat trivial TEV configuration is used where the output color is simply set to the sampled color from the texture for EFB copy 3, and the output alpha value is set to 0. Since the EFB has no alpha channel (and the alpha test is configured to always pass), the value for alpha here does not matter.

A somewhat simplified version of what the game is doing (in pseudo-glsl-that-probably-has-compile-errors™) is this:

uniform sampler2D base_texture;  // EFB copy 3
uniform sampler2D indirect_texture;  // EFB copy 4
uniform mat3x2 ind_mtx;

varying vec2 texture_coordinate;  // from (0, 0) to (1, 1)

void main(void) {
  texture_coordinate.u *= 640;
  texture_coordinate.v *= 480;

  vec2 indirect_sample_coord = texture_coordinate;
  indirect_sample_coord.u /= 2;
  indirect_sample_coord.v /= 2;

  vec4 color = texture(indirect_texture, indirect_sample_coord);
  // color.r == color.g == color.b; all are set to the intensity channel of the indirect texture

  texture_coordinate.u += ind_mtx[0][0] * color.a + ind_mtx[1][0] * color.r + ind_mtx[2][0] * color.g;
  texture_coordinate.v += ind_mtx[0][1] * color.a + ind_mtx[1][1] * color.r + ind_mtx[2][1] * color.g;
  // i.e. texture_coordinate += ind_mtx * color.arg;

  gl_FragColor = vec4(texture(base_texture, texture_coordinate).rgb, 0);
}

The bug

The indirect texture matrix is s = t = -0.036132812 * a + 0.58203125 * r + 0 * g, where a is the alpha channel and r/g/b are all the value of the intensity channel (b is not available to the indirect matrix, however). In other It may seem odd that alpha appears in this equation, given that the EFB does not have an alpha channel. Indeed, that's the source of the problem: Dolphin was using an alpha value of 0 in this case, when it should have been using an alpha value of 255 (it already used an alpha value of 255 for non-intensity formats, but was using 0 for intensity formats).

But why is the alpha channel needed? Well, what they're actually trying to do is add a constant offset, and the only way to do that is to use a constant value. Specifically, since alpha is 255, the offset is -9.21386706. But why would that offset be wanted? It seems rather arbitrary...

The answer is that intensity formats use the BT.601 standard, and that standard doesn't use a value of 0 for black and 255 for white. Instead, black is 16, and white is 235. (This can be seen in the indirect images.) So, if black is not supposed to perform any offsetting, that 16 needs to be converted back to zero. It just so happens that -0.036132812 * 255 + 0.58203125 * 16 = -9.21386706 + 9.3125 = 0.09863294, or about 1/10th of a pixel in offset, which is the best that is achievable with the values you can use in the indirect matrix. Similarly, (235 - 16)*0.58203125 is 127.46484375, the closest value to 127.5 = 255/2 that can be achieved. So, these strange-looking values are trying to convert to the range of [0, 127.5]. But if the alpha channel has the wrong value, it instead ends up being a range of approximately [-9.2, 118.3], meaning pixels that were not supposed to be shifted at all were instead shifted by 9 pixels both horizontally and vertically.

There is still a slight shift (of 0.09863294 pixels) left over, but that shift would happen on real hardware too, and there's no practical way to eliminate it. But, given that the camera usually moves by more than that each frame, I don't think that shift is noticeable in practice, even at higher IRs where 1/10th of a pixel becomes 1 pixel.

As a side note, the original shift issue results in a black border. Why is it black? Well, that's actually a separate issue, caused by a game bug. See the next document for more information...

In addition to the shifting issue, Pokémon Battle Revolution has a black border problem. This happens to be the most obvious with effects that caused shifting, but it occurs in all situations (including the wrist strap screen when starting up the game).

This document looks at both the psycho cut fifolog and a rain fifolog, both of which are on issue 12629. However, specific object numbers aren't relevant for this issue.

The problem

Specifically, the rightmost column and bottommost row of pixels are all black due to a poorly-configured scissor rectangle. The game has the scissor top-left at (342, 342), bottom-right at (980, 820), scissor offset of (342, 342), and the viewport covers (342, 342) to (982, 822). Without going into details on the calculation, this means that the viewport covers pixels from (0, 0) to (639, 479) inclusive, while the scissor covers pixels from (0, 0) to (638, 478) inclusive. This means that pixels with x=639 or y=479 never get drawn, and remain black. (Note that this is actually the second scissor the game uses; when it draws shadows, it uses a smaller scissor and viewport which is not affected by the bug.)

When the texture is drawn, an indirect texture is used to adjust texture coordinates (the details of which vary depending on the effect the game is rendering, and I go into this more in the other PBR document). The base texture, which is an earlier EFB copy, has clamping enabled. Thus, if the indirect effect causes texture coordinates to go outside of the texture, they end up inside of the texture (instead of being wrapped): x < 0 becomes x = 0, x > 639 becomes 639, y < 0 becomes 0, y > 479 becomes 479. You may see the problem already: x = 639 and y = 479 are black due to the scissor, and thus everything beyond them is also black, meaning the indirect effects can bring in large black regions. (For instance, if x is already 630, and the indirect texture would result in adding 15 to x, then the result is blackness.)

The fix is to adjust the scissor rectangle so that it actually covers the whole screen (by adding 1 to the bottom and right coordinates in this case). Then, the clamping works properly: x = 639 and y = 479 both look similar to their neigboring pixels, instead of being completely black. Note that this doesn't magically add additional detail when the indirect texture results in going beyond the edge of the texture; this is easiest to see on the right-hand side of the rain output image where the checkerboard does not continue within the rain effect (but it's a good enough approximation that it doesn't look wrong at first glance either, which is way better than black). Note also that x = 0 and y = 0 already worked properly with clamping (however, the effects in Pokémon Battle Revolution mainly shift down/left and not up/right, so this doesn't mean much in practice).

The code

Here is the relevant code, as generated by Ghidra with some naming but no cleanup (this code is located at 80244a4c in the NTSC-U version of the game):

void GSrender_SetScissor(GSrender *render,uint x,uint y,uint width,uint height)
{
  short screenHeight;
  ushort screenWidth;
  uint screenSize;

  render->scissorX = (short)x;
  render->scissorY = (short)y;
  render->scissorWidth = (short)width;
  render->scissorHeight = (short)height;
  if ((undefined *)register0x00000004 != &DAT_00000006) {
    screenWidth = render->screenWidth;
  }
  if ((undefined *)register0x00000004 != &DAT_00000008) {
    screenHeight = render->screenHeight;
  }
  screenSize = (uint)(ushort)(screenWidth - 1);
  if (screenSize < x) {
    x = screenSize;
  }
  if ((int)screenSize < (int)((x & 0xffff) + width)) {
    width = screenSize - x & 0xffff;
  }
  screenSize = (uint)(ushort)(screenHeight - 1);
  if (screenSize < y) {
    y = screenSize;
  }
  if ((int)screenSize < (int)((y & 0xffff) + height)) {
    height = screenSize - y & 0xffff;
  }
  GXSetScissor(x & 0xffff,y & 0xffff,width,height);
  return;
}

The register0x00000004 stuff is related to an addic. instruction (add immediate with carry, and then update the status register) on the stack pointer, and that screenWidth/scissorHeight are not set if the result of the add is 0. However, there is no valid reason for the stack pointer to be near 0, so that check really shouldn't exist (and this same function will break in other places if the stack pointer is invalid). My best guess is that this is a case of the compiler being overly paranoid or just buggy. The name itself is a bit odd, but note that register0x00000004 could also be labeled as unaff_r1; see this Ghidra bug report.

In practice, I think the code is actually something like this:

// Calling code already masked x/y/width/height to a 16-bit value, but using ushort in
// Ghidra resulted in bad results because it doesn't know that the top 16 bits are all
// zero since that masking happened outside of this function.  Thus I used uint in Ghidra,
// but u16 here.
void GSrender_SetScissor(GSrender *render, u16 x, u16 y, u16 width, u16 height)
{
  render->scissorX = x;
  render->scissorY = y;
  render->scissorWidth = width;
  render->scissorHeight = height;
  u16 maxWidth = render->screenWidth - 1;
  if (x > maxWidth) {
    x = maxWidth;
  }
  // n.b. I believe the (int) casts appeared in the original because the compiler assumed
  // x + width will not overflow, and thus doesn't truncate x + width to a 16-bit value.
  // This would be legal because unsigned int overflow is undefined behavior in C/C++.
  // It's also preferable behavior because the code's intention isn't to allow extremely
  // large widths (e.g. x = 0x200, width = 0xff00 would result in x + width = 0x10100
  // which truncates to 0x100, which is less than maxWidth, but we'd want width to be
  // clamped to maxWidth - 0x200 in that case).  In any case, this isn't important,
  // as the game doesn't actually use values that big, and by adding this giant comment
  // I may have undone the clarity of removing those casts...
  if (x + width > maxWidth) {
    width = maxWidth - x;
  }
  u16 maxHeight = render->screenHeight - 1;
  if (y > maxHeight) {
    y = maxHeight;
  }
  if (y + height > maxHeight) {
    height = screenSize - y;
  }
  GXSetScissor(x, y, width, height);
}

For additional context, screenWidth and screenHeight are set at 8024439c. That function is called in the constructor for GSvideo at 80243b10, which is called by the constructor for GSrender (at 802359d0); note that the vtable used by this constructor (which only contains a virtual destructor, I think?) is what gives the names GSvideo and GSrender. However, this function is rather hard to understand at first glance. It iterates through an array of structs composed of an ID followed by 4 pointers, where that array is located at 80413bf0; when it finds a struct with a matching ID, it copies data from the appropriate pointer using a memcpy call at 80244560). The easiest thing to compare it to is libogc's VIDEO_GetPreferredMode but it's not an exact match. On my machine, it ends up copying data from 80424438, which results in a screenWidth of 640 and a screenHeight of 480.

The one other thing to note is that although the scissor registers are composed of a top-left coordinate and a bottom-right coordinate, and those coordinates are inclusive, GXSetScissor takes a top-left coordinate and an exclusive width and height. Libogc's GX_SetScissor (doc) is basically the same function. So, GXSetScissor(0, 0, 640, 480) would configure the scissor for the whole screen.

Now, GSrender_SetScissor(render, 0, 0, 640, 480) should do the exact same thing, shouldn't it? Unfortunately, that's not what happens. maxWidth = renderer->screenWidth - 1 = 639 and maxHeight = renderer->screenHeight - 1 = 479. x <= maxWidth and y <= maxHeight, so x and y remain 0. But x + width = 0 + width = 640 is greater than maxWidth = 639, so width becomes maxWidth - x = maxWidth - 0 = 639. The same thing happens to height; it becomes maxHeight - y = 479. Thus, it ends up calling GXSetScissor(0, 0, 639, 479), and one pixel is no longer included.

The easiest fix is to just stop subtracting 1 from screenWidth and screenHeight. Then, the GSrender_SetScissor(render, 0, 0, 640, 480) call results in x + width <= maxWidth (as width == maxWidth), and the same for height, so GXSetScissor(0, 0, 640, 480) is called.

But now, GSrender_SetScissor(render, 1000, 1000, 4, 4) would result in a call of GXSetScissor(640, 480, 0, 0) - is that a problem? No, as before that same call would have resulted in GXSetScissor(639, 479, 0, 0) which affects on-screen pixels even though the scissor is off-screen. An exclusive width of 0 also seems like it would cause issues, but in practice, if the top-left coordinate is past the bottom-right coordinate, nothing is rendered (this was checked on real hardware). Theoretically, none of this clamping is needed at all, as the scissor functionality works mostly fine with out-of-bounds ranges (as long as the ranges are close enough to in bounds that nothing overflows), but there's no harm in having it.

The patch

As determined above, the fix is simply to get rid of the code that subtracts 1 from screenWidth/screenHeight. This is pretty easy: we just need to replace two instructions:

80244a94: 3908ffff subi r8, r8, 1
80244a9c: 3803ffff subi r0, r3, 1

The first one could be replaced with any kind of NOP, but the second one also moves from r3 to r0 in addition to subtracting 1. I think the easiest fix is to change this from a subi instruction (which subtracts an immediate value (i.e. a value encoded in the instruction, not a value stored in a register) from the value in one register and stores that result in another register) with a addi instruction where the immediate value is set to 0. This is actually trivial, as subi is just a special case of addi (both of them simply take a 16-bit sign-extended immediate value; if that value has the top bit set, it's negative, and the instruction shows up as subi). So we can simply replace the ffff with 0000, to get the following:

80244a94: 39080000 addi r8, r8, 0
80244a9c: 38030000 addi r0, r3, 0

This gives the following Dolphin patch:

[OnFrame]
$Fix bottom/right pixels being black and black screen effects
0x80244A94:dword:0x39080000
0x80244A9C:dword:0x38030000

which can also be implemented by a Gecko code like this (see code type documentation):

[Gecko]
$Fix bottom/right pixels being black and black screen effects [Pokechu22]
04244A94 39080000
04244A9C 38030000
*See https://gist.github.com/Pokechu22/e9fa9037fe21312ff32475638b78ba27

It's worth noting that this code fixes the issue on real hardware — again, this isn't a Dolphin-specific problem, but a game bug.

Patching other versions of the game

I only own a USA copy of Pokémon Battle Revolution, so I only have a patch for it. Furthermore, the code in various versions of the game will almost certainly be at slightly different addresses, meaning the same patch won't work on all versions of the game. However, it should be possible to generate patches for other versions without as much trouble by following these steps:

  1. Enable Dolphin's debug UI.
  2. Select Options → Boot to Pause.
  3. Launch Pokémon Battle Revolution.
  4. Select Symbols → Generate Symbols From → Signature Database.
  5. Select View → Code.
  6. In "filter symbols", enter GXSetScissor.
  7. Select GXSetScissor from the symbols list.
  8. In the function callers list, there should be 2 entries; one will be __GXInitGX, and the other will be something like zz_80244a4c. Select the one that is not __GXInitGX.
  9. Scroll up to the start of the function (different functions are indicated by different background colors).
  10. Start scrolling down until you see subi r8, r8, 1. There should be a subi r0, r3, 1 a few instructions below too.
  11. For each instruction, right-click and copy the address, and then also copy the hex. The hex should end in ffff; changing it to 0000 will solve the issue. A 32-bit memory patch is fine for this.

Note about dolphin

Note that when generating images, I used a modified version of Dolphin so that I could use the same fifolog for both images. Here is the change I made:

diff --git a/Source/Core/VideoCommon/BPStructs.cpp b/Source/Core/VideoCommon/BPStructs.cpp
index 0fc4ca6785..b0b3940398 100644
--- a/Source/Core/VideoCommon/BPStructs.cpp
+++ b/Source/Core/VideoCommon/BPStructs.cpp
@@ -128,11 +128,13 @@ static void BPWritten(const BPCmd& bp, int cycles_into_future)
   // ----------------
   // Scissor Control
   // ----------------
   case BPMEM_SCISSORTL:      // Scissor Rectable Top, Left
   case BPMEM_SCISSORBR:      // Scissor Rectable Bottom, Right
   case BPMEM_SCISSOROFFSET:  // Scissor Offset
+    bpmem.scissorBR.x = bpmem.scissorBR.x | 1;
+    bpmem.scissorBR.y = bpmem.scissorBR.y | 1;
     SetScissor();
     SetViewport();
     VertexShaderManager::SetViewportChanged();
     GeometryShaderManager::SetViewportChanged();
     return;

This also fixes the issue, but it's inaccurate for general emulation. However, the software renderer currently works that way due to an implementation mistake (it applied the scissor to 2 by 2 blocks of pixels, rather than to individual pixels). My scissor pull request fixes that. When that issue with the software renderer is fixed, it too will produce the black pixels (unless the patch is applied). Note that there are games where this are needed (James Bond - Everything or Nothing can have shadows break, and Samurai Warriors 3 uses it to produce a vignette effect, for instance).

@Dentomologist
Copy link

This is an excellent writeup; Kudos for the effort you obviously put into this.

If I've understood your explanation correctly I think there's a small typo:

Without going into details on the calculation, this means that the viewport covers pixels from (0, 0) to (639, 479) inclusive, while the scissor covers pixels from (0, 0) to (638, 479) inclusive.

I believe the scissor goes up to (638, 478) instead of 479.

@Pokechu22
Copy link
Author

Fixed, thanks. If you find any other typos lurking about (there likely are more), do let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment