Skip to content

Instantly share code, notes, and snippets.

@RealNeGate
Last active December 13, 2023 02:38
Show Gist options
  • Select an option

  • Save RealNeGate/3dfa5e866017cc967ccd71de44448f17 to your computer and use it in GitHub Desktop.

Select an option

Save RealNeGate/3dfa5e866017cc967ccd71de44448f17 to your computer and use it in GitHub Desktop.

unfinished :p

SSE instructions (OP refers to the opcodes in the table below):

  ___ss 32 x 1  F3 0F OP ...
  ___sd 64 x 1  F2 0F OP ...
  ___ps 32 x 4     0F OP ...
  ___pd 64 x 2  66 0F OP ...

simple example:

    addss xmm0, xmm1     F3 0F 58 C1
    addsd xmm0, xmm1     F2 0F 58 C1
    addps xmm0, xmm1        0F 58 C1
    addpd xmm0, xmm1     66 0F 58 C1

Opcode list:

  mov   0x10
  add   0x58
  mul   0x59
  sub   0x5C
  div   0x5E
  cmp   0xC2
  ucomi 0x2E
  cvt   0x5A
  sqrt  0x51
  rsqrt 0x52
  and   0x54
  or    0x56
  xor   0x57

Exceptions:

for the mov it doesn't refer to the standard integer mov on x86, instead it means the movss,movsd,movups,movupd (the unaligned variants) and the bottom bit acts as a "direction flag" which will flip which side has the memory operand:

direction set:

            you'll notice that the 1s bit is set for storing
                                 VV
    movups [rsp + 8], xmm0    0F 11 44 24 10
    movups xmm0, [rsp + 8]    0F 10 44 24 10

Oddities:

bitwise operations seem really dumb here because the type info is useless and there's just extra instructions for example andps, andpd are literally the same... except they're not?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment