Skip to content

Instantly share code, notes, and snippets.

@nibrunie
Created February 8, 2025 16:48
Show Gist options
  • Save nibrunie/a5b19ed36b873dbce78b0628a607d56b to your computer and use it in GitHub Desktop.
Save nibrunie/a5b19ed36b873dbce78b0628a607d56b to your computer and use it in GitHub Desktop.
Test program to list inputs for which vfrec7.v approximation is inexact in BFloat16
/** Determining when RVV 1.0 vfrev7.v SEW=32 result can not be converted
* exactly to a BF16 value. */
#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>
typedef union {
float f;
uint32_t u;
} fu_t;
int main(void) {
for (int i = 0; i < 256; ++i) {
uint32_t input = 42, fdst = 17;
__asm__ volatile(
"slli a0, %[i], 16\n"
"li a1, 0x7e800000\n"
"add %[input], a0, a1\n"
"vsetivli zero, 1, e32, m1, ta, ma\n"
"vmv.v.x v0, %[input]\n"
"vfrec7.v v0, v0\n"
"vmv.x.s %[fdst], v0"
: [input]"=r"(input), [fdst]"=r"(fdst): [i]"r"(i)
: "a0", "a1", "v0", "memory"
);
fu_t f;
f.u = input;
printf("%d %a 1/%"PRIx32" = %"PRIx32" BF16 underflow %d\n", i, (double) f.f, input, fdst, (fdst & 0xffff) != 0);
}
return 0;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment