Skip to content

Instantly share code, notes, and snippets.

@minoki
Last active February 20, 2022 05:27
Show Gist options
  • Save minoki/cbae83418f049ea72263bf4a69316621 to your computer and use it in GitHub Desktop.
Save minoki/cbae83418f049ea72263bf4a69316621 to your computer and use it in GitHub Desktop.
IEEE 754 conformance of GHC primitives

Literal

-0.0## under NegativeLiterals+MagicHash yields positive 0.0##.

Optimization

Constant folding

Fragile to negative zero and infinity and NaN (LitFloat/LitDouble cannot express them). cf. #9811, #18897.

For example, 0.0 * (-1.0) optimizes to 0.0 rather than -0.0.

0x1p512 * 0x1p1023 / 0x1p512 optimizes to 0x1p512 rather than infinity.

Other rules

  • 0.0 + x = x + 0.0 → x: Problematic with x=-0.0 or signaling NaN.
  • x - 0.0 → x: signaling NaN.
  • 1.0 * x = x * 1.0 → x: signaling NaN.
  • 2.0 * x = x * 2.0 → x + x: Okay.
  • x / 1.0 → x: signaling NaN.
  • negate (negate x) → x: Okay.

My opinion is, behavior change is acceptable if the only affected value is signaling NaN. So the rule with 0.0 + x should be fixed, but other rules can be left as is. In that case, it should be documented that GHC assumes the operand is not signaling NaN when optimizing floating-point operations.

Code generation

I think NCGs should generate IEEE-compliant code for basic operations (+, -, *, /), now that x87 support is dropped.

abs does not behave well with NaNs with via-C backend (#21043).

For negate, LLVM backend generates -0.0-x, which LLVM optimizes to fneg (fneg is a relatively new feature: LLVM 8 or later).

{int,word}2{Double,Float} and double2Float use the current rounding mode (i.e. roundTiesToEven).

double2Int and float2Int truncate.

float2Double on PPC does not seem to convert signaling NaN to quiet one, but that's minor issue (GHC does not care signaling NaN anyway).

Haskell GHC prim (Haskell name) GHC prim (internal name) Cmm x86 (SSE2) PPC AArch64 LLVM C
(+) (+##), plusFloat# DoubleAddOp, FloatAddOp MO_F_Add add{ss,sd} fadd(s) fadd fadd +
(-) (-##), minusFloat# DoubleSubOp, FloatSubOp MO_F_Sub sub{ss,sd} fsub(s) fsub fsub -
(*) (*##), timesFloat# DoubleMulOp, FloatMulOp MO_F_Mul mul{ss,sd} fmul(s) fmul fmul *
(/) (/##), divideFloat# DoubleDivOp, FloatDivOp MO_F_Quot div{ss,sd} fdiv(s) fdiv fdiv /
negate negateDouble#, negateFloat# DoubleNegOp, FloatNegOp MO_F_Neg bit xor fneg fneg -0.0-x -
abs fabsDouble#, fabsFloat# DoubleFabsOp, FloatFabsOp MO_F64_Fabs, MO_F32_Fabs, or genericFabsOp bit and fabs fabs llvm.fabs.{f32,f64} generic
sqrt sqrtDouble#, sqrtFloat# DoubleSqrtOp, FloatSqrtOp MO_F64_Sqrt, MO_F32_Sqrt sqrt{ss,sd} libm sqrt(f) libm sqrt(f) llvm.sqrt.{f32,f64} libm sqrt(f)
int2Double, int2Float int2Double#, int2Float# IntToDoubleOp, IntToFloatOp MO_SF_Conv cvtsi2{ss,sd} fcfid (+ ... + frsp) on PPC64 scvtf sitofp cast
word2Double, word2Float word2Double#, word2Float# WordToDoubleOp, WordToFloatOp MO_UF_Conv C hs_word2float{32,64} C hs_word2float{32,64} C hs_word2float{32,64} uitofp C hs_word2float{32,64}
double2Float double2Float# DoubleToFloatOp MO_FF_Conv cvtsd2ss frsp fcvt fptrunc cast
float2Double float2Double# FloatToDoubleOp MO_FF_Conv cvtss2sd no-op fcvt fpext cast
double2Int, float2Int double2Int#, float2Int# DoubleToIntOp, FloatToIntOp MO_FS_Conv cvtt{ss,sd}2si fctiwz / fctidz fcvtzs fptosi cast
(==) (==##), eqFloat# DoubleEqOp, FloatEqOp MO_F_Eq
(/=) (/=##), neFloat# DoubleNeOp, FloatNeOp MO_F_Ne
(<) (<##), ltFloat# DoubleLtOp, FloatLtOp MO_F_Lt
(<=) (<=##), leFloat# DoubleLeOp, FloatLeOp MO_F_Le
(>) (>##), gtFloat# DoubleGtOp, FloatGtOp MO_F_Gt
(>=) (>=##), geFloat# DoubleGeOp, FloatGeOp MO_F_Ge
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment