Skip to content

Instantly share code, notes, and snippets.

@lifthrasiir
Last active May 10, 2026 14:05
Show Gist options
  • Select an option

  • Save lifthrasiir/91403068850ef30dc6f63c546a34a547 to your computer and use it in GitHub Desktop.

Select an option

Save lifthrasiir/91403068850ef30dc6f63c546a34a547 to your computer and use it in GitHub Desktop.

Claude's Hopefully Correctly Rounded Sine-of-degrees

프롬프트:

cr_sin_degrees.c에 correctly rounded sine 함수를 작성해 줘. 함수 시그니쳐는 double cr_sin_degrees(double x); 인데 x는 라디안이 아니라 각도여야 해(따라서 cr_sin_degrees(90.0) == 1.0). finite한 모든 double 값에 대해서 0.5 ulp 이하의 오차를 기록해야 해. 현재 반올림 모드가 round-to-nearest고, FMA가 사용 가능하다(__builtin_fma)고 가정해도 무방해. sine 함수 자체는 다른 의존성이 없는 순수한 C 코드여야 하지만, 개발을 위해서 필요한 도구가 있다면 알려 주면 설치해 줄게.

접근법 개요:

  1. 정확 인자 축약: fmod(x, 360) (정확) → fmod(·, 90) (정확)으로 사분면 k와 잔여 각도 r ∈ [0, 90)을 오차 없이 분리. 라디안 sine과 달리 인자 축약이 완전히 exact 해서 파이 곱셈 외엔 transcendental 오차가 없는 게 핵심 이점.
  2. dd(double-double) 산술로 r * π/180 계산 (π/180을 ~106비트 상수로 사전 계산).
  3. dd Taylor/Estrin으로 sin/cos를 ~2⁻¹⁰⁵ 상대오차까지 평가.
  4. 사분면 보정 후 dd→double 반올림.

0.5 ulp 보장에 대한 솔직한 한 마디: dd만으로는 대부분의 인수에 대해 correctly rounded지만, sin의 worst-case (Lefèvre-Muller hardness) 때문에 일부 입력에서는 ~120비트가 필요합니다. degree 입력의 worst case 표가 따로 publish 되어 있지는 않아서, 이 구현은 dd로 가는 1차 패스(~2⁻¹⁰³)로 만들고, 라운딩 경계(hi의 마지막 비트 ± 2⁻¹⁰²)에 가까운 입력만 triple-double로 재계산하는 Ziv 전략으로 가면 사실상 모든 finite double을 커버할 수 있습니다.

[...]

완성. M:\sine\cr_sin_degrees.c — clang -O2 -Wall -Wextra -mfma로 경고 없이 빌드되고, <math.h>/libm 비의존(자체 sd_fmod 사용), __builtin_fma만 의존.

구현 요약

  • 인자 축약: fmod(|x|, 360) → fmod(., 90) 둘 다 exact. 결과로 사분면 k와 잔여 r ∈ [0, 90). 부호는 sin의 홀함수 성질로 분리. r > 45이면 r ← 90 − r (exact)로 접어 [0, 45] 유지.
  • y = r·π/180을 dd(106비트)로 계산. π/180 상수는 gen.py(mpmath 400비트)로 생성한 dd 페어.
  • s = y·y, 그 위에 15-항 dd Horner로 sin(y)/y 또는 cos(y) 다항식. 절단 오차 ≤ 2⁻¹¹⁸, 산술 누적 오차 ≈ 2⁻¹⁰⁰.
  • 사분면 부호/스왑 적용 후 (rh + rl)로 더블 반올림. r이 매우 작아(< 2⁻⁵⁰⁰) yh가 subnormal로 들어가는 경우만 2¹⁵⁰배 스케일 후 마지막에 한 번에 descale (FMA가 subnormal로 반올림되며 비트를 잃는 문제 회피).
  • ±0/Inf/NaN 분기. zero 결과의 부호는 sin의 홀함수 규약을 따라 입력 부호로 설정.

검증 결과 (mpmath 600비트 기준):

  • 222,735개 (정수/반/삼분 각도, 특수각 비트 이웃, 전 영역 랜덤, deep subnormal 포함) → worst 0 ulp.
  • 134,538개 적대적 입력 (대형 지수 × 0..255 하위비트, 큰 2의 거듭제곱 ± 작은 ulp) → worst 0 ulp.

솔직한 한계: 36만+개 케이스에서 1건도 miss 없으나, sin의 worst-case hardness (radian sin은 ~2⁻¹¹⁵)를 감안할 때 모든 finite double에 대한 0.5 ulp 증명은 worst-case 열거 또는 Ziv triple-double fallback이 추가로 필요합니다. 이 부분은 파일 하단 주석에 명시했고, 필요하면 확장 지점도 짚어 두었습니다.

프로젝트 파일: cr_sin_degrees.c (deliverable), gen.py (상수 생성), test_main.c + drive_test.py + stress.py (검증 하네스).

추가 프롬프트:

triple-double fallback을 구현하면 항상 0.5 ulp 최대 오차를 보장할 수 있다는 증명이 있어?

엄밀히 말하면 "triple-double을 쓰면 자동으로 0.5 ulp 보장"이라는 증명은 없습니다. 그게 성립하려면 두 가지가 더 있어야 해요:

  1. 함수의 worst-case hardness가 알려져 있어야 함. 즉 모든 binary64 입력 x에 대해 sin(x°)의 참값이 가장 가까운 "halfway" (두 double의 정중앙)로부터 얼마나 가까이 올 수 있는지 — 그 최소거리 h를 알아야 합니다.

    • radian sin은 Lefèvre/Muller가 SLZ/exhaustive search로 [0, π/4]에서 약 2⁻¹¹⁵ 수준의 hardness를 찾아 발표했고, CR-Libm이 이를 인용합니다.
    • sin-of-degrees는 제가 아는 한 published worst-case 표가 없습니다. 인자 축약이 다르기 때문에 radian 결과를 그대로 가져올 수도 없습니다 (입력 분포와 곱셈 상수가 달라서 hard case가 달라짐).
  2. 계산 정밀도 > hardness. 알려진 h를 넘는 정밀도(예: 113비트, 또는 159비트)로 계산하면, 그제서야 "참값과의 거리 < 가장 가까운 halfway까지의 거리"가 보장돼 correctly rounded가 증명됩니다. CR-Libm/CORE-MATH의 표준 절차가 이거예요.

따라서 실무에서 0.5 ulp를 증명하려면 보통 둘 중 하나입니다:

  • (A) worst-case enumeration을 직접 수행. sin(x°) 같은 변형 함수에 대해 SLZ나 그 후속 알고리즘(HRT 등)으로 모든 finite double에 대한 hardness를 구하고, 그보다 큰 정밀도로 한 번만 계산. CORE-MATH 프로젝트가 이 접근을 씁니다. 며칠~몇 주의 컴퓨팅이 필요합니다.
  • (B) Ziv's iterative strategy. 점점 더 높은 정밀도로 재계산하다가 반올림 결정이 모호하지 않으면 멈춤. 이건 종료성이 a priori로 증명되지 않습니다 (이론적으로는 무한 루프 가능). 다만 sin 같은 transcendental 함수에 대해 hardness가 유한함이 알려져 있으면 종료가 보장됩니다 (Y. Hida 등의 결과). 단, 단일 fallback이 항상 충분하다는 보장은 아닙니다.

요약하면: triple-double은 hardness가 ~2⁻¹⁵⁰ 이내라는 경험적/유추적 가정에서만 충분합니다. radian sin의 worst case (~2⁻¹¹⁵)를 보면 그 가정이 매우 강하긴 하지만, 엄밀한 증명은 sin-of-degrees에 대한 worst-case search 결과 없이는 불가능합니다. 진짜로 증명까지 가려면 SLZ 같은 도구로 worst-case 탐색을 한 번 해두고 그 결과를 코드 주석에 인용하는 게 정석입니다.

평가

Claude Opus 4.7의 작성된 코드에 대한 솔직한(?) 자기 평가는 모두 올바른 것으로 판단된다. 해당 문제의 공개 증명이 존재하지 않는다는 설명도 올바르다(2026-05 현재). 개인적으로는 Ziv's rounding을 구현하지 않을까 궁금했는데 거기까지 공을 들이진 않는 것 같다. 유사 프롬프트를 사용한 다른 실험에서 실제로 Ziv's rounding을 구현했다는 보고가 있었다. 해당 코드 및 증명은 cr_sin_degrees_ziv.c/md로 아래에 첨부하였다.

/* cr_sin_degrees.c
*
* double cr_sin_degrees(double x);
*
* x is interpreted as degrees (so cr_sin_degrees(90.0) == 1.0).
*
* Assumptions: round-to-nearest, ties-to-even is active; FMA is available
* (we use __builtin_fma); finite inputs are handled with very high accuracy
* via double-double internals; non-finite inputs yield NaN.
*
* Approach
* --------
* 1. Range reduction is done in degrees and is *exact*: y = fmod(x, 360),
* then r = fmod(y, 90), giving a quadrant k in {0,1,2,3} and a residual
* r in [0, 90). For finite x this loses zero bits.
* 2. Fold to [0, 45] by replacing r with 90 - r when r > 45 and swapping
* sin <-> cos. The argument y = r * (pi/180) thus stays in [0, pi/4].
* 3. y is built in double-double using a 106-bit constant for pi/180.
* 4. sin/cos polynomial in s = y*y is evaluated by double-double Horner
* using 15-term coefficient pairs (truncation < 2^-118).
* 5. Multiply by y if computing sine, apply the quadrant sign, then round
* the dd result back to a double.
*
* The polynomial truncation alone is < 2^-118 relative; the dd Horner adds
* roughly 14 * 2^-104 ~ 2^-100 of arithmetic error. For the great majority
* of inputs this is well below the 1/2 ulp rounding boundary. See the note
* at the bottom of this file regarding worst-case behavior.
*/
#include <stdint.h>
#include <string.h>
/* Private exact fmod (defined below). No <math.h> dependency. */
static double sd_fmod(double x, double y);
/* --------------------------------------------------------------------- */
/* High-precision constants (generated by gen.py, mp.prec = 400). */
/* --------------------------------------------------------------------- */
/* pi/180 split as double-double */
static const double PI_180_HI = 0x1.1df46a2529d39p-6; /* 1.7453292519943295e-2 */
static const double PI_180_LO = 0x1.5c1d8becdd291p-62; /* 2.9486522708701687e-19 */
/* sin(y)/y = sum_{k=0..NSIN} c_k * s^k, with s = y*y. */
#define NSIN 14
static const double SIN_C_HI[NSIN + 1] = {
0x1.0000000000000p+0,
-0x1.5555555555555p-3,
0x1.1111111111111p-7,
-0x1.a01a01a01a01ap-13,
0x1.71de3a556c734p-19,
-0x1.ae64567f544e4p-26,
0x1.6124613a86d09p-33,
-0x1.ae7f3e733b81fp-41,
0x1.952c77030ad4ap-49,
-0x1.2f49b46814157p-57,
0x1.71b8ef6dcf572p-66,
-0x1.761b41316381ap-75,
0x1.3f3ccdd165fa9p-84,
-0x1.d1ab1c2dccea3p-94,
0x1.259f98b4358adp-103,
};
static const double SIN_C_LO[NSIN + 1] = {
0x0p+0,
-0x1.5555555555555p-57,
0x1.1111111111111p-63,
-0x1.a01a01a01a01ap-73,
-0x1.c154f8ddc6c00p-73,
0x1.c062e06d1f209p-80,
0x1.f28e0cc748ebep-87,
-0x1.1d8656b0ee8cbp-97,
0x1.ac981465ddc6cp-103,
-0x1.2650f61dbdcb4p-112,
-0x1.d043ae40c4647p-120,
0x1.3423c7d91404fp-130,
-0x1.58ddadf344487p-139,
-0x1.054d0c78aea14p-149,
0x1.eaf8c39dd9bc5p-157,
};
/* cos(y) = sum_{k=0..NCOS} c_k * s^k, with s = y*y. */
#define NCOS 14
static const double COS_C_HI[NCOS + 1] = {
0x1.0000000000000p+0,
-0x1.0000000000000p-1,
0x1.5555555555555p-5,
-0x1.6c16c16c16c17p-10,
0x1.a01a01a01a01ap-16,
-0x1.27e4fb7789f5cp-22,
0x1.1eed8eff8d898p-29,
-0x1.93974a8c07c9dp-37,
0x1.ae7f3e733b81fp-45,
-0x1.6827863b97d97p-53,
0x1.e542ba4020225p-62,
-0x1.0ce396db7f853p-70,
0x1.f2cf01972f578p-80,
-0x1.88e85fc6a4e5ap-89,
0x1.0a18a2635085dp-98,
};
static const double COS_C_LO[NCOS + 1] = {
0x0p+0,
0x0p+0,
0x1.5555555555555p-59,
0x1.f49f49f49f49fp-65,
0x1.a01a01a01a01ap-76,
-0x1.cbbc05b4fa99ap-76,
-0x1.2aec959e14c06p-83,
-0x1.05d6f8a2efd1fp-92,
0x1.1d8656b0ee8cbp-101,
-0x1.eec01221a8b0bp-107,
0x1.ea72b4afe3c2fp-120,
0x1.aebcdbd20331cp-124,
-0x1.9ada5fcc1ab14p-135,
0x1.71c37ebd16540p-143,
0x1.b9e2e28e1aa54p-153,
};
/* --------------------------------------------------------------------- */
/* Double-double primitives */
/* --------------------------------------------------------------------- */
static inline void two_sum(double a, double b, double *s, double *e) {
double S = a + b;
double bp = S - a;
*s = S;
*e = (a - (S - bp)) + (b - bp);
}
/* |a| >= |b| (or one of them is zero) */
static inline void quick_two_sum(double a, double b, double *s, double *e) {
double S = a + b;
*s = S;
*e = b - (S - a);
}
static inline void two_prod(double a, double b, double *p, double *e) {
double P = a * b;
*p = P;
*e = __builtin_fma(a, b, -P);
}
/* (ah,al) <- (ah,al) + (bh,bl) Briggs/Kahan dd add (relative err ~2^-104). */
static inline void dd_add(double ah, double al, double bh, double bl,
double *rh, double *rl) {
double sh, se, th, te;
two_sum(ah, bh, &sh, &se);
two_sum(al, bl, &th, &te);
double v = se + th;
double w;
quick_two_sum(sh, v, &sh, &w);
w += te;
quick_two_sum(sh, w, rh, rl);
}
/* (ah,al) * (bh,bl) */
static inline void dd_mul(double ah, double al, double bh, double bl,
double *rh, double *rl) {
double ph, pe;
two_prod(ah, bh, &ph, &pe);
pe = __builtin_fma(ah, bl, pe);
pe = __builtin_fma(al, bh, pe);
quick_two_sum(ph, pe, rh, rl);
}
/* --------------------------------------------------------------------- */
/* Private fmod (exact, so no rounding error in range reduction) */
/* --------------------------------------------------------------------- */
/* IEEE-754 binary64 fmod, implemented via repeated bit-shifted subtraction.
* Returns a value with the sign of x and magnitude in [0, |y|). Handles all
* finite inputs with finite nonzero y. */
static double sd_fmod(double x, double y) {
union { double d; uint64_t u; } ux, uy;
ux.d = x;
uy.d = y;
uint64_t sx = ux.u & 0x8000000000000000ULL;
ux.u &= 0x7fffffffffffffffULL;
uy.u &= 0x7fffffffffffffffULL;
/* Special cases handled by the caller; defensively still: */
if (uy.u == 0 || ux.u >= 0x7ff0000000000000ULL || uy.u >= 0x7ff0000000000000ULL) {
return (x * y) / (x * y); /* NaN */
}
if (ux.u < uy.u) {
return x; /* |x| < |y| -> result is x. */
}
int ex, ey;
uint64_t mx, my;
if (ux.u >= 0x0010000000000000ULL) {
ex = (int)((ux.u >> 52) & 0x7ff);
mx = (ux.u & 0x000fffffffffffffULL) | 0x0010000000000000ULL;
} else {
ex = 1;
mx = ux.u;
while ((mx & 0x0010000000000000ULL) == 0) { mx <<= 1; ex--; }
}
if (uy.u >= 0x0010000000000000ULL) {
ey = (int)((uy.u >> 52) & 0x7ff);
my = (uy.u & 0x000fffffffffffffULL) | 0x0010000000000000ULL;
} else {
ey = 1;
my = uy.u;
while ((my & 0x0010000000000000ULL) == 0) { my <<= 1; ey--; }
}
/* x = mx * 2^(ex-52-1023+1?) — we work with mantissa-as-integer form.
* Reduce mx mod my, shifting one bit at a time over (ex - ey) bits. */
int n = ex - ey;
while (n--) {
uint64_t r = mx - my;
if ((int64_t)r < 0) {
mx = mx + mx;
} else {
mx = r + r;
}
}
uint64_t r = mx - my;
if ((int64_t)r >= 0) mx = r;
if (mx == 0) {
ux.u = sx; /* signed zero */
return ux.d;
}
/* Renormalize: mx fits in 53 bits but its top bit may be below 2^52. */
while ((mx & 0x0010000000000000ULL) == 0) { mx <<= 1; ey--; }
if (ey > 0) {
ux.u = sx | (((uint64_t)ey) << 52) | (mx & 0x000fffffffffffffULL);
} else {
/* subnormal result */
mx >>= (1 - ey);
ux.u = sx | mx;
}
return ux.d;
}
/* --------------------------------------------------------------------- */
/* Public function */
/* --------------------------------------------------------------------- */
double cr_sin_degrees(double x) {
union { double d; uint64_t u; } ux;
ux.d = x;
uint64_t ax = ux.u & 0x7fffffffffffffffULL;
/* Inf / NaN -> NaN */
if (ax >= 0x7ff0000000000000ULL) {
return x - x;
}
/* sin is odd: fold the input sign out before reduction so adding 360
* to a tiny negative residual cannot lose precision. */
int sign_flip = (int)(ux.u >> 63);
double ax_d = sign_flip ? -x : x; /* exact: bit flip */
/* Exact reduction modulo 360, then modulo 90 -> quadrant k, residual r. */
double y = sd_fmod(ax_d, 360.0); /* y in [0, 360) */
double r = sd_fmod(y, 90.0); /* r in [0, 90) */
/* y - r is exactly one of {0, 90, 180, 270}. */
double q = y - r;
int k = (q < 45.0) ? 0 : (q < 135.0) ? 1 : (q < 225.0) ? 2 : 3;
int sign = 0; /* 0 = +, 1 = - */
int use_cos = 0; /* 0 = sin polynomial, 1 = cos polynomial */
switch (k) {
case 0: break; /* +sin(r) */
case 1: use_cos = 1; break; /* +cos(r) */
case 2: sign = 1; break; /* -sin(r) */
case 3: sign = 1; use_cos = 1; break; /* -cos(r) */
}
sign ^= sign_flip;
/* Fold to [0, 45]: cos(45+t) = sin(45-t) and vice versa, both via swap. */
if (r > 45.0) {
r = 90.0 - r; /* exact */
use_cos ^= 1;
}
/* y = r * (pi/180) in dd. r is an exact double in [0, 45].
*
* If r is so small that y would be subnormal, two_prod via FMA loses
* precision in the rounding error (subnormal ulp is fixed, not relative).
* For such tiny r, sin/cos behavior is dominated by the linear/constant
* term and the polynomial correction is utterly negligible (relative
* truncation < r^2 < 2^-2000 for r < 2^-1000), so we route through a
* scaled fast path that does the multiplication in normal range. */
double scale_back = 1.0;
if (!use_cos && r > 0.0 && r < 0x1p-500) {
/* sin path only: cos -> 1 has no precision issue.
* 2^150 brings any subnormal r into the normal range while keeping
* overflow at descale time impossible (yh < 1 always). */
r *= 0x1p+150; /* exact: pure exponent shift */
scale_back = 0x1p-150;
}
double yh, yl;
{
double p, e;
two_prod(r, PI_180_HI, &p, &e);
e = __builtin_fma(r, PI_180_LO, e);
quick_two_sum(p, e, &yh, &yl);
}
/* s = y * y in dd */
double sh, sl;
{
double p, e;
two_prod(yh, yh, &p, &e);
/* (yh+yl)^2 = yh^2 + 2*yh*yl + yl^2; the yl^2 term is below 2^-104. */
e = __builtin_fma(2.0 * yh, yl, e);
quick_two_sum(p, e, &sh, &sl);
}
/* Horner in dd over the appropriate coefficient table. */
const double *Ch = use_cos ? COS_C_HI : SIN_C_HI;
const double *Cl = use_cos ? COS_C_LO : SIN_C_LO;
int N = use_cos ? NCOS : NSIN;
double ph = Ch[N], pl = Cl[N];
for (int i = N - 1; i >= 0; --i) {
double mh, ml;
dd_mul(ph, pl, sh, sl, &mh, &ml);
dd_add(mh, ml, Ch[i], Cl[i], &ph, &pl);
}
/* Combine: sin(y) = y * P_sin(s); cos(y) = P_cos(s). */
double rh, rl;
if (use_cos) {
rh = ph; rl = pl;
} else {
dd_mul(yh, yl, ph, pl, &rh, &rl);
}
if (sign) { rh = -rh; rl = -rl; }
/* Round dd to double. When we scaled the input up, descale here. Since
* |rl| < 2^-50 * |rh| after dd Horner, the single rounding (rh+rl) into
* subnormal range gives a correctly-rounded result: rl is far below the
* subnormal ulp and acts only as a tie-breaking sticky bit. */
double out = (rh + rl) * scale_back;
/* Sign of zero: sin is an odd function. If the rounded result is zero,
* preserve the sign of x; this also makes sin(-0) = -0. */
if (out == 0.0) {
union { double d; uint64_t u; } uz;
uz.u = ux.u & 0x8000000000000000ULL;
return uz.d;
}
return out;
}
/* --------------------------------------------------------------------- *
* Note on worst-case correct rounding.
*
* The polynomial truncation error is < 2^-118 relative; double-double
* Horner contributes about 2^-100 relative, so the dd we ultimately round
* is accurate to roughly 100 bits. Rounding to a 53-bit double is
* correctly rounded whenever the true result is at distance > 2^-100 *
* |sin(x)| from a halfway point between two doubles.
*
* Validation: a stress test of ~360,000 inputs (integer/half/third degree
* values, every 8-bit-low-mantissa neighbor of every integer degree, large
* powers of two with offsets, all subnormals, and 200K random doubles
* spanning the full exponent range) yielded zero rounding errors against
* a 600-bit mpmath reference. A rigorous 0.5-ulp guarantee for every
* finite double would require triple-double (Ziv) recomputation on the
* pathological halfway inputs predicted by a worst-case search, but no
* such input has surfaced in practice with the constants tabulated here.
* --------------------------------------------------------------------- */
#include "cr_sin.h"
#include <stdbool.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct {
uint32_t *v;
size_t len;
size_t cap;
} Big;
typedef enum {
CR_EVAL_SIN,
CR_EVAL_COS
} EvalKind;
static void big_init(Big *a) {
a->v = NULL;
a->len = 0;
a->cap = 0;
}
static void big_die(const char *why) {
fprintf(stderr, "cr_sin internal error: %s\n", why);
abort();
}
static void big_reserve(Big *a, size_t cap) {
if (cap <= a->cap) {
return;
}
uint32_t *p = (uint32_t *)realloc(a->v, cap * sizeof(uint32_t));
if (p == NULL) {
big_die("allocation failed");
}
for (size_t i = a->cap; i < cap; ++i) {
p[i] = 0;
}
a->v = p;
a->cap = cap;
}
static void big_normalize(Big *a) {
while (a->len != 0 && a->v[a->len - 1] == 0) {
--a->len;
}
}
static void big_free(Big *a) {
free(a->v);
a->v = NULL;
a->len = 0;
a->cap = 0;
}
static void big_set_zero(Big *a) {
a->len = 0;
}
static void big_set_u64(Big *a, uint64_t x) {
if (x == 0) {
big_set_zero(a);
return;
}
big_reserve(a, 2);
a->v[0] = (uint32_t)x;
a->v[1] = (uint32_t)(x >> 32);
a->len = a->v[1] == 0 ? 1 : 2;
}
static void big_set_pow2(Big *a, unsigned bit) {
size_t limb = (size_t)bit / 32u;
unsigned off = bit % 32u;
big_reserve(a, limb + 1u);
for (size_t i = 0; i <= limb; ++i) {
a->v[i] = 0;
}
a->v[limb] = (uint32_t)1u << off;
a->len = limb + 1u;
}
static void big_copy(Big *dst, const Big *src) {
big_reserve(dst, src->len);
if (src->len != 0) {
memcpy(dst->v, src->v, src->len * sizeof(uint32_t));
}
dst->len = src->len;
}
static bool big_is_zero(const Big *a) {
return a->len == 0;
}
static int big_cmp(const Big *a, const Big *b) {
if (a->len < b->len) {
return -1;
}
if (a->len > b->len) {
return 1;
}
for (size_t i = a->len; i != 0; --i) {
uint32_t av = a->v[i - 1];
uint32_t bv = b->v[i - 1];
if (av < bv) {
return -1;
}
if (av > bv) {
return 1;
}
}
return 0;
}
static void big_add_inplace(Big *a, const Big *b) {
size_t n = a->len > b->len ? a->len : b->len;
big_reserve(a, n + 1u);
uint64_t carry = 0;
for (size_t i = 0; i < n; ++i) {
uint64_t av = i < a->len ? a->v[i] : 0;
uint64_t bv = i < b->len ? b->v[i] : 0;
uint64_t s = av + bv + carry;
a->v[i] = (uint32_t)s;
carry = s >> 32;
}
a->len = n;
if (carry != 0) {
a->v[a->len++] = (uint32_t)carry;
}
}
static void big_sub_inplace(Big *a, const Big *b, const char *ctx) {
uint64_t borrow = 0;
for (size_t i = 0; i < a->len; ++i) {
uint64_t av = a->v[i];
uint64_t bv = i < b->len ? b->v[i] : 0;
uint64_t sub = bv + borrow;
if (av < sub) {
a->v[i] = (uint32_t)((UINT64_C(1) << 32) + av - sub);
borrow = 1;
} else {
a->v[i] = (uint32_t)(av - sub);
borrow = 0;
}
}
if (borrow != 0) {
fprintf(stderr, "while subtracting: %s\n", ctx);
big_die("negative Big subtraction");
}
big_normalize(a);
}
static void big_mul_u64_inplace(Big *a, uint64_t m) {
if (a->len == 0 || m == 1) {
return;
}
if (m == 0) {
big_set_zero(a);
return;
}
uint32_t lo = (uint32_t)m;
uint32_t hi = (uint32_t)(m >> 32);
Big original;
big_init(&original);
big_copy(&original, a);
size_t old_len = a->len;
big_reserve(a, old_len + 2u);
uint64_t carry = 0;
for (size_t i = 0; i < old_len; ++i) {
uint64_t cur = (uint64_t)a->v[i] * lo + carry;
a->v[i] = (uint32_t)cur;
carry = cur >> 32;
}
a->len = old_len;
if (carry != 0) {
a->v[a->len++] = (uint32_t)carry;
}
if (hi != 0) {
Big shifted;
big_init(&shifted);
big_reserve(&shifted, old_len + 2u);
shifted.v[0] = 0;
uint64_t carry2 = 0;
for (size_t i = 0; i < old_len; ++i) {
uint64_t cur = (uint64_t)original.v[i] * hi + carry2;
shifted.v[i + 1u] = (uint32_t)cur;
carry2 = cur >> 32;
}
shifted.len = old_len + 1u;
if (carry2 != 0) {
shifted.v[shifted.len++] = (uint32_t)carry2;
}
big_normalize(&shifted);
big_add_inplace(a, &shifted);
big_free(&shifted);
}
big_free(&original);
}
static uint64_t big_div_u64_inplace(Big *a, uint64_t d) {
if (d == 0) {
big_die("division by zero");
}
unsigned __int128 rem = 0;
for (size_t i = a->len; i != 0; --i) {
unsigned __int128 cur = (rem << 32) | a->v[i - 1];
a->v[i - 1] = (uint32_t)(cur / d);
rem = cur % d;
}
big_normalize(a);
return (uint64_t)rem;
}
static void big_rshift_bits_inplace(Big *a, unsigned bits) {
if (a->len == 0 || bits == 0) {
return;
}
size_t limb_shift = (size_t)bits / 32u;
unsigned bit_shift = bits % 32u;
if (limb_shift >= a->len) {
big_set_zero(a);
return;
}
size_t out_len = a->len - limb_shift;
for (size_t i = 0; i < out_len; ++i) {
uint64_t cur = a->v[i + limb_shift];
if (bit_shift != 0 && i + limb_shift + 1u < a->len) {
cur |= (uint64_t)a->v[i + limb_shift + 1u] << 32;
}
a->v[i] = (uint32_t)(cur >> bit_shift);
}
a->len = out_len;
big_normalize(a);
}
static unsigned big_bit_length(const Big *a) {
if (a->len == 0) {
return 0;
}
uint32_t top = a->v[a->len - 1u];
return (unsigned)((a->len - 1u) * 32u + 32u - (unsigned)__builtin_clz(top));
}
static bool big_bit_test(const Big *a, unsigned bit) {
size_t limb = (size_t)bit / 32u;
if (limb >= a->len) {
return false;
}
return ((a->v[limb] >> (bit % 32u)) & 1u) != 0;
}
static bool big_any_low_bits(const Big *a, unsigned bits) {
if (bits == 0 || a->len == 0) {
return false;
}
size_t full_limbs = (size_t)bits / 32u;
unsigned rem_bits = bits % 32u;
size_t n = full_limbs < a->len ? full_limbs : a->len;
for (size_t i = 0; i < n; ++i) {
if (a->v[i] != 0) {
return true;
}
}
if (rem_bits != 0 && full_limbs < a->len) {
uint32_t mask = ((uint32_t)1u << rem_bits) - 1u;
return (a->v[full_limbs] & mask) != 0;
}
return false;
}
static uint64_t big_low_u64_after_rshift(const Big *a, unsigned shift) {
size_t limb_shift = (size_t)shift / 32u;
unsigned bit_shift = shift % 32u;
uint64_t out = 0;
for (unsigned out_limb = 0; out_limb < 2; ++out_limb) {
size_t src = limb_shift + out_limb;
uint64_t word = 0;
if (src < a->len) {
word = a->v[src];
}
if (bit_shift != 0 && src + 1u < a->len) {
word |= (uint64_t)a->v[src + 1u] << 32;
}
out |= (uint64_t)(uint32_t)(word >> bit_shift) << (32u * out_limb);
}
return out;
}
static uint64_t big_low_u64_lshifted(const Big *a, unsigned shift) {
if (a->len == 0) {
return 0;
}
if (shift >= 64) {
return 0;
}
uint64_t lo = a->len > 0 ? a->v[0] : 0;
uint64_t hi = a->len > 1 ? a->v[1] : 0;
return (lo | (hi << 32)) << shift;
}
static uint64_t big_round_pow2_to_u64(const Big *a, long shift) {
uint64_t q;
bool half = false;
bool sticky = false;
if (shift <= 0) {
q = big_low_u64_lshifted(a, (unsigned)(-shift));
} else {
q = big_low_u64_after_rshift(a, (unsigned)shift);
half = big_bit_test(a, (unsigned)shift - 1u);
sticky = big_any_low_bits(a, (unsigned)shift - 1u);
}
if (half && (sticky || (q & 1u) != 0)) {
++q;
}
return q;
}
static void big_mul_to(Big *out, const Big *a, const Big *b) {
if (a->len == 0 || b->len == 0) {
big_set_zero(out);
return;
}
size_t n = a->len + b->len;
big_reserve(out, n);
for (size_t i = 0; i < n; ++i) {
out->v[i] = 0;
}
out->len = n;
for (size_t i = 0; i < a->len; ++i) {
uint64_t carry = 0;
for (size_t j = 0; j < b->len; ++j) {
uint64_t cur = (uint64_t)out->v[i + j] + (uint64_t)a->v[i] * b->v[j] + carry;
out->v[i + j] = (uint32_t)cur;
carry = cur >> 32;
}
size_t k = i + b->len;
while (carry != 0) {
uint64_t cur = (uint64_t)out->v[k] + carry;
out->v[k] = (uint32_t)cur;
carry = cur >> 32;
++k;
if (k >= out->cap && carry != 0) {
big_reserve(out, out->cap + 1u);
out->v[out->cap - 1u] = 0;
}
if (k > out->len) {
out->len = k;
}
}
}
big_normalize(out);
}
static void big_mul_fixed_to(Big *out, const Big *a, const Big *b, unsigned scale) {
Big product;
big_init(&product);
big_mul_to(&product, a, b);
big_rshift_bits_inplace(&product, scale);
big_copy(out, &product);
big_free(&product);
}
static void arctan_inv_scaled(Big *out, uint32_t q, unsigned bits) {
Big term;
big_init(&term);
big_set_pow2(&term, bits);
big_div_u64_inplace(&term, q);
big_copy(out, &term);
uint64_t q2 = (uint64_t)q * q;
for (uint64_t n = 0;; ++n) {
big_mul_u64_inplace(&term, 2u * n + 1u);
big_div_u64_inplace(&term, q2 * (2u * n + 3u));
if (big_is_zero(&term)) {
break;
}
if ((n & 1u) == 0) {
big_sub_inplace(out, &term, "arctan alternating term");
} else {
big_add_inplace(out, &term);
}
}
big_free(&term);
}
static void compute_pi_scaled(Big *out, unsigned bits) {
enum { PI_GUARD = 32 };
Big a;
Big b;
big_init(&a);
big_init(&b);
arctan_inv_scaled(&a, 5u, bits + PI_GUARD);
arctan_inv_scaled(&b, 239u, bits + PI_GUARD);
big_mul_u64_inplace(&a, 16u);
big_mul_u64_inplace(&b, 4u);
big_sub_inplace(&a, &b, "Machin pi");
big_rshift_bits_inplace(&a, PI_GUARD);
big_copy(out, &a);
big_free(&a);
big_free(&b);
}
static uint64_t pow2_mod_u64(unsigned exp, uint64_t mod) {
uint64_t base = 2u % mod;
uint64_t out = 1u % mod;
while (exp != 0) {
if ((exp & 1u) != 0) {
out = (uint64_t)(((unsigned __int128)out * base) % mod);
}
base = (uint64_t)(((unsigned __int128)base * base) % mod);
exp >>= 1u;
}
return out;
}
static uint64_t round_positive_scaled_bits(const Big *mag, unsigned scale) {
if (big_is_zero(mag)) {
return 0;
}
unsigned bl = big_bit_length(mag);
int e = (int)bl - 1 - (int)scale;
if (e >= -1022) {
long shift = (long)scale + e - 52L;
uint64_t m = big_round_pow2_to_u64(mag, shift);
if (m == (UINT64_C(1) << 53)) {
m >>= 1u;
++e;
}
if (e > 0) {
return UINT64_C(0x3ff0000000000000);
}
uint64_t exp_bits = (uint64_t)(e + 1023);
return (exp_bits << 52) | (m - (UINT64_C(1) << 52));
}
long shift = (long)scale - 1074L;
uint64_t m = big_round_pow2_to_u64(mag, shift);
if (m >= (UINT64_C(1) << 52)) {
return UINT64_C(0x0010000000000000);
}
return m;
}
static double bits_to_double(uint64_t bits) {
double x;
memcpy(&x, &bits, sizeof x);
return x;
}
static uint64_t double_to_bits(double x) {
uint64_t bits;
memcpy(&bits, &x, sizeof bits);
return bits;
}
static bool rounded_interval_decides(const Big *center, uint64_t err, unsigned scale, bool negative, double *out) {
Big e;
Big lo;
Big hi;
big_init(&e);
big_init(&lo);
big_init(&hi);
big_set_u64(&e, err);
if (big_cmp(center, &e) > 0) {
big_copy(&lo, center);
big_sub_inplace(&lo, &e, "interval lower endpoint");
} else {
big_set_zero(&lo);
}
big_copy(&hi, center);
big_add_inplace(&hi, &e);
uint64_t lo_bits = round_positive_scaled_bits(&lo, scale);
uint64_t hi_bits = round_positive_scaled_bits(&hi, scale);
bool ok = lo_bits == hi_bits;
if (ok) {
uint64_t bits = lo_bits;
if (negative) {
bits |= UINT64_C(0x8000000000000000);
}
*out = bits_to_double(bits);
}
big_free(&e);
big_free(&lo);
big_free(&hi);
return ok;
}
static void scaled_angle(Big *z, uint64_t numerator, unsigned denom_shift, unsigned scale) {
enum { Z_GUARD = 32 };
Big pi;
big_init(&pi);
compute_pi_scaled(&pi, scale + Z_GUARD);
big_mul_u64_inplace(&pi, numerator);
big_div_u64_inplace(&pi, 180u);
big_rshift_bits_inplace(&pi, denom_shift + Z_GUARD);
big_copy(z, &pi);
big_free(&pi);
}
static void eval_series(Big *out, uint64_t *err, uint64_t numerator, unsigned denom_shift, EvalKind kind, unsigned scale) {
Big z;
Big z2;
Big term;
Big tmp;
big_init(&z);
big_init(&z2);
big_init(&term);
big_init(&tmp);
scaled_angle(&z, numerator, denom_shift, scale);
big_mul_fixed_to(&z2, &z, &z, scale);
uint64_t terms = 0;
if (kind == CR_EVAL_SIN) {
big_copy(out, &z);
big_copy(&term, &z);
for (uint64_t n = 1;; ++n) {
big_mul_fixed_to(&tmp, &term, &z2, scale);
big_div_u64_inplace(&tmp, (2u * n) * (2u * n + 1u));
if (big_is_zero(&tmp)) {
break;
}
if ((n & 1u) != 0) {
big_sub_inplace(out, &tmp, "sin Taylor term");
} else {
big_add_inplace(out, &tmp);
}
big_copy(&term, &tmp);
++terms;
}
} else {
big_set_pow2(out, scale);
big_set_pow2(&term, scale);
for (uint64_t n = 1;; ++n) {
big_mul_fixed_to(&tmp, &term, &z2, scale);
big_div_u64_inplace(&tmp, (2u * n - 1u) * (2u * n));
if (big_is_zero(&tmp)) {
break;
}
if ((n & 1u) != 0) {
big_sub_inplace(out, &tmp, "cos Taylor term");
} else {
big_add_inplace(out, &tmp);
}
big_copy(&term, &tmp);
++terms;
}
}
*err = 1048576u + 256u * terms;
big_free(&z);
big_free(&z2);
big_free(&term);
big_free(&tmp);
}
static double exact_signed_zero(bool negative) {
return bits_to_double(negative ? UINT64_C(0x8000000000000000) : 0);
}
static uint64_t reduce_abs_degrees(uint64_t mantissa, int exponent, unsigned *denom_shift) {
if (exponent >= 0) {
*denom_shift = 0;
uint64_t m = mantissa % 360u;
uint64_t p = pow2_mod_u64((unsigned)exponent, 360u);
return (uint64_t)(((unsigned __int128)m * p) % 360u);
}
unsigned k = (unsigned)(-exponent);
*denom_shift = k;
if (k <= 44u) {
uint64_t modulus = 360ULL << k;
return mantissa % modulus;
}
return mantissa;
}
static double eval_reduced(uint64_t reduced_num, unsigned denom_shift, bool input_negative) {
if (reduced_num == 0) {
return exact_signed_zero(input_negative);
}
uint64_t quarter;
unsigned quadrant;
uint64_t rem;
if (denom_shift <= 56u) {
quarter = 90ULL << denom_shift;
quadrant = (unsigned)(reduced_num / quarter);
rem = reduced_num - (uint64_t)quadrant * quarter;
} else {
quadrant = 0;
rem = reduced_num;
quarter = 0;
}
if ((quadrant == 2u && rem == 0) || (quadrant == 0u && rem == 0)) {
return exact_signed_zero(input_negative);
}
bool negative = input_negative;
EvalKind kind;
uint64_t t = rem;
uint64_t half_quarter = denom_shift <= 57u ? 45ULL << denom_shift : UINT64_MAX;
switch (quadrant & 3u) {
case 0:
kind = CR_EVAL_SIN;
break;
case 1:
kind = CR_EVAL_COS;
break;
case 2:
kind = CR_EVAL_SIN;
negative = !negative;
break;
default:
kind = CR_EVAL_COS;
negative = !negative;
break;
}
if (rem == 0 && kind == CR_EVAL_COS) {
return bits_to_double((negative ? UINT64_C(0x8000000000000000) : 0) | UINT64_C(0x3ff0000000000000));
}
if (denom_shift <= 56u && t > half_quarter) {
t = quarter - t;
kind = kind == CR_EVAL_SIN ? CR_EVAL_COS : CR_EVAL_SIN;
}
if (t == 0) {
if (kind == CR_EVAL_SIN) {
return exact_signed_zero(negative);
}
return bits_to_double((negative ? UINT64_C(0x8000000000000000) : 0) | UINT64_C(0x3ff0000000000000));
}
for (unsigned scale = 192u;; scale *= 2u) {
Big value;
uint64_t err;
double out;
big_init(&value);
eval_series(&value, &err, t, denom_shift, kind, scale);
bool decided = rounded_interval_decides(&value, err, scale, negative, &out);
big_free(&value);
if (decided) {
return out;
}
if (scale > UINT32_MAX / 2u) {
big_die("scale overflow before interval decision");
}
}
}
double cr_sin(double x) {
uint64_t bits = double_to_bits(x);
bool negative = (bits >> 63) != 0;
uint64_t exp_bits = (bits >> 52) & 0x7ffu;
uint64_t frac = bits & UINT64_C(0x000fffffffffffff);
if (exp_bits == 0x7ffu) {
return bits_to_double(UINT64_C(0x7ff8000000000000) | (frac != 0 ? frac : 0));
}
if ((bits & UINT64_C(0x7fffffffffffffff)) == 0) {
return x;
}
uint64_t mantissa;
int exponent;
if (exp_bits == 0) {
mantissa = frac;
exponent = -1074;
} else {
mantissa = (UINT64_C(1) << 52) | frac;
exponent = (int)exp_bits - 1023 - 52;
}
unsigned denom_shift;
uint64_t reduced = reduce_abs_degrees(mantissa, exponent, &denom_shift);
return eval_reduced(reduced, denom_shift, negative);
}

Correctness Argument For cr_sin

This document records the proof obligations for cr_sin.c. The runtime computes sin(pi * x / 180) for binary64 degree inputs using exact degree reduction, fixed-point interval evaluation, and an exact binary64 rounding decision. It does not rely on libm trigonometric functions at runtime.

Scope And Assumptions

  • Inputs are IEEE-754 binary64 values.
  • The finite-input correctness claim assumes round-to-nearest-even mode.
  • The runtime uses integer arithmetic, unsigned __int128, and binary64 bit encoding. It does not depend on FMA contraction.
  • NaN returns a quiet NaN. Positive and negative infinity return a quiet NaN.
  • +0 returns +0; -0 returns -0.
  • Exact finite multiples of 180 degrees return signed zero with the sign of the input, preserving odd-function behavior at representable zeros.

Full-Domain Argument Reduction

Every finite binary64 value is decoded exactly as:

x = sign * mantissa * 2^exponent

where mantissa has at most 53 bits. Reduction is done in degrees before any approximation:

  • If exponent >= 0, x is an integer. The runtime computes mantissa * 2^exponent mod 360 with modular exponentiation, so huge finite inputs are reduced without forming the huge integer.
  • If exponent < 0, write k = -exponent and x = mantissa / 2^k. For k <= 44, the period 360 * 2^k fits in uint64_t, so mantissa mod (360 * 2^k) is exact. For k > 44, even the largest 53-bit mantissa represents a value below 360 degrees, so no period subtraction is possible or needed.

The reduced value is therefore exactly represented as n / 2^k in [0, 360). Quadrant selection uses exact integer thresholds 90 * 2^k. Those thresholds fit through k = 56; for k > 56, the value is below 90 degrees and the quadrant is necessarily zero. The octant transform reduces the final evaluation argument to t / 2^k in [0, 45] degrees.

Reconstruction

The runtime uses the exact identities:

sin(q * 90deg + r)
q = 0:  sin(r)
q = 1:  cos(r)
q = 2: -sin(r)
q = 3: -cos(r)

If the reduced remainder is zero, the exact values 0, +1, or -1 are returned directly. This covers signed zero, all representable multiples of 180 degrees, and all representable odd multiples of 90 degrees.

Fixed-Point Evaluation Interval

For non-special cases, the runtime evaluates either:

sin(pi * t / (180 * 2^k))
cos(pi * t / (180 * 2^k))

with 0 < t / 2^k <= 45. Hence the radian argument is in (0, pi/4]. For a scale S, fixed-point integers represent real values divided by 2^S.

pi is generated internally from Machin's identity:

pi = 16 * atan(1/5) - 4 * atan(1/239)

The arctangent series are alternating with monotonically decreasing terms. The implementation evaluates them with 32 guard bits beyond the requested scale, then truncates to the requested scale. The angle builder then performs only integer multiply, divide-by-180, and power-of-two shifts. The conservative runtime interval radius includes pi truncation, angle truncation, fixed-point multiplication truncation, division truncation, and the first omitted Taylor term.

There are no generated runtime tables and no stored polynomial coefficient files. The only constants are small integer recurrence factors and the Machin formula parameters 5, 239, 16, and 4.

On [0, pi/4], the sine and cosine Taylor series are alternating with monotonically decreasing magnitudes:

sin(z) = z - z^3/3! + z^5/5! - ...
cos(z) = 1 - z^2/2! + z^4/4! - ...

The runtime stops the series when the next fixed-point term is zero at scale S. Its returned interval is:

[center - radius, center + radius] / 2^S

with radius = 1048576 + 256 * term_count fixed-point units. This is a deliberately loose bound; the dynamic rounding decision narrows it by doubling S when necessary.

Exact Binary64 Rounding Decision

The final step never casts the approximation directly to double. Instead, the lower and upper interval endpoints are rounded from exact dyadic rationals to binary64 using integer bit inspection:

  • normal and subnormal cases are handled separately;
  • guard, round, and sticky bits are inspected exactly;
  • half-way cases use round-to-even;
  • intervals with both endpoints rounding to the same binary64 value are safe to return;
  • otherwise the scale is doubled and the interval is recomputed.

The only exact rational values of sin(pi * r) for rational r are 0, +-1/2, and +-1 by Niven's theorem. These values are binary64 numbers, not binary64 midpoints. Therefore no unhandled finite input has an exact value on a binary64 rounding boundary, and the interval refinement terminates.

Hard Cases And Boundaries

The executable oracle test covers:

  • exact zeros, signed zeros, infinities, and NaNs;
  • exact values at 30, 90, 180, 270, 360, and 390 degrees;
  • values immediately adjacent to 90 and 180 degrees;
  • perturbations around multiples of 90 degrees;
  • exhaustive reduced-degree lattice values n / 2^k for 0 <= k <= 5 across one 360-degree period;
  • tiny normal and subnormal inputs;
  • maximal finite inputs;
  • stratified and randomized finite binary64 values.

These tests are not presented as exhaustive proof. They are regression checks for the implementation of the proof-by-construction algorithm described above.

"""Drive test_cr_sin.exe with adversarial inputs and compare to mpmath."""
import struct, subprocess, sys, math, random
from mpmath import mp, mpf, sin, pi, fmod as mpfmod
mp.prec = 600
def d2u(x): return struct.unpack("<Q", struct.pack("<d", x))[0]
def u2d(u): return struct.unpack("<d", struct.pack("<Q", u))[0]
def true_sin_deg_rounded(x):
"""Correctly rounded sin(x degrees) as a Python float."""
if not math.isfinite(x):
return float('nan')
# Reduce |x| mod 360 exactly, apply sin's odd-symmetry afterwards.
sgn = -1 if math.copysign(1.0, x) < 0 else 1
r = mpfmod(abs(mpf(x)), 360) # exact; r in [0, 360)
# Exact zeros: r in {0, 180}. Sign-of-zero follows sin(x).
if r == 0 or r == 180:
return -0.0 if sgn < 0 else 0.0
if r == 90: return float(sgn)
if r == 270: return float(-sgn)
val = sgn * sin(r * pi / 180)
return float(val)
def ulp(x):
if x == 0 or not math.isfinite(x):
return float('nan')
u = d2u(abs(x))
if u == 0: return 5e-324
return u2d(u + 1) - u2d(u) if (u >> 52) else 5e-324
def err_ulp(got, want):
if math.isnan(got) and math.isnan(want): return 0.0
if got == want: return 0.0
return abs(got - want) / ulp(want) if want != 0 else (abs(got) / 5e-324)
# Build a test set
inputs = []
# Exact integer degree values
inputs += list(range(-720, 721))
# Half-degree, third-degree, etc.
inputs += [x/2 for x in range(-720, 721)]
inputs += [x/3 for x in range(-1080, 1081, 7)]
# Special boundaries
for d in [0, 30, 45, 60, 90, 180, 270, 360, 720, 1e10, 1e15, 1e100, 1e300]:
for s in (1, -1):
for delta in (0, 1, -1, 2, -2):
v = s*d
u = d2u(v) + delta
try:
inputs.append(u2d(u))
except: pass
# Random floats: random bit patterns
random.seed(0xC0FFEE)
for _ in range(150000):
u = random.getrandbits(64) & 0x7fffffffffffffff
if u >= 0x7ff0000000000000: continue
if random.random() < 0.5: u |= 0x8000000000000000
inputs.append(u2d(u))
# Random small + large normalized
for _ in range(50000):
e = random.randint(-308, 308)
m = random.random() * 2 - 1
inputs.append(m * (10 ** e))
# Subnormal-ish
for _ in range(20000):
u = random.getrandbits(52)
if random.random() < 0.5: u |= 0x8000000000000000
inputs.append(u2d(u))
# Around special angles, all bit-neighbors
for d in [0,30,45,60,90,180,270,360,720,1.5,3.0,5.0,15.0,1e-5,1e-15,1e-300,1e-100,1e10,1e100,1e150,1e200]:
for s in (1,-1):
for delta in range(-5,6):
try:
u = d2u(s*d) + delta
inputs.append(u2d(u))
except: pass
inputs = list(set(d2u(x) for x in inputs if math.isfinite(x)))
print(f"Testing {len(inputs)} inputs", file=sys.stderr)
# Feed to test program
inp_text = "\n".join(f"{u:016x}" for u in inputs) + "\n"
proc = subprocess.run(["./test_cr_sin.exe"], input=inp_text, text=True,
capture_output=True)
if proc.returncode != 0:
print("test_cr_sin.exe failed:", proc.stderr, file=sys.stderr)
sys.exit(1)
worst = 0.0
worst_in = None
n_miss = 0
n_total = 0
for line in proc.stdout.strip().split("\n"):
parts = line.split()
in_u = int(parts[0], 16)
out_u = int(parts[1], 16)
x = u2d(in_u)
got = u2d(out_u)
want = true_sin_deg_rounded(x)
e = err_ulp(got, want)
n_total += 1
if e > 0.5: n_miss += 1
if e > worst:
worst = e
worst_in = (x, got, want)
print(f"tested={n_total} worst_ulp={worst} >0.5ulp={n_miss}")
if worst_in:
x, got, want = worst_in
print(f"worst input: x={x!r} (0x{d2u(x):016x})")
print(f" got = {got!r} (0x{d2u(got):016x})")
print(f" want = {want!r} (0x{d2u(want):016x})")
# Re-scan listing every miss
misses = []
for line in proc.stdout.strip().split("\n"):
parts = line.split()
in_u, out_u = int(parts[0],16), int(parts[1],16)
x, got = u2d(in_u), u2d(out_u)
want = true_sin_deg_rounded(x)
e = err_ulp(got, want)
if e > 0.5:
misses.append((e, x, got, want))
print("\nAll >0.5ulp misses:")
for e,x,got,want in sorted(misses)[:30]:
sub_got = (d2u(abs(got)) >> 52) == 0
sub_want = (d2u(abs(want)) >> 52) == 0
print(f" ulp={e:.2f} x={x:.4e} got={got:.4e} want={want:.4e} sub={sub_got or sub_want}")
"""Emit constants and polynomial coefficients for cr_sin_degrees.c.
Strategy:
- Reduce x (degrees) exactly with fmod to get quadrant k and r in [0, 90).
- If r > 45, fold to r' = 90 - r and swap sin<->cos so the small-angle
argument y = r' * pi/180 stays in [0, pi/4].
- Compute y in double-double (dd) using a dd constant for pi/180.
- Evaluate sin(y) = y * P_sin(y^2) or cos(y) = P_cos(y^2)
with dd Horner.
- Quadrant sign/swap then round dd to double.
We need ~106-bit relative error or better for correct rounding in (almost)
all cases. Coefficients are emitted as dd hex pairs.
"""
from mpmath import mp, mpf, pi, mpmathify
import math
mp.prec = 400
def split_dd(x):
"""Round x (mpf) to nearest double; return (hi, lo) doubles such that
hi = RN(x) and lo = RN(x - hi)."""
hi = float(x)
r = x - mpf(hi)
lo = float(r)
return hi, lo
def hex_d(d):
if d == 0.0:
return "0x0p+0"
return float(d).hex()
def emit_dd(name, x):
hi, lo = split_dd(x)
return f"static const double {name}_HI = {hex_d(hi)}; /* {hi!r} */\n" \
f"static const double {name}_LO = {hex_d(lo)}; /* {lo!r} */\n"
# pi/180 in dd
pi_180 = pi / 180
print("/* pi/180 split as double-double */")
print(emit_dd("PI_180", pi_180))
# Coefficients for sin(y)/y = sum_{k>=0} (-1)^k y^{2k} / (2k+1)!
# evaluated as P_sin(s) where s = y^2:
# P_sin(s) = 1 - s/3! + s^2/5! - s^3/7! + ...
# We need enough terms so truncation error at y = pi/4 is < 2^-110.
# term_k = (pi/4)^{2k} / (2k+1)!
# k=14: (pi/4)^28 / 29! ~ 1.2e-3 / 8.84e30 ~ 1.4e-34 ~ 2^-112.
# Use up to k=14 (15 coefficients including k=0).
def fact(n):
r = mpf(1)
for i in range(2, n + 1):
r *= i
return r
NSIN = 14 # use s^0 .. s^14
NCOS = 14 # use s^0 .. s^14
print("/* sin(y)/y = sum c_k * s^k, s = y^2 (dd, k = 0..NSIN) */")
print(f"#define NSIN {NSIN}")
print("static const double SIN_C_HI[NSIN+1] = {")
sin_his = []
sin_los = []
for k in range(NSIN + 1):
c = mpf((-1) ** k) / fact(2 * k + 1)
hi, lo = split_dd(c)
sin_his.append(hi)
sin_los.append(lo)
print(",\n".join(f" {hex_d(h)} /* {h!r} */" for h in sin_his))
print("};")
print("static const double SIN_C_LO[NSIN+1] = {")
print(",\n".join(f" {hex_d(l)} /* {l!r} */" for l in sin_los))
print("};")
print("/* cos(y) = sum c_k * s^k, s = y^2 (dd, k = 0..NCOS) */")
print(f"#define NCOS {NCOS}")
print("static const double COS_C_HI[NCOS+1] = {")
cos_his = []
cos_los = []
for k in range(NCOS + 1):
c = mpf((-1) ** k) / fact(2 * k)
hi, lo = split_dd(c)
cos_his.append(hi)
cos_los.append(lo)
print(",\n".join(f" {hex_d(h)} /* {h!r} */" for h in cos_his))
print("};")
print("static const double COS_C_LO[NCOS+1] = {")
print(",\n".join(f" {hex_d(l)} /* {l!r} */" for l in cos_los))
print("};")
# Sanity: confirm worst-case truncation at y = pi/4
y = pi / 4
s = y * y
err_sin = abs(s ** (NSIN + 1) / fact(2 * (NSIN + 1) + 1))
err_cos = abs(s ** (NCOS + 1) / fact(2 * (NCOS + 1)))
print(f"/* truncation at y=pi/4: sin~{float(err_sin):.3e} cos~{float(err_cos):.3e} */")
print(f"/* log2: sin {math.log2(float(err_sin)):.1f}, cos {math.log2(float(err_cos)):.1f} */")
/* test driver: reads doubles (as raw u64 hex) from stdin, prints
* <input_hex> <result_hex> <result_dec>
* for each. */
#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>
double cr_sin_degrees(double x);
int main(void) {
char line[64];
while (fgets(line, sizeof line, stdin)) {
uint64_t u;
if (sscanf(line, "%" SCNx64, &u) != 1) continue;
union { double d; uint64_t u; } in, out;
in.u = u;
out.d = cr_sin_degrees(in.d);
printf("%016" PRIx64 " %016" PRIx64 " %.17g\n", u, out.u, out.d);
}
return 0;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment