Created
February 16, 2015 01:27
-
-
Save andersx/8057b2a6fd3d715d35eb to your computer and use it in GitHub Desktop.
Very fast EXP(x) for AVX2+FMA instructions
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// Approximation for EXP(x) -- very fast, but not super accurate | |
static inline __m256 _mm256_expfaster_ps(const __m256 &q) { | |
const __m256 C1 = _mm256_set1_ps(1064872507.1541044f); | |
const __m256 C2 = _mm256_set1_ps(12102203.161561485f); | |
return _mm256_castsi256_ps(_mm256_cvttps_epi32(_mm256_fmadd_ps(C2, q, C1))); | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment