-#606: Removed Sparse Dynamic Matrix, representing an API break.
-#649: Move Eigen::all
, last
, lastp1
back to Eigen::placeholders::
to reduce name collisions.
-#740: Remove DenseBase::nonZeros() as it duplicates DenseBase::size() functionality.
-#658: Update SVD Module to allow specifying computation options with a template parameter, replacing the previous QRPreconditioner parameter.
-#749: Reverted SVD module update to restore compatibility with third-party libraries.
-#744: Require recent GCC and MSCV, removed EIGEN_HAS_CXX14
and other feature test macros.
-#771: Renamed Eigen::internal::size
to ssize
to prevent ADL conflicts, aligning with C++ standards.
-#826: Introduced Options
template parameter in SVD module with API breaking changes for improved flexibility.
-#857: Re-add svd::compute(Matrix, options)
method to avoid breaking external projects.
-#1240: Revert comparison overloads to return bool array and add cwiseTypedLesser for typed comparisons.
-#1260: Require C++14 standard for detecting Inf and NaN.
-#1280: Disable raw array indexed view access for 1D arrays to improve stability.
-#1301: Ensure Euler angles are returned with canonical ranges, impacting existing computations.
-#1475: Remove MoreVectorization feature due to redundancy and potential ODR violations.
-#1474: Remove Skyline module due to non-functionality and lack of tests.
-#1498: Removed r_cnjg due to conflicts with f2c, inlining functions to resolve duplicate symbol errors.
-#1497: Removed non-standard int
return types from BLAS/LAPACK functions to improve interoperability and reduce symbol conflicts.
-#1730: Revert change to make fixed-size objects trivially move assignable due to issues with setZero()
.
-#515: Add random matrix generation via SVD for enhanced testing capabilities.
-#612: Add support for EIGEN_TENSOR_PLUGIN and related functionalities for enhanced tensor manipulation.
-#447: Introduces BiCGSTAB(L) algorithm for solving linear systems, enhancing capabilities for non-symmetric systems.
-#577: Introduces the IDR(s)STAB(l) method for solving sparse square problems, enhancing convergence and computational efficiency.
-#791: Add support for Cray, Fujitsu, and Intel ICX compilers.
-#856: Add support for Apple's Accelerate sparse matrix solvers to enhance performance of sparse matrix computations.
-#798: Introduced a NNLS solver using the active-set algorithm, enhancing the library's capabilities for non-negative least squares problems.
-#965: Add fused multiply functions for PowerPC - pmsub, pnmadd and pnmsub.
-#356: Added PocketFFT support in FFT module to improve accuracy and performance over KissFFT.
-#973: Introduces the .arg() method to enhance Tensor functionality.
-#981: Introduced an MKL adapter in FFT module with fixes and new library implementations.
-#978: Add Sparse Subset of Matrix Inverse using the Takahashi algorithm and Kahan Summation for efficiency and stability.
-#1004: Implemented true determinant calculation for QR decomposition classes.
-#1029: Add fixed power unary operation for coefficientwise real-valued power operations on arrays.
-#1017: Add support for AVX512-FP16 for vectorizing half precision math.
-#1047: Introduces a skew symmetric matrix class for 3D vectors using Rodrigues' rotation formula.
-#1082: Add a vectorized implementation of atan2 for enhanced performance in Eigen.
-#1097: Introduced a signbit
function to efficiently determine the sign of floating point values.
-#1098: Implemented cross product for 2D vectors, enriching calculations.
-#1103: Introduced a utility to sort inner vectors of sparse matrices using a customizable comparison function, enhancing sparse algorithm performance.
-#1133: Introduced setEqualSpaced
for vectorized creation of equally spaced vectors.
-#1152: Add template for QR permutation index type and fix ColPivHouseholderQR Lapacke bindings.
-#1211: Introduced CArg
for vectorized complex argument calculations in Eigen.
-#1203: Introduces typed logical operators, enabling full vectorization and improved handling of logical operations across scalar types.
-#1244: Introduce permutation index specification for PartialPivLU and FullPivLU, enhancing compatibility with Lapacke ILP64.
-#1281: Introduced insertFromTriplets
and insertFromSortedTriplets
for efficient batch insertions in sparse matrices.
-#1285: Introduces USM support for SYCL, simplifying usage and improving performance.
-#1314: Introduce canonicalEulerAngles
method to provide standardized angle ranges.
-#1330: Enabling half precision support for SYCL using Eigen::half
abstraction.
-#1403: Implemented component-wise cubic root (cbrt) calculations for arrays and matrices.
-#1462: Introduces feature to specify a temporary directory for file I/O outputs, enhancing compatibility with systems that restrict writing to the current directory.
-#1414: Implemented plog_complex
to handle vectorized complex functions.
-#1512: Add method signDeterminant() to QR and related decompositions.
-#1554: Introduced SimplicialNonHermitianLLT and SimplicialNonHermitianLDLT solvers for complex symmetric matrix handling.
-#1395: Incorporate Threadpool in Eigen Core for enhanced computation performance.
-#1696: Make fixed size matrices and arrays trivially_default_constructible.
-#1627: Implement Tensor roll function for circular shifts.
-#1777: Add support for LoongArch64 LSX architecture to enhance Eigen's capabilities.
-#544: Added support for Eigen::Block types to GDB pretty printer.
-#607: Added a flowchart to aid in selecting unsupported sparse iterative solvers.
-#610: Updated CMake configuration to require at least C++11, centralizing standard settings for clarity and maintainability.
-#605: Updated RandomSetter
in SparseExtra
to use unordered_map
for improved performance.
-#543: Fix PEP8 and formatting issues in GDB pretty printer.
-#611: Included unordered_map header to enhance functionality related to unordered maps.
-#613: Fixes Eigen::fix<N>
and symbolic_index
test for environments without variable template support.
-#614: Enhances LAPACK test compatibility with newer Fortran compilers, addressing argument mismatches.
-#615: Include intrin header for better compatibility on Windows ARM.
-#616: Disable CUDA Eigen::half vectorization on host for versions before 10.0.
-#621: Improved compatibility and reduced warnings for GCC 4.8 on ARM.
-#619: Fixed documentation for unsupported linear solvers.
-#629: Fix EIGEN_OPTIMIZATION_BARRIER for arm-clang to enhance compatibility.
-#628: Renamed 'vec_all_nan' to resolve ppc64le build failures.
-#485: Remove deprecated package config variables from CMake configuration.
-#622: Rename Tuple
to Pair
and introduce a Tuple
class enhancing GPU compatibility.
-#632: Simplified CMake configuration by removing unused EIGEN_DEFINITIONS
.
-#633: Use ARCH_INDEPENDENT versioning in CMake for improved package configuration management.
-#617: Enhanced matrixmarket reader/writer to support various dense matrices.
-#634: CMake now populates package registry by default for improved package management.
-#635: Fixed tridiagonalization_inplace_selector to align hCoeffs vector with MatrixType, enhancing stability.
-#636: Remove stray DynamicSparseMatrix references to clean and maintain codebase.
-#637: Removed extra DynamicSparseMatrix references and fixed typos for improved code clarity.
-#638: Added missing packet types in the pset1
call to improve robustness.
-#482: Add LLDB Pretty Printer for enhanced debugging of Eigen matrices and vectors.
-#624: Add Serializer<T>
for binary serialization to enhance GPU testing.
-#641: Removed unnecessary std::tuple reference to simplify the code.
-#631: Issue an error when internal headers are included directly to enhance code safety.
-#625: Introduce new GPU test utilities enhancing execution flexibility across CPU and GPU.
-#643: Minor fix for compilation error on HIP to enhance compatibility.
-#645: Introduced a default constructor for eigen_packet_wrapper
, simplifying memory operations with memcpy
.
-#647: Clean up static assertions to use standard C++11 static_assert
for better error messages and performance.
-#651: Removed -fabi-version=6 flag from AVX512 builds for improved compatibility and performance.
-#646: Add buildtests_gpu and check_gpu to simplify GPU testing.
-#653: Disabled subtests on HIP to maintain test stability due to device side malloc/free limitations.
-#652: Added a macro to pass arguments to ctest for running tests in parallel.
-#654: Silence string overflow warning for GCC in initializer_list_construction test.
-#656: Fix strict aliasing bug causing product_small failure, enhancing reliability of small matrix operations.
-#655: Implemented parallel execution of CI tests on all CPU cores, enhancing test speed and efficiency.
-#657: Fix implicit conversion warnings in tuple_test to enhance type safety and code clarity.
-#572: Removed unnecessary const
when returning by value to enhance code readability.
-#660: Fix various typos to enhance clarity and professionalism in the documentation.
-#661: Fixes typographical errors to improve readability and professionalism.
-#664: Disable testing of complex compound assignment operators for MSVC to prevent compilation issues.
-#671: Improved GPU special function test accuracy by aligning with scipy.
-#669: Optimized tensor_contract_gpu test by reducing contraction count to prevent timeouts.
-#667: Speed up tensor reduction using loop strip mining and unrolling techniques for enhanced performance.
-#678: Moved CUDA/Complex.h to GPU/Complex.h and removed deprecated TensorReductionCuda.h.
-#665: Fix tuple compilation issues for Visual Studio 2017 by replacing tuple
alias with TupleImpl
.
-#666: Fix MSVC+NVCC EIGEN_INHERIT_ASSIGNMENT_EQUAL_OPERATOR compilation issues.
-#676: Improve accuracy of full tensor reduction for half and bfloat16.
-#686: Revert bit_cast to use memcpy for CUDA to avoid undefined behavior.
-#687: Add nan-propagation options to matrix and array plugins for improved handling of NaN in min/max operations.
-#691: Fix -Wbitwise-instead-of-logical clang warning to enhance code clarity and correctness.
-#693: Clarified documentation on inner stride for compile-time vectors in Stride
class.
-#692: Extend EIGEN_QT_SUPPORT to Qt6 for better compatibility.
-#688: Introduced nan-propagation options to enhance NaN value handling in matrix and array plugins.
-#696: Removed const from visitor return type to fix compatibility issues on ARM and PPC.
-#689: Fixed index-out-of-bounds error in broadcasting for vectorized 1-dimensional inputs.
-#698: Enhance CommaInitializer to reuse fixed dimensions, improving performance and consistency.
-#695: Fix boostmultiprec test to compile with older Boost versions, enhancing compatibility.
-#681: Addressed integer overflows in EigenMetaKernel indexing to enhance robustness.
-#701: Updated alignment qualifier placement for consistency and addressed compiler warnings.
-#700: Vectorize fp16 tanh and logistic functions on Neon for enhanced performance.
-#694: Fix ZVector build issues on s390x and enhance test reliability.
-#697: Streamline CMake scripts to enhance subproject integration, reducing unnecessary test builds.
-#703: Fix NaN propagation in min/max functions for scalar inputs.
-#702: Introduces AVX vectorized implementation for float2half and half2float, boosting performance 3x.
-#705: Fix TensorReduction warnings and error bound for sum accuracy test.
-#707: Fix total deflation issue in BDCSVD when M is diagonal and add unit tests.
-#709: Fixed BDCSVD's total deflation logic to enhance performance with diagonal matrices.
-#714: Fix uninitialized matrix in nestbyvalue test to enhance reliability.
-#712: Enhance documentation for Quaternion constructor from MatrixBase, clarifying element order.
-#711: Fixes bug in macro definition EIGEN_HAS_FP16_C for non-Clang compilers.
-#715: Fix failing test for tensor reduction by comparing results against forward error bounds.
-#713: Avoid integer overflow in EigenMetaKernel indexing to enhance reliability and prevent CUDA errors.
-#121: Introduced a make format
command for automatic code formatting.
-#720: Corrected a typo in the documentation to enhance clarity.
-#717: Moved prune function to SparseVector.h to improve sparse matrix storage organization.
-#718: Use consistent StorageIndex
across SparseMatrix
implementations.
-#327: Reimplemented Tensor stream output for enhanced flexibility and consistency.
-#723: Fix tensor broadcast off-by-one error to enhance robustness.
-#722: Enhanced efficiency in Umeyama algorithm by optimizing computation when scaling is not required.
-#724: Enhanced TensorIO to support TensorMap with const elements.
-#719: Fixed Sparse-Sparse Product in case of mixed StorageIndex types enhancing robustness and reliability.
-#728: Fix errors for Windows build and enhance compatibility.
-#726: Added basic iterator support for Eigen::array to ease transition to std::array.
-#725: Removed deprecated MappedSparseMatrix to enhance maintainability.
-#727: Make numeric_limits members constexpr as per the newer C++ standards.
-#729: Implemented Eigen::array<...>::reverse_iterator for enhanced iteration capabilities.
-#733: Fix warnings about shadowing definitions to improve code clarity and maintainability.
-#732: Remove EIGEN_HAS_CXX11 to streamline codebase and enhance maintainability.
-#737: Refactored LLT macro binding to Lapacke into smaller parts for better clarity and maintainability.
-#741: Fix for HIP compilation failure in DenseBase by adding EIGEN_DEVICE_FUNC modifiers.
-#735: Removed EIGEN_HAS_CXX11_* and redundant EIGEN_COMP_CXXVER checks to streamline code and enhance maintainability.
-#742: Updated minimum CMake version to 3.10, set GCC version to 5, and removed disabling of C++11 tests.
-#730: Fixes issue #2375 related to indexed views for non-Eigen types, improving stability.
-#736: Improved handling of non-const overloads in self-adjoint and triangular views when not referring to an lvalue.
-#746: Fixed Cholesky to handle 0-sized matrices, ensuring LAPACKE-based LLT aligns with Eigen's expectations.
-#739: Disabled GCC-4.8 tests to streamline transition to C++14.
-#752: Deprecate macro EIGEN_GPU_TEST_C99_MATH to reduce code clutter and simplify maintenance.
-#748: Improved lapacke binding code for HouseholderQR and PartialPivLU.
-#755: Removed unused else branch after #ifdef
removal, enhancing code clarity.
-#757: Refactored IDRS code for stability and performance enhancements using StableNorm().
-#756: Improve compatibility with toolchains lacking atomic support by conditional inclusion of <atomic>
.
-#759: Fix typo StableNorm
to stableNorm
to ensure naming consistency.
-#762: Improved readability and reliability of documentation snippets in the Eigen library.
-#761: Removed outdated compiler checks and flags to streamline the codebase.
-#765: Disambiguate overloads for empty index list to reduce compiler warnings and improve code clarity.
-#760: Removed using namespace Eigen
in sample code to promote better coding practices.
-#767: Ensure exp(-Inf) returns zero for vectorized expressions and improve AVX2 and SSE performance.
-#758: Introduced GPU unit tests for HIP using C++14.
-#763: Removed use of deprecated CMake COMPILE_FLAGS
in favor of modern options.
-#768: Removed custom Find*.cmake scripts in favor of CMake's built-in support for better compatibility.
-#770: Fixed customIndices2Array to include the first index, enhancing tensor module functionality.
-#769: Added guards to enforce proper header inclusion practices for CholmodSupport
.
-#753: Converted computational macros to constexpr functions for enhanced type safety and code maintainability.
-#776: Enables EIGEN_TEST_CUSTOM_CXX_FLAGS
to be used as a CMake list by converting spaces to semicolons.
-#782: Fix bug introduced in !751 affecting EIGEN_IMPLIES macro handling side-effects.
-#783: Simplify logical_xor() for bool types by using a != b.
-#785: Fixed Clang warnings about alignment change and floating point precision.
-#786: Small cleanup of GDB pretty printer code to improve readability and maintainability.
-#788: Minor fixes to documentation and code warnings for improved clarity and quality.
-#790: Added missing internal namespace qualifiers to improve clarity in vectorization logic tests.
-#779: Optimize exp() for denormal results and 4% speedup.
-#793: Removed unused macro EIGEN_HAS_STATIC_ARRAY_TEMPLATE
to enhance maintainability.
-#797: Add bounds checking to the Eigen serializer to enhance data integrity and reliability.
-#800: Corrected functionality of GPU unit tests for HIP after serialization API changes.
-#792: Allow specifying inner & outer stride for CWiseUnaryView, enhancing functionality and control.
-#799: Improve plog with 20% speedup for float and handle denormals.
-#802: Fixes truncation from unsigned int
to bool
, improving reliability of type conversions.
-#801: Fixes and cleanups for numeric_limits and psqrt bug.
-#796: Make fixed-size Matrix and Array trivially copyable using C++20 features.
-#803: Fix Gcc8.5 warning about missing base class initialization.
-#805: Ensure consistent values from scalar and vectorized paths in array.exp().
-#780: Improved accuracy and performance of logistic sigmoid function.
-#806: Fix IterativeSolverBase assertion messages to reference the correct class name.
-#795: Refactor to reduce usage of reserved names, enhancing compliance with C++ standards.
-#808: Addressed type compatibility in pmadd
with explicit casting for enhanced type safety.
-#810: Fix two corner cases in logistic sigmoid to enhance accuracy and robustness.
-#809: Corrects broken assertions to enhance reliability and robustness.
-#811: Fix compilation issue with GCC < 10 and C++2a standard.
-#812: Fix implicit conversion warning in vectorwise_reverse_inplace.
-#814: Updated comment to replace reference to removed macro EIGEN_SIZE_MIN_PREFER_DYNAMIC with constexpr function.
-#813: Minor correction/clarification to LSCG solver documentation.
-#815: Fix implicit conversion warning in GEBP kernel's packing by changing data types to improve type safety.
-#819: Enhance clang warning suppressions by verifying supported warnings.
-#818: Silence MSVC warnings for cleaner builds and easier debugging.
-#772: Removed obsolete macros and implementation for better code maintainability.
-#817: Add support for packets of int64 on x86 to enhance processing efficiency.
-#821: Prevented heap allocation in diagonal matrix product by using reference types, optimizing performance.
-#822: Make casts explicit and fix type to prevent overflow issues in random test implementation.
-#825: Enhance handling of float warnings by refining comparisons and conversions.
-#827: Optimized preciprocal function for IEEE compliance, improving performance with division by zero and infinity.
-#830: Removed outdated documentation referencing C++98/03 standards.
-#835: Fixed ODR violations by removing unnamed namespaces from headers.
-#833: Fixes type discrepancy issues on 32-bit ARM by using int32_t
consistently.
-#840: Correct use of EIGEN_CUDACC to respect EIGEN_NO_CUDA, preventing unwanted compilation of CUDA code when disabled.
-#838: Defined EIGEN_HAS_AVX512_MATH
correctly in PacketMath to enhance AVX512 support.
-#836: Limit GCC<6.3 maxpd workaround to GCC, improving compatibility with Clang.
-#841: Consolidated and enhanced implementations of fast psqrt and prsqrt for correct handling of edge cases.
-#844: Updated MPL2 license link to use HTTPS for enhanced security.
-#843: Fix naming collisions with resolve.h to enhance code clarity and stability.
-#845: Provide a definition for numeric_limits static data members to enhance compliance with C++ standards.
-#846: Return alphas() and betas() by const reference to enhance performance and memory efficiency.
-#849: Enhance documentation with MatrixXNt and MatrixNXt details and fix namespace issues.
-#852: Add constexpr size()
method to Eigen::IndexList
for compile-time size evaluation.
-#853: Fix ODR failures in TensorRandom to enhance code stability and reliability.
-#855: Remove unused macros related to obsolete prsqrt
implementation.
-#859: Fix MSVC+NVCC 9.2 pragma error by replacing _Pragma
with __pragma
.
-#850: Enhance Doxygen documentation by adding descriptions to Matrix typedefs.
-#861: Made FixedInt constexpr and fixed potential ODR issues by removing static from the variable template.
-#862: Restores fixed sizes for U/V matrices for fixed-sized inputs, reverting unnecessary dynamic sizing.
-#866: Initialize pointers to nullptr in SPQRSupport to prevent crash from invalid free() calls.
-#870: Fix test macro conflicts with STL headers in C++20.
-#865: Add assert for edge case if Thin U Requested at runtime.
-#863: Modified test expressions to ensure numerical consistency across optimization levels.
-#868: Optimizations to fast SQRT/RSQRT for enhanced performance on modern x86 processors.
-#873: Disabled deprecated warnings in SVD tests to enhance log readability.
-#874: Fix gcc-5 packetmath_12 bug by initializing memory to zeroes.
-#875: Fix compilation error in packetmath by introducing a wrapper around psqrt
.
-#877: Disable deprecated warnings for SVD tests on MSVC for cleaner build logs.
-#878: Fix frexp packetmath tests for MSVC to handle non-finite inputs correctly.
-#876: Fix mixingtypes for g++-11, improving stability and performance with AVX512 operations.
-#880: Fix SVD for MSVC by addressing a critical bug with Options template parameter handling.
-#879: Improved any/all reduction operations for row-major layout.
-#882: Fixes compatibility issues with SVD implementation for MSVC+CUDA addressing Index
type discrepancies and function return warnings.
-#883: Adjust tolerance of matrix_power test for MSVC to reduce test failures.
-#884: Removed overly strict non-convergence checks in NonLinearOptimization tests to improve flexibility and reliability.
-#886: Skip denormal test if Cond
is false, enhancing test suite efficiency.
-#885: Fix enum conversion warnings in BooleanRedux.
-#888: Speed lscg by using .noalias to enhance computation speed in least squares conjugate gradient function.
-#851: Fixes inconsistency in JacobiSVD_LAPACKE bindings to enhance SVD module.
-#887: Enhanced vectorization_logic tests for platform compatibility and reliability.
-#864: Removed unnecessary EIGEN_UNUSED decorations to improve code clarity.
-#890: Removed duplicate IsRowMajor declaration to reduce compilation warnings.
-#891: Split and reduce SVD test sizes to optimize memory usage and improve compile times.
-#893: Introduces new CMake options for controlling build components.
-#894: Fixed tensor executor test and supported tensor packets of size 1 for better platform compatibility.
-#897: Removed obsolete copy_bool
workaround for gcc 4.3, enhancing code maintainability.
-#895: Introduce move constructors for SparseSolverBase and IterativeSolverBase for enhanced flexibility.
-#889: Introduce construct_at, destroy_at wrappers to replace placement new and explicit destructor calls, improving code clarity and safety.
-#898: Fix edge-case in zeta for large inputs to prevent NaNs and ensure scipy compatibility.
-#896: Removed ComputeCpp-specific code from SYCL Vptr to enhance compatibility and performance.
-#901: Fix construct_at compilation breakage on ROCm, improving compatibility with HIP environments.
-#900: Fix swap test for size 1 inputs to enhance test reliability.
-#903: Convert bit calculation to constexpr, avoiding casts for readability and maintainability.
-#907: Enhanced PowerPC MMA flags with dynamic dispatch, improving compatibility and performance.
-#909: Removed outdated GCC-4 warning workarounds for a cleaner codebase.
-#829: Streamline codebase by replacing Eigen type metaprogramming with std types and alias templates.
-#914: Disabled Schur non-convergence test to improve reliability.
-#913: Enhanced PowerPC MMA flag handling for default builds.
-#915: Fix missing pound directive to prevent compilation errors.
-#917: Implement workaround for g++-10 docker optimization issue in geo_orthomethods_4.
-#911: Fix RowMajorBit <-> RowMajor mixup to enhance code robustness.
-#916: Updated EIGEN_ALTIVEC flags for compatibility with TensorFlow, allowing binary values and enhancing documentation.
-#923: Fix AVX512 builds with MSVC; improved compatibility and added comprehensive testing.
-#925: Fix ODR violation in trsm by marking functions as inline.
-#926: Fixed namespace usage to resolve compilation errors and enhance code stability.
-#927: Update warning suppression methods to enhance compatibility with newer compilers.
-#921: Optimize visitor traversal for RowMajor matrices to enhance performance by adjusting traversal methods to data layout.
-#918: Added missing explicit reinterprets to resolve g++ build errors.
-#930: Added a missing typename and fixed an unused typedef warning for better GCC 9 compatibility.
-#931: Re-enabled Aarch64 CI pipelines to enhance testing and validation on Aarch64 architecture.
-#892: Added is_constant_evaluated support and improved alignment checks.
-#937: Eliminated warnings related to unused trace statements for cleaner compilation.
-#924: Disable f16c scalar conversions for MSVC to enhance compatibility.
-#939: Removed .cpp file inclusions in LAPACK for better code clarity.
-#934: Fixed order of arguments in BLAS SYRK to resolve compilation errors and enhance code correctness.
-#941: Modified test_isApprox
function to handle inf/nan comparisons correctly.
-#940: Reintroduced std::remove* aliases to restore compatibility for third-party libraries.
-#854: Added Scaling function overload for vector rvalue reference to improve usability and correctness.
-#904: Transformed static const
members into constexpr
for increased performance and optimization.
-#943: Enhanced compile-time evaluations by converting helper functions to constexpr in XprHelper.h.
-#944: Transitioned a metaprogramming utility to a constexpr function for improved compile-time evaluation and code usability.
-#942: Fixed navbar scroll issues by overriding Doxygen's initResizable()
and adjusting TOC positioning.
-#945: Fixes max size expressions to ensure correct calculations and intended behavior.
-#949: Fix ODR issues in lapacke_helpers to enhance reliability and stability.
-#946: Removed legacy macro EIGEN_EMPTY_STRUCT_CTOR
for improved code simplicity and maintainability.
-#953: Fixed ambiguous DiagonalMatrix constructors by clarifying initializer list usage.
-#951: Fix Power GEMV order of operations in predux for MMA, optimizing performance and fixing GCC assembly issues.
-#952: Allow all tests to pass with EIGEN_TEST_NO_EXPLICIT_VECTORIZATION
to improve test stability.
-#962: Improved memory handling and performance in HouseholderSequence by eliminating unnecessary heap allocations and streamlining block logic.
-#963: Fix cwise NaN propagation for scalar input.
-#964: Fix compilation issue in HouseholderSequence.h related to InnerPanel template.
-#958: Fix compiler bugs for GCC 10 & 11 for Power GEMM to enhance compatibility and stability.
-#966: Removed need to supply the Symmetric flag to UpLo argument for Accelerate LLT and LDLT, simplifying solver usage.
-#967: Added load vector_pairs for GEMM MMA RHS and improved predux GEMV.
-#968: Made diagonal matrix cols() and rows() methods constexpr for enhanced compile-time evaluation.
-#969: Add uninstall
target only if not already defined to enhance compatibility.
-#908: Fixed incorrect reference code in STL_interface.hh for ata_product to enhance reliability.
-#974: Prevent BDCSVD crash caused by index out of bounds.
-#977: Fixes BDCSVD numerical instability, enhancing robustness and reliability.
-#984: Unset the executable flag on specified files to improve file permission management.
-#985: Improve plogical_shift_* implementations and fix typo in SVE/PacketMath.h.
-#980: Avoid signed integer overflow in adjoint test to enhance reliability.
-#975: Introduces subMappers for Power GEMM packing, enhancing performance by 10%.
-#982: Avoid ambiguous Tensor comparison operators for C++20 compatibility.
-#986: Updated SYCL-2020 range handling to ensure compliance and improved parallel operation reliability.
-#976: Fixes incorrect LDLT results when using AutoDiffScalar with value 0, ensuring proper derivative handling.
-#971: Introduces R-Bidiagonalization in BDCSVD for improved performance on large matrices.
-#987: Fix integer shortening warnings in visitor tests.
-#989: Fix C++20 ambiguity of comparisons, enhancing clarity and compliance.
-#990: Introduces diagonal matrix multiplication and static initializers for zero and identity matrices.
-#991: Addressed ambiguous comparison warnings in C++20, improving TensorBase comparison operations.
-#993: Fix row vs column vector typo in Matrix class tutorial for improved clarity.
-#994: Marked index_remap
as EIGEN_DEVICE_FUNC
to enhance GPU utilization in Eigen's reshaping functionality.
-#995: Added documentation for DiagonalBase to enhance clarity and usability.
-#996: SYCL-Spec compliance for kernel names to enhance compatibility with SYCL-2020 specifications.
-#999: Use numext::sqrt in Householder.h to simplify custom type integration.
-#1003: Eliminate undef warnings when not compiling for AVX512 to enhance code stability.
-#1001: Skip f16/bf16 bessel specializations on AVX512 if unavailable, enhancing portability and reducing build errors.
-#1002: Fix clang-tidy warnings and reformat code for improved readability.
-#947: Introduced partial loading and storing operations to enhance memory access and performance.
-#1005: Re-enable unit tests for device side malloc after ROCm 5.2 fix.
-#1007: Fix ODR violations by replacing unnamed type with named type, enhancing stability and clarity.
-#1006: Ensure AutoDiff module includes Core dependency for better consistency.
-#1009: Corrected Doxygen group usage to enhance documentation clarity.
-#1013: Add option to disable AVX512 GEBP kernels.
-#1014: Fix aligned_realloc to call check_that_malloc_is_allowed() if ptr == 0, enhancing memory management integrity.
-#1015: Disable AVX512 GEMM kernels by default to enhance stability and prevent segmentation faults.
-#1016: Improved Emscripten compatibility by including immintrin.h header.
-#1021: Updated documentation for AccelerateSupport after PR 966.
-#1019: Avoid including with EIGEN_NO_IO to enhance compatibility with embedded systems.
-#1020: Enhanced ConjugateGradient to use numext::sqrt
for compatibility with custom numeric types.
-#1023: Fix flaky packetmath_1 test for increased reliability.
-#1010: Fix inner iterator for sparse block to enhance reliability of sparse matrix operations.
-#1025: Fix use of Packet2d type for non-VSX to enhance portability and usability.
-#1028: Fix non-VSX PowerPC build to enhance compatibility and usability.
-#1027: Fix code and unit test for corner cases in vectorized pow() function.
-#1012: Fix vectorized Jacobi Rotation to utilize packet math vectorized version and ensure 'fixed-size' code path passes tests.
-#1026: Vectorize the sign operator to enhance performance for real types.
-#1030: Resolve compilation errors by avoiding double definitions of Half functions on aarch64 during GPU compilation.
-#1031: Eliminated bool bitwise warnings to improve code clarity and maintainability.
-#1032: Disable invalid deprecation warnings in BDCSVD class.
-#1033: Fix and enhance accuracy of SYCL tests and tensor operations.
-#1035: Removed unnecessary FP16C checks for AVX512 to enhance performance.
-#1034: Improved pow<double>
performance by 11-15% with a new division algorithm.
-#1037: Protect new pblend implementation with EIGEN_VECTORIZE_AVX2 to enhance robustness and compatibility.
-#1039: Fixes psign
for unsigned integer types, enhancing robustness and correctness.
-#1044: Added missing pointer in realloc call to improve memory management.
-#1045: Enhanced GeneralizedEigenSolver::info()
reliability and error clarity.
-#1042: Addressed undefined behavior in array_cwise
test caused by signed integer overflow.
-#1046: Re-enable pow function for complex types, enhancing mathematical operations in the Eigen library.
-#1043: Vectorize pow for integer base / exponent types to improve performance and robustness.
-#1038: Vectorize acos
, asin
, and atan
for float with significant accuracy and performance enhancements.
-#1048: Fix test build errors related to new unary power functionality, improving compatibility and flexibility.
-#1049: Fixes two typos in the slicing tutorial documentation.
-#1051: Updated mixingtypes tests to accommodate changes in unary pow operation.
-#1052: Fixes CMake issues by adjusting benchmark builds and handling test dependencies.
-#1050: Add asserts for index-out-of-bounds in IndexedView to enhance error checking and prevent runtime errors.
-#1053: Fixed MSVC compilation error in GeneralizedEigenSolver.h by adding missing semi-colon.
-#1056: Reduce compiler warnings for tests, leading to cleaner build output.
-#1057: Adjusted overflow threshold bounds for pow function tests to enhance CI pipeline reliability.
-#899: Introduced C++14 constexpr
support for Map
s and basic operations, enhancing compile-time capabilities.
-#1061: Tweak bound for pow to account for floating-point types, improving reliability and fixing specific failures.
-#1060: Fix realloc for non-trivial types to enhance stability in memory handling.
-#1064: Fix g++-6 constexpr and C++20 build errors for better compliance.
-#1063: Address issues with unary pow() to enhance type safety and correctness.
-#1069: Removed faulty skew_symmetric_matrix3 test to improve test robustness and mitigate potential msan errors.
-#1066: Allow mixed types for pow() if exponent is exactly representable in base type.
-#1070: Fix test for pow with mixed integer types to prevent unintended conversions.
-#1077: Addressed unused-result warning in ROCm integration related to gpuGetDevice.
-#1078: Add macro to optimize GEBP kernel for NEON architecture.
-#1080: Remove unused typedef to enhance code clarity and maintainability.
-#1079: Reduce compilation time/memory for GEBP kernel using EIGEN_IF_CONSTEXPR.
-#1083: Reduces memory footprint of GEBP kernel for non-ARM targets to improve MSVC build performance.
-#1084: Vectorize atan() for double to enhance performance and accuracy.
-#1085: Fix 4x4 inverse issues when using -Ofast compilation.
-#1088: Replaced assert
with eigen_assert
for consistency and configurability.
-#1089: Unconditionally enable CXX11 math across the Eigen library to enhance compatibility and consistency.
-#1087: Introduced a refined range reduction strategy for atan<float>()
improving performance by 20-40% on x86 architectures.
-#1091: Enhance AttributeMacros with new macros for better clang-format compatibility.
-#1092: Removed references to M_PI_2 and M_PI_4 to improve code clarity and portability.
-#1093: Enhanced handling of NaN inputs in atan2 function to improve reliability.
-#1094: Fix warnings -Wunused-but-set-variable in Eigen/Sparse for improved code cleanliness.
-#1095: Refactor special values test for pow and add similar test for atan2 to enhance coverage.
-#1096: Fix bug in atan2
function for better cross-platform compatibility.
-#1099: Clarified that indices must be sorted in documentation.
-#1101: Modify memory functions to use 1-byte offset for improved alignment handling.
-#1105: Fix pragma check for disabling fastmath to enhance reliability and numerical stability.
-#1102: Added assert to validate outer index array size in SparseMapBase.
-#1106: Fixes offset computation in handmade_aligned_malloc to reduce compiler warnings and improve memory safety.
-#1107: Disabled patan for double precision on PPC to fix build issues.
-#1100: Enabled resizing of dynamic empty matrices to enhance flexibility and accuracy.
-#1110: Remove unused parameter name to enhance code readability and maintainability.
-#1109: Removed an unnecessary assert in SparseMapBase to allow flexibility in sparse matrix population.
-#1112: Fixes a typo in CholmodSupport for improved code readability.
-#1118: Fix ambiguity in PPC for vec_splats call by clarifying type usage.
-#1119: Implement bracket notation for unsigned type names to enhance code clarity and consistency.
-#1116: Corrected handling of floating-point zero in pnegate function by directly flipping the sign bit.
-#1113: Fix duplicate execution code for Power 8 Altivec in pstore_partial.
-#1117: Minor improvements to IDRS.h for code cleanliness and readability.
-#1120: Fix bug in handmade_aligned_realloc to enhance memory management and prevent undefined behavior.
-#1121: Add serialization for sparse matrix and sparse vector.
-#1122: Fix compiler warnings in test files to improve code quality and maintainability.
-#1125: Introduce synchronize method to all devices to enhance flexibility and testing.
-#1124: Fix sparseLU solver to handle destinations with non-unit stride.
-#1114: Modified BiCGSTAB parameters initialization to support custom types.
-#1123: Fix reshape strides when input has non-zero inner stride.
-#1127: Fix serialization and enhance robustness for non-compressed matrices.
-#1130: Fix index type for sparse index sorting.
-#1128: Enable direct access for NestByValue to enhance performance.
-#1134: Optimize equalspace
packet operation for improved performance and efficiency.
-#1090: Allow std::initializer_list constructors in constexpr expressions for enhanced usability and compatibility with modern C++.
-#1135: Enhanced compatibility by removing std::raise() dependency for handling divide by zero.
-#1137: Replaced std::signbit with numext::signbit for bfloat16 compatibility.
-#1138: Update test framework for numext::signbit
to enhance reliability and accuracy.
-#1139: Added operators to CompressedStorageIterator
to enhance functionality and usability.
-#1144: Fix up C++ version detection macros and cmake tests to enhance compatibility and stabilize CI.
-#1145: Adjust thresholds for bfloat16 product tests to enhance reliability.
-#1140: Updated SparseLU for enhanced compatibility and fixed initialization bug.
-#1149: Fix git add .
to include scripts/buildtests.in by modifying .gitignore.
-#1151: Fix EIGEN_HAS_CXX17_OVERALIGN for icc to enhance compatibility with Intel C++ Compiler.
-#1155: Fix overalign check to enhance compiler compatibility.
-#1156: Fix minor build and test issues for better reliability and performance.
-#1158: Clarified spbenchsolver help message to improve naming conventions for SPD matrices and rhs files.
-#1147: Overhauled Sparse Core to enhance performance and maintainability of sparse matrix operations.
-#1159: Re-introduced missing header for GPU tests to restore functionality.
-#1161: Fix compilation error due to unused parameter 'tmp' on clang/32-bit ARM.
-#1160: Improved insert strategy for compressed sparse matrices to enhance performance and reduce reallocations.
-#1162: Rollback QR changes to fix build error related to StorageIndex
conflicts.
-#1167: Avoid move assignment in ColPivHouseholderQR to enhance stability and compatibility with compilers.
-#1165: Added missing EIGEN_DEVICE_FUNC in assertions and improved code robustness.
-#1164: Enhance performance of sparse permutations by reducing memory allocations and optimizing data handling.
-#1168: Introduce thread-local storage for is_malloc_allowed() to enhance safety in multi-threaded applications.
-#1136: Review and cleanup of compiler version checks to enhance readability and maintainability.
-#1169: Replace deprecated $<CONFIGURATION>
with $<CONFIG>
for CMake compliance.
-#1166: Introduces custom ODR-safe assert to enhance C++20 module compatibility.
-#1170: Enhanced sparse matrix insertion and memory management, improving performance and efficiency.
-#1172: Refactored SparseMatrix for improved code consistency and readability.
-#1175: Improved corner case handling and efficiency of atan2
function, added to numext
namespace, and fixed a bug in tests.
-#1179: Disabled vectorized rsqrt to ensure consistency with generic version.
-#1181: Fix bugs exposed by enabling GPU asserts and enhance GPU computation reliability.
-#1178: Fix sparse warnings to enhance code stability and reliability.
-#1176: Optimize mathematical packet operations for better accuracy and performance.
-#1180: Fixed critical sparse bugs with outerSize == 0 to enhance stability and prevent segmentation faults.
-#1183: Fix undefined behavior in Block access, eliminating UBSan errors.
-#1185: Enhanced robustness of the atan2
function for compatibility with TensorFlow using Clang.
-#1186: Updated ForwardDeclarations.h for improved clarity and maintainability.
-#1188: Reverted StlIterators edit to address concerns about undefined behavior.
-#1190: Use VERIFY_IS_EQUAL for zero comparisons to enhance code consistency.
-#1191: Improved LAPACKE configuration for better compatibility and complex type management.
-#1192: Enhanced EIGEN_DEVICE_FUNC compatibility for CUDA 10/11/12 and cleaned up warnings.
-#1189: Improved compatibility of SkewSymmetric<?> with CUDA by adding EIGEN_DEVICE_FUNC qualifiers.
-#1198: Optimized Power module by replacing eigen_asserts with eigen_internal_asserts to reduce runtime overhead in release builds.
-#1197: Remove LGPL code to ensure MPL2 compatibility and simplify licensing.
-#1200: Removed custom implementations of equal_to and not_equal_no, leveraging C++14 capabilities.
-#1199: Add IWYU export pragmas to top-level headers to enhance compatibility with tooling like clang-tidy.
-#1206: Update ColPivHouseholderQR_LAPACKE.h to enhance type handling for LAPACK operations involving complex numbers.
-#1201: Fix ODR violation with gemm_extra_cols
on PPC to prevent crashes and enhance stability.
-#1208: Revert ODR changes and inline gemm functions to improve efficiency.
-#1209: Introduced functionality to print diagonal matrix expressions directly, enhancing debugging and efficiency.
-#1212: Disable array BF16 to F32 conversions in Power architecture to enhance stability and efficiency.
-#1213: Fix compiler warnings to enhance code quality and maintainability.
-#1215: Fix compiler warnings in tests to enhance code stability and maintainability.
-#1216: Fix typo in NEON make_packet2f return value to enhance correctness.
-#1218: Implements a correction in MSVC's atan2
for consistency with the POSIX spec.
-#1220: Resolved GCC compile issues and fixed preinterpret stack overflow in NEON packetmath.
-#1219: Optimizations for pasin_float
and error handling fixes for psqrt_complex
.
-#1221: Guard complex sqrt on old MSVC compilers to enhance compatibility.
-#1222: Fix epsilon value in long double for double doubles to improve algorithm convergence on PPC.
-#1223: Vectorize atanh, add atan definition, and unit tests for atan.
-#1226: Use pmsub in twoprod to improve pow() performance by ~1% on Skylake.
-#1229: Fix MSAN failures in SVD tests by initializing matrix entries to improve test robustness.
-#1230: Removed EIGEN_HAS_AVX512_MATH workaround to simplify code and improve compatibility.
-#1228: Enhanced compatibility on Power architecture by fixing vec_div
issues across compiler versions.
-#1196: Enhance vectorized comparisons with typed comparison support for performance boost.
-#1239: Improve test reliability for NEON integer shift operations by handling zero argument cases.
-#1243: Fixes tensor comparison test to ensure accurate results.
-#1242: Improves memory allocation efficiency during tridiagonalization in eigenvector computation.
-#1233: Vectorize any()
and all()
in DenseBase
, enhancing performance and flexibility for large matrix operations.
-#1241: Ensure CMAKE_* cache variables are set only for top-level projects to prevent build setting modifications.
-#1245: Modify failing cwise test to ensure it passes by using .abs()
to avoid overflow issues.
-#1248: Fix typo in LinAlgSVD example code to ensure successful compilation and correct output.
-#1250: Replaced 'Lesser' with 'Less' for consistency and clarity.
-#1251: Added a newline to end of file to align with coding standards.
-#1252: Work around compiler bug in Tridiagonalization.h to enhance robustness.
-#1254: Make Select implementation backwards compatible to ensure stability with older versions.
-#1256: Fix bug in minmax_coeff_visitor for matrix of all NaNs to enhance robustness.
-#1257: Align handling of PropagateFast with PropagateNaN in minmax visitor.
-#1259: Reintroduced and added deadcode checks to enhance code quality.
-#1262: Limit build and link jobs for PowerPC to reduce OOM issues.
-#1263: Fix recent PowerPC warnings and clang warning.
-#1264: Introduced EIGEN_NOT_A_MACRO
to enhance compatibility with TensorFlow and avoid build issues.
-#1265: Vectorize tensor.isnan() using typed predicates for improved AVX512 performance.
-#1266: Removed pool functionalities for CMake versions less than 3.11 to streamline build processes.
-#1268: Enhanced compatibility with CMake list handling for command-line argument parsing.
-#1267: Fixed various typographical errors to enhance code readability and professionalism.
-#1269: Reverted CMake pools changes to stabilize the build process by eliminating related errors.
-#1271: Enhanced SparseMatrix with updated Map typedef and improved overflow checks in setFromTriplets.
-#1273: Replaced internal::(U)IntPtr with std::(u)intptr_t and removed ICC workaround to enhance compatibility and code clarity.
-#1234: Removed unused BLAS/LAPACK declarations to enhance maintainability and reduce signature conflicts.
-#1276: Optimized generic_rsqrt_newton_step
for enhanced accuracy and performance.
-#1279: Refactor indexed view methods to enable non-const reference access with symbolic indices, improving usability and maintainability.
-#1283: Use correct truncating intrinsic for double-to-int casting to improve accuracy.
-#1148: Guarded malloc, realloc, and free() with check_that_malloc_is_allowed() and improved error handling by replacing abort with exceptions.
-#1284: Cleanup and enhance packet math by removing unused components and adding missing specializations.
-#1286: Improve handling of non-const symbolic indexed views by checking for l-value-ness.
-#1287: Prevent crash on empty tensor contraction by omitting assert and returning nullptr for size 0 allocations.
-#1288: Updated documentation for Eigen 3.4.x to resolve build errors and enhance clarity.
-#1291: Ensure Eigen/Core
and Eigen/src/Core
are not ignored due to core
rule on Windows.
-#1294: Improve accuracy of erf() with refined rational approximation and better clamping methods.
-#1295: Refactor IndexedView to enhance readability and maintainability by reducing SFINAE verbosity.
-#1299: Introduces BF16 pcast functions and reorganizes type casting in TypeCasting.h.
-#1298: Improved tensor select evaluator performance using select ternary op for enhanced efficiency.
-#1303: Ensure Erf() returns +/-1 above clamping and enhance performance, particularly for AVX2 on Skylake.
-#1306: Removed last occurrences of the unused enum HasHalfPacket for codebase cleanliness.
-#1308: Fix pow for uint32_t, disable pmul to enhance robustness and prevent compilation issues.
-#1309: Introduces the Abs2 function for Packet4ul, enhancing functionality.
-#1312: Fixed boolean bitwise and warning in test code.
-#1311: Fixed sparse iterator compatibility issues and deprecated function warnings on macOS using Clang.
-#1304: Specializes evaluator for scalar_cast_op
to optimize handling of different packet types.
-#1316: Implemented pcmp
, pmin
, and pmax
for Packet4ui
, improving compliance and test stability.
-#1305: Enhance StridedLinearBufferCopy
with half-Packet
operations for improved performance.
-#1318: Set m_nonzeroSingularValues
to zero when input is not finite to enhance stability.
-#1319: Fix ColMajor BF16 GEMV for mixed RowMajor vector compatibility.
-#1321: Cleaned up array_cwise test by suppressing warnings, resolving ambiguities, and removing redundant tests.
-#1322: Corrected loadColData implementation to fix BF16 GEMV compatibility with LLVM.
-#1323: Fix modulo by zero compiler warning to enhance code robustness.
-#1325: Renamed array_cwise
test and suppressed compiler warnings to enhance clarity and reduce message noise.
-#1289: Moved thread pool code from Tensor to Core to enhance accessibility for future developments.
-#1324: Update ndtri
to return NaN for out-of-range inputs, ensuring consistency with SciPy and MATLAB.
-#1329: Introduce macros to override synchronization primitives in Eigen ThreadPool for customization.
-#1333: Fix compiler warnings and failures in JacobiSVD and BDCSVD by initializing matrix members.
-#1334: Fix unrolled assignment evaluator to enhance access patterns for small fixed-size arrays and matrices.
-#1335: Introduced functions for adding/removing outer vectors in SparseMatrix for enhanced structure management.
-#1336: Introduces linear redux evaluators to enhance expression evaluation performance.
-#1337: Clean up Redux.h and fix vectorization_logic test after traversal order changes.
-#1339: Adjusted EIGEN_HAS_ARM64_FP16_SCALAR_ARITHMETIC for CUDA to resolve compilation issues.
-#1338: Optimized error handling in scalar_unary_pow_op for better performance and robustness.
-#1342: Reduce max relative error of prsqrt from 3 to 2 ulps.
-#1343: Improved error handling and testing for unary power functions in Eigen.
-#1344: Enhanced numerical stability in prsqrt
function to prevent underflow.
-#1328: Partially vectorizes cast for enhanced performance and safety in vectorized operations.
-#1347: Introduced compile- and run-time assertions for Ref<const>
construction to enhance memory layout safety.
-#1346: Introduce a move constructor for Ref<const...> to enhance performance by reducing unnecessary copying.
-#1351: Streamlined SVD testing to improve CI stability and reduce resource consumption.
-#1350: Improved safe_abs function in int_pow for better Clang compatibility.
-#1353: Removed deprecated function calls in SVD test to improve maintainability.
-#1352: Enhanced precision and performance of rint
, round
, floor
, and ceil
functions.
-#1354: Add optional offset parameter to ploadu_partial and pstoreu_partial for API consistency.
-#1355: Disabled FP16 arithmetic for arm32 to enhance stability and compatibility with Clang compiler limitations.
-#1345: Add Quaternion constructor from real scalar and imaginary vector to simplify common expressions.
-#1360: Fix ivcSize
return type for enhanced type consistency and reliability.
-#1362: Fix argument for _mm256_cvtps_ph imm parameter to eliminate MSVC warning C4556.
-#1361: Fixed Altivec compilation with C++20 and higher by addressing simple-template-id
issues.
-#1363: Fix use of arg function in CUDA for improved compatibility with MSVC and C++20.
-#1358: Addressed various compiler warnings to improve code stability and readability.
-#1364: Optimize check_rows_cols_for_overflow
for better compile-time performance with matrix size checks.
-#1367: Fix gcc warnings by addressing subtle bugs and improving code clarity.
-#1369: Fix ARM build warnings by improving type casting and variable shadowing.
-#1370: Fixes -Waggressive-loop-optimizations warning for better compilation with gcc 10+.
-#1371: Fix -Wmaybe-uninitialized warning in SVD by initializing dimensions correctly.
-#1373: Added max_digits10
function to enhance decimal digit representation and improve floating-point serialization.
-#1372: Fix compatibility issues with Tensorflow on Power architecture, enhancing performance and reliability.
-#1376: Fix nullptr dereference issue in triangular product for zero-sized matrices.
-#1331: Add test to validate SYCL in Eigen core enhancing compatibility.
-#1377: Fix to prevent undefined behavior in triangular solves with empty systems.
-#1378: Fix clang-tidy warning by replacing std::move()
with std::forward()
for better handling of lvalues and rvalues.
-#1379: Fix nullptr dereference in SVD to enhance robustness and prevent runtime errors.
-#1380: Fixes undefined behavior by ensuring proper memory alignment for scalars, enhancing library stability.
-#1381: Updates boost MP test suite to reference new SVD tests for improved reliability.
-#1382: Fix tensor stridedlinearbuffercopy by preventing negative indices and enhancing robustness.
-#1383: Introduces a temporary macro for handling unaligned scalar UB to address TFLite-related issues.
-#1384: Add IWYU private pragmas to internal headers to enhance tooling capabilities.
-#1385: Rename plugin headers to .inc to improve management and usability.
-#1387: Introduced method for handling block expressions, improving block unwinding and compatibility.
-#1388: Ensure stage is not 'ok' if Pardiso returns an error, improving error handling.
-#1389: Introduces new panel modes for GEMM MMA enhancing performance for real and complex matrices.
-#1391: Exported ThreadPool symbols to silence Clang include-cleaner warnings.
-#1394: Fix extra semicolon in XprHelper to resolve compilation error with -Wextra-semi
flag.
-#1396: Fixes longstanding bug in sparse triangular view iterator by restoring row()
and col()
functions.
-#1397: Consolidated multiple implementations of divup/div_up/div_ceil to streamline code maintenance and clarity.
-#1398: Resolved compile errors by eliminating use of _res
.
-#1400: Modify div_ceil
to pass arguments by value, reducing odr-usage errors.
-#1402: Work around MSVC issue in Block XprType, enhancing compatibility.
-#1404: Avoid building docs if cross-compiling or not top level to streamline the build process.
-#1399: Disable denorm deprecation warnings in MSVC C++23 for cleaner build output.
-#1406: Replaced divup with div_ceil in TensorReduction to remove deprecation warnings.
-#1407: Fix Wshorten-64-to-32 warnings in div_ceil to enhance code robustness.
-#1410: Fix int overflow in cxx11_tensor_gpu_1 test using DenseIndex.
-#1411: Fix typo to allow nomalloc test to pass on AVX512.
-#1412: Backport fix for disambiguating overloads with empty index lists to address compilation errors.
-#1408: Generalize parallel GEMM to work with ThreadPool in addition to OpenMP.
-#1413: Optimized traits<Ref>::match
to use correct strides for performance enhancement and memory management.
-#1415: Linked pthread for product_threaded test to ensure successful execution of multi-threaded tests.
-#1416: Fix Wshorten-64-to-32 warning in gemm parallelizer for improved code quality.
-#1417: Fixed bug in getNbThreads()
to return 1 when not parallelized.
-#1421: Gemv microoptimization improves loop performance and reduces compile-time warnings.
-#1422: Fix conversion of (u)int64_t to float on ARM to prevent data loss.
-#1424: Optimized matrix-vector operations in GeneralMatrixVector.h
for improved performance.
-#1423: Introduce static assertions in Tensor constructors to ensure matching dimensions.
-#1425: Fix typecasting for arm32 to restore functionality and compatibility.
-#1419: Ensure that mc is not smaller than Traits::nr to prevent potential errors in calculations.
-#1429: Applied clang-format for consistent coding style across the codebase.
-#1430: Added .git-blame-ignore-revs file to improve git blame clarity.
-#1431: Fix scalar_logistic_function overflow for complex inputs to enhance robustness and accuracy.
-#1432: Applied clang-format-17 across the library to improve code consistency and readability.
-#1433: Improved formatting for .git-blame-ignore-revs to enhance clarity and usability.
-#1434: Fix CUDA syntax error introduced by clang-format to enhance code quality.
-#1435: Protect kernel launch syntax from clang-format to prevent syntax errors.
-#1436: Add internal ctz/clz implementation for enhanced random number generation and pointer alignment checking.
-#1439: Fix MSVC clz to correct leading zero count functionality.
-#1428: Set up clang-format in CI to ensure consistent code formatting.
-#1441: Fixed clang-format CI to run in non-interactive mode and ensured proper installation.
-#1446: Remove C++11 features from ctz/clz to restore compatibility with earlier C++ standards.
-#1409: Addressed compiler warnings and fixed significant bugs in Memory.h.
-#1448: Addressed MSAN failures by ensuring matrices are initialized, improving memory operation reliability.
-#1449: Enhanced memory safety by replacing function pointers with lambdas in GPU code with Clang and asan.
-#1447: Fix various asan errors to enhance stability and reliability by addressing memory management issues.
-#1450: Simplified stableNorm
to suppress GCC warnings and improve efficiency.
-#1445: Add factor getters to Cholmod LLT/LDLT for enhanced solver functionality.
-#1438: Improve documentation of SparseLU module for enhanced user comprehension.
-#1453: Fixes TensorForcedEval copying issues to prevent memory management errors.
-#1456: Ensure pointers are checked before being freed to enhance memory safety.
-#1458: Fixes stableNorm to handle zero-sized inputs, enhancing robustness.
-#1443: Update CI with testing framework from eigen_ci_cross_testing to enhance testing processes.
-#1459: Add missing constexpr qualifier to enhance compile-time evaluation capabilities.
-#1457: Added assertions for .chip to enhance robustness and error handling.
-#1460: Reverted cleanup of stableNorm
to restore performance for large vectors.
-#1461: Fix unused warnings in failtest to enhance code quality and developer experience.
-#1444: Use smaller index types to enhance robustness during resize operations.
-#1451: Fix build error due to Index/StorageIndex mismatch in SPQR::compute().
-#1466: Implements and refines assertions for dimension indices in chipping operations.
-#1454: Add half and quarter vector support to HVX architecture for improved performance.
-#1467: Fix compile-time error and enhance error detection with static asserts.
-#1469: Enhanced C++ standards compliance by removing explicit specialization, improving compatibility with gcc and MSVC.
-#1470: Various formatting improvements for better code readability and consistency.
-#1471: Updated LAPACK CPU time functions for consistency with standard naming conventions.
-#1477: Removed an obsolete relicense script to streamline the codebase.
-#1473: Update documentation for LAPACK's second
and dsecnd
functions to improve clarity and usability.
-#1478: Fix bug in checking subnormals to enhance accuracy in numerical operations.
-#1481: Fix CI for clang-6 cross-compilation ensuring consistent GLIBC versions.
-#1479: Fix busted formatting in Eigen::Tensor README.md.
-#1483: Use stableNorm in ComplexEigenSolver for improved result stability.
-#1482: Fix preshear transformation to restore proper functionality and add test.
-#1476: Fix a bunch of ODR violations to enhance code clarity and consistency.
-#1437: Enhance random number generation to improve entropy for 64-bit scalars.
-#1486: Fix gcc-6 bug in the rand test by adding noinline
attribute to ensure correct behavior.
-#1487: Improve skew-symmetric test reliability by excluding problematic dimensions.
-#1485: Enhanced robustness of PPC testing by removing constraints on random integer generation and fixing overflow issues.
-#1488: Fix tests for bfloat16 and half scalar types, improving test reliability.
-#1489: Fix undefined behavior in getRandomBits
to improve code safety and reliability.
-#1490: Fix UB in bool packetmath test by ensuring valid boolean standards and enhancing reliability.
-#1494: Fix segfault in CholmodBase::factorize() for zero matrix to enhance stability.
-#1492: Fix C++20 error related to arithmetic between different enumeration types.
-#1491: Applied clang-format to lapack/blas directories for code consistency.
-#1496: Fix division by zero UB in packet size logic.
-#1499: Eliminated warning about writing bytes directly to non-trivial type, enhancing code clarity and reducing compiler warnings.
-#1500: Implements explicit scalar conversion in ternary expressions, enhancing type safety and correctness.
-#1504: Fixes undefined behavior in pabsdiff
for ARM with latest compilers, enhancing stability.
-#1507: Fix deflation in BDCSVD, enhancing stability and correctness for large matrices.
-#1506: Use traits::Options for consistency across Eigen objects.
-#1503: Fix random for custom scalars without constexpr digits() to enhance compatibility.
-#1509: Renamed generic_fast_tanh_float
to ptanh_float
and improved code readability.
-#1501: Introduced SIMD complex function pexp_complex
for float
, enhancing performance and compatibility for complex number operations.
-#1513: Fix pexp_complex_test for compliance with C++ standard.
-#1514: Fix exp complex test by using int instead of index to improve correctness and clarity.
-#1510: Enhance real Schur decomposition robustness and improve polynomial solver error checking.
-#1516: Fix GPU build for ptanh_float.
-#1517: Fix use of uninitialized memory in kronecker_product test.
-#1511: Enabled direct access for IndexedView with data method and strides for improved performance.
-#1518: Standardized header guards in key files to fix a build error.
-#1521: Fix crash in IncompleteCholesky when the input has zeros on the diagonal.
-#1520: Removed 'using namespace Eigen' from blas/common.h to prevent symbol collisions.
-#1519: Change array_size
result from enum to constexpr
for enhanced type safety and reduced compiler warnings.
-#1523: Speed up SparseQR improving execution time from 256 to 200 seconds.
-#1524: Fixed signed integer overflows in random number generation for improved stability.
-#1525: Speed up sparse x dense dot product, reducing computation time significantly.
-#1527: Delete shadowed typedefs to improve code clarity and maintainability.
-#1528: Fix QR colpivoting warnings and test failure by using numext::abs
for floating-point types.
-#1531: Add degenerate checks before calling BLAS routines to handle zero-sized matrices or vectors safely.
-#1532: Updated error message to clarify the C++14 requirement, enhancing clarity for users.
-#1530: Eliminated FindCUDA CMake warning to enhance the build process.
-#1529: Fix triangular matrix-vector multiply uninitialized warning by removing const_cast and clarifying logic.
-#932: Replaced make_coherent
with CoherentPadOp
for improved performance in handling derivative sizes.
-#1536: Fix unaligned access in trmv to enhance memory alignment handling.
-#1537: Fix static_assert for better C++14 compatibility.
-#1538: Return 0 volume for empty AlignedBox, fixing erroneous negative volume calculation.
-#1533: Enhanced test coverage for the pexp function, increasing reliability.
-#1535: Fix deprecated anonymous enum-enum conversion warnings to enhance code compliance with C++ standards.
-#1541: Fix packetmath plog test on Windows by switching to numext::log for improved accuracy.
-#1539: Allow aligned assignment in TRMV to simplify edge-case handling and enhance code stability.
-#1540: Fix pexp test for ARM to handle flushed subnormal values in 32-bit comparisons.
-#1542: Split up cxx11_tensor_gpu tests to reduce timeouts and improve test reliability on Windows.
-#1543: Fix and enhance incomplete Cholesky decomposition handling.
-#1545: Enhancements to CwiseUnaryView for improved access and modification of complex array components.
-#1547: Fix const input and C++20 compatibility in unary view.
-#1549: Improved CwiseUnaryView const access by adding matrix mutability checks.
-#1550: Improved error messaging for unsupported rbegin
/rend
on GPU.
-#1553: Restore C++03 compatibility by manually constructing 2x2 matrices.
-#1551: Work around a compile issue in VS2015 by using static_cast
.
-#1552: Enhanced MSVC compatibility for CwiseUnaryView by reorganizing code.
-#1557: Improved documentation in the Jacobi module by adjusting tag placement for applyOnTheRight
.
-#1558: Removed slow index check in Tensor::resize for performance improvements and modernized codebase.
-#1555: Enhanced Matrix functions with constexpr for compile-time evaluation.
-#1559: Fix compatibility of SIMD intrinsics for 32-bit builds.
-#1562: Enhanced compatibility and stability by protecting the use of alloca
on 32-bit ARM systems.
-#1561: Removed unnecessary 'extern C' in CholmodSupport for code simplification.
-#1564: Introduced vectorization for cross3_product
, improving performance and MSVC compatibility.
-#1563: Introduced custom formatting for complex numbers, enhancing Numpy/Native compatibility.
-#1566: Fix for Packet2l on Win32 enhances compatibility and reliability.
-#1568: Fix ScalarPrinter redefinition for gcc to enhance CI reliability.
-#1570: Use truncation rather than rounding when casting Packet2d to Packet2l to enhance accuracy.
-#1560: Implemented cwiseSquare
function and fixed typo in cwiseCbrt
, enhancing test coverage for matrix operations.
-#1565: Enhance compile-time expressions with symbols for efficient computations.
-#1571: Fix usages of Eigen::array for std::array compatibility.
-#1515: Fix random number generation for custom float types to enhance accuracy and minimize rounding bias.
-#1575: Improved handling of long double random number generation by refining fallback to double for unsupported configurations.
-#1577: Fix preverse for PowerPC to enhance compatibility and performance.
-#1576: Fixed preprocessor condition to restore fast float logistic implementation, enhancing performance and compatibility.
-#1578: Enhancements to Geometry_SIMD.h improve SIMD operations and compatibility.
-#1581: Add constexpr
to accessors in DenseBase, Quaternions, and Translations to enhance compile-time functionality.
-#1583: Optimized pexp
function for up to 6% faster performance.
-#1582: Refactor indexed view to fix warnings and errors in MSVC 14.16 build.
-#1573: Fix compiler warnings for MSVC and enhance code quality and cross-platform compatibility.
-#1584: Implements optimizations for Intel pblend
, improving mask creation and comparison operations.
-#1522: Introduces SIMD implementation for double precision sincos, enhancing performance using Veltkamp method and Padé approximant.
-#1591: Fix compilation problems with PacketI on PowerPC for enhanced compatibility.
-#1590: Optimized pblend functionality with blend_mask_helper and enhanced auto vectorization for improved performance.
-#1593: Specialized evaluation for ternary operations to optimize (a < b).select(c, d)
expression.
-#1594: Fix tridiagonalization_inplace_selector::run()
for CUDA compatibility by adding EIGEN_DEVICE_FUNC
.
-#1597: Fix autodiff enum comparison warnings to enhance code quality by reducing compilation warnings.
-#1596: Fix unused variable warnings in TensorIO to enhance code cleanliness and maintainability.
-#1598: Fix transposed matrix product bug to reduce unnecessary memory allocations.
-#1599: Prevent PPC runner from cross-compiling non-PPC targets to reduce build failures.
-#1601: Fixes sine and cosine functions on PPC by implementing a missing comparison function.
-#1602: Adjust error bound for nonlinear tests to account for AVX usage without FMA.
-#1604: Unbork AVX512 preduce_mul on MSVC for improved reliability and correctness.
-#1606: Fix undefined behavior for generating inputs to the predux_mul test to ensure reliable results.
-#1607: Relaxed hard-coded error bounds in nonlinear tests to enhance cross-platform reliability.
-#1605: Removed unnecessary semicolons to enhance code readability.
-#1493: Introduce trunc
operation and improve code structure for SIMD operations.
-#1600: Optimizes transposed matrix products to reduce memory usage and improve performance.
-#1610: Fix new generic nearest integer ops on GPU for enhanced compatibility and performance.
-#1609: Enhanced reliability of orthonormality tests for eigenvectors by adjusting tolerance for matrix scaling.
-#1556: Reorganize CMake for better efficiency and usability in non-top-level builds.
-#1612: Introduces new bit shifting functionalities such as logical and arithmetic shift operators for integer types.
-#1613: Improved MSVC support by utilizing MSVC functions for 128-bit integer operations.
-#1614: Fix FFT when destination does not have unit stride by using a temporary buffer.
-#1611: Fix CMake package to correctly set include path for eigen target.
-#1615: Change predux on PowerPC for Packet4i to NOT saturate the sum of the elements for consistency.
-#1616: Fixed GCC 6 compile error by removing namespace prefixes from struct specializations.
-#1618: Fixed a clerical error in Matrix class documentation.
-#1621: Added checks for valid indices in SparseMatrix::insert to enhance robustness.
-#1622: Fix ubsan failure in array_for_matrix to enhance robustness and reduce undefined behavior.
-#1624: Improved performance by eliminating int to ptr casting in aligned_alloca
.
-#1619: Suppress C++23 deprecation warnings for std::has_denorm and std::has_denorm_loss to enhance compatibility.
-#1625: Utilize __builtin_alloca_with_align
for improved memory allocation efficiency.
-#1623: Reformatted EIGEN_STATIC_ASSERT() as a statement macro for consistency and maintainability.
-#1620: Fix compilation failures on constexpr matrices with GCC 14.
-#1628: Improve threading test reliability by adjusting header file inclusion and resolving C++20 warnings.
-#1629: Vectorize isfinite
and isinf
functions for improved performance.
-#1630: Fix warnings about repeated definitions of macros, enhancing code reliability.
-#1631: Suppressed GCC warnings on enum comparisons to enhance code quality.
-#1632: Vectorized allFinite() function achieving significant performance enhancement.
-#1633: Resolved warnings stemming from previous fixes, enhancing code quality and maintainability.
-#1635: Fixed warning C5054 by ensuring type-safe enum comparisons.
-#1636: Allow pointer_based_stl_iterator to conform to the contiguous_iterator concept in C++20, enhancing compatibility with modern C++ ranges.
-#1637: Fixed scalar pselect to handle NaN values consistently in MSVC's fast-math mode.
-#1644: Add async support for 'chip' and 'extract_volume_patches' with extensive testing.
-#1653: Corrected numerous typographical errors to enhance code clarity.
-#1649: Fix compiler warnings in BDCSVD by using placement new for object initialization.
-#1659: Updated .clang-format to support JavaScript files, improving formatting process.
-#1660: Updated eigen_navtree_hacks.js to enhance performance and usability.
-#1654: Introduced alignment macro to reduce atomic false sharing in RunQueue, enhancing multithreaded performance.
-#1656: Fixes multiple typos to enhance code readability and professionalism.
-#1645: Removed implicit 'this' capture in lambdas to prevent Clang warnings and enhance code clarity.
-#1658: Fix pi in kissfft to enhance accuracy in computations.
-#1651: Fixes and enhances conversion of Eigen::half
to _Float16
in AVX512 to resolve compilation issues and improve code robustness.
-#1648: Fix Woverflow warnings in PacketMathFP16 with explicit short casts.
-#1661: Modified hlog
to allow symbol lookup in local namespaces, improving function flexibility and consistency.
-#1650: Remove C++23 check around has_denorm deprecation suppression to prevent MSVC warnings.
-#1665: Cleanups to threaded product code and test for improved clarity and maintainability.
-#1666: Add a yield instruction in the two spinloops of the threaded matmul implementation to optimize CPU resource usage.
-#1668: Include for improved thread management with std::this_thread::yield().
-#1667: Speed up StableNorm for non-trivial sizes and improve consistency between aligned and unaligned inputs.
-#1670: Speed up and improve accuracy of tanh with new rational approximation.
-#1675: Add vectorized implementation of tanh to enhance performance across various ISAs.
-#1672: Vectorize squaredNorm() for complex types to enhance performance and efficiency.
-#1677: Consolidated float and double implementations of patan() for improved performance and accuracy.
-#1678: Suppress Wmaybe-uninitialized warning in TensorVolumePatchOp by handling unreachable code.
-#1676: Fixed documentation visibility for GeneralizedEigenSolver::eigenvectors() method.
-#1681: Enhanced complex number trait handling and added tests for pnmsub.
-#1679: Suppressed Wmaybe-uninitialized warning in BDCSVD for better memory safety and code reliability.
-#1680: Enhanced TensorChipping with detection of 'effectively inner/outer' chipping for better data loading optimization.
-#1682: Added nvc++ support by fixing ARM NEON compilation errors and improving test flag handling.
-#1684: Vectorize atanh, improve standard compliance and performance for |x| >= 1.
-#1685: Fix out-of-range arguments to _mm_permute_pd to enhance stability.
-#1688: Fix bug for atanh(-1) improving stability and accuracy.
-#1690: Fixes bug in previous atanh implementation, enhancing accuracy and reliability.
-#1691: Updated NonBlockingThreadPool.h to use eigen_plain_assert for enhanced compatibility with non-C++26 projects.
-#1671: Optimized dot products with new evaluator and explicit unrolling for improved performance.
-#1692: Optimize dot product for enhanced performance on smaller vector sizes.
-#1693: Fix generic ceil for SSE2 to handle negative numbers near zero correctly.
-#1694: Make fixed size matrices and arrays trivially copy and move constructible to enhance performance and compatibility.
-#1626: Refactor code to use constexpr for data() functions, enhancing compile-time evaluation and optimization.
-#1697: Removed unneeded call to _mm_setzero_si128 to address issue #2858.
-#1698: Fix implicit conversion in TensorChipping to enhance code reliability and prevent unexpected behavior.
-#1699: Fix warning in EigenSolver::pseudoEigenvalueMatrix() for improved robustness and compatibility.
-#1700: Added debugging info to float_pow_test_impl and cleaned up array_cwise tests.
-#1701: Add missing EIGEN_DEVICE_FUNC annotations to enhance CUDA compatibility.
-#1702: Added max_digits10 to NumTraits for mpreal types to enhance precision handling.
-#1703: Fix inverse evaluator for CUDA devices by marking as host+device function.
-#1706: Enhanced speed and accuracy of erf()
function with reduced error and significant performance gains.
-#1707: Fixes bug to avoid NaN in erf(x) for large |x| with maintained speedup.
-#1709: Use ppolevl for polynomial evaluation to enhance maintainability and set the stage for future optimizations.
-#1708: Enhanced robustness of the atan
test for 32-bit ARM platforms, resolving test failures.
-#1710: Vectorize erfc() for float to enhance accuracy and performance.
-#1711: Fix DenseBase::tail
for dynamic template arguments, enhancing flexibility and usability.
-#1712: Suppressed ARM array out-of-bounds warnings for reverseInPlace
function on fixed-size matrices.
-#1704: Introduced a free-function swap
for dense and sparse matrices to enhance compatibility with C++ standard algorithms.
-#1715: Introduced exp2(x)
as a packet op and array method, enhancing precision with reduced error rates.
-#1716: Fix stack allocation assert to improve performance by relocating static assert and reducing performance overhead during evaluator instantiation.
-#1714: Add nextafter for bfloat16 to enhance precision and correctness in calculations.
-#1719: Add tests for sizeof() with one dynamic dimension to enhance coverage.
-#1722: Modified handling of matrix parameters to improve internal data alignment by avoiding passing by value.
-#1720: Fix NVCC builds for CUDA 10+, enhancing compatibility and reducing warnings.
-#1718: Fix OOB access in triangular matrix multiplication to enhance robustness.
-#1723: Fix clang6 compiler issues with optimization flags.
-#1725: Enhanced ARM compatibility by fixing clang6 failures and removing SSE reliance.
-#1724: Fix macro redefinition warning in FFTW test by removing default FFT macros from CMake test declarations.
-#1721: Ensure compatibility of EIGEN_ALIGNED_ALLOCA with nvc++ by replacing __builtin_alloca_with_align.
-#1726: Fixes GPU build issues by initializing constexpr
global variables for CUDA compatibility.
-#1727: Make fixed-size objects trivially move assignable for improved performance.
-#1729: Add nvc++ compiler support in Eigen v3.4 for better compatibility.
-#1731: Use EIGEN_CPLUSPLUS instead of __cplusplus for better MSVC compatibility.
-#1736: Add missing EIGEN_DEVICE_FUNCTION
decorations enhancing compatibility and performance.
-#1737: Ensure fixed-size matrices conform to std::is_standard_layout, enhancing type safety and reducing compiler warnings.
-#1739: Use numeric limits for overflow checks instead of C99 macro, enhancing type safety and compatibility.
-#1740: Use old syntax for CMake's separate_arguments() to restore compatibility with old CMake versions.
-#1735: Make element accessors constexpr to enhance compile-time usability.
-#1741: Ensure destructors needed by lldb are non-inlined for proper debugging.
-#1742: Cast enum to int in Assign_MKL.h to resolve C++20 compatibility issues.
-#1743: Vectorize erf(x) for double precision with significant speed improvements using SIMD instructions.
-#1745: Fix C++20 constexpr test compilation failures, enhancing test suite compatibility.
-#1747: Optimized erf(x) by removing redundant computations for large arguments.
-#1748: Removed unnecessary check for HasBlend trait to enhance code efficiency and readability.
-#1750: Speed up exp(x)
function by 30-35%, leveraging input characteristics for performance gains.
-#1751: Reverted a commit to restore stability in debug mode builds.
-#1752: Prevent premature overflow in exp(x) and improve performance by 3-4%.
-#1754: Simplify and speed up pow() by 5-6%.
-#1755: Optimize setConstant
and setZero
for better performance in Eigen library.
-#1756: Improve pow(x,y) with 25% speedup and increased accuracy for integer exponents.
-#1758: Add test for using pcast on scalars to enhance testing coverage.
-#1759: Refactor special case handling in pow(x,y) and revert to repeated squaring for <float,int> to enhance accuracy and efficiency.
-#1760: Fix UB in setZero to prevent undefined behavior with null destination arrays.
-#1761: Improved map fill logic to enhance flexibility and memory access patterns.
-#1762: Fixes IOFormat alignment by correcting the rowSpacer computation.
-#1764: Updated CI configuration to use ubuntu:latest
, improving build reliability.
-#1763: Documentation improvements for move constructors and move assignments.
-#1765: Introduce deploy phase to CI for tagging successful nightly builds.
-#1769: Fix special packetmath erfc flushing for ARM32 to handle subnormals.
-#1771: Update deploy job to enhance efficiency and streamline steps.
-#1772: Updated git clone strategy to improve branch update reliability.
-#1775: Remove branch name from nightly tag job to simplify the tagging process.
-#1774: Implemented equality comparison operator for matrices with different sizes.
-#1776: Use alpine for deploying nightly tag, improving deployment efficiency.
-#1779: Enable fill_n and memset optimizations for construction and assignment.
-#1785: Add missing #include <new>
to resolve build issue.
-#1786: Use omp_get_max_threads
if setNbThreads
is not set to improve threading behavior.
-#1790: Fix read of uninitialized threshold in SparseQR, enhancing code clarity and safety.
-#1792: Fixed std::fill_n reference to resolve namespace conflicts and improve code reliability.
-#1793: Zero-initialize test arrays to avoid uninitialized reads, enhancing test reliability and memory safety.
-#1794: Clarified documentation for complex number cross product.
-#1795: Eigen::aligned_allocator
modified to not inherit from std::allocator
, preventing incorrect method calls.
-#1791: Introduces ForkJoin-based ParallelFor algorithm to enhance ThreadPool with improved parallel execution.
-#1797: Improved compatibility and performance for loongarch architecture.
-#1799: Fix typo in NonBlockingThreadPool to improve task management functionality.
-#1796: Updated documentation to clarify non-square dimensions for block objects.
-#1802: Fixed initialization order and removed unused variables in NonBlockingThreadPool.h.
-#1803: Enhanced compatibility of threadpool with C++14 and fixed minor warnings.
-#1804: Fix potential data race on spin_count_
NonBlockingThreadPool member variable to enhance thread safety.
-#1801: Enhanced Simplicial Cholesky analyzePattern with advanced algorithms for improved performance.
-#1806: Fix UTF-8 encoding errors impacting compilation on MSVC and Apple Clang.
-#1810: Changed midpoint selection in Eigen::ForkJoinScheduler to enhance reliability and prevent out-of-bounds errors.
-#1805: Introduced matrixL()
and matrixU()
functions for accessing L and U Factors in IncompleteLU decomposition.
-#1811: Enhanced configuration for loongarch64 emulated tests in Eigen to improve flexibility and reliability.
-#1807: Fix all the doxygen warnings, enhancing documentation clarity and accuracy.
-#1813: Increased max alignment to 256 bytes for improved performance on modern ARM architectures.
-#1812: Build and deploy nightly Doxygen docs for enhanced accessibility and up-to-date resources.
-#1814: Add missing return statements for PPC to enhance code reliability and correctness.
-#1809: Fix issues in tensor documentation by correcting class name references.
-#1815: Update check for std::hardware_destructive_interference_size to improve compatibility on Android.
-#1816: Fix android hardware_destructive_inference_size issue to ensure compatibility with Android NDK versions 25 and lower.
-#1817: Added EIGEN_CI_CTEST_ARGS for custom test timeouts and standardized argument naming.
-#1818: Enhanced documentation generation with nightly builds and improved Doxygen configuration.
-#1823: Added graphviz to doc build to fix broken graphs.
-#1821: Fix numerical issues with BiCGSTAB to enhance performance and robustness.
-#1824: Ensures condition number is zero for non-invertible matrices, enhancing rcond estimate reliability.
-#1826: Added missing MathJax/LaTeX configuration for proper formula rendering.
-#1825: Eliminate type-punning UB in Eigen::half with a safer bit-cast approach.
-#1827: Remove assumption of std::complex for complex scalar types, enhancing flexibility for user-defined complex types.
-#1828: Enhances TensorRef with flexible type assignments and consistent immutability.
-#1829: Refactored AssignEvaluator.h
for enhanced readability and maintainability.
-#1831: Enhanced compatibility for Power builds without VSX and POWER8.
-#1820: Fix Warray-bounds warning for fixed-size assignments by optimizing vectorized traversal strategies.
-#1833: Fixes Warray-bounds in inner product to enhance stability and reliability.
-#1830: Make assignment operations constexpr
for compile-time evaluation.
-#1834: Initialize matrix elements in bicgstab test to enhance reliability.
-#1836: Fix implicit copy-constructor warning in TensorRef.
-#1835: Fix bitwise operation error for C++26 compatibility.
-#1837: Implemented a system to retain nightly documentation, ensuring continuous availability despite pipeline failures.
-#1838: Simplified ForkJoin code and ensured test execution, enhancing ParallelFor API and performance.
-#1839: Specify constructor template arguments for ConstexprTest struct to suppress warnings.
-#1840: Fix boolean scatter and random generation for tensors, enhancing reliability and expanding test coverage.
-#1841: Fix docs job for nightlies to ensure consistency and reliability.
-#1842: Fix CMake BOOST warning by updating configuration to resolve deprecated behavior.
-#1843: Fixes STL feature detection to support C++20, enhancing compatibility with various compilers and STL versions.
-#1844: Optimize division operations in TensorVolumePatch.h to reduce computational overhead.
-#1846: Refactor AssignmentFunctors.h to reduce redundancy and unify assignment operations.
-#1847: Fixes potential compilation errors by removing an unnecessary semicolon in DeviceWrapper.
-#1848: Improved TensorDeviceThreadPool.h by removing unused methods and enhancing functionality.
-#1849: Formatted TensorDeviceThreadPool.h and used C++20's if constexpr for enhanced readability and performance.
-#1850: Fix x86 complex vectorized FMA bugs, improving performance and accuracy.
-#1778: Added an install-doc
target in CMake to improve documentation installation.
-#1851: Implemented a fix for the Givens rotation algorithm, enhancing accuracy and reliability.
-#609: Optimize predux operations on AArch64 architecture for performance enhancement.
-#489: AVX512 and AVX2 support for Packet16i and Packet8i added, enhancing vectorization capabilities.
-#630: Fixed AVX integer packet issues by adding AVX2 protection and correcting AVX512 implementation.
-#618: Added EIGEN_DEVICE_FUNC labels to resolve CUDA 9 gpu_basic
compilation issues.
-#639: Fixes in AVX2 PacketMath.h improve performance and stability by correcting typos and addressing unaligned load issues.
-#623: Introduces device-compatible tuple for GPU testing, addressing compatibility issues with std::tuple
in Eigen.
-#659: Fix alias violation in BFloat16 enhancing reliability on PPC platforms.
-#663: Disable more CUDA warnings to reduce compilation output clutter.
-#668: Fix Windows CMake compiler/OS detection for improved build system reliability.
-#673: Vectorized Visitor.h with AVX2, enhancing coeffMax
performance and matrix decompositions.
-#679: Disabled Tree reduction for GPU to eliminate memory errors and improve stability.
-#677: Use reinterpret_cast on GPU for bit_cast to improve performance by avoiding memcpy overhead.
-#534: Preliminary support for HIP bfloat16 GPU on AMD, setting the foundation for future optimizations.
-#680: Improved PowerPC packing performance and accuracy in non-vectorized operations.
-#704: Removed problematic implementation causing g++-11 crashes.
-#716: Converted diag pragmas to nv_diag for improved code consistency and maintenance.
-#734: Select AVX2 even if the data size is not a multiple of 8 to improve vectorization.
-#745: Fix for HIP compilation breakage in selfAdjoint and triangular view classes.
-#774: Introduced fixes to enable HIP unit tests and updated CMake configuration for compatibility.
-#773: Small speed-up in row-major sparse dense product by optimizing sparse_time_dense_product_impl for better parallelism.
-#789: Include immintrin.h for F16C intrinsics when vectorization is disabled.
-#764: Add MMA and performance improvements for VSX in GEMV for PowerPC.
-#816: Port EIGEN_OPTIMIZATION_BARRIER to support soft float ARM architectures.
-#820: Add reciprocal packet op and fast specializations for float with SSE, AVX, and AVX512.
-#824: Removed inline assembly for FMA (AVX) and added packet ops: pmsub, pnmadd, pnmsub.
-#828: Fix number of block columns to prevent cache overflow on PowerPC in GEMV.
-#832: Fix AVX512 math function consistency and enable for ICC.
-#847: Cleaned up compiler warnings in GEMM & GEMV for PowerPC.
-#858: Fix and enhance sqrt/rsqrt functions for NEON with improved testing and accuracy.
-#869: Fix CMake for SYCL support, enhancing configuration and compatibility.
-#872: Enhanced sqrt/rsqrt for denormal handling improving performance on AVX512.
-#834: AVX512 optimizations for triangular solve improve performance for fp32/fp64 operations without matrix packing.
-#922: Work around MSVC compiler bug dropping const
in transpose()
and diagonal()
.
-#929: Split general_matrix_vector_product interface for Power into ColMajor and RowMajor macros to resolve TensorFlow compilation issues.
-#936: Performance improvements in GEMM for Power architecture enhancing efficiency and speed.
-#948: Fix compatibility issues with MSVC+CUDA, enhancing compilation and reducing warnings.
-#959: Restrict AVX512 trsm to AVX512VL and rename files for consistency.
-#960: Removed AVX512VL dependency in trsm to enhance compatibility and maintain performance.
-#860: Added AVX512 optimizations for matrix multiply enhancing large problem size performance.
-#983: Enhanced SYCL backend by extending QueueInterface for better integration with existing SYCL queues.
-#972: Add AVX512 s/dgemm optimizations for compute kernel to improve stability and performance.
-#988: Fix build issues with MSVC for AVX512 by disabling certain optimizations.
-#992: Improved AVX512 TRSM Kernels to respect EIGEN_NO_MALLOC configuration.
-#998: Fix tanh and erf to use vectorized version for EIGEN_FAST_MATH in VSX.
-#997: AVX512 TRSM kernels use alloca for memory allocation when EIGEN_NO_MALLOC is requested, enhancing performance under memory constraints.
-#1000: Enhanced GEMV performance for Power10 by optimizing load/store vector pairs.
-#1011: Improved pblend AVX implementation by optimizing blendv operations for better performance.
-#1024: Added Partial Packet support for GEMM real-only on PowerPC, fixed compilation warnings and reduced binary size.
-#1036: Replace malloc/free with aligned memory management in sparse classes for improved consistency.
-#1040: Specialize psign for AVX2 and avoid vectorizing psign for better performance.
-#1055: Call check_that_malloc_is_allowed() in aligned_realloc() to enhance memory management robustness.
-#1058: Add missing comparison operators for GPU packets to resolve build issues with CUDA.
-#1065: Fix for sparse matrix related breakage on ROCm.
-#1073: Add AVX int32_t pdiv for enhanced performance in integer division.
-#1076: Add vectorized integer division for int32 with AVX512, AVX or SSE, enhancing performance.
-#1018: Optimized gebp_kernel for arm64-neon using 3px8/2px8/1px8, improving performance through better register usage.
-#1086: Conditional vectorization of atan for Altivec only if VSX is available.
-#1075: Optimized sign function for complex numbers by using generic implementation only when vectorizable.
-#1111: Fixed Neon vectorization issues to enhance ARM performance and compatibility.
-#1115: Fixed a bug in the AVX2 implementation of psignbit
, improving reliability and correctness.
-#1008: Add support for Power10 (AltiVec) MMA instructions for bfloat16 to enhance performance.
-#1104: Fix NEON instruction bug for half data type in 'fmla' function.
-#1131: Increased L2 and L3 cache sizes for Power10 to boost performance.
-#1129: Add BDCSVD_LAPACKE binding for improved SVD computations using LAPACKE.
-#1141: Enable NEON pabs for unsigned int types to enhance performance of absolute value operations.
-#1142: Fix incorrect NEON native fp16 multiplication.
-#1146: Enable NEON pcmp, plset, and complex psqrt for enhanced NEON support and performance.
-#1153: Fix guard macros for emulated FP16 operators on GPU to improve compatibility with CUDA.
-#1154: Improve performance for Power10 MMA bfloat16 GEMM with significant speed enhancements.
-#1150: Altivec fixes for Darwin to avoid unsupported VSX instructions on older PowerPC CPUs.
-#1126: Enabled Intel DPCPP Compiler support for Eigen's SYCL backend to enhance compatibility with SYCL-2020 features.
-#1174: Improve performance of bfloat16 MMA when dimensions are not multiples of 8 or 4.
-#1184: Fix bugs in pcmp_lt and pnegate, reactivate psqrt for pre-POWER8_VECTOR.
-#1202: Fix MSVC ARM build by resolving macro complications and improving vector type handling.
-#1207: Optimize psign for better performance in floating point operations.
-#1210: Optimized bfloat16 MMA GEMM for improved performance with an additional MMA accumulator.
-#1214: Optimize BF16 to F32 array conversions on Power architectures by reducing vector instructions.
-#1224: Add and enable Packet int divide for Power10 to enhance performance.
-#1227: Fixed null placeholder accessor issue in Reduction SYCL test for DPC++ compliance.
-#1232: Guard use of long double on GPU device to reduce warnings and prevent duplicate symbols.
-#1235: Fix ODR issues with Intel's AVX512 TRSM kernels to enhance linkage and performance.
-#1237: Fix gpu conv3d out-of-resources failure by enhancing internal variable handling.
-#1236: Added partial linear access for LHS & Output, achieving 30% faster bfloat16 GEMM MMA (Power).
-#1253: Streamline packetmath specializations for various backends using a macro to enhance readability and maintainability.
-#1249: Fix failing MSVC tests by replacing *_set1_*
intrinsics to ensure consistency.
-#1255: Added MMA to BF16 GEMV for Power, achieving 5.0-6.3X speedup.
-#1258: Revert changes causing register spillage in BF16 GEMM for LLVM (Power), improving performance.
-#1270: Fix issues affecting ARM builds including missing cast, conversion issue for MSVC packets, and macro definitions for 32-bit ARM.
-#1272: Optimize casting operations for x86_64 architecture, enhancing performance, especially for bool casting.
-#1274: Optimize float->bool cast for AVX2, resulting in significant performance improvements.
-#1275: Added vectorized integer casts for x86 and removed redundant tests for performance enhancement.
-#1277: Fix incorrect casting in AVX512DQ path to enhance code reliability and performance.
-#1282: ASAN fixes for AVX512 GEMM/TRSM to address memory-related issues and enhance safety.
-#1293: Enable new AVX512 GEMM kernel by default, incorporating ASAN fixes.
-#1296: Add dynamic dispatch to BF16 GEMM (Power) and new VSX version for significant performance boost.
-#1297: Add Packet4ui
, Packet8ui
, and Packet4ul
to the SSE
/AVX
PacketMath.h
headers for improved SIMD operations with unsigned integers.
-#1307: New VSX version of BF16 GEMV for Power architecture, achieving up to 6.7X performance improvement.
-#1313: Added pmul and abs2 operations for Packet4ul type under AVX2, enhancing computational efficiency and compatibility.
-#1317: Unroll F32 to BF16 loop for 1.8X faster conversions on LLVM and improved GCC handling.
-#1320: Use std::shared_ptr for FFTW/IMKL FFT plan implementation to enhance memory management.
-#1327: Fixes CUDA compilation issues by rearranging header inclusions and adding necessary includes.
-#1341: Replaced CudaStreamDevice with GpuStreamDevice in tensor benchmarks for improved accuracy and reliability.
-#1349: Fixed AVX pstore implementation for correct aligned store with integer types.
-#1356: Unconditionally define EIGEN_HAS_ARM64_FP16_VECTOR_ARITHMETIC for ARM to enhance compilation stability.
-#1357: Fix supportsMMA to obey EIGEN_ALTIVEC_MMA_DYNAMIC_DISPATCH compilation flag and compiler support.
-#1359: Fix AVX512 nomalloc issues in trsm by disabling inappropriate memory allocations.
-#1365: Added missing pcasts for x86 architectures to enhance type conversion capabilities and cleaned up code.
-#1375: Added architecture definition files for Qualcomm Hexagon Vector Extension (HVX) to enhance compatibility and performance.
-#1386: Improved ARM32 float division accuracy and reliability by refining methods and adjusting tests.
-#1392: Fix call to static functions from device by adding EIGEN_DEVICE_FUNC attribute to run methods.
-#1393: Update to use ROCM_PATH instead of HIP_PATH for ROCm 6.0 compatibility.
-#1455: Introduced MI300 related test support for ROCm platforms.
-#1468: Fixes ARM32 issues by enhancing floating-point computations and accuracy.
-#1505: Disable float16 packet casting when native AVX512 f16 is available to enhance stability and correctness.
-#1495: Optimized JacobiSVD by removing unnecessary member variables for better performance and memory efficiency.
-#1526: Fix MSVC GPU build by resolving allocate()
function conflict for better MSVC and NVCC compatibility.
-#1544: Introduced Packet2l for efficient int64_t operations with SSE, enhancing integer computation capabilities.
-#1546: Add support for casting between double and int64_t for SSE and AVX2.
-#1567: Improved 32-bit support by addressing double to int64 conversion issues and adding smoketests for Windows.
-#1569: Optimized SparseMatrix move operations for improved performance.
-#1572: Implemented AVX2 vectorized casting from double
to int64_t
, with performance and code clean-ups.
-#1574: Guard Packet4l
definition in AVX to avoid conflicts and enhance stability.
-#1580: Add support for Packet8l to AVX512, optimizing performance and compatibility.
-#1585: Handle missing AVX512 intrinsic, improving stability and reliability with GCC.
-#1588: Fix build for pblend and psin_double, pcos_double when AVX but not AVX2 is supported.
-#1592: Enhanced psincos for PPC and fixed ARM32 test failures.
-#1595: Enhance CI scripts with Windows fixes and performance testing additions.
-#1641: Implemented AVX512F-based casting from double
to int64_t
for performance enhancement.
-#1639: Fix AVX512FP16 build failure by implementing vectorized cast specializations.
-#1655: Optimize ThreadPool spinning for enhanced performance and reduced latency.
-#1662: Enhanced performance of complex matrix multiplication with dynamic block panel size adjustment, improving speed by 8-33%.
-#1663: Optimized complex multiplication using vfmaddsub
for SSE/AVX, enhancing performance.
-#1669: Introduces ARM NEON complex intrinsics for enhanced performance in complex number computations.
-#1673: Improved SVE intrinsic performance by using "_x" suffix instead of "_z", reducing instruction overhead.
-#1683: Introduces SSE and AVX implementations for complex FMA to enhance performance and accuracy.
-#1689: Fix a bug for pcmp_lt_or_nan and add sqrt support for ARM SVE.
-#1733: Add missing AVX predux_any
functions for enhanced vectorized performance.
-#1734: Enhanced predux_any function using AVX for performance improvements.
-#1732: Vectorize erfc(x) for double and improve accuracy and performance for float.
-#1749: Disable fill_n
optimization for MSVC to improve performance.
-#1753: Re-enable vectorized erf(x) for SSE and AVX for optimized performance.
-#1767: Update ROCm Docker image to Ubuntu 22.04 for improved stability.
-#1768: Transition to Ubuntu 24.04 in ROCm Docker for improved stability.
-#1770: Experiment with Alpine for slimmer Docker builds in CI.
-#1773: CI pipeline now uses commit tags for improved traceability and reliability.
-#1787: Fix the missing CUDA device qualifier to enhance CUDA compatibility and performance.
-#1788: Removed unnecessary ToolChain PPA from CI configuration.
-#1832: Removed the fno-check-new
flag for Clang to reduce warnings.
-#608: Removed c++11-off CI jobs to streamline process and focus on modern standards.
-#648: Fix typos in copyright dates.
-#662: Reorganize test main file to enhance maintainability and clarity.
-#794: Fixed duplicated header guards in AltiVec and ZVector packages to prevent conflicts.
-#842: Corrected typo in COD documentation from matrixR() to matrixT().
-#902: Temporarily disable aarch64 CI due to unavailability of Windows on Arm machines.
-#910: Reverted changes to PowerPC MMA flags due to premature merge.
-#919: Completed a missing parenthesis in tutorial to enhance code clarity.
-#1054: Fix typo in doc/TutorialSparse.dox to improve documentation clarity.
-#1074: Revert addition of C++14 constexpr support to restore stability and compatibility.
-#1143: Revert changes to type handling in CompressedStorage.h to restore previous functionality.
-#1173: Revert changes to QR tests to restore original functionality and maintain compatibility.
-#1302: Fix typo in SSE packetmath for improved code clarity.
-#1401: Fixed a typo in the comments for improved documentation clarity.
-#1452: Fix minor issues in basic slicing examples documentation.
-#1463: Revert addition of asserts for .chip due to test failures.
-#1642: Reverted a previous change to fix scalar pselect to maintain library integrity.
-#1640: Fix markdown formatting in README.md.
-#1766: Update ROCm docker image in CI to enhance reliability.
-#1800: Clean up and fix the documentation of ForkJoin.h, focusing on typos and formatting.
-#1808: Fixed minor typos in ForkJoin.h
to enhance documentation clarity.