Created
March 19, 2025 13:47
-
-
Save SteveBronder/de5681294213f7b8a3654b3ac14ab627 to your computer and use it in GitHub Desktop.
We can't make this file beautiful and searchable because it's too large.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
MR ID/Link,Title/Subject,Description/Summary,Author,Merge Date,Category/Labels,Impacted Areas/Components,Summary | |
515 (https://gitlab.com/libeigen/eigen/-/merge_requests/515),Add random matrix generation via SVD,"Add random matrix generation via singular value decomposition as proposed by C. C. Paige and M. A. Saunders. | |
Fixes #2250. | |
### Reference issue | |
See #2250 for details. | |
### What does this implement/fix? | |
Adds a generator for random matrices with prescribed singular values. This allows for fine-tuning the difficulty of test problems with respect to the l2-norm. Two strategies for generating the singular values and unit tests for verification are included.",Kolja Brix,2021-08-23T16:00:05.986Z,NA,NA,"## Title: | |
Add random matrix generation via SVD | |
## Authors: | |
Kolja Brix | |
## Summary: | |
This merge request implements a new feature in the Eigen C++ library that enables the generation of random matrices with specified singular values using singular value decomposition (SVD). This enhancement is designed to help developers create test problems with controlled difficulty related to the l2-norm. | |
### Key Changes: | |
- Introduced a generator for random matrices based on prescribed singular values. | |
- Included two strategies for generating singular values. | |
- Added unit tests to verify the functionality. | |
### Improvements: | |
- The random matrix generation via SVD allows for more precise control over the properties of generated matrices, beneficial for testing and algorithm development. | |
### Impact: | |
This feature will significantly enhance the testing capabilities of the Eigen library, allowing users to create tailored test scenarios that better reflect the performance of algorithms under specific conditions." | |
544 (https://gitlab.com/libeigen/eigen/-/merge_requests/544),Add support for Eigen::Block types to GDB pretty printer,"### Reference issue | |
#1539 | |
### What does this implement/fix? | |
Add support for Eigen::Block types to GDB pretty printer. | |
### Additional information | |
* Thanks a lot to Allan Leal who provided the patch, see also #1539. | |
* See also !543.",Kolja Brix,2021-08-23T16:11:49.626Z,NA,NA,"## Title: | |
Add support for Eigen::Block types to GDB pretty printer | |
## Authors: | |
Kolja Brix | |
## Summary: | |
This merge request implements support for Eigen::Block types in the GDB pretty printer, enhancing the debugging experience for users of the Eigen C++ library. | |
### Key Changes: | |
- Added functionality to the GDB pretty printer to support Eigen::Block types. | |
### Improvements: | |
- Enhanced the visualization of Eigen::Block types during debugging, making it easier for developers to inspect their data structures. | |
### Impact: | |
- This improvement significantly aids developers using Eigen in debugging scenarios, as it provides clearer insights into Eigen::Block types, ultimately streamlining the development workflow." | |
606 (https://gitlab.com/libeigen/eigen/-/merge_requests/606),removed sparse dynamic matrix,"It was deprecated already. | |
It is an API break in unsupported.",Jens Wehner,2021-08-24T15:53:34.943Z,NA,NA,"## Title: | |
Removed Sparse Dynamic Matrix | |
## Authors: | |
Jens Wehner | |
## Summary: | |
This merge request removes the Sparse Dynamic Matrix from the Eigen C++ library, following its prior deprecation. | |
### Key Changes: | |
- Removal of the Sparse Dynamic Matrix class from the API. | |
### Improvements: | |
- Streamlines the library by eliminating outdated and unsupported features. | |
### Impact: | |
- Represents an API break, affecting users who relied on the Sparse Dynamic Matrix. Users will need to adapt to the absence of this feature in future versions." | |
607 (https://gitlab.com/libeigen/eigen/-/merge_requests/607),Add flowchart to unsupported sparse iterative solvers,"Add a flowchart to help people choose the best solver for their problem. | |
It is is a dot graph and it is not really pretty but it gets the job done. | |
I am a dot beginner so feedback and assistance is very welcome.",Jens Wehner,2021-08-24T17:12:05.554Z,NA,NA,"## Title: | |
Add flowchart to unsupported sparse iterative solvers | |
## Authors: | |
Jens Wehner | |
## Summary: | |
This merge request introduces a flowchart designed to assist users in selecting the most appropriate sparse iterative solver for their specific problems. The flowchart is presented as a dot graph, serving as a practical tool despite its rudimentary aesthetic. | |
### Key Changes: | |
- Introduction of a flowchart for choosing sparse iterative solvers. | |
- The flowchart is implemented in a dot graph format. | |
### Improvements: | |
- Provides a visual aid for users, which enhances the decision-making process when selecting a solver. | |
### Impact: | |
This addition aids users, especially those unfamiliar with the available solvers, by guiding them towards optimal choices for their needs, potentially improving their experience and efficiency when working with the Eigen library." | |
608 (https://gitlab.com/libeigen/eigen/-/merge_requests/608),Remove c++11-off CI jobs.,This is step 1 in transitioning beyond c++03.,Antonio Sánchez,2021-08-24T17:59:45.188Z,NA,NA,"## Title: | |
Remove c++11-off CI jobs. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request removes continuous integration (CI) jobs that are designed to ensure compatibility with C++03, marking a significant step in transitioning beyond this older standard. | |
### Key Changes: | |
- Elimination of c++11-off CI jobs. | |
### Improvements: | |
- Streamlines the CI process by focusing on more modern standards. | |
### Impact: | |
- Paves the way for the Eigen library to adopt more advanced features and improvements that are available in C++11 and beyond, enhancing future development and code quality." | |
609 (https://gitlab.com/libeigen/eigen/-/merge_requests/609),optimize predux if architecture is aarch64,"### What does this implement/fix? | |
This PR is going to optimize predux, predux_min and predux_max. | |
When NEON is in aarch64, we can use `v(add|min|max)v` intrinsic to do reduction, because use a lot of `vp(add|min|max)` will slow the performance.",Han-Kuan Chen,2021-08-25T19:18:55.109Z,NA,NA,"## Title: | |
Optimize predux if architecture is aarch64 | |
## Authors: | |
Han-Kuan Chen | |
## Summary: | |
This merge request focuses on optimizing the reduction operations `predux`, `predux_min`, and `predux_max` specifically for the ARM AArch64 architecture. | |
### Key Changes: | |
- Implementation of optimized versions of `predux`, `predux_min`, and `predux_max` using `v(add|min|max)v` intrinsics for AArch64. | |
- Reduction in the usage of multiple `vp(add|min|max)` operations to enhance performance. | |
### Improvements: | |
- Significant performance enhancement in reduction operations on AArch64 architecture by leveraging appropriate intrinsics. | |
### Impact: | |
- The optimization improves the efficiency of the Eigen library's reduction functionalities, particularly for applications running on AArch64 architecture, leading to faster execution times." | |
489 (https://gitlab.com/libeigen/eigen/-/merge_requests/489),AVX512 and AVX2 support for Packet16i and Packet8i added,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
#2244 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Adds support for Packet16i and Packet8i for AVX512 and AVX2 backends respectively. | |
### Additional information | |
<!--Any additional information you think is important.-->",Jakub Lichman,2021-08-25T19:38:24.142Z,NA,NA,"## Title: | |
AVX512 and AVX2 support for Packet16i and Packet8i added | |
## Authors: | |
Jakub Lichman | |
## Summary: | |
This merge request implements support for `Packet16i` using the AVX512 backend and `Packet8i` using the AVX2 backend, enhancing Eigen's capabilities for vectorization on supported hardware architectures. | |
### Key Changes: | |
- Introduction of AVX512 support for `Packet16i`. | |
- Introduction of AVX2 support for `Packet8i`. | |
### Improvements: | |
- Enhanced vectorization capabilities for integer types, potentially increasing performance on compatible CPUs. | |
- Improved compatibility with modern hardware, allowing for better optimization in computational tasks. | |
### Impact: | |
- Users leveraging the Eigen library on systems with AVX512 or AVX2 support will experience improved performance for operations involving `Packet16i` and `Packet8i`. | |
- This enhancement positions Eigen as a more efficient option for applications that demand high-performance linear algebra computations on modern hardware." | |
610 (https://gitlab.com/libeigen/eigen/-/merge_requests/610),Bump CMake files to at least c++11.,"Removed all configurations that explicitly test or set the c++ standard | |
flags. The only place the standard is now configured is at the top of | |
the main `CMakeLists.txt` file, which can easily be updated (e.g. if | |
we decide to move to c++14+). This can also be set via command-line using | |
``` | |
> cmake -DCMAKE_CXX_STANDARD 14 | |
``` | |
Kept the `EIGEN_TEST_CXX11` flag for now - that still controls whether to | |
build/run the `cxx11_*` tests. We will likely end up renaming these | |
tests and removing the `CXX11` subfolder.",Antonio Sánchez,2021-08-25T20:24:09.725Z,NA,NA,"## Title: | |
Bump CMake files to at least c++11. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request updates the Eigen C++ library's CMake configuration to set a minimum standard of C++11. It streamlines the CMake files by removing individual configurations that test or set C++ standard flags, centralizing the configuration in the main `CMakeLists.txt` file. | |
### Key Changes: | |
- Removed explicit tests and settings for C++ standard flags across the CMake files. | |
- Established C++ standard configuration solely at the top of the `CMakeLists.txt` file. | |
- Maintained the `EIGEN_TEST_CXX11` flag to control the building and running of `cxx11_*` tests. | |
### Improvements: | |
- Simplification of CMake configuration enhances maintainability and clarity. | |
- Provides flexibility to easily update the C++ standard in the future (e.g., to C++14) through a command-line option. | |
### Impact: | |
- Facilitates smoother transitions to newer C++ standards by centralizing configuration. | |
- Reduces complexity in the build configuration, making it easier for developers to understand and modify as needed." | |
605 (https://gitlab.com/libeigen/eigen/-/merge_requests/605),SparseExtra: updated RandomSetter,"As the master branch has C++11 now | |
- updated the `RandomSetter` to use unordered_map",Jens Wehner,2021-08-25T20:47:41.570Z,NA,NA,"## Title: | |
SparseExtra: updated RandomSetter | |
## Authors: | |
Jens Wehner | |
## Summary: | |
This merge request updates the `RandomSetter` component within the SparseExtra module of the Eigen C++ library. The changes leverage C++11 features, specifically using `unordered_map` for improved performance and efficiency. | |
### Key Changes: | |
- Updated `RandomSetter` to utilize `unordered_map`. | |
### Improvements: | |
- Enhanced performance of the `RandomSetter` implementation by adopting `unordered_map`, which provides average constant-time complexity for lookups and insertions. | |
### Impact: | |
- The transition to `unordered_map` is expected to improve the overall responsiveness and speed of operations involving the `RandomSetter`, benefiting applications that require random data generation in sparse matrices." | |
543 (https://gitlab.com/libeigen/eigen/-/merge_requests/543),Fix PEP8 and formatting issues in GDB pretty printer.,"### What does this implement/fix? | |
Several PEP8 and formatting issues were found in the GDB pretty printer, which uses the Python interface of GDB to visualize the values of Eigen data structures. These get fixed in this MR. | |
### Additional information | |
Details on which PEP8 errors or warnings were found are listed in the individual commit messages.",Kolja Brix,2021-08-26T15:22:28.836Z,NA,NA,"## Title: | |
Fix PEP8 and formatting issues in GDB pretty printer. | |
## Authors: | |
Kolja Brix | |
## Summary: | |
This merge request addresses several PEP8 and formatting issues identified in the GDB pretty printer for the Eigen C++ library, which facilitates the visualization of Eigen data structures using GDB. | |
### Key Changes: | |
- Resolved multiple PEP8 violations in the code. | |
- Improved formatting of the GDB pretty printer. | |
### Improvements: | |
- Enhanced code readability and consistency with Python standards. | |
### Impact: | |
- The improvements ensure better integration and usability of the GDB pretty printer when visualizing Eigen data structures, contributing to a more efficient debugging process for users." | |
611 (https://gitlab.com/libeigen/eigen/-/merge_requests/611),included unordered_map header,fixes https://gitlab.com/libeigen/eigen/-/issues/2311,Jens Wehner,2021-08-27T16:53:32.855Z,NA,NA,"## Title: | |
Included unordered_map header | |
## Authors: | |
Jens Wehner | |
## Summary: | |
This merge request addresses a specific issue within the Eigen C++ library by including the unordered_map header file. | |
### Key Changes: | |
- Inclusion of the `<unordered_map>` header in the library code. | |
### Improvements: | |
- Resolves issue #2311, enhancing the library’s functionality related to unordered maps. | |
### Impact: | |
- This change improves the efficiency and capabilities of the library by ensuring proper usage of unordered maps, potentially leading to better performance in implementations that depend on this feature." | |
613 (https://gitlab.com/libeigen/eigen/-/merge_requests/613),Fix fix<N> when variable templates are not supported.,"There were some typos that checked `EIGEN_HAS_CXX14` that should have | |
checked `EIGEN_HAS_CXX14_VARIABLE_TEMPLATES`, causing a mismatch | |
in some of the `Eigen::fix<N>` assumptions. | |
Also fixed the `symbolic_index` test when | |
`EIGEN_HAS_CXX14_VARIABLE_TEMPLATES` is 0. | |
Fixes #2308",Antonio Sánchez,2021-08-30T16:06:51.591Z,NA,NA,"## Title: | |
Fix fix<N> when variable templates are not supported. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses issues related to the `Eigen::fix<N>` implementation when variable templates are not supported in the compilation environment. | |
### Key Changes: | |
- Corrected typos in conditional checks from `EIGEN_HAS_CXX14` to `EIGEN_HAS_CXX14_VARIABLE_TEMPLATES`. | |
- Fixed the `symbolic_index` test for scenarios where `EIGEN_HAS_CXX14_VARIABLE_TEMPLATES` is set to 0. | |
### Improvements: | |
- Ensures proper functionality of `Eigen::fix<N>` in environments lacking support for variable templates. | |
- Enhances reliability of the `symbolic_index` test case under specific conditions. | |
### Impact: | |
This fix resolves issue #2308, improving the library's robustness for users who may not have C++14 variable templates enabled, thereby broadening compatibility across different compilers and environments." | |
612 (https://gitlab.com/libeigen/eigen/-/merge_requests/612),Add EIGEN_TENSOR_PLUGIN support per issue #2052.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #2052 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Adds support for using EIGEN_TENSOR_PLUGIN, EIGEN_TENSORBASE_PLUGIN, and EIGEN_READONLY_TENSORBASE_PLUGIN to expand the functionality of those classes. | |
### Additional information | |
<!--Any additional information you think is important.-->",Turing Eret,2021-08-30T19:36:56.697Z,NA,NA,"## Title: | |
Add EIGEN_TENSOR_PLUGIN support per issue #2052. | |
## Authors: | |
Turing Eret | |
## Summary: | |
This merge request enhances the Eigen C++ library by adding support for EIGEN_TENSOR_PLUGIN, EIGEN_TENSORBASE_PLUGIN, and EIGEN_READONLY_TENSORBASE_PLUGIN. This expansion aims to improve the functionality of the tensor-related classes within the library. | |
### Key Changes: | |
- Implementation of EIGEN_TENSOR_PLUGIN support. | |
- Integration of EIGEN_TENSORBASE_PLUGIN and EIGEN_READONLY_TENSORBASE_PLUGIN functionalities. | |
### Improvements: | |
- Enhances the capabilities of tensor classes in Eigen, allowing for more flexible and powerful tensor manipulations. | |
### Impact: | |
- Users of the Eigen library will benefit from increased tensor functionality, potentially improving performance and ease of use in tensor operations." | |
614 (https://gitlab.com/libeigen/eigen/-/merge_requests/614),Lapack flags,Allow old Fortran code for LAPACK tests to compile despite argument mismatch errors (REAL passed to COMPLEX workspace argument) with GNU Fortran 10.,Rasmus Munk Larsen,2021-08-30T20:24:00.026Z,NA,NA,"## Title: | |
Lapack flags | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request addresses compilation issues in older Fortran code for LAPACK tests. It modifies the code to allow these tests to compile successfully, despite argument mismatches when using GNU Fortran 10. | |
### Key Changes: | |
- Adjustments made to LAPACK test code to handle argument mismatch errors (specifically, situations where REAL types were passed to COMPLEX workspace arguments). | |
### Improvements: | |
- Enhances compatibility of the LAPACK tests with newer Fortran compilers, particularly GNU Fortran 10. | |
### Impact: | |
- Ensures that the LAPACK test suite remains functional and accessible, minimizing barriers for users working with legacy code in LAPACK, and improving overall testing consistency." | |
615 (https://gitlab.com/libeigen/eigen/-/merge_requests/615),win: include intrin header for Windows on ARM,"It is necessary to include the intrin header for BitScanReverse and BitScanReverse64. | |
Fixes #2314",Ádám Kallai,2021-08-31T15:14:03.743Z,NA,NA,"## Title: | |
win: include intrin header for Windows on ARM | |
## Authors: | |
Ádám Kallai | |
## Summary: | |
This merge request addresses the need to include the intrin header for the BitScanReverse and BitScanReverse64 functions specifically on Windows ARM architecture. | |
### Key Changes: | |
- Inclusion of the intrin header file for Windows on ARM. | |
### Improvements: | |
- Enhanced compatibility and functionality of the Eigen library on Windows ARM by ensuring necessary intrinsic functions are accessible. | |
### Impact: | |
- This change resolves issue #2314, improving the library's performance and reliability on ARM platforms running Windows." | |
616 (https://gitlab.com/libeigen/eigen/-/merge_requests/616),Disable cuda Eigen::half vectorization on host.,"All cuda `__half` functions are device-only in CUDA 9, including | |
conversions. Host-side conversions were added in CUDA 10. | |
The existing code doesn't build prior to 10.0. | |
All arithmetic functions are always device-only, so there's | |
therefore no reason to use vectorization on the host at all. | |
Modified the code to disable vectorization for `__half` on host, | |
which required also updating the `TensorReductionGpu` implementation | |
which previously made assumptions about available packets.",Antonio Sánchez,2021-08-31T19:29:19.377Z,NA,NA,"## Title: | |
Disable cuda Eigen::half vectorization on host. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses issues related to the vectorization of `Eigen::half` types on the host side for CUDA versions prior to 10.0. Given that all `__half` functions in CUDA 9 are device-only, the code has been modified to disable host-side vectorization for `__half`. | |
### Key Changes: | |
- Disabled vectorization of `__half` types on the host. | |
- Updated the `TensorReductionGpu` implementation to align with the new approach regarding available packets. | |
### Improvements: | |
- Ensured compatibility with CUDA versions prior to 10.0 by preventing the use of host-side conversions that were not supported until CUDA 10. | |
- Simplified the handling of `__half` types by eliminating unnecessary host vectorization. | |
### Impact: | |
This change enhances the build reliability of the Eigen C++ library when using CUDA versions prior to 10.0 and streamlines the code by removing host-side vectorization, which is not applicable for `__half` types." | |
621 (https://gitlab.com/libeigen/eigen/-/merge_requests/621),GCC 4.8 arm EIGEN_OPTIMIZATION_BARRIER fix (#2315).,"GCC 4.8 doesn't seem to like the `g` register constraint, failing to | |
compile with ""error: 'asm' operand requires impossible reload"". | |
Tested `r` instead, and that seems to work, even with latest compilers. | |
Also fixed some minor macro issues to eliminate warnings on armv7. | |
Fixes #2315.",Antonio Sánchez,2021-08-31T20:37:12.522Z,NA,NA,"## Title: | |
GCC 4.8 arm EIGEN_OPTIMIZATION_BARRIER fix (#2315). | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses compilation issues encountered with the Eigen library when using GCC 4.8 on ARM architecture. It modifies the assembly constraints to improve compatibility and resolve warnings. | |
### Key Changes: | |
- Replaced the `g` register constraint with the `r` constraint for better compatibility with GCC 4.8. | |
- Fixed minor macro issues, leading to the elimination of warnings on armv7. | |
### Improvements: | |
- Enhanced code compatibility with older GCC versions, facilitating smoother compilation on ARM platforms. | |
- Reduced warnings, leading to cleaner build outputs. | |
### Impact: | |
These changes ensure that the Eigen library compiles successfully with GCC 4.8 on ARM, improving usability for developers using this toolchain and enhancing overall compatibility." | |
619 (https://gitlab.com/libeigen/eigen/-/merge_requests/619),fixed unsupported linear solvers documentation,Fixed the header of the unsupported sparse iterative solvers and deleted a commented out include which no longer exists.,Jens Wehner,2021-08-31T23:15:13.631Z,NA,NA,"## Title: | |
Fixed unsupported linear solvers documentation | |
## Authors: | |
Jens Wehner | |
## Summary: | |
This merge request addresses documentation issues within the Eigen C++ library's unsupported sparse iterative solvers. It fixes the header section and removes outdated code. | |
### Key Changes: | |
- Updated the header of the unsupported sparse iterative solvers. | |
- Deleted a commented-out include that is no longer relevant. | |
### Improvements: | |
- Enhanced clarity and accuracy of the documentation for users working with unsupported linear solvers. | |
### Impact: | |
- The changes improve the overall documentation quality, helping users better understand the sparse iterative solvers and reducing confusion regarding obsolete code references." | |
629 (https://gitlab.com/libeigen/eigen/-/merge_requests/629),Fix EIGEN_OPTIMIZATION_BARRIER for arm-clang.,"Clang doesn't like !621, needs the ""g"" constraint back. | |
The ""g"" constraint also works for GCC >= 5. | |
This fixes our gitlab CI.",Antonio Sánchez,2021-09-01T16:39:50.615Z,NA,NA,"## Title: | |
Fix EIGEN_OPTIMIZATION_BARRIER for arm-clang. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an issue with the EIGEN_OPTIMIZATION_BARRIER for the arm-clang compiler, reinstating the ""g"" constraint required for compatibility. | |
### Key Changes: | |
- Reintroduced the ""g"" constraint to the EIGEN_OPTIMIZATION_BARRIER for arm-clang compatibility. | |
### Improvements: | |
- Enhanced compatibility with GCC versions 5 and above by retaining the ""g"" constraint. | |
### Impact: | |
- Resolves CI failures in the GitLab environment, ensuring stability and reliability in the build process for users of arm-clang." | |
628 (https://gitlab.com/libeigen/eigen/-/merge_requests/628),Rename 'vec_all_nan' of cxx11_tensor_expr test because this symbol is used by altivec.h,"Rename 'vec_all_nan' of cxx11_tensor_expr test because this symbol is used by altivec.h | |
This patch fixes the build failures of ppc64le tests.",Maxiwell S. Garcia,2021-09-01T17:00:06.170Z,NA,NA,"## Title: | |
Rename 'vec_all_nan' of cxx11_tensor_expr test | |
## Authors: | |
Maxiwell S. Garcia | |
## Summary: | |
This merge request addresses the build failures encountered in the ppc64le tests by renaming the symbol 'vec_all_nan' in the cxx11_tensor_expr test. The previous name conflicts with a symbol in altivec.h, leading to compilation issues. | |
### Key Changes: | |
- Renamed the symbol 'vec_all_nan' in the cxx11_tensor_expr test. | |
### Improvements: | |
- Resolved build failures specific to ppc64le architecture. | |
### Impact: | |
This change ensures successful compilation and testing on the ppc64le platform, enhancing the reliability of the Eigen C++ library in diverse environments." | |
485 (https://gitlab.com/libeigen/eigen/-/merge_requests/485),cmake: remove deprecated package config variables,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
This MR removes deprecated `EIGEN3_*` variables and related CMake files that were provided for compatibility purposes with previously provided find module. Their use, however, was discouraged and therefore declared deprecated almost five years ago with the introduction of the relocatable package config in 5c516e4e0a1290b9a233c8f3c379fd6bde5ef9c2. | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.--> | |
Removing deprecated CMake variables exposed by the package config can help avoid subtle errors that can be caused by intermixing `FindEigen3.cmake` and the `Eigen3Config.cmake` package config. The documentation does not mention the deprecated CMake variables anyway. The changes also effectively eliminate the need for workarounds required to solve #1386. | |
We should also likely remove `FindEigen3.cmake` (and maybe `FindEigen2.cmake`?) which mostly duplicate `Eigen3Config.cmake` while preventing best CMake practices by caching all the paths and increasing the maintenance burden at no additional value.",Sergiu Deitsch,2021-09-01T17:22:28.249Z,NA,NA,"## Title: | |
CMake: Remove Deprecated Package Config Variables | |
## Authors: | |
Sergiu Deitsch | |
## Summary: | |
This merge request focuses on the removal of deprecated `EIGEN3_*` variables and related CMake files, which have been obsolete since the introduction of the relocatable package config. | |
### Key Changes: | |
- Eliminated deprecated `EIGEN3_*` variables from the CMake configuration. | |
- Removed related CMake files that were used for compatibility with obsolete find modules. | |
### Improvements: | |
- Reduces potential errors by preventing the intermixing of `FindEigen3.cmake` and `Eigen3Config.cmake`. | |
- Clears out unnecessary components, simplifying the build system. | |
### Impact: | |
- Enhances CMake best practices and maintenance by removing outdated configurations. | |
- Potentially addresses workarounds needed for previously identified issues (e.g., #1386). | |
- Suggests that further deprecated modules like `FindEigen3.cmake` and `FindEigen2.cmake` may also be removed in the future to streamline the library." | |
630 (https://gitlab.com/libeigen/eigen/-/merge_requests/630),Fix AVX integer packet issues.,"Most are instances of AVX2 functions not protected by | |
`EIGEN_VECTORIZE_AVX2`. There was also a missing semi-colon | |
for AVX512.",Antonio Sánchez,2021-09-01T21:32:51.342Z,NA,NA,"## Title: | |
Fix AVX integer packet issues. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses issues related to AVX2 functions in the Eigen C++ library. It ensures that AVX2 functions are properly protected by the `EIGEN_VECTORIZE_AVX2` directive and also rectifies a missing semi-colon in the AVX512 implementation. | |
### Key Changes: | |
- Added protection for AVX2 functions using `EIGEN_VECTORIZE_AVX2`. | |
- Fixed a missing semi-colon in the AVX512 code. | |
### Improvements: | |
- Enhanced reliability and correctness of AVX2 integer packet functionality. | |
- Prevented potential compilation issues related to the AVX512 implementation. | |
### Impact: | |
These changes improve the robustness of the AVX integer packet operations, ensuring they function correctly across compatible hardware. This can lead to better performance and fewer bugs in applications utilizing these features." | |
622 (https://gitlab.com/libeigen/eigen/-/merge_requests/622),(GPU Testing Part 1) Rename Tuple -> Pair.,"This is to make way for a new `Tuple` class that mimics `std::tuple`, | |
but can be reliably used on device and with aligned Eigen types. | |
The existing Tuple has very few references, and is actually an | |
analogue of `std::pair`. | |
This is part 1 of a set of changes to simplify creating generic GPU tests.",Antonio Sánchez,2021-09-02T02:36:06.654Z,NA,NA,"## Title: | |
(GPU Testing Part 1) Rename Tuple -> Pair. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request focuses on renaming the existing `Tuple` class to `Pair` and sets the groundwork for the introduction of a new `Tuple` class designed to function reliably on devices and with aligned Eigen types. | |
### Key Changes: | |
- Renamed the existing `Tuple` class to `Pair`. | |
- Introduced a new `Tuple` class that mimics `std::tuple`. | |
### Improvements: | |
- The new `Tuple` class will enhance compatibility with device usage and alignment requirements of Eigen types. | |
### Impact: | |
This change simplifies the process of creating generic GPU tests and prepares the framework for future enhancements related to GPU functionality in Eigen." | |
618 (https://gitlab.com/libeigen/eigen/-/merge_requests/618),Missing EIGEN_DEVICE_FUNCs to get `gpu_basic` passing with CUDA 9.,"CUDA 9 seems to require labelling defaulted constructors as | |
`EIGEN_DEVICE_FUNC`, despite giving warnings that such labels are | |
ignored. Without these labels, the `gpu_basic` test fails to | |
compile, with errors about calling `__host__` functions from | |
`__host__ __device__` functions. | |
With this and !616, the `gpu_basic` test now passes for CUDA 9.1.",Antonio Sánchez,2021-09-02T03:21:08.665Z,NA,NA,"## Title: | |
Missing EIGEN_DEVICE_FUNCs to get `gpu_basic` passing with CUDA 9. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses compilation issues with the `gpu_basic` test in CUDA 9. It introduces the necessary `EIGEN_DEVICE_FUNC` labels for defaulted constructors, which are required despite CUDA's warnings that these labels may be ignored. | |
### Key Changes: | |
- Added `EIGEN_DEVICE_FUNC` labels to defaulted constructors in the Eigen C++ library. | |
### Improvements: | |
- Resolved compilation errors when calling `__host__` functions from `__host__ __device__` functions. | |
- Enabled successful execution of the `gpu_basic` test under CUDA 9.1. | |
### Impact: | |
This update ensures compatibility and successful compilation of the `gpu_basic` tests in the Eigen library when using CUDA 9, thus improving the library's support for GPU functionalities." | |
632 (https://gitlab.com/libeigen/eigen/-/merge_requests/632),cmake: remove unused interface definitions,"### What does this implement/fix? | |
This MR removes `EIGEN_DEFINITIONS` from interface definitions as `EIGEN_DEFINITIONS` is not defined anywhere.",Sergiu Deitsch,2021-09-02T16:07:39.760Z,NA,NA,"## Title: | |
cmake: remove unused interface definitions | |
## Authors: | |
Sergiu Deitsch | |
## Summary: | |
This merge request simplifies the CMake configuration by removing the `EIGEN_DEFINITIONS` from interface definitions, as it was not defined anywhere in the project. | |
### Key Changes: | |
- Removed `EIGEN_DEFINITIONS` from the interface definitions in the CMake scripts. | |
### Improvements: | |
- Cleans up the CMake configuration by eliminating redundant and unused definitions. | |
### Impact: | |
- Enhances maintainability and clarity of the CMake setup by ensuring that only relevant definitions are included, which may help prevent confusion and potential errors in the build process." | |
633 (https://gitlab.com/libeigen/eigen/-/merge_requests/633),cmake: use ARCH_INDEPENDENT versioning if available,"### What does this implement/fix? | |
CMake 3.14 added the `ARCH_INDEPENDENT` option to `write_basic_package_version_file` from the `CMakePackageConfigHelpers` module greatly simplifying versioning of architecture independent package configs as used by Eigen. | |
This MR adds an alternative code path which makes use of the option. The legacy code path can be removed once Eigen increases the minimum required CMake version to at least 3.14.",Sergiu Deitsch,2021-09-02T16:09:12.711Z,NA,NA,"## Title: | |
cmake: use ARCH_INDEPENDENT versioning if available | |
## Authors: | |
Sergiu Deitsch | |
## Summary: | |
This merge request introduces a new mechanism for versioning package configurations in the Eigen C++ library, utilizing the `ARCH_INDEPENDENT` option from CMake 3.14. This enhances the management of architecture-independent packages by simplifying the versioning process. | |
### Key Changes: | |
- Implementation of an alternative code path that leverages the `ARCH_INDEPENDENT` option in `write_basic_package_version_file`. | |
- A plan to remove the legacy code path upon raising Eigen's minimum required CMake version to 3.14. | |
### Improvements: | |
- Simplified versioning for architecture-independent package configurations. | |
- Increased compatibility and adherence to modern CMake practices. | |
### Impact: | |
This change is expected to streamline the versioning process for users of the Eigen library, particularly enhancing the experience for those building architecture-independent packages. It paves the way for cleaner code by phasing out older methods in the future." | |
617 (https://gitlab.com/libeigen/eigen/-/merge_requests/617),Matrixmarket extension,"### What does this implement/fix? | |
Currently the matrixmarket reader/writer can only parse sparse matrices and dense dynamic vectors, this extends the reader/writer to read write any kind of dense matrix. Currently self-adjoint and triangular reading is not supported. | |
Also added the documentation for these features.",Jens Wehner,2021-09-02T17:23:33.992Z,NA,NA,"## Title: | |
Matrixmarket extension | |
## Authors: | |
Jens Wehner | |
## Summary: | |
This merge request introduces enhancements to the matrixmarket reader/writer in the Eigen C++ library. It enables the support for reading and writing various types of dense matrices, moving beyond the current limitations of only handling sparse matrices and dense dynamic vectors. | |
### Key Changes: | |
- Extended the matrixmarket reader/writer to handle any kind of dense matrix. | |
- Self-adjoint and triangular matrix reading capabilities are not yet supported. | |
- Added documentation for the new features. | |
### Improvements: | |
- Increased versatility in handling different matrix types enhances usability for users dealing with dense matrices. | |
### Impact: | |
This extension improves the matrixmarket functionality, allowing for greater flexibility and usability when working with dense matrices, thereby broadening the scope of applications for users of the Eigen library." | |
634 (https://gitlab.com/libeigen/eigen/-/merge_requests/634),cmake: populate package registry by default,"### What does this implement/fix? | |
With CMake 3.15 and above the `export` command does not populate the package registry by default unless the `CMAKE_EXPORT_PACKAGE_REGISTRY` variable is set. For backwards compatibility, prefer the old behavior (even though the old one occasionally causes some [confusion](https://gitlab.com/libeigen/eigen/-/issues/1386#note_254719335) and is considered deprecated.)",Sergiu Deitsch,2021-09-02T17:52:54.570Z,NA,NA,"## Title: | |
cmake: populate package registry by default | |
## Authors: | |
Sergiu Deitsch | |
## Summary: | |
This merge request implements a change in the CMake configuration for the Eigen C++ library. It ensures that the package registry is populated by default when using CMake versions 3.15 and above, addressing a backward compatibility concern. | |
### Key Changes: | |
- The `export` command in CMake now populates the package registry by default unless the `CMAKE_EXPORT_PACKAGE_REGISTRY` is explicitly set. | |
### Improvements: | |
- This change aims to maintain consistency and reduce potential confusion related to package registry behavior, despite being aware that the previous method has some deprecation issues. | |
### Impact: | |
- By adopting this update, users will experience improved ease of use with package visibility and dependency management in projects utilizing the Eigen library, particularly for those using newer versions of CMake." | |
635 (https://gitlab.com/libeigen/eigen/-/merge_requests/635),Fix tridiagonalization_inplace_selector.,"The `Options` of the new `hCoeffs` vector do not necessarily match | |
those of the `MatrixType`, leading to build errors if they differ. Having the | |
`CoeffVectorType` be a template parameter relieves this restriction.",Antonio Sánchez,2021-09-02T19:45:01.334Z,NA,NA,"## Title: | |
Fix tridiagonalization_inplace_selector. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an issue with the `Options` of the new `hCoeffs` vector not aligning with those of the `MatrixType`, which could lead to build errors. The solution involves making `CoeffVectorType` a template parameter to eliminate this mismatch. | |
### Key Changes: | |
- Modified `hCoeffs` vector handling to utilize a template parameter for `CoeffVectorType`. | |
### Improvements: | |
- Ensured compatibility between `hCoeffs` and `MatrixType`, reducing the likelihood of build errors. | |
### Impact: | |
- Enhances the stability and flexibility of the tridiagonalization functionality in the Eigen C++ library, improving overall usability for developers." | |
636 (https://gitlab.com/libeigen/eigen/-/merge_requests/636),Remove stray DynamicSparseMatrix references.,DynamicSparseMatrix has been removed. These shouldn't be here anymore.,Antonio Sánchez,2021-09-02T20:03:43.632Z,NA,NA,"## Title: | |
Remove stray DynamicSparseMatrix references. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request focuses on the removal of leftover references to the DynamicSparseMatrix class, which has been eliminated from the Eigen C++ library. | |
### Key Changes: | |
- All references to the DynamicSparseMatrix have been removed from the codebase. | |
### Improvements: | |
- The removal of outdated references contributes to a cleaner, more maintainable codebase, eliminating potential confusion for developers. | |
### Impact: | |
- This change streamlines the library, ensuring that no obsolete components or references linger, therefore enhancing the overall performance and clarity of the code." | |
637 (https://gitlab.com/libeigen/eigen/-/merge_requests/637),Remove more DynamicSparseMatrix references.,Missed some in unsupported/. Also fixed some typos clang-tidy was complaining about.,Antonio Sánchez,2021-09-02T22:53:22.916Z,NA,NA,"## Title: | |
Remove more DynamicSparseMatrix references. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request focuses on eliminating additional instances of the DynamicSparseMatrix references that were overlooked in the unsupported directory. Additionally, it addresses some typographical errors highlighted by clang-tidy. | |
### Key Changes: | |
- Removal of extra DynamicSparseMatrix references in the unsupported/ folder. | |
- Corrections of various typographical errors as identified by clang-tidy. | |
### Improvements: | |
- Enhanced code clarity and readability by fixing typos. | |
- Ensured consistency in the codebase by removing unnecessary references. | |
### Impact: | |
These changes contribute to a cleaner and more maintainable codebase, reducing confusion around the usage of DynamicSparseMatrix and improving overall coding standards." | |
638 (https://gitlab.com/libeigen/eigen/-/merge_requests/638),Add missing packet types in pset1 call.,"Oops, introduced this when ""fixing"" integer packets.",Antonio Sánchez,2021-09-02T23:39:03.436Z,NA,NA,"## Title: | |
Add missing packet types in pset1 call. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an oversight related to the inclusion of missing packet types in the `pset1` call following a recent fix for integer packets. | |
### Key Changes: | |
- Added previously missing packet types in the `pset1` function. | |
### Improvements: | |
- Ensures that all relevant packet types are handled correctly during calls to `pset1`, enhancing function robustness. | |
### Impact: | |
- This change resolves issues that may arise from incomplete packet type handling, thereby improving the overall reliability of the Eigen C++ library when processing packet data." | |
639 (https://gitlab.com/libeigen/eigen/-/merge_requests/639),Fix AVX2 PacketMath.h.,"There were a couple typos ps -> epi32, and an unaligned load issue.",Antonio Sánchez,2021-09-03T20:03:50.162Z,NA,NA,"## Title: | |
Fix AVX2 PacketMath.h. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses issues in the AVX2 implementation of PacketMath.h, correcting typos and an unaligned load problem. | |
### Key Changes: | |
- Corrected typos from ""ps"" to ""epi32"". | |
- Fixed an unaligned load issue in the code. | |
### Improvements: | |
These corrections enhance the reliability and correctness of the AVX2 optimizations within the Eigen library. | |
### Impact: | |
The changes improve performance and stability during operations that utilize AVX2, ensuring better data handling and processing in applications relying on the Eigen library." | |
482 (https://gitlab.com/libeigen/eigen/-/merge_requests/482),Add LLDB Pretty Printer,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
<!-- ### Reference issue --> | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
An LLDB synthetic child provider was implemented, with which the LLDB debugger can show the items in a structured view. | |
This script supports fixed or dynamic storage dense matrix, and compressed or uncompressed sparse matrix. Both row-major or column-major matrices are supported. | |
#### Usage | |
Import the script to LLDB using the following command | |
``` | |
command script import eigenlldb.py | |
``` | |
#### Effects | |
##### Before | |
Previous screenshots during debugging in CLion on Windows: | |
 | |
 | |
##### After | |
The effects in CLion on Windows: | |
 | |
 | |
The effects in command line on Ubuntu 20.04 with LLDB 10: | |
``` | |
(lldb) frame variable vector3 | |
(Eigen::Vector3d) vector3 = ([0,0] = 1, [1,0] = 1, [2,0] = 1) | |
(lldb) frame variable matrix3 | |
(Eigen::Matrix3d) matrix3 = ([0,0] = 1, [1,0] = 1, [2,0] = 1, [0,1] = 1, [1,1] = 1, [2,1] = 1, [0,2] = 1, [1,2] = 1, [2,2] = 1) | |
(lldb) frame variable sparse_mat | |
(Eigen::SparseMatrix<double, 0, int>) sparse_mat = ([0,0] = 1, [0,1] = 2, [1,1] = 3) | |
``` | |
<!--Please explain your changes.--> | |
<!-- ### Additional information --> | |
<!--Any additional information you think is important.-->","Huang, Zhaoquan",2021-09-07T17:28:25.975Z,NA,NA,"## Title: | |
Add LLDB Pretty Printer | |
## Authors: | |
Huang, Zhaoquan | |
## Summary: | |
This merge request introduces a synthetic child provider for the LLDB debugger, allowing for structured display of Eigen library matrices and vectors during debugging sessions. It supports various matrix types, including both dense (fixed or dynamic) and sparse (compressed or uncompressed) in both row-major and column-major formats. | |
### Key Changes: | |
- Implementation of an LLDB synthetic child provider. | |
- Support for displaying fixed or dynamic storage dense matrices and compressed or uncompressed sparse matrices. | |
- Compatibility with both row-major and column-major storage formats. | |
### Improvements: | |
- Enhanced visual representation of Eigen data structures in LLDB, making it easier for developers to debug and inspect matrix and vector contents. | |
- Simplified debugging process in both graphical (CLion) and command-line (Ubuntu with LLDB) environments. | |
### Impact: | |
The addition of the LLDB pretty printer significantly improves the debugging experience for developers using the Eigen library by providing a clear and structured view of matrix and vector data. This facilitates quicker identification of issues, thereby enhancing productivity during development." | |
624 (https://gitlab.com/libeigen/eigen/-/merge_requests/624),(GPU Testing Part 3) Add a simple serialization mechanism.,"The `Serializer<T>` class implements a binary serialization that | |
can write to (`serialize`) and read from (`deserialize`) a byte | |
buffer. Also added convenience routines for serializing | |
a list of arguments. | |
This will mainly be for testing, specifically to transfer data to | |
and from the GPU. | |
This is part 3 of a set of changes to simplify creating generic GPU tests.",Antonio Sánchez,2021-09-08T20:05:19.942Z,NA,NA,"## Title: | |
(GPU Testing Part 3) Add a simple serialization mechanism. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request introduces a `Serializer<T>` class designed for binary serialization, enabling the writing and reading of data from a byte buffer. It includes additional routines for serializing multiple arguments, primarily intended for GPU testing purposes. | |
### Key Changes: | |
- Implementation of the `Serializer<T>` class for binary serialization. | |
- Introduction of methods for data serialization and deserialization. | |
- Addition of convenience routines for handling lists of arguments during serialization. | |
### Improvements: | |
- Enhances the ability to transfer data to and from the GPU effectively. | |
- Streamlines the process of creating generic GPU tests. | |
### Impact: | |
This addition significantly simplifies testing workflows in GPU environments, allowing for more efficient data handling and testing capabilities." | |
623 (https://gitlab.com/libeigen/eigen/-/merge_requests/623),(GPU Testing Part 2) Device-compatible Tuple implementation.,"An analogue of `std::tuple` that works on device. | |
Context: I've tried `std::tuple` in various versions of NVCC and clang, | |
and although code seems to compile, it often fails to run - generating | |
""illegal memory access"" errors, or ""illegal instruction"" errors. | |
This replacement does work on device. | |
This is part 2 of a set of changes to simplify creating generic GPU tests.",Antonio Sánchez,2021-09-08T22:34:02.622Z,NA,NA,"## Title: | |
(GPU Testing Part 2) Device-compatible Tuple implementation. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request introduces a device-compatible implementation of a tuple, serving as an alternative to `std::tuple`, specifically designed for use in GPU testing scenarios within the Eigen C++ library. | |
### Key Changes: | |
- Implementation of a device-compatible tuple that functions similarly to `std::tuple`. | |
- Focus on addressing compatibility issues encountered with `std::tuple` in various NVCC and clang versions, which led to runtime errors. | |
### Improvements: | |
- Enhanced reliability for GPU testing by providing a robust alternative that avoids ""illegal memory access"" and ""illegal instruction"" errors. | |
### Impact: | |
- Facilitates simpler creation of generic GPU tests, thereby improving the overall testing framework and stability within the Eigen library's GPU contexts." | |
641 (https://gitlab.com/libeigen/eigen/-/merge_requests/641),Remove unnecessary std::tuple reference.,"Doesn't seem to be used anyways, since we never actually include `<tuple>` anywhere.",Antonio Sánchez,2021-09-09T16:06:12.783Z,NA,NA,"## Title: | |
Remove unnecessary std::tuple reference. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request focuses on the removal of an unused reference to `std::tuple` within the Eigen C++ library, streamlining the codebase by eliminating unnecessary dependencies. | |
### Key Changes: | |
- Removed the reference to `std::tuple`. | |
### Improvements: | |
- Simplified the code by removing an unused component, improving readability and maintainability. | |
### Impact: | |
- Reduces code complexity and eliminates potential confusion regarding unused libraries, which can help developers navigate the codebase more effectively." | |
631 (https://gitlab.com/libeigen/eigen/-/merge_requests/631),Issue an error in case of direct inclusion of internal headers.,This change was mostly autogenerated.,Rasmus Munk Larsen,2021-09-10T19:12:27.443Z,NA,NA,"## Title: | |
Issue an error in case of direct inclusion of internal headers. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request implements a mechanism to issue an error when internal headers of the Eigen C++ library are included directly, helping to enforce best practices in the library's usage. | |
### Key Changes: | |
- Introduced error handling for direct inclusion of internal headers. | |
### Improvements: | |
- Enhances code safety by preventing misuse of internal headers. | |
### Impact: | |
- Promotes proper header usage, likely reducing confusion and errors among users while maintaining the integrity of the library's modular design." | |
625 (https://gitlab.com/libeigen/eigen/-/merge_requests/625),(GPU Testing Part 4) New GPU test utilities and example.,"This introduces three new functions: | |
``` | |
// returns kernel(args...) running on the CPU. | |
Eigen::run_on_cpu(Kernel kernel, Args&&... args); | |
// returns kernel(args...) running on the GPU. | |
Eigen::run_on_gpu(Kernel kernel, Args&&... args); | |
Eigen::run_on_gpu_with_hint(size_t buffer_capacity_hint, Kernel kernel, Args&&... args); | |
// returns kernel(args...) running on the GPU if using | |
// a GPU compiler, or CPU otherwise. | |
Eigen::run(Kernel kernel, Args&&... args); | |
Eigen::run_with_hint(size_t buffer_capacity_hint, Kernel kernel, Args&&... args); | |
``` | |
Running on the GPU is accomplished by: | |
- Serializing the kernel inputs on the CPU | |
- Transferring the inputs to the GPU | |
- Passing the kernel and serialized inputs to a GPU kernel | |
- Deserializing the inputs on the GPU | |
- Running `kernel(inputs...)` on the GPU | |
- Serializing all output parameters and the return value | |
- Transferring the serialized outputs back to the CPU | |
- Deserializing the outputs and return value on the CPU | |
- Returning the deserialized return value | |
All inputs must be serializable (currently POD types, `Eigen::Matrix` | |
and `Eigen::Array`). The kernel must also be POD (though usually | |
contains no actual data). | |
Tested on CUDA 9.1, 10.2, 11.3, with g++-6, g++-8, g++-10 respectively. | |
This MR depends on !622, !623, !624.",Antonio Sánchez,2021-09-10T22:33:06.246Z,NA,NA,"## Title: | |
GPU Testing Part 4: New GPU Test Utilities and Example | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request introduces a set of new GPU test utilities that enhance the Eigen C++ library's capabilities for executing kernels on both CPU and GPU. It includes functions for running kernels in a versatile manner, providing flexibility for developers working with GPU computations. | |
### Key Changes: | |
- Added functions for executing kernels: | |
- `Eigen::run_on_cpu(Kernel kernel, Args&&... args)` – Executes on CPU. | |
- `Eigen::run_on_gpu(Kernel kernel, Args&&... args)` – Executes on GPU. | |
- `Eigen::run_on_gpu_with_hint(size_t buffer_capacity_hint, Kernel kernel, Args&&... args)` – GPU execution with buffer capacity hints. | |
- `Eigen::run(Kernel kernel, Args&&... args)` – Executes on GPU if available, otherwise defaults to CPU. | |
- `Eigen::run_with_hint(size_t buffer_capacity_hint, Kernel kernel, Args&&... args)` – Same as above but with buffer hints. | |
### Improvements: | |
- Improved flexibility in executing kernels across CPU and GPU environments. | |
- Facilitated serialization and deserialization processes for input and output data, enabling the use of POD types, `Eigen::Matrix`, and `Eigen::Array`. | |
### Impact: | |
- Enhances the testing capabilities for GPU support within the Eigen library. | |
- Allows developers to easily leverage GPU processing power for better performance in numerical computations. | |
- Tested across multiple CUDA versions (9.1, 10.2, 11.3) and various g++ versions ensuring compatibility and reliability." | |
643 (https://gitlab.com/libeigen/eigen/-/merge_requests/643),Minor fix for compilation error on HIP.,Minor fix to enable successful compilation on HIP.,Rohit Santhanam,2021-09-12T17:57:38.971Z,NA,NA,"## Title: | |
Minor fix for compilation error on HIP. | |
## Authors: | |
Rohit Santhanam | |
## Summary: | |
This merge request addresses a minor compilation error encountered during the use of the Eigen C++ library on HIP (Heterogeneous-compute Interface for Portability). | |
### Key Changes: | |
- Implemented a small code adjustment to ensure compatibility with HIP. | |
### Improvements: | |
- Enhances the library's usability for developers utilizing HIP for GPU programming. | |
### Impact: | |
- Ensures successful compilation of the Eigen library on HIP, facilitating broader application and functionality for users working within the HIP environment." | |
645 (https://gitlab.com/libeigen/eigen/-/merge_requests/645),Default eigen_packet_wrapper constructor.,"This makes it trivial, allowing use of `memcpy`. | |
Fixes #2326.",Antonio Sánchez,2021-09-14T19:47:50.618Z,NA,NA,"## Title: | |
Default eigen_packet_wrapper constructor. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request introduces a default constructor for the `eigen_packet_wrapper`, facilitating simpler memory operations via `memcpy`. | |
### Key Changes: | |
- Implementation of a default constructor for the `eigen_packet_wrapper`. | |
### Improvements: | |
- Simplifies the usage of `eigen_packet_wrapper` by making it easier to handle memory copying operations. | |
### Impact: | |
- Enhances the usability of `eigen_packet_wrapper`, leading to potential performance improvements in memory management." | |
648 (https://gitlab.com/libeigen/eigen/-/merge_requests/648),Fix typos in copyright dates,"### What does this implement/fix? | |
This just fixes some typos in copyright dates noticed when doing some extraction of copyright statements from the project. | |
It's on the 3.4 branch, but I can rebase to master if you prefer.",Rylie Pavlik,2021-09-15T20:46:24.854Z,NA,NA,"## Title: | |
Fix typos in copyright dates | |
## Authors: | |
Rylie Pavlik | |
## Summary: | |
This merge request addresses typographical errors in the copyright dates found in the Eigen C++ library. | |
### Key Changes: | |
- Corrected inaccurate copyright dates in the project files. | |
### Improvements: | |
- Enhances accuracy and clarity of copyright statements. | |
### Impact: | |
- Maintains the integrity of the project's legal documentation and ensures that copyright information is correctly represented." | |
647 (https://gitlab.com/libeigen/eigen/-/merge_requests/647),Clean up EIGEN_STATIC_ASSERT to only use standard c++11 static_assert.,"* Move static assertions out of constructors. | |
* Remove mechanism to turn static assertions into runtime checks and update affected tests. | |
* Break the large static assert in PlainObjectBase.h into individual checks to improve error messages.",Rasmus Munk Larsen,2021-09-16T20:43:54.846Z,NA,NA,"## Title: | |
Clean up EIGEN_STATIC_ASSERT to only use standard c++11 static_assert. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request focuses on enhancing the Eigen C++ library by refining the use of static assertions within the codebase. It consolidates the static assertions to employ the standard C++11 `static_assert`, improving clarity and consistency. | |
### Key Changes: | |
- Moved static assertions out of constructors. | |
- Removed the mechanism to convert static assertions into runtime checks, updating the associated tests accordingly. | |
- Divided the large static assertion in `PlainObjectBase.h` into individual checks for better error reporting. | |
### Improvements: | |
- Enhanced error messages through more granular static assertions. | |
- Improved code organization by separating static assertions from constructor definitions. | |
### Impact: | |
These changes will lead to clearer error messages during compilation, making it easier for developers to identify issues. Additionally, the removal of runtime checks will streamline the performance of the library, as the checks will now be conducted at compile time rather than runtime." | |
651 (https://gitlab.com/libeigen/eigen/-/merge_requests/651),Remove -fabi-version=6 flag from AVX512 builds.,Fixes #2328,Rasmus Munk Larsen,2021-09-16T23:44:36.157Z,NA,NA,"## Title: | |
Remove -fabi-version=6 flag from AVX512 builds. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request addresses an issue related to the use of the `-fabi-version=6` flag in AVX512 builds of the Eigen C++ library, ultimately contributing to resolving issue #2328. | |
### Key Changes: | |
- The `-fabi-version=6` flag has been removed from the AVX512 build configurations. | |
### Improvements: | |
- The removal of this flag is expected to improve compatibility and performance of the AVX512 builds. | |
### Impact: | |
This change is anticipated to enhance the overall stability and functionality of the Eigen library when utilizing AVX512 support, potentially leading to better performance metrics and fewer compatibility issues." | |
646 (https://gitlab.com/libeigen/eigen/-/merge_requests/646),Add buildtests_gpu and check_gpu to simplify GPU testing.,"This is in preparation of adding GPU tests to the CI, allowing | |
us to limit building/testing of GPU-specific tests for a given | |
GPU-capable runner. | |
GPU tests are tagged with the label ""gpu"". The new targets | |
``` | |
make buildtests_gpu | |
make check_gpu | |
``` | |
allow building and running only the gpu tests.",Antonio Sánchez,2021-09-17T01:06:15.209Z,NA,NA,"## Title: | |
Add buildtests_gpu and check_gpu to simplify GPU testing. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request introduces new build targets for GPU testing in the Eigen C++ library, facilitating a more efficient testing process specifically for GPU capabilities. | |
### Key Changes: | |
- Added targets `make buildtests_gpu` and `make check_gpu`. | |
- Implemented a tagging system for GPU tests labeled as ""gpu"". | |
### Improvements: | |
- Simplifies the process of building and running GPU-specific tests. | |
- Streamlines integration of GPU tests into the Continuous Integration (CI) environment. | |
### Impact: | |
Enhances the ability to test GPU functionality in Eigen, allowing for more focused and efficient testing processes on GPU-capable runners." | |
653 (https://gitlab.com/libeigen/eigen/-/merge_requests/653),Disable specific subtests that fail on HIP due to non-functional device side malloc/free (on HIP).,"Disable specific subtests that use dynamic data structures since device side malloc/free is not currently available on HIP. | |
This functionality is forthcoming and once it is publicly available, these subtests will be reenabled. | |
/cc @cantonios",Rohit Santhanam,2021-09-17T16:37:35.636Z,NA,NA,"## Title: | |
Disable specific subtests that fail on HIP due to non-functional device side malloc/free (on HIP). | |
## Authors: | |
Rohit Santhanam | |
## Summary: | |
This merge request addresses the issue of specific subtests that fail when running on HIP due to the absence of functional device side malloc/free capabilities. To maintain stability and ensure the reliability of the testing process, these subtests using dynamic data structures have been disabled. | |
### Key Changes: | |
- Disabled specific subtests that rely on dynamic data structures in HIP. | |
### Improvements: | |
- Enhanced stability of test results on HIP by removing failing subtests. | |
### Impact: | |
- Ensures that testing on HIP remains reliable until device side malloc/free functionality is implemented and can be utilized. Subtests will be reenabled once this feature becomes available." | |
649 (https://gitlab.com/libeigen/eigen/-/merge_requests/649),"Move Eigen::all,last,lastp1 back to Eigen::placeholders::.","These names are so common, IMO they should not exist directly in the | |
`Eigen::` namespace. This prevents us from using the `last` or `all` | |
names for any parameters or local variables, otherwise the compiler spews | |
warnings about shadowing or hiding the global values. Many external | |
projects (and our own examples) also heavily use | |
``` | |
using namespace Eigen; | |
``` | |
which means these conflict with external libraries as well, e.g. | |
`std::fill(first,last,value)`. | |
It seems originally these were placed in a separate namespace | |
`Eigen::placeholders`, which has since been deprecated. I propose | |
to un-deprecate this, and restore the original locations. | |
These symbols are also imported into `Eigen::indexing`, which | |
additionally imports `fix` and `seq`. An alternative is to remove the | |
`placeholders` namespace and stick with `indexing`. | |
NOTE: this is an API-breaking change. | |
Fixes #2321.",Antonio Sánchez,2021-09-17T17:44:08.696Z,API change,NA,"## Title: | |
Move Eigen::all,last,lastp1 back to Eigen::placeholders::. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request proposes to move the `Eigen::all`, `Eigen::last`, and `Eigen::lastp1` names back into the `Eigen::placeholders` namespace. The current presence of these names in the `Eigen::` namespace leads to conflicts and warnings in user code, particularly when using `using namespace Eigen;`. | |
### Key Changes: | |
- Reintroduces the `Eigen::placeholders` namespace for `all`, `last`, and `lastp1`. | |
- Keeps these symbols imported into `Eigen::indexing`. | |
### Improvements: | |
- Reduces name collision risks for users declaring local variables or parameters with common names. | |
- Enhances compatibility with external libraries by minimizing overlapping identifiers. | |
### Impact: | |
- This change is API-breaking, which may affect existing user code that relies on the current naming conventions in the `Eigen::` namespace." | |
652 (https://gitlab.com/libeigen/eigen/-/merge_requests/652),"Added a macro to pass arguments to ctest, e.g. to run tests in parallel.","Example: To build and run 32 tests in parallel, build Eigen with: | |
``` | |
cmake -DEIGEN_CTEST_ARGS=-j32 ../eigen | |
make -j32 check | |
```",Rasmus Munk Larsen,2021-09-17T18:33:13.440Z,NA,NA,"## Title: | |
Added a macro to pass arguments to ctest, e.g. to run tests in parallel. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request introduces a new macro in the Eigen library that allows users to easily pass arguments to `ctest`, enabling the capability to run tests in parallel. | |
### Key Changes: | |
- Added a macro for passing `ctest` arguments. | |
- Example provided for running 32 tests in parallel. | |
### Improvements: | |
- Enhances test efficiency by allowing parallel execution, leading to faster feedback on code changes. | |
### Impact: | |
- Developers can now run tests concurrently, significantly speeding up the testing process and improving productivity during development." | |
654 (https://gitlab.com/libeigen/eigen/-/merge_requests/654),Silence string overflow warning for GCC in initializer_list_construction test.,"This looks to be a GCC bug. It doesn't seem to reproduce in a smaller example, | |
making it hard to isolate. | |
Warning: | |
``` | |
In file included from ../Eigen/Core:296, | |
from ../Eigen/QR:11, | |
from ../test/main.h:340, | |
from ../test/initializer_list_construction.cpp:10: | |
In constructor ‘Eigen::PlainObjectBase<Derived>::PlainObjectBase(const std::initializer_list<std::initializer_list<typename Eigen::internal::traits<T>::Scalar> >&) [with Derived = Eigen::Matrix<unsigned char, 5, 4, 0, 5, 4>]’, | |
inlined from ‘Eigen::Matrix<Scalar_, Rows_, Cols_, Options_, MaxRows_, MaxCols_>::Matrix(const std::initializer_list<std::initializer_list<typename Eigen::internal::traits<Eigen::Matrix<Scalar_, Rows_, Cols_, Options_, MaxRows_, MaxCols_> >::Scalar> >&) [with Scalar_ = unsigned char; int Rows_ = 5; int Cols_ = 4; int Options_ = 0; int MaxRows_ = 5; int MaxCols_ = 4]’ at ../Eigen/src/Core/Matrix.h:319:118, | |
inlined from ‘void initializerListMatrixConstruction() [with Scalar = unsigned char]’ at ../test/initializer_list_construction.cpp:176:26: | |
../Eigen/src/Core/PlainObjectBase.h:583:44: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=] | |
583 | coeffRef(row_index, col_index) = e; | |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~ | |
In file included from ../Eigen/Core:289, | |
from ../Eigen/QR:11, | |
from ../test/main.h:340, | |
from ../test/initializer_list_construction.cpp:10: | |
../Eigen/src/Core/DenseStorage.h: In function ‘void initializerListMatrixConstruction() [with Scalar = unsigned char]’: | |
../Eigen/src/Core/DenseStorage.h:48:5: note: at offset 30 into destination object ‘Eigen::internal::plain_array<unsigned char, 20, 0, 0>::array’ of size 20 | |
48 | T array[Size]; | |
| ^~~~~ | |
``` | |
I've verified we never actually access offset `30` in the destination.",Antonio Sánchez,2021-09-17T18:57:10.262Z,NA,NA,"## Title: | |
Silence string overflow warning for GCC in initializer_list_construction test. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses a warning generated by GCC related to string overflow in the `initializer_list_construction` test. The warning seems to be a false positive attributed to a GCC bug, as it does not reproduce in a simplified example. The changes involve silencing the warning without compromising functionality. | |
### Key Changes: | |
- Suppressed the string overflow warning in the `initializer_list_construction` test related to the `Eigen::PlainObjectBase` constructor. | |
### Improvements: | |
- Enhanced compatibility with GCC by eliminating a misleading warning that could distract developers during testing. | |
### Impact: | |
- Improved the developer experience by reducing noise in the warning output, allowing for more focus on relevant issues during build and test processes." | |
656 (https://gitlab.com/libeigen/eigen/-/merge_requests/656),Fix strict aliasing bug causing product_small failure.,"Packet loading is skipped due to aliasing violation, leading to nullopt matrix | |
multiplication. | |
Fixes #2327.",Antonio Sánchez,2021-09-17T21:24:32.775Z,NA,NA,"## Title: | |
Fix strict aliasing bug causing product_small failure. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses a critical strict aliasing bug in the Eigen C++ library that was causing failures in the `product_small` function. The issue resulted in packet loading being skipped due to aliasing violations, leading to unexpected nullopt matrix multiplications. | |
### Key Changes: | |
- Resolved the strict aliasing violation that impacted matrix multiplication. | |
- Ensured proper packet loading for small matrix products. | |
### Improvements: | |
- Enhanced the reliability of small matrix operations by preventing failures related to aliasing issues. | |
### Impact: | |
This fix eliminates the risk of nullopt results during matrix multiplication, thus improving the correctness and stability of the library's matrix operations, especially for small matrices." | |
655 (https://gitlab.com/libeigen/eigen/-/merge_requests/655),Run CI tests in parallel on all cores.,NA,Rasmus Munk Larsen,2021-09-17T22:35:23.466Z,NA,NA,"## Title: | |
Run CI tests in parallel on all cores. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request implements parallel execution of continuous integration (CI) tests across all available CPU cores. | |
### Key Changes: | |
- Adjusted CI configuration to enable parallel test execution. | |
### Improvements: | |
- Enhanced test execution speed by utilizing multiple cores, reducing overall testing time. | |
### Impact: | |
- Increases efficiency of the CI process, allowing for faster feedback and quicker iterations in development." | |
657 (https://gitlab.com/libeigen/eigen/-/merge_requests/657),Fix implicit conversion warnings in tuple_test.,Fixes #2329.,Antonio Sánchez,2021-09-18T02:58:52.873Z,NA,NA,"## Title: | |
Fix implicit conversion warnings in tuple_test. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses implicit conversion warnings in the tuple_test file, resolving issue #2329. | |
### Key Changes: | |
- Updated the code in the tuple_test file to eliminate implicit conversion warnings. | |
### Improvements: | |
- Ensures better type safety and code clarity by removing warnings related to implicit conversions. | |
### Impact: | |
- Enhances the overall robustness of the codebase by preventing potential issues that could arise from incorrect type conversions." | |
572 (https://gitlab.com/libeigen/eigen/-/merge_requests/572),[AutodiffScalar] Remove const when returning by value,"clang-tidy: Return type 'const T' is 'const'-qualified at the top level, | |
which may reduce code readability without improving const correctness | |
The types are somewhat long, but the affected return types are of the form: | |
``` | |
const T my_func() { /**/ } | |
``` | |
Change to: | |
``` | |
T my_func() { /**/ } | |
```",Alexander Karatarakis,2021-09-18T21:38:57.382Z,NA,NA,"## Title: | |
[AutodiffScalar] Remove const when returning by value | |
## Authors: | |
Alexander Karatarakis | |
## Summary: | |
This merge request addresses a code quality issue identified by clang-tidy regarding the use of `const` in return types. Specifically, it involves modifying functions that return values by value to eliminate unnecessary `const` qualifiers. | |
### Key Changes: | |
- Removed `const` from function return types in the form `const T my_func()`, changing them to `T my_func()`. | |
### Improvements: | |
- Enhanced code readability by eliminating the `const` qualifier at the top level of return types, which previously did not contribute to const correctness. | |
### Impact: | |
- The changes streamline the function return types, making the code cleaner and easier to read, while maintaining the intended functionality." | |
659 (https://gitlab.com/libeigen/eigen/-/merge_requests/659),Fix alias violation in BFloat16,"### What does this implement/fix? | |
Using a reinterpret_cast to access the bits of a float value is undefined behavior. With GCC 10 on PPC platforms we have seen actual failures (wrong values) due to that which are fixed by (the equivalent) of this change. See https://github.com/easybuilders/easybuild-easyconfigs/pull/14025 | |
An easy testcase for that with TF 2.2.3 is: | |
``` | |
import numpy as np | |
from tensorflow.python import _pywrap_bfloat16 | |
bfloat16 = _pywrap_bfloat16.TF_bfloat16_type() | |
print(np.arange(-10.5, 7.8, 0.5, dtype=bfloat16)) | |
``` | |
Which prints `[bfloat16(-10.5) bfloat16(-10) bfloat16(-20) bfloat16(-30) bfloat16(-40)...` | |
printf-debugging into the TF bfloat16 shows that during conversion from bfloat16->float the step value gets calculated wrong. | |
### Additional information | |
Not only is the proposed solution correct, it is even (potentially) faster. See the generated ASM: https://godbolt.org/z/4dT4a9d1b and https://github.com/tensorflow/tensorflow/commit/6b853c8f2020a446d7c04e75deff7866a35a7658#diff-17ca5d26579d2089aa9c41eacf8570b066e5c83dc957dc9bf1647a266de990f1 (see commit message)",Alexander Grund,2021-09-20T14:25:12.004Z,NA,NA,"## Title: | |
Fix alias violation in BFloat16 | |
## Authors: | |
Alexander Grund | |
## Summary: | |
This merge request addresses an issue related to undefined behavior due to the use of `reinterpret_cast` for float value access in the BFloat16 implementation. It highlights failures encountered with GCC 10 on PPC platforms, specifically resulting in incorrect float conversions. | |
### Key Changes: | |
- Replaced `reinterpret_cast` with a safer alternative to access float values, which resolves the undefined behavior issue. | |
### Improvements: | |
- The new implementation not only corrects the existing problem but may also improve performance, as evidenced by more efficient assembly output. | |
### Impact: | |
This change enhances the reliability and correctness of float conversion in BFloat16, particularly on PPC platforms, thus preventing potential errors in calculations and ensuring better performance in related operations." | |
660 (https://gitlab.com/libeigen/eigen/-/merge_requests/660),fix various typos,NA,sciencewhiz,2021-09-22T16:15:06.955Z,NA,NA,"## Title: | |
Fix Various Typos | |
## Authors: | |
sciencewhiz | |
## Summary: | |
This merge request addresses and corrects various typographical errors found throughout the Eigen C++ library codebase. | |
### Key Changes: | |
- Corrected multiple typos in the documentation and comments within the code. | |
### Improvements: | |
- Enhanced clarity and professionalism in the documentation, contributing to better readability and understanding for users and contributors. | |
### Impact: | |
- Improves the overall quality of the codebase by ensuring accurate terminology, which may assist in reducing misunderstandings among users and developers." | |
661 (https://gitlab.com/libeigen/eigen/-/merge_requests/661),Fix some typos found,"### What does this implement/fix? | |
This MR fixes some typos in English text that I found using a spell checker. | |
@cantonios, @sciencewhiz: Could you please review my changes? | |
### Reference issue | |
See also (discussion in) !660. | |
### Additional information | |
There are also some occurrences of ""assignement"" which should be changed to ""assignment"", e.g. in lines 52, 56, 58, 87, 140, and 144 of `unsupported/test/cxx11_tensor_builtins_sycl.cpp`.",Kolja Brix,2021-09-23T15:22:01.068Z,NA,NA,"## Title: | |
Fix some typos found | |
## Authors: | |
Kolja Brix | |
## Summary: | |
This merge request addresses and corrects several typographical errors detected in the English text of the Eigen C++ library. | |
### Key Changes: | |
- Fixed various typos identified through a spell checker. | |
- Corrected occurrences of the term ""assignement"" to ""assignment"" in specific lines of the `unsupported/test/cxx11_tensor_builtins_sycl.cpp` file. | |
### Improvements: | |
- Enhances the overall readability and professionalism of the documentation and code comments by eliminating spelling errors. | |
### Impact: | |
- Improves the clarity of the documentation, contributing to a better understanding for users and developers engaging with the Eigen library." | |
663 (https://gitlab.com/libeigen/eigen/-/merge_requests/663),Disable more CUDA warnings.,"For cuda 9.2 and 11.4, they changed the numbers again. | |
Fixes #2331.",Antonio Sánchez,2021-09-25T04:51:45.289Z,NA,NA,"## Title: | |
Disable more CUDA warnings. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses the changes in CUDA versions 9.2 and 11.4 that led to increased warnings. The modifications aim to suppress unnecessary warnings during compilation. | |
### Key Changes: | |
- Disabled additional CUDA warnings triggered by the changes in CUDA versions 9.2 and 11.4. | |
### Improvements: | |
- Reduces clutter in the compilation output, making it easier for developers to focus on relevant warnings and errors. | |
### Impact: | |
- A smoother development experience for users of the Eigen C++ library who utilize CUDA, particularly with the specified versions, leading to less distraction from non-critical warnings." | |
662 (https://gitlab.com/libeigen/eigen/-/merge_requests/662),Reorganize test main file,"### What does this implement/fix? | |
Reorganize test main file `main.h` as discussed with @rmlarsen1 in !515. | |
* Move random matrix generators to separate file `random_matrix_helper.h` | |
* Protect forward declarations with EIGEN_COMP_ICC. | |
### Reference issue | |
See also !515.",Kolja Brix,2021-09-27T18:30:48.634Z,NA,NA,"## Title: | |
Reorganize test main file | |
## Authors: | |
Kolja Brix | |
## Summary: | |
This merge request reorganizes the test main file `main.h` by separating concerns and improving organization as per discussions with @rmlarsen1. | |
### Key Changes: | |
- Random matrix generators have been moved to a new file, `random_matrix_helper.h`. | |
- Forward declarations are now protected with `EIGEN_COMP_ICC`. | |
### Improvements: | |
The changes enhance the maintainability and clarity of the test code by isolating specific functionalities into dedicated files and ensuring compatibility across different compiler setups. | |
### Impact: | |
This reorganization is expected to simplify future modifications and improve the overall structure of the test suite, making it easier for contributors to understand and use." | |
664 (https://gitlab.com/libeigen/eigen/-/merge_requests/664),Disable testing of complex compound assignment operators for MSVC.,"MSVC does not support specializing compound assignments for | |
`std::complex`, since it already specializes them (contrary to the | |
standard). | |
Trying to use one of these on device will currently lead to a | |
duplicate definition error. This is still probably preferable | |
to no error though. If we remove the definitions for MSVC, then | |
it will compile, but the kernel will fail silently. | |
The only proper solution would be to define our own custom `Complex` | |
type.",Antonio Sánchez,2021-09-28T15:39:59.670Z,NA,NA,"## Title: | |
Disable testing of complex compound assignment operators for MSVC. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an issue with MSVC's handling of compound assignment operators for `std::complex`. It disables the testing of these operators to avoid compilation errors and silent failures when running kernels. | |
### Key Changes: | |
- Disabled testing of complex compound assignment operators specifically for MSVC. | |
### Improvements: | |
- Prevents duplicate definition errors that occur when using compound assignments with `std::complex` on MSVC. | |
- Ensures that developers are notified of issues rather than facing silent failures in kernel execution. | |
### Impact: | |
- Enhances compatibility of the Eigen library with MSVC by avoiding critical compilation issues, ultimately leading to a smoother development experience when using the library on Microsoft platforms." | |
671 (https://gitlab.com/libeigen/eigen/-/merge_requests/671),Fix gpu special function tests.,"Some checks used incorrect values, partly from copy-paste errors, | |
partly from the change in behaviour introduced in !398. | |
Modified results to match scipy, simplified tests by updating | |
`VERIFY_IS_CWISE_APPROX` to work for scalars.",Antonio Sánchez,2021-10-02T04:33:13.347Z,NA,NA,"## Title: | |
Fix gpu special function tests. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses issues in GPU special function tests where incorrect values were being utilized due to copy-paste errors and changes in behavior from a previous merge. It modifies test results to align with those from the scipy library and simplifies the testing process. | |
### Key Changes: | |
- Corrected checks that had incorrect values. | |
- Updated `VERIFY_IS_CWISE_APPROX` to handle scalars effectively. | |
### Improvements: | |
- Improved accuracy of GPU special function tests by aligning results with scipy. | |
- Enhanced test simplification for better maintainability. | |
### Impact: | |
The changes lead to more reliable and consistent special function testing within the Eigen library, ensuring that the GPU implementations conform to expected behavior as verified by scipy." | |
669 (https://gitlab.com/libeigen/eigen/-/merge_requests/669),Reduce tensor_contract_gpu test.,"The original test times out after 60 minutes on Windows, even when | |
setting flags to optimize for speed. Reducing the number of | |
contractions performed from 3600->27 for subtests 8,9 allow the | |
two to run in just over a minute each.",Antonio Sánchez,2021-10-02T04:51:14.721Z,NA,NA,"## Title: | |
Reduce tensor_contract_gpu test. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
The merge request aims to optimize the tensor_contract_gpu test by significantly reducing the number of contractions performed, addressing performance issues that lead to timeouts on Windows. | |
### Key Changes: | |
- Reduced the number of contractions from 3600 to 27 for subtests 8 and 9. | |
### Improvements: | |
- The optimization allows the tests to complete in just over a minute each rather than timing out after 60 minutes. | |
### Impact: | |
- Enhances the efficiency of testing on Windows, reducing waiting times and improving overall development workflow." | |
667 (https://gitlab.com/libeigen/eigen/-/merge_requests/667),Speed up tensor reduction,"Speed up tensor reduction by strip mining & unrolling loops in `InnerMostDimReducer` and `InnerMostDimPreserved`. | |
This change also cleans up a few redundant pieces of code, where deferring to an existing specialization was possible. | |
Below are measurements of full-, row-, and column- sum reductions of square 2D float tensors with sizes ranging from 3 x 3 to 10k x 10k. These were measured single-threaded on a Skylake core, and compiled with clang approximately at head. | |
AVX2: | |
``` | |
name old cpu/op new cpu/op delta | |
BM_fullReduction_1T/3 [using 1 threads] 13.4ns ± 8% 15.1ns ± 4% +12.18% (p=0.000 n=49+60) | |
BM_fullReduction_1T/4 [using 1 threads] 13.0ns ± 4% 15.4ns ±18% +18.06% (p=0.000 n=48+60) | |
BM_fullReduction_1T/7 [using 1 threads] 15.0ns ±13% 16.4ns ± 3% +9.29% (p=0.000 n=48+47) | |
BM_fullReduction_1T/8 [using 1 threads] 15.8ns ±19% 16.7ns ±13% +5.84% (p=0.000 n=60+60) | |
BM_fullReduction_1T/10 [using 1 threads] 18.7ns ±11% 18.5ns ± 9% ~ (p=0.292 n=48+60) | |
BM_fullReduction_1T/15 [using 1 threads] 31.1ns ±12% 22.4ns ±17% -27.80% (p=0.000 n=52+57) | |
BM_fullReduction_1T/16 [using 1 threads] 34.3ns ±10% 23.0ns ±13% -32.75% (p=0.000 n=50+58) | |
BM_fullReduction_1T/31 [using 1 threads] 125ns ± 5% 48ns ± 9% -61.81% (p=0.000 n=60+60) | |
BM_fullReduction_1T/32 [using 1 threads] 134ns ± 6% 50ns ± 8% -62.70% (p=0.000 n=60+57) | |
BM_fullReduction_1T/64 [using 1 threads] 535ns ± 4% 160ns ± 5% -70.06% (p=0.000 n=60+60) | |
BM_fullReduction_1T/128 [using 1 threads] 2.15µs ± 4% 0.69µs ± 8% -67.65% (p=0.000 n=60+60) | |
BM_fullReduction_1T/256 [using 1 threads] 8.55µs ± 4% 2.77µs ± 5% -67.65% (p=0.000 n=60+55) | |
BM_fullReduction_1T/512 [using 1 threads] 34.5µs ± 3% 11.6µs ± 6% -66.52% (p=0.000 n=50+60) | |
BM_fullReduction_1T/1k [using 1 threads] 155µs ± 4% 158µs ± 4% +1.73% (p=0.000 n=60+60) | |
BM_fullReduction_1T/2k [using 1 threads] 682µs ±20% 684µs ±17% ~ (p=0.475 n=40+45) | |
BM_fullReduction_1T/4k [using 1 threads] 6.34ms ±12% 5.71ms ±11% -9.98% (p=0.000 n=39+35) | |
BM_fullReduction_1T/10k [using 1 threads] 37.4ms ± 7% 37.4ms ±32% ~ (p=0.481 n=10+10) | |
name old cpu/op new cpu/op delta | |
BM_rowReduction_1T/3 [using 1 threads] 29.0ns ± 7% 30.5ns ± 4% +5.10% (p=0.000 n=54+50) | |
BM_rowReduction_1T/4 [using 1 threads] 33.5ns ± 3% 38.5ns ± 4% +15.07% (p=0.000 n=50+50) | |
BM_rowReduction_1T/7 [using 1 threads] 54.6ns ± 4% 60.8ns ± 8% +11.40% (p=0.000 n=59+60) | |
BM_rowReduction_1T/8 [using 1 threads] 55.1ns ± 8% 52.1ns ± 9% -5.40% (p=0.000 n=60+60) | |
BM_rowReduction_1T/10 [using 1 threads] 75.8ns ± 7% 72.2ns ± 7% -4.66% (p=0.000 n=60+60) | |
BM_rowReduction_1T/15 [using 1 threads] 114ns ± 5% 123ns ± 6% +7.98% (p=0.000 n=60+60) | |
BM_rowReduction_1T/16 [using 1 threads] 102ns ± 5% 95ns ± 7% -6.74% (p=0.000 n=60+60) | |
BM_rowReduction_1T/31 [using 1 threads] 250ns ± 5% 264ns ± 4% +5.56% (p=0.000 n=55+55) | |
BM_rowReduction_1T/32 [using 1 threads] 232ns ± 4% 203ns ± 9% -12.47% (p=0.000 n=55+60) | |
BM_rowReduction_1T/64 [using 1 threads] 651ns ± 4% 482ns ± 6% -25.95% (p=0.000 n=60+60) | |
BM_rowReduction_1T/128 [using 1 threads] 1.90µs ± 3% 1.30µs ± 7% -31.67% (p=0.000 n=60+60) | |
BM_rowReduction_1T/256 [using 1 threads] 7.03µs ± 5% 3.69µs ± 5% -47.44% (p=0.000 n=60+49) | |
BM_rowReduction_1T/512 [using 1 threads] 28.6µs ± 4% 13.3µs ± 6% -53.36% (p=0.000 n=54+60) | |
BM_rowReduction_1T/1k [using 1 threads] 158µs ± 9% 157µs ± 4% ~ (p=0.948 n=60+60) | |
BM_rowReduction_1T/2k [using 1 threads] 733µs ±37% 657µs ±13% -10.36% (p=0.000 n=45+40) | |
BM_rowReduction_1T/4k [using 1 threads] 6.65ms ±11% 6.19ms ± 9% -6.89% (p=0.032 n=30+38) | |
BM_rowReduction_1T/10k [using 1 threads] 41.4ms ±11% 37.8ms ± 1% ~ (p=0.080 n=12+10) | |
name old cpu/op new cpu/op delta | |
BM_colReduction_1T/3 [using 1 threads] 21.8ns ± 5% 22.4ns ± 4% +2.34% (p=0.000 n=58+55) | |
BM_colReduction_1T/4 [using 1 threads] 20.8ns ± 6% 27.7ns ± 6% +33.27% (p=0.000 n=60+55) | |
BM_colReduction_1T/7 [using 1 threads] 32.0ns ± 4% 43.9ns ± 6% +37.53% (p=0.000 n=48+60) | |
BM_colReduction_1T/8 [using 1 threads] 28.7ns ±11% 24.8ns ± 3% -13.81% (p=0.000 n=53+55) | |
BM_colReduction_1T/10 [using 1 threads] 39.9ns ± 7% 37.8ns ± 4% -5.12% (p=0.000 n=53+50) | |
BM_colReduction_1T/15 [using 1 threads] 65.0ns ±10% 77.2ns ± 6% +18.79% (p=0.000 n=58+57) | |
BM_colReduction_1T/16 [using 1 threads] 56.5ns ± 7% 43.0ns ±21% -23.92% (p=0.000 n=48+60) | |
BM_colReduction_1T/31 [using 1 threads] 203ns ± 5% 210ns ± 6% +3.46% (p=0.000 n=60+59) | |
BM_colReduction_1T/32 [using 1 threads] 170ns ± 8% 95ns ± 7% -44.18% (p=0.000 n=60+60) | |
BM_colReduction_1T/64 [using 1 threads] 677ns ± 7% 261ns ± 4% -61.43% (p=0.000 n=60+55) | |
BM_colReduction_1T/128 [using 1 threads] 3.14µs ± 4% 1.40µs ± 5% -55.45% (p=0.000 n=50+60) | |
BM_colReduction_1T/256 [using 1 threads] 14.8µs ± 4% 5.4µs ± 6% -63.24% (p=0.000 n=60+60) | |
BM_colReduction_1T/512 [using 1 threads] 65.2µs ± 5% 25.2µs ± 5% -61.31% (p=0.000 n=60+55) | |
BM_colReduction_1T/1k [using 1 threads] 754µs ± 6% 393µs ± 5% -47.92% (p=0.000 n=60+45) | |
BM_colReduction_1T/2k [using 1 threads] 3.24ms ±18% 1.66ms ±17% -48.61% (p=0.000 n=35+42) | |
BM_colReduction_1T/4k [using 1 threads] 70.3ms ± 3% 34.5ms ± 3% -50.93% (p=0.000 n=44+25) | |
BM_colReduction_1T/10k [using 1 threads] 69.5ms ± 0% 69.6ms ± 2% ~ (p=0.605 n=10+15) | |
``` | |
SSE4.3: | |
``` | |
name old cpu/op new cpu/op delta | |
BM_fullReduction_1T/3 [using 1 threads] 13.5ns ± 6% 13.1ns ± 4% -2.72% (p=0.000 n=59+60) | |
BM_fullReduction_1T/4 [using 1 threads] 13.2ns ± 8% 12.8ns ± 4% -2.60% (p=0.000 n=60+60) | |
BM_fullReduction_1T/7 [using 1 threads] 14.7ns ± 4% 14.5ns ± 5% -1.16% (p=0.014 n=48+60) | |
BM_fullReduction_1T/8 [using 1 threads] 14.8ns ± 4% 14.6ns ± 4% -1.59% (p=0.001 n=48+60) | |
BM_fullReduction_1T/10 [using 1 threads] 17.8ns ± 4% 16.5ns ± 5% -7.15% (p=0.000 n=48+60) | |
BM_fullReduction_1T/15 [using 1 threads] 29.9ns ± 7% 24.7ns ± 3% -17.59% (p=0.000 n=54+55) | |
BM_fullReduction_1T/16 [using 1 threads] 33.1ns ± 7% 27.1ns ± 4% -18.35% (p=0.000 n=47+54) | |
BM_fullReduction_1T/31 [using 1 threads] 123ns ± 4% 70ns ± 7% -43.38% (p=0.000 n=60+57) | |
BM_fullReduction_1T/32 [using 1 threads] 131ns ± 4% 78ns ± 7% -40.77% (p=0.000 n=60+60) | |
BM_fullReduction_1T/64 [using 1 threads] 534ns ± 4% 281ns ± 4% -47.40% (p=0.000 n=60+55) | |
BM_fullReduction_1T/128 [using 1 threads] 2.13µs ± 4% 1.23µs ± 4% -42.17% (p=0.000 n=60+60) | |
BM_fullReduction_1T/256 [using 1 threads] 8.54µs ± 4% 4.95µs ± 5% -42.10% (p=0.000 n=60+60) | |
BM_fullReduction_1T/512 [using 1 threads] 34.5µs ± 4% 20.2µs ± 4% -41.43% (p=0.000 n=50+60) | |
BM_fullReduction_1T/1k [using 1 threads] 158µs ± 6% 154µs ± 5% -2.46% (p=0.000 n=60+60) | |
BM_fullReduction_1T/2k [using 1 threads] 687µs ±25% 668µs ±23% ~ (p=0.093 n=47+46) | |
BM_fullReduction_1T/4k [using 1 threads] 5.86ms ± 6% 5.82ms ±10% ~ (p=0.736 n=28+35) | |
BM_fullReduction_1T/10k [using 1 threads] 36.0ms ± 3% 35.5ms ± 3% ~ (p=0.095 n=10+9) | |
name old cpu/op new cpu/op delta | |
BM_rowReduction_1T/3 [using 1 threads] 28.8ns ± 4% 27.8ns ± 4% -3.64% (p=0.000 n=53+54) | |
BM_rowReduction_1T/4 [using 1 threads] 33.6ns ± 4% 33.7ns ± 6% ~ (p=0.465 n=50+50) | |
BM_rowReduction_1T/7 [using 1 threads] 54.4ns ± 4% 52.9ns ± 4% -2.81% (p=0.000 n=60+60) | |
BM_rowReduction_1T/8 [using 1 threads] 53.8ns ± 4% 51.6ns ± 4% -4.05% (p=0.000 n=60+60) | |
BM_rowReduction_1T/10 [using 1 threads] 74.4ns ± 4% 71.2ns ± 4% -4.39% (p=0.000 n=60+58) | |
BM_rowReduction_1T/15 [using 1 threads] 113ns ± 4% 109ns ± 4% -3.49% (p=0.000 n=60+60) | |
BM_rowReduction_1T/16 [using 1 threads] 101ns ± 6% 97ns ± 6% -3.91% (p=0.000 n=60+60) | |
BM_rowReduction_1T/31 [using 1 threads] 250ns ± 4% 271ns ± 4% +8.24% (p=0.000 n=55+55) | |
BM_rowReduction_1T/32 [using 1 threads] 232ns ± 3% 222ns ± 4% -4.31% (p=0.000 n=55+59) | |
BM_rowReduction_1T/64 [using 1 threads] 654ns ± 4% 501ns ± 5% -23.43% (p=0.000 n=60+60) | |
BM_rowReduction_1T/128 [using 1 threads] 1.90µs ± 4% 1.62µs ± 5% -14.84% (p=0.000 n=60+59) | |
BM_rowReduction_1T/256 [using 1 threads] 7.07µs ± 4% 5.51µs ± 4% -21.99% (p=0.000 n=60+59) | |
BM_rowReduction_1T/512 [using 1 threads] 28.7µs ± 6% 21.1µs ± 4% -26.28% (p=0.000 n=55+60) | |
BM_rowReduction_1T/1k [using 1 threads] 156µs ±10% 153µs ± 4% -2.07% (p=0.007 n=60+60) | |
BM_rowReduction_1T/2k [using 1 threads] 705µs ±26% 678µs ±33% -3.86% (p=0.035 n=41+39) | |
BM_rowReduction_1T/4k [using 1 threads] 7.04ms ±10% 6.31ms ± 8% -10.45% (p=0.000 n=41+36) | |
BM_rowReduction_1T/10k [using 1 threads] 42.6ms ± 6% 38.8ms ± 4% -8.82% (p=0.000 n=12+9) | |
name old cpu/op new cpu/op delta | |
BM_colReduction_1T/3 [using 1 threads] 22.0ns ± 7% 22.1ns ± 7% ~ (p=0.614 n=54+46) | |
BM_colReduction_1T/4 [using 1 threads] 20.6ns ± 5% 20.6ns ± 5% ~ (p=0.771 n=60+48) | |
BM_colReduction_1T/7 [using 1 threads] 31.6ns ± 4% 31.6ns ± 3% ~ (p=0.935 n=50+40) | |
BM_colReduction_1T/8 [using 1 threads] 27.8ns ± 9% 27.5ns ± 4% ~ (p=0.113 n=45+44) | |
BM_colReduction_1T/10 [using 1 threads] 39.0ns ± 4% 38.6ns ± 5% -0.86% (p=0.048 n=50+40) | |
BM_colReduction_1T/15 [using 1 threads] 63.9ns ± 4% 63.1ns ± 4% -1.20% (p=0.005 n=60+48) | |
BM_colReduction_1T/16 [using 1 threads] 56.5ns ± 8% 47.2ns ± 9% -16.50% (p=0.000 n=59+49) | |
BM_colReduction_1T/31 [using 1 threads] 200ns ± 5% 145ns ± 8% -27.33% (p=0.000 n=60+60) | |
BM_colReduction_1T/32 [using 1 threads] 170ns ± 5% 100ns ± 6% -40.78% (p=0.000 n=60+55) | |
BM_colReduction_1T/64 [using 1 threads] 673ns ± 4% 291ns ± 5% -56.83% (p=0.000 n=60+55) | |
BM_colReduction_1T/128 [using 1 threads] 3.14µs ± 4% 2.43µs ± 6% -22.70% (p=0.000 n=50+55) | |
BM_colReduction_1T/256 [using 1 threads] 14.7µs ± 4% 9.6µs ± 5% -35.06% (p=0.000 n=60+60) | |
BM_colReduction_1T/512 [using 1 threads] 65.4µs ± 4% 44.2µs ± 5% -32.42% (p=0.000 n=59+59) | |
BM_colReduction_1T/1k [using 1 threads] 761µs ± 8% 756µs ± 8% ~ (p=0.274 n=60+60) | |
BM_colReduction_1T/2k [using 1 threads] 3.22ms ±13% 3.27ms ±23% ~ (p=0.629 n=37+37) | |
BM_colReduction_1T/4k [using 1 threads] 70.9ms ±10% 69.8ms ± 8% -1.47% (p=0.028 n=40+40) | |
BM_colReduction_1T/10k [using 1 threads] 69.7ms ± 3% 79.6ms ± 2% +14.22% (p=0.000 n=13+14) | |
```",Rasmus Munk Larsen,2021-10-02T14:58:24.275Z,NA,NA,"## Title: | |
Speed up tensor reduction | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request enhances the performance of tensor reduction in the Eigen library by employing loop strip mining and unrolling techniques in the `InnerMostDimReducer` and `InnerMostDimPreserved` components. Additionally, it cleans up redundant code by leveraging existing specializations. | |
### Key Changes: | |
- Implemented optimizations through loop strip mining and unrolling in tensor reduction methods. | |
- Refactored code to eliminate redundancies by utilizing existing specializations. | |
### Improvements: | |
- Significant performance improvements in full, row, and column reductions for square 2D float tensors, with reductions in execution time noted across various tensor sizes. | |
- Full reductions show reductions in CPU time per operation of up to 70.06% for larger tensor sizes. | |
- Row and column reductions similarly achieve marked optimizations, with up to 61.43% and 48.61% reductions in CPU time respectively for larger tensors. | |
### Impact: | |
- Enhanced efficiency of tensor reduction operations, potentially leading to faster computations in applications relying on the Eigen library for linear algebra and tensor operations. | |
- The overall user experience is expected to improve due to the more responsive performance, especially for large datasets." | |
668 (https://gitlab.com/libeigen/eigen/-/merge_requests/668),Fix Windows CMake compiler/OS detection.,"Replaced deprecated `DetermineVSServicePack` macro with recommended | |
`CMAKE_CXX_COMPILER_VERSION`. | |
Deleted custom `OSVersion` detection. The windows-specific code is | |
highly outdated, and on other systems simply returns `CMAKE_SYSTEM`. | |
We will get values like `windows-10.0.17763`, but this is preferable | |
to `unknownwin`, and saves us needing to maintain a separate cmake file.",Antonio Sánchez,2021-10-02T16:47:06.748Z,NA,NA,"## Title: | |
Fix Windows CMake compiler/OS detection. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request updates the Eigen C++ library's CMake configuration by replacing outdated Windows compiler and OS detection methods. It enhances the build system's reliability and maintenance. | |
### Key Changes: | |
- Replaced the deprecated `DetermineVSServicePack` macro with `CMAKE_CXX_COMPILER_VERSION`. | |
- Deleted outdated custom `OSVersion` detection, defaulting to `CMAKE_SYSTEM` for consistency across platforms. | |
### Improvements: | |
- Improved accuracy of OS version reporting on Windows, providing a more precise output like `windows-10.0.17763` instead of `unknownwin`. | |
- Reduced maintenance burden by eliminating the need for a separate CMake file for Windows-specific code. | |
### Impact: | |
The changes enhance the overall portability and maintainability of the build configuration for the Eigen library, ensuring better compatibility and reducing technical debt related to outdated detection mechanisms." | |
673 (https://gitlab.com/libeigen/eigen/-/merge_requests/673),Vectorize Visitor.h.,"This change adds a vectorized codepath in `Visitor.h`, which speeds up `coeffMax(&row, &col)` etc. by about 5x on machines with AVX2. | |
Benchmark of `coeffMax(&row, &col)` on a random square matrix of `float`: | |
``` | |
name old cpu/op new cpu/op delta | |
BM_EigenCoeffMax/16 317ns ± 0% 73ns ± 0% -77.16% (p=0.000 n=44+47) | |
BM_EigenCoeffMax/64 5.30µs ± 0% 0.92µs ± 5% -82.56% (p=0.000 n=42+60) | |
BM_EigenCoeffMax/128 21.3µs ± 0% 3.6µs ± 1% -83.21% (p=0.000 n=45+48) | |
BM_EigenCoeffMax/512 341µs ± 0% 56µs ± 0% -83.65% (p=0.000 n=38+60) | |
BM_EigenCoeffMax/1k 1.42ms ± 0% 0.24ms ± 1% -83.31% (p=0.000 n=36+33) | |
``` | |
This also speeds up various matrix decompositions that perform pivot search using `coeffMax`; | |
``` | |
name old cpu/op new cpu/op delta | |
BM_EigenPartialPivLU/16 1.99µs ± 1% 1.96µs ± 1% -1.32% (p=0.000 n=60+59) | |
BM_EigenPartialPivLU/64 23.2µs ± 1% 21.7µs ± 2% -6.63% (p=0.000 n=56+58) | |
BM_EigenPartialPivLU/128 116µs ± 2% 108µs ± 2% -6.56% (p=0.000 n=60+60) | |
BM_EigenPartialPivLU/512 3.53ms ± 1% 3.40ms ± 2% -3.83% (p=0.000 n=38+38) | |
BM_EigenPartialPivLU/1k 17.0ms ± 1% 16.4ms ± 1% -3.98% (p=0.000 n=29+27) | |
BM_EigenFullPivLU/16 3.17µs ± 1% 2.76µs ± 1% -12.99% (p=0.000 n=49+50) | |
BM_EigenFullPivLU/64 79.2µs ± 2% 53.3µs ± 3% -32.75% (p=0.000 n=58+56) | |
BM_EigenFullPivLU/128 560µs ± 2% 361µs ± 3% -35.61% (p=0.000 n=60+50) | |
BM_EigenFullPivLU/512 26.7ms ± 3% 16.5ms ± 2% -38.26% (p=0.000 n=47+47) | |
BM_EigenFullPivLU/1k 234ms ± 3% 165ms ± 4% -29.52% (p=0.000 n=15+21) | |
BM_EigencolPivQR/16 4.61µs ± 3% 4.61µs ± 4% ~ (p=0.881 n=58+59) | |
BM_EigencolPivQR/64 51.7µs ± 2% 51.0µs ± 2% -1.44% (p=0.000 n=58+57) | |
BM_EigencolPivQR/128 277µs ± 3% 272µs ± 3% -1.97% (p=0.000 n=55+54) | |
BM_EigencolPivQR/512 9.05ms ± 3% 9.00ms ± 2% ~ (p=0.197 n=45+44) | |
BM_EigencolPivQR/1k 127ms ± 4% 127ms ± 5% ~ (p=0.421 n=27+26) | |
BM_EigenfullPivQR/16 5.45µs ± 3% 5.02µs ± 4% -7.78% (p=0.000 n=59+60) | |
BM_EigenfullPivQR/64 108µs ± 3% 76µs ± 4% -29.07% (p=0.000 n=59+59) | |
BM_EigenfullPivQR/128 682µs ± 3% 452µs ± 2% -33.78% (p=0.000 n=59+57) | |
BM_EigenfullPivQR/512 33.1ms ± 4% 20.0ms ± 3% -39.57% (p=0.000 n=44+40) | |
BM_EigenfullPivQR/1k 323ms ± 1% 225ms ± 3% -30.20% (p=0.000 n=8+15) | |
``` | |
Closes #2345",Rasmus Munk Larsen,2021-10-20T16:58:02.155Z,NA,NA,"## Title: | |
Vectorize Visitor.h | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request introduces a vectorized implementation in `Visitor.h`, enhancing the performance of the `coeffMax(&row, &col)` function and related operations by leveraging AVX2 instructions. The changes significantly reduce computation times across various matrix sizes. | |
### Key Changes: | |
- Implemented a vectorized codepath in `Visitor.h`. | |
- Improved performance for `coeffMax(&row, &col)` function. | |
### Improvements: | |
- Approximately 5x speedup for `coeffMax` on AVX2-capable machines. | |
- Notable speed reductions for various matrix decompositions, such as: | |
- `BM_EigenPartialPivLU`: Up to 6.63% improvement. | |
- `BM_EigenFullPivLU`: Up to 38.26% improvement. | |
- `BM_EigenfullPivQR`: Up to 39.57% improvement. | |
### Impact: | |
The enhancements lead to considerably faster operations for matrix computations in applications using the Eigen library, likely improving the performance of numerical algorithms and applications that rely on efficient matrix processing." | |
678 (https://gitlab.com/libeigen/eigen/-/merge_requests/678),"Move CUDA/Complex.h to GPU/Complex.h, remove TensorReductionCuda.h","Move CUDA/Complex.h to GPU/Complex.h, remove TensorReductionCuda.h | |
The `Complex.h` file applies equally to HIP/CUDA, so placing under the | |
generic `GPU` folder. | |
The `TensorReductionCuda.h` has already been deprecated, now removing | |
for the next Eigen version.",Antonio Sánchez,2021-10-20T19:17:53.230Z,NA,NA,"## Title: | |
Move CUDA/Complex.h to GPU/Complex.h, remove TensorReductionCuda.h | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request focuses on reorganizing header files in the Eigen C++ library by moving the `Complex.h` file to a more appropriate location and removing a deprecated header. | |
### Key Changes: | |
- Moved `CUDA/Complex.h` to `GPU/Complex.h`. | |
- Removed `TensorReductionCuda.h`, which has been deprecated. | |
### Improvements: | |
- The new location for `Complex.h` under the `GPU` folder is more suited for its applicability to both HIP and CUDA interfaces. | |
### Impact: | |
- This adjustment streamlines the header file structure by consolidating related functionalities, enhancing maintainability, and ensuring that deprecated files are removed for cleaner code management." | |
665 (https://gitlab.com/libeigen/eigen/-/merge_requests/665),Fix tuple compilation for VS2017.,"VS2017 doesn't like deducing alias types, leading to a bunch of compile | |
errors for functions involving the `tuple` alias. Replacing with | |
`TupleImpl` seems to solve this, allowing the test to compile/pass.",Antonio Sánchez,2021-10-20T19:37:50.457Z,NA,NA,"## Title: | |
Fix tuple compilation for VS2017 | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses compilation issues encountered in Visual Studio 2017 related to the deducing of alias types for the `tuple` in the Eigen C++ library. The fix involves replacing the `tuple` alias with `TupleImpl`, which rectifies the compilation errors. | |
### Key Changes: | |
- Replaced `tuple` alias with `TupleImpl` to resolve compilation issues in VS2017. | |
### Improvements: | |
- Enables successful compilation and passing of tests in Visual Studio 2017. | |
### Impact: | |
- Ensures compatibility of the Eigen library with Visual Studio 2017, improving the development experience for users on this platform." | |
666 (https://gitlab.com/libeigen/eigen/-/merge_requests/666),Fix MSVC+NVCC EIGEN_INHERIT_ASSIGNMENT_EQUAL_OPERATOR compilation.,"Looks like we need to update the | |
`EIGEN_INHERIT_ASSIGNMENT_EQUAL_OPERATOR` for newer versions of MSVC as | |
well when compiling with NVCC. Fixes build issues for VS 2017.",Antonio Sánchez,2021-10-20T19:53:31.809Z,NA,NA,"## Title: | |
Fix MSVC+NVCC EIGEN_INHERIT_ASSIGNMENT_EQUAL_OPERATOR compilation. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses compilation issues related to the `EIGEN_INHERIT_ASSIGNMENT_EQUAL_OPERATOR` when using newer versions of MSVC in conjunction with NVCC. It specifically resolves build problems encountered with Visual Studio 2017. | |
### Key Changes: | |
- Updated the `EIGEN_INHERIT_ASSIGNMENT_EQUAL_OPERATOR` macro to ensure compatibility with recent MSVC versions. | |
### Improvements: | |
- Enhanced the robustness of the Eigen library's cross-compiler support, particularly for users working with MSVC and NVCC together. | |
### Impact: | |
- Resolves critical build issues for users of Visual Studio 2017, facilitating smoother compilation and improving overall developer experience when using the Eigen library in mixed compiler environments." | |
676 (https://gitlab.com/libeigen/eigen/-/merge_requests/676),Improve accuracy of full tensor reduction for half and bfloat16,"We use a tree summation algorithm for full tensor reduction. The relative error in summing n (positive) elements this way is bounded by `~2*eps*(log(n/B) + B)`, where `B` is the size of the leaves in the tree, where we sum sequentially in the interest of speed. For less accurate types (i.e. types with larger eps), we reduce B to keep the relative error significantly below 1.",Rasmus Munk Larsen,2021-10-20T20:11:32.498Z,NA,NA,"## Title: | |
Improve accuracy of full tensor reduction for half and bfloat16 | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request introduces a refined algorithm for full tensor reduction, specifically targeting the half and bfloat16 data types. By implementing a tree summation strategy, it aims to enhance the accuracy of summations performed on these lower-precision types. | |
### Key Changes: | |
- Adoption of a tree summation algorithm for full tensor reduction. | |
- Optimization of the summation process to maintain accuracy by adjusting the leaf size (B) based on the type's precision. | |
### Improvements: | |
- Reduced relative error in summing positive elements, particularly for less accurate types, ensuring it remains significantly below 1. | |
### Impact: | |
- Enhanced accuracy of tensor operations for reduced precision types, potentially leading to better performance in applications relying on these data formats." | |
679 (https://gitlab.com/libeigen/eigen/-/merge_requests/679),Disable Tree reduction for GPU.,"For moderately sized inputs, running the Tree reduction quickly | |
overflows the GPU thread stack space, leading to memory errors. | |
This was happening in the `cxx11_tensor_complex_gpu` test, for example. | |
Disabling tree reduction on GPU fixes this.",Antonio Sánchez,2021-10-20T21:34:18.676Z,NA,NA,"## Title: | |
Disable Tree reduction for GPU. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an issue with the Tree reduction method on GPUs, which led to memory errors due to overflowing the GPU thread stack space during moderately sized inputs. | |
### Key Changes: | |
- Disabled Tree reduction functionality specifically for GPU implementations. | |
### Improvements: | |
- Eliminated memory errors experienced in the `cxx11_tensor_complex_gpu` test case, enhancing stability. | |
### Impact: | |
- The fix improves the reliability of the GPU operations within the Eigen library, particularly for users working with moderate-sized inputs." | |
677 (https://gitlab.com/libeigen/eigen/-/merge_requests/677),Use reinterpret_cast on GPU for bit_cast.,"This seems to be the recommended approach for doing type punning in | |
CUDA. See for example | |
- https://stackoverflow.com/questions/47037104/cuda-type-punning-memcpy-vs-ub-union | |
- https://developer.nvidia.com/blog/faster-parallel-reductions-kepler/ | |
(the latter puns a double to an `int2`). The issue is that for CUDA, the `memcpy` is not elided, and ends up | |
being an expensive operation. We already have similar `reintepret_cast`s across | |
the Eigen codebase for GPU (as does TensorFlow).",Antonio Sánchez,2021-10-20T21:50:30.064Z,NA,NA,"## Title: | |
Use reinterpret_cast on GPU for bit_cast. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request introduces the use of `reinterpret_cast` for type punning in the Eigen C++ library's GPU code. This change aims to enhance performance by avoiding the overhead associated with `memcpy`, which is not optimized in CUDA environments. | |
### Key Changes: | |
- Replaced `memcpy` with `reinterpret_cast` for type punning in CUDA-related code. | |
### Improvements: | |
- Reduced performance costs associated with type casting operations on the GPU by eliminating the need for `memcpy`. | |
### Impact: | |
- Enhances the efficiency of type conversions in GPU calculations, potentially leading to faster execution of CUDA code in the Eigen library." | |
686 (https://gitlab.com/libeigen/eigen/-/merge_requests/686),Revert bit_cast to use memcpy for CUDA.,"To elide the memcpy, we need to first load the `src` value into | |
registers by making a local copy. This avoids the need to resort | |
to potential UB by using `reinterpret_cast`. | |
This change doesn't seem to affect CPU (at least not with gcc/clang). | |
With optimizations on, the copy is also elided.",Antonio Sánchez,2021-10-21T19:18:17.611Z,NA,NA,"## Title: | |
Revert bit_cast to use memcpy for CUDA. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request reverts the implementation of `bit_cast` to utilize `memcpy` for CUDA environments. The adjustment aims to prevent undefined behavior associated with `reinterpret_cast` by first loading the source value into local registers. | |
### Key Changes: | |
- Reverted `bit_cast` to use `memcpy` for CUDA instead of `reinterpret_cast`. | |
### Improvements: | |
- Avoids potential undefined behavior by making a local copy of the source value before casting. | |
- Under optimizations, the copy operation can be elided, maintaining performance. | |
### Impact: | |
- The change is expected to have no adverse effects on CPU performance, at least with gcc/clang, ensuring safe operation in CUDA contexts." | |
687 (https://gitlab.com/libeigen/eigen/-/merge_requests/687),Add nan-propagation options to matrix and array plugins.,"The ability to control nan-propagation in elementwise min/max and min/max reduction was added in Eigen 3.4, but we missed adding them to the corresponding array and matrix plugins. Should we consider backporting this change to 3.4 (would have to make it c++03 compliant)?",Rasmus Munk Larsen,2021-10-21T20:06:54.199Z,NA,NA,"## Title: | |
Add nan-propagation options to matrix and array plugins. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request introduces the ability to control nan-propagation for elementwise min/max operations and min/max reductions in the Eigen C++ library's matrix and array plugins, enhancing functionality and consistency with previous updates. | |
### Key Changes: | |
- Added nan-propagation options to elementwise min/max and min/max reduction functions in both matrix and array plugins. | |
### Improvements: | |
- Aligns the behavior of matrix and array plugins with the previously introduced nan-propagation features in Eigen 3.4. | |
- Increases usability and consistency for users relying on these operations in computations involving NaN values. | |
### Impact: | |
- Enhances the flexibility and robustness of the Eigen library in handling special cases involving NaN, potentially improving the accuracy of computations in scientific applications. Discussions are underway regarding the possibility of backporting this functionality to version 3.4 to ensure broader compatibility." | |
691 (https://gitlab.com/libeigen/eigen/-/merge_requests/691),Fix -Wbitwise-instead-of-logical clang warning,Fixes #2353.,Nico,2021-10-22T05:50:17.301Z,NA,NA,"## Title: | |
Fix -Wbitwise-instead-of-logical clang warning | |
## Authors: | |
Nico | |
## Summary: | |
This merge request addresses and fixes a specific clang warning regarding the misuse of bitwise operators instead of logical operators within the Eigen C++ library. | |
### Key Changes: | |
- Resolved instances of bitwise operations being used in places where logical operations were intended. | |
### Improvements: | |
- Enhanced code clarity and correctness, ensuring that the intended logical operations are now explicitly defined. | |
### Impact: | |
- Reduces the potential for bugs related to operator misuse, improving overall code quality and maintainability." | |
693 (https://gitlab.com/libeigen/eigen/-/merge_requests/693),Included note on inner stride for compile-time vectors. Fixes #2355,"### Reference issue | |
#2355 | |
### What does this implement/fix? | |
Added note in the documentation on the `Stride` class for compile-time vectors, which always use the inner stride. | |
### Additional information | |
<!--Any additional information you think is important.-->",Lennart Steffen,2021-10-22T15:14:14.956Z,NA,NA,"## Title: | |
Included note on inner stride for compile-time vectors. Fixes #2355 | |
## Authors: | |
Lennart Steffen | |
## Summary: | |
This merge request enhances the documentation for the `Stride` class in the Eigen C++ library by clarifying that compile-time vectors always utilize the inner stride. | |
### Key Changes: | |
- Added a note in the documentation of the `Stride` class regarding inner stride behavior for compile-time vectors. | |
### Improvements: | |
- Improved clarity in documentation related to the handling of inner stride in compile-time vectors. | |
### Impact: | |
- Enhances user understanding and reduces potential confusion regarding the `Stride` class, leading to better utilization of the functionality provided by the Eigen library." | |
692 (https://gitlab.com/libeigen/eigen/-/merge_requests/692),Extend EIGEN_QT_SUPPORT to Qt6,"When building with Qt6, excludes the functions `inline Transform(const QMatrix& other);`, `inline Transform& operator=(const QMatrix& other);` and `inline QMatrix toQMatrix(void) const;` from `Transform.h`. | |
Fixes #2350",benardp,2021-10-23T23:43:07.383Z,NA,NA,"## Title: | |
Extend EIGEN_QT_SUPPORT to Qt6 | |
## Authors: | |
benardp | |
## Summary: | |
This merge request extends the support of the Eigen C++ library to be compatible with Qt6 by modifying certain functions in the `Transform.h` file. | |
### Key Changes: | |
- Excluded the following functions when building with Qt6: | |
- `inline Transform(const QMatrix& other);` | |
- `inline Transform& operator=(const QMatrix& other);` | |
- `inline QMatrix toQMatrix(void) const;` | |
### Improvements: | |
- Enhances compatibility of the Eigen library with the latest version of Qt, ensuring users can leverage new features and improvements in Qt6. | |
### Impact: | |
- Allows Eigen users to smoothly integrate with Qt6 by eliminating potential conflicts and maintaining the functionality required for transformations involving `QMatrix`. This change helps in transitioning projects from Qt5 to Qt6 while using Eigen." | |
688 (https://gitlab.com/libeigen/eigen/-/merge_requests/688),Add nan-propagation options to matrix and array plugins.,NA,Rasmus Munk Larsen,2021-10-25T19:11:00.256Z,NA,NA,"## Title: | |
Add nan-propagation options to matrix and array plugins. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request introduces options for handling NaN (Not a Number) values in the Eigen C++ library's matrix and array plugins, enhancing the way these data structures manage and propagate NaN values in computations. | |
### Key Changes: | |
- Implemented nan-propagation options in matrix and array plugins. | |
- Enhanced functions to control the behavior of NaN values during operations. | |
### Improvements: | |
- Allows for more robust handling of NaN cases in matrix and array computations. | |
- Provides users with customizable control over NaN propagation behavior, improving flexibility. | |
### Impact: | |
- Enhances the reliability and usability of the Eigen library when dealing with potentially invalid data. | |
- Facilitates better error handling and data integrity in numerical computations, reducing unexpected results from NaN values." | |
696 (https://gitlab.com/libeigen/eigen/-/merge_requests/696),Remove const from visitor return type.,"This seems to interfere with `pload`/`ploadu`, since `pload<const Packet**>` are not defined. | |
This should unbreak the arm/ppc builds.",Antonio Sánchez,2021-10-25T19:26:04.904Z,NA,NA,"## Title: | |
Remove const from visitor return type. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an issue related to the `pload` and `ploadu` functions in the Eigen C++ library. Specifically, it removes the `const` qualifier from the visitor return type, which was causing conflicts in the definition of `pload<const Packet**>`. This change aims to resolve build issues on ARM and PPC architectures. | |
### Key Changes: | |
- Removed `const` from the visitor return type. | |
### Improvements: | |
- Fixes compatibility issues with `pload` and `ploadu` functions. | |
### Impact: | |
- Resolves build failures on ARM and PPC platforms, enhancing cross-platform build reliability." | |
689 (https://gitlab.com/libeigen/eigen/-/merge_requests/689),Fix broadcasting oob error.,"For vectorized 1-dimensional inputs that do not take the special | |
blocking path (e.g. `std::complex<...>`), there was an | |
index-out-of-bounds error causing the broadcast size to be | |
computed incorrectly. Here we fix this, and make other minor | |
cleanup changes. | |
Fixes #2351.",Antonio Sánchez,2021-10-25T19:48:13.063Z,NA,NA,"## Title: | |
Fix broadcasting oob error. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an index-out-of-bounds error in the Eigen C++ library related to the broadcasting of vectorized 1-dimensional inputs, specifically for types like `std::complex<...>` that do not utilize special blocking paths. The change corrects the computation of the broadcast size and includes minor cleanup updates. | |
### Key Changes: | |
- Fixed an index-out-of-bounds error for vectorized 1-dimensional inputs. | |
- Adjustments made for `std::complex<...>` types that avoid the special blocking path. | |
- Minor cleanup changes implemented. | |
### Improvements: | |
- Ensures accurate computation of broadcast sizes, enhancing the stability and reliability of the library when handling various input types. | |
### Impact: | |
This fix improves the robustness of the Eigen library by preventing errors during broadcasting operations, thereby enhancing user experience and reducing potential runtime failures in applications that rely on complex vectorized inputs." | |
698 (https://gitlab.com/libeigen/eigen/-/merge_requests/698),Ensure comma initializer reuses fixed dimensions,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
This relates to #2346. | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
Ensures that the block in `CommaInitializer` is fix-sized if the input is by replacing | |
``` | |
m_xpr.block(0, 0, other.rows(), other.cols()) = other; | |
``` | |
with | |
``` | |
m_xpr.template block<OtherDerived::RowsAtCompileTime, OtherDerived::ColsAtCompileTime>(0, 0, other.rows(), other.cols()) = other; | |
``` | |
<!--Please explain your changes.--> | |
### Additional information | |
No additional information :)",Stresspresso,2021-10-25T20:10:15.448Z,NA,NA,"## Title: | |
Ensure comma initializer reuses fixed dimensions | |
## Authors: | |
Stresspresso | |
## Summary: | |
This merge request enhances the `CommaInitializer` functionality within the Eigen C++ library by ensuring that fixed-sized blocks are used for initialization when inputs are provided. This change aims to optimize performance and consistency in the handling of fixed dimensions in matrix blocks. | |
### Key Changes: | |
- Modified the initialization block in `CommaInitializer` to ensure it is fix-sized. | |
- Replaced the line: | |
```cpp | |
m_xpr.block(0, 0, other.rows(), other.cols()) = other; | |
``` | |
with: | |
```cpp | |
m_xpr.template block<OtherDerived::RowsAtCompileTime, OtherDerived::ColsAtCompileTime>(0, 0, other.rows(), other.cols()) = other; | |
``` | |
### Improvements: | |
- Improved initialization efficiency for fixed-size matrices by explicitly using template parameters. | |
- Enhanced code readability and clarity regarding matrix dimensions during initialization. | |
### Impact: | |
This change is expected to improve performance in scenarios where fixed dimensions are used, resulting in faster computations and reduced overhead associated with dynamic size resolution." | |
695 (https://gitlab.com/libeigen/eigen/-/merge_requests/695),test: fix boostmutiprec test to compile with older Boost versions,"test: fix boostmutiprec test to compile with older Boost versions | |
Eigen boostmultiprec test redefines a symbol that is already defined inside Boot Math [1]. Boost has fixed it recently [2], but this patch avoids errors if Boost version was less than 1.77. | |
https://github.com/boostorg/math/blob/boost-1.76.0/include/boost/math/policies/policy.hpp#L18 | |
https://github.com/boostorg/math/commit/68307123029676ba5cb316f8dd1d1c98d1fc7b23#diff-c7a8e5911c2e6be4138e1a966d762200f147792ac16ad96fdcc724313d11f839",Maxiwell S. Garcia,2021-10-25T20:48:18.344Z,NA,NA,"## Title: | |
Fix boostmultiprec test to compile with older Boost versions | |
## Authors: | |
Maxiwell S. Garcia | |
## Summary: | |
This merge request addresses a compilation issue in the Eigen boostmultiprec test when using older versions of the Boost library, particularly versions prior to 1.77. The patch ensures compatibility by avoiding symbol redefinition conflicts that arise due to changes in Boost Math. | |
### Key Changes: | |
- Modified the Eigen boostmultiprec test to prevent redefinition of a symbol that clashes with Boost Math existing in versions before 1.77. | |
### Improvements: | |
- Increased compatibility of the Eigen library with older Boost versions, improving ease of integration for users who have not updated to the latest Boost. | |
### Impact: | |
- This change allows users relying on older versions of Boost to build and run the Eigen boostmultiprec test without encountering compilation errors, enhancing the library's usability across various environments." | |
681 (https://gitlab.com/libeigen/eigen/-/merge_requests/681),Avoid integer overflows in EigenMetaKernel indexing,"- The current implementation computes `size + total_threads`, which can overflow and cause `CUDA_ERROR_ILLEGAL_ADDRESS` when size is close to the maximum representable value. | |
- The `num_blocks` calculation can also overflow due to the implementation of `divup()`. | |
- This patch prevents these overflows and allows the kernel to work correctly for the full representable range of tensor sizes. | |
- Also adds relevant tests. | |
cc @nluehr",Ben Barsdell,2021-10-26T00:20:35.402Z,NA,NA,"## Title: | |
Avoid integer overflows in EigenMetaKernel indexing | |
## Authors: | |
Ben Barsdell | |
## Summary: | |
This merge request addresses potential integer overflows in the EigenMetaKernel indexing calculations. It focuses on ensuring that the kernel functions correctly across the full range of tensor sizes by modifying how size-related calculations are performed. | |
### Key Changes: | |
- Adjusted the formula for computing `size + total_threads` to prevent overflows. | |
- Revised the implementation of `divup()` to avoid overflow in the `num_blocks` calculation. | |
- Added relevant tests to validate the new implementation. | |
### Improvements: | |
- Enhanced the robustness of the kernel by preventing CUDA errors associated with illegal addresses resulting from integer overflows. | |
- Increased the reliability of tensor size handling, especially near the maximum representable values. | |
### Impact: | |
This patch ensures the kernel operates correctly with larger tensor sizes, enhancing the overall stability and functionality of the Eigen C++ library in CUDA environments." | |
701 (https://gitlab.com/libeigen/eigen/-/merge_requests/701),ZVector: Move alignas qualifier to come first,"We currently have plenty of type definitions with the alignment | |
qualifier coming after the type. The compiler warns about ignoring | |
them: | |
int EIGEN_ALIGN16 ai[4]; | |
Turn this into: | |
EIGEN_ALIGN16 int ai[4];",Andreas Krebbel,2021-10-26T16:54:18.888Z,NA,NA,"## Title: | |
ZVector: Move alignas qualifier to come first | |
## Authors: | |
Andreas Krebbel | |
## Summary: | |
This merge request involves updating the alignment qualifier in type definitions within the Eigen C++ library. The change is aimed at enhancing consistency and addressing compiler warnings related to ignored alignment qualifiers. | |
### Key Changes: | |
- The alignment qualifier (EIGEN_ALIGN16) is now placed before the type declaration (e.g., `EIGEN_ALIGN16 int ai[4];` instead of `int EIGEN_ALIGN16 ai[4];`). | |
### Improvements: | |
- Resolves compiler warnings by ensuring the alignment qualifier is recognized properly. | |
- Promotes consistent coding practices regarding type definitions across the library. | |
### Impact: | |
- Enhances code readability and maintainability. | |
- Reduces potential issues arising from misalignment in memory management, leading to better performance and fewer bugs in aligned data structures." | |
700 (https://gitlab.com/libeigen/eigen/-/merge_requests/700),Vectorize fp16 tanh and logistic functions on Neon,Adds vectorization to the current implementation of the tanh and logistic functions when they run on Neon.,Alex Druinsky,2021-10-27T16:24:58.672Z,NA,NA,"## Title: | |
Vectorize fp16 tanh and logistic functions on Neon | |
## Authors: | |
Alex Druinsky | |
## Summary: | |
This merge request introduces vectorization for the tanh and logistic functions specifically optimized for fp16 precision on Neon architectures, enhancing computational efficiency. | |
### Key Changes: | |
- Implementation of vectorized versions of the tanh and logistic functions for fp16 data types. | |
- Optimized code for better performance on Neon platforms. | |
### Improvements: | |
- Enhanced performance of tanh and logistic function calculations, leading to faster execution times. | |
- Improved support for fp16 computations, which is increasingly relevant for machine learning applications. | |
### Impact: | |
The vectorized functions significantly boost performance on Neon architectures, making the Eigen library more efficient for applications reliant on fast mathematical operations, particularly in deep learning and other high-performance computing scenarios." | |
694 (https://gitlab.com/libeigen/eigen/-/merge_requests/694),Fix ZVector build.,"Cross-compiled via `s390x-linux-gnu-g++`, run via qemu. This allows the packetmath tests to pass.",Antonio Sánchez,2021-10-27T16:50:52.175Z,NA,NA,"## Title: | |
Fix ZVector build. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses issues with the ZVector build when cross-compiling using `s390x-linux-gnu-g++`. The changes enable the successful execution of packetmath tests under QEMU. | |
### Key Changes: | |
- Resolved build issues related to ZVector for the `s390x` architecture. | |
- Enabled packetmath tests to pass in a QEMU environment. | |
### Improvements: | |
- Improved compatibility for cross-compilation on the `s390x` architecture. | |
- Enhanced test reliability by ensuring packetmath tests can run successfully. | |
### Impact: | |
This fix allows developers to compile and test the Eigen library on the `s390x` platform, facilitating broader usage and ensuring robustness in diverse environments." | |
534 (https://gitlab.com/libeigen/eigen/-/merge_requests/534),Preliminary HIP bfloat16 GPU support.,"The purpose of this MR is to deliver bfloat16 data type support for AMD GPUs and the HIP software stack. | |
The changes captured herein are a work in progress in the sense that basic functionality is provided. | |
Performance optimizations and similar changes will be forthcoming. | |
The existing bfloat16 header has been enhanced with support for HIP. | |
Also, a GPU specific bfloat16 unit test suite has been introduced.",Rohit Santhanam,2021-10-27T18:36:46.828Z,NA,NA,"## Title: | |
Preliminary HIP bfloat16 GPU support. | |
## Authors: | |
Rohit Santhanam | |
## Summary: | |
This merge request introduces preliminary support for the bfloat16 data type specifically tailored for AMD GPUs using the HIP software stack. It lays the groundwork for future enhancements, including performance optimizations. | |
### Key Changes: | |
- Enhanced the existing bfloat16 header to include support for HIP. | |
- Introduced a GPU-specific bfloat16 unit test suite. | |
### Improvements: | |
- Provides basic functionality for the bfloat16 data type on AMD GPUs. | |
- Establishes a foundation for future performance improvements. | |
### Impact: | |
This implementation opens up pathways for using bfloat16 on AMD GPUs, potentially improving computing performance in applications utilizing this data type, with further optimizations expected in subsequent updates." | |
697 (https://gitlab.com/libeigen/eigen/-/merge_requests/697),optimize cmake scripts for subproject use,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
Fixes #2347 | |
### What does this implement/fix? | |
With that, the support for subprojects is improved. E.g. no tests will be builded by default. | |
### Additional information | |
The use of CMAKE_CXX_FLAGS for tests should be replaced by a test_interface target",Fabian Keßler,2021-10-28T15:04:48.377Z,NA,NA,"## Title: | |
Optimize CMake scripts for subproject use | |
## Authors: | |
Fabian Keßler | |
## Summary: | |
This merge request enhances the CMake scripts for the Eigen C++ library, specifically focusing on improving support for subprojects. | |
### Key Changes: | |
- Removal of default test builds when integrating Eigen as a subproject. | |
- Plan to replace the use of CMAKE_CXX_FLAGS for tests with a dedicated test_interface target. | |
### Improvements: | |
- Streamlined integration of Eigen as a subproject, making it more user-friendly for developers working with composite projects. | |
### Impact: | |
- Reduces unnecessary build overhead by not compiling tests by default, improving efficiency for users integrating Eigen into larger systems." | |
703 (https://gitlab.com/libeigen/eigen/-/merge_requests/703),"Fix min/max nan-propagation for scalar ""other"".","Copied input type from `EIGEN_MAKE_CWISE_BINARY_OP`. | |
Fixes #2362.",Antonio Sánchez,2021-10-28T16:47:50.777Z,NA,NA,"## Title: | |
Fix min/max nan-propagation for scalar ""other"". | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses the issue of NaN propagation in the min and max functions when a scalar ""other"" is involved. By copying the input type from `EIGEN_MAKE_CWISE_BINARY_OP`, the functionality is improved to handle NaN values correctly. | |
### Key Changes: | |
- Adjusted the implementation of min/max functions to ensure proper NaN propagation when dealing with scalar inputs. | |
### Improvements: | |
- Enhanced robustness of the min/max functions with scalar ""other"" inputs, enabling more predictable behavior in mathematical operations. | |
### Impact: | |
This change fixes issue #2362, leading to improved correctness in mathematical operations involving potential NaN values, therefore enhancing the reliability of the Eigen library for users." | |
702 (https://gitlab.com/libeigen/eigen/-/merge_requests/702),Add AVX vector path to float2half/half2float,"Add AVX vector path to float2half/half2float | |
Makes e. g. matrix multiplication 3x faster: | |
name old cpu/op new cpu/op delta | |
BM_convers 181ms ± 1% 62ms ± 9% -65.82% (p=0.016 n=4+5) | |
Direct translation of the scalar code from half_to_float and float_to_half_rtne (Eigen/src/Core/arch/Default/Half.h). | |
Tested on all possible input values (not adding those, since they take a long time, especially in debug build).",Ilya Tokar,2021-10-28T21:04:41.543Z,NA,NA,"## Title: | |
Add AVX vector path to float2half/half2float | |
## Authors: | |
Ilya Tokar | |
## Summary: | |
This merge request introduces an AVX vectorized implementation for the float2half and half2float conversion functions in the Eigen C++ library. The new implementation offers significant performance improvements for matrix operations involving half-precision floats. | |
### Key Changes: | |
- Added an AVX vector path for the `float2half` and `half2float` functions, which leverages SIMD instructions for enhanced performance. | |
- Directly translated code from existing scalar versions, ensuring consistency with previous implementations. | |
### Improvements: | |
- Matrix multiplication performance improved drastically, with benchmarks showing a reduction in computation time from 181ms to 62ms for the conversion operation, resulting in a speedup of approximately 3x. | |
### Impact: | |
This enhancement will lead to better performance in applications that utilize half-precision floating-point numbers, particularly in matrix operations, making computations faster and more efficient in the Eigen library." | |
680 (https://gitlab.com/libeigen/eigen/-/merge_requests/680),Invert rows and depth in non-vectorized portion of packing (PowerPC).,"Invert rows and depth in non-vectorized portion of packing for RHS (PowerPC). | |
This shows up as bad results in the following: | |
``` | |
export EIGEN_SEED=1629216664 | |
test/product_syrk_3 | |
test/product_mmtr_3 | |
``` | |
The previous packing did NOT allow us to know the correct end of a row in some cases and it would pickup incorrect values from the wrong locations. | |
In the process of fixing this, I simplified the code and added performance improvements (extra rows are now 5X faster and overall 10% gains).",Chip Kerchner,2021-10-28T21:59:41.561Z,NA,NA,"## Title: | |
Invert rows and depth in non-vectorized portion of packing (PowerPC). | |
## Authors: | |
Chip Kerchner | |
## Summary: | |
This merge request addresses issues with incorrect results in the non-vectorized packing portion for PowerPC, specifically for the RHS. | |
### Key Changes: | |
- Corrected the inversion of rows and depth in the packing process, which previously led to incorrect value retrieval. | |
- Simplified the code structure during the fix. | |
### Improvements: | |
- Enhanced the performance of packing: processing extra rows is now 5X faster, contributing to an overall performance gain of 10%. | |
### Impact: | |
This correction resolves critical issues in tests `product_syrk_3` and `product_mmtr_3`, ensuring accurate results and improved efficiency for PowerPC architecture." | |
705 (https://gitlab.com/libeigen/eigen/-/merge_requests/705),Fix TensorReduction warnings and error bound for sum accuracy test.,"The sum accuracy test currently uses the default test precision for | |
the given scalar type. However, scalars are generated via a normal | |
distribution, and given a large enough count and strong enough random | |
generator, the expected sum is zero. This causes the test to | |
periodically fail. | |
Here we estimate an upper-bound for the error as `N * prec` for | |
summing N values, with each having an approximate epsilon of `prec`. | |
In practice, this is much larger than it probably needs to be, since errors | |
are generally both positive and negative. | |
Also fixed a few warnings generated by MSVC when compiling the | |
reduction test.",Antonio Sánchez,2021-11-01T17:03:50.849Z,NA,NA,"## Title: | |
Fix TensorReduction warnings and error bound for sum accuracy test. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses issues in the sum accuracy test within the Eigen C++ library's Tensor module. It adjusts the error bound calculation to improve test reliability and resolves compilation warnings. | |
### Key Changes: | |
- Redefined the error bound for the sum accuracy test to be `N * prec`, where `N` is the number of values summed and `prec` is the test precision. | |
- Fixed warnings produced by MSVC during the compilation of the reduction test. | |
### Improvements: | |
- Enhanced the reliability of the sum accuracy test, reducing the likelihood of false failures when summing values generated from a normal distribution. | |
- Cleaned up the code by resolving MSVC warnings, contributing to better maintainability. | |
### Impact: | |
The changes improve the consistency and accuracy of the sum accuracy test, thereby increasing confidence in the Tensor module's functionality. They also enhance code quality by addressing compiler warnings, which can prevent future compilation issues." | |
704 (https://gitlab.com/libeigen/eigen/-/merge_requests/704),"Remove bad ""take"" impl that causes g++-11 crash.","For some reason, having `take<n, numeric_list<T>>` for `n > 0` causes | |
g++-11 to ICE with | |
``` | |
sorry, unimplemented: unexpected AST of kind nontype_argument_pack | |
``` | |
It does work with other versions of gcc, and with clang. | |
I filed a GCC bug | |
[here](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102999). | |
Technically we should never actually run into this case, since you | |
can't take n > 0 elements from an empty list. Commenting it out | |
allows our Eigen tests to pass",Antonio Sánchez,2021-11-01T17:20:39.950Z,NA,NA,"## Title: | |
Remove bad ""take"" impl that causes g++-11 crash. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses a critical issue in the Eigen C++ library related to the `take<n, numeric_list<T>>` implementation when `n > 0`. The specific problem leads to a crash in g++-11 due to an internal compiler error. | |
### Key Changes: | |
- Removed the problematic implementation of `take<n, numeric_list<T>>` for `n > 0` to prevent g++-11 from encountering a compiler crash. | |
### Improvements: | |
- The removal of this implementation allows Eigen's tests to pass successfully when compiled with g++-11, ensuring improved compatibility for users relying on this compiler version. | |
### Impact: | |
- This change enhances the stability of Eigen when used with g++-11, mitigating the risk of compiler crashes and contributing to better overall reliability of the library in various development environments." | |
707 (https://gitlab.com/libeigen/eigen/-/merge_requests/707),"Fix total deflation issue in BDCSVD, when & only when M is already diagonal.","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
- #1980 | |
- #2174 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
1. Add more unit tests to BDCSVD, to capture the total deflation issue before. | |
2. Fix total deflation when it's not supposed to be triggered.",Xinle Liu,2021-11-02T16:53:55.746Z,NA,NA,"## Title: | |
Fix total deflation issue in BDCSVD, when & only when M is already diagonal. | |
## Authors: | |
Xinle Liu | |
## Summary: | |
This merge request addresses a specific issue with the BDCSVD implementation in the Eigen C++ library, focusing on the total deflation behavior when the input matrix is already diagonal. | |
### Key Changes: | |
1. Added more unit tests to the BDCSVD framework to specifically target the total deflation issue. | |
2. Implemented a fix to prevent total deflation from being triggered incorrectly. | |
### Improvements: | |
The newly added unit tests enhance the robustness of the BDCSVD module by ensuring correct behavior under various conditions, particularly when the matrix M is diagonal. This contributes to better reliability in the library's functionality. | |
### Impact: | |
The changes improve the accuracy of the BDCSVD computation, preventing erroneous total deflation scenarios that could lead to incorrect results in applications relying on this functionality. This ensures users can trust the decomposition results from BDCSVD when working with diagonal matrices." | |
709 (https://gitlab.com/libeigen/eigen/-/merge_requests/709),"Fix BDCSVD's total deflation in branch 3.4, similar to that of master in MR 707.","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
* #1980 | |
* #2174 | |
* Cherry pick for !707 into branch `3.4` | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Fix BDCSVD's total deflation logic, to be triggered when and only when M is diagonal.",Xinle Liu,2021-11-03T18:16:04.555Z,NA,NA,"## Title: | |
Fix BDCSVD's total deflation in branch 3.4, similar to that of master in MR 707. | |
## Authors: | |
Xinle Liu | |
## Summary: | |
This merge request addresses an issue in the BDCSVD implementation by correcting the logic for total deflation, ensuring that it activates only when the matrix \( M \) is diagonal. | |
### Key Changes: | |
- Adjusted the total deflation logic within the BDCSVD algorithm. | |
- The change aligns the behavior in branch 3.4 with that of the master branch as detailed in merge request !707. | |
### Improvements: | |
- Enhanced accuracy of the BDCSVD algorithm's performance in specific cases involving diagonal matrices. | |
- Potentially reduces unexpected results during computations where total deflation is applicable. | |
### Impact: | |
- This fix provides more reliable outcomes for users relying on the BDCSVD functionality in the Eigen library, particularly in situations with diagonal matrices, thus improving the overall robustness of matrix computations." | |
714 (https://gitlab.com/libeigen/eigen/-/merge_requests/714),nestbyvalue test: fix uninitialized matrix,"- Doing computation with uninitialized (zero-ed ? but thanks Linux) matrix, or | |
worse NaN on other non-linux systems. | |
- This commit fixes it by initializing to Random(). | |
Note: can we have this cherry-picked into 3.4 too, please ?",Minh Quan Ho,2021-11-04T16:19:21.437Z,NA,NA,"## Title: | |
nestbyvalue test: fix uninitialized matrix | |
## Authors: | |
Minh Quan Ho | |
## Summary: | |
This merge request addresses issues related to uninitialized matrices, which can lead to unpredictable computation results across different operating systems. | |
### Key Changes: | |
- Initialized matrices that were previously uninitialized, preventing potential computation errors. | |
### Improvements: | |
- Ensures consistent behavior for matrix operations across different platforms, reducing reliance on specific OS behavior (e.g., Linux). | |
### Impact: | |
- Enhances the reliability and stability of the Eigen library by preventing computation pitfalls associated with uninitialized matrices." | |
712 (https://gitlab.com/libeigen/eigen/-/merge_requests/712),Documentation of Quaternion constructor from MatrixBase (fixes #2368),"Added documentation to clarify that the Quaternion constructor from MatrixBase assumes the matrix is in the order qx, qy, qz, qw. | |
Fixes #2368.",Gilad Barach,2021-11-04T16:38:08.734Z,NA,NA,"## Title: | |
Documentation of Quaternion constructor from MatrixBase (fixes #2368) | |
## Authors: | |
Gilad Barach | |
## Summary: | |
This merge request enhances the documentation related to the Quaternion constructor that accepts a MatrixBase. It clarifies the assumption regarding the order of elements in the matrix. | |
### Key Changes: | |
- Added documentation specifying that the Quaternion constructor expects the matrix elements to be ordered as qx, qy, qz, qw. | |
### Improvements: | |
- Improved clarity on the usage of the Quaternion constructor, helping users understand the requirements for matrix input. | |
### Impact: | |
- Reduces potential confusion for users implementing the Quaternion constructor, leading to fewer errors and improved usage of the library." | |
711 (https://gitlab.com/libeigen/eigen/-/merge_requests/711),Bug Fix: correct the bug that won't define EIGEN_HAS_FP16_C,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This patch has fixed a bug of Eigen: | |
If we use a compiler which is not clang, `EIGEN_COMP_CLANG` is defined as 0, then `(!defined(EIGEN_COMP_CLANG) || EIGEN_COMP_CLANG>=380))` is always false. `EIGEN_HAS_FP16_C` will be never defined. | |
### Additional information | |
<!--Any additional information you think is important.-->",Gengxin Xie,2021-11-04T22:29:42.448Z,NA,NA,"## Title: | |
Bug Fix: correct the bug that won't define EIGEN_HAS_FP16_C | |
## Authors: | |
Gengxin Xie | |
## Summary: | |
This merge request addresses a bug in the Eigen C++ library related to the incorrect definition of the macro `EIGEN_HAS_FP16_C`. The issue occurs when a non-Clang compiler is used, leading to persistent incorrect evaluations in macro conditionals. | |
### Key Changes: | |
- Fixed the logic related to the definition of `EIGEN_HAS_FP16_C` to ensure it is correctly defined when using non-Clang compilers. | |
### Improvements: | |
- The bug fix enhances compatibility with various compilers, ensuring that `EIGEN_HAS_FP16_C` can be defined appropriately. | |
### Impact: | |
- This improvement allows users of Eigen to have accurate macro definitions in their projects, potentially leading to better compiler behavior and greater overall reliability when using features that depend on `EIGEN_HAS_FP16_C`." | |
715 (https://gitlab.com/libeigen/eigen/-/merge_requests/715),Fix failing test for tensor reduction.,Compare summation results against forward error bound.,Rasmus Munk Larsen,2021-11-05T01:25:46.968Z,NA,NA,"## Title: | |
Fix failing test for tensor reduction. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request addresses an issue with a failing test related to tensor reduction functionality in the Eigen C++ library. The fix involves comparing the summation results against a forward error bound to ensure accuracy. | |
### Key Changes: | |
- Correction made to the tensor reduction test to include comparison with forward error bounds. | |
### Improvements: | |
- Enhanced reliability of the tensor reduction functionality by establishing a more rigorous testing framework. | |
### Impact: | |
- Improved correctness of tensor reduction operations, contributing to overall stability and reliability in the library's numerical computations." | |
713 (https://gitlab.com/libeigen/eigen/-/merge_requests/713),Avoid integer overflow in EigenMetaKernel indexing (v2),"This is a re-submission of https://gitlab.com/libeigen/eigen/-/merge_requests/681, which was reverted due to build issues on Windows. | |
This version has two changes compared to the previous version: | |
- It doesn't use inline PTX, so there shouldn't be any build issues on Windows. | |
- It only uses saturated addition in each loop iteration when overflow is possible (i.e., when the size is within total_threads of the max representable index). When overflow is not possible, regular addition is used. | |
Summary of changes: | |
- The current implementation computes `size + total_threads`, which can | |
overflow and cause CUDA_ERROR_ILLEGAL_ADDRESS when size is close to | |
the maximum representable value. | |
- The num_blocks calculation can also overflow due to the implementation | |
of divup(). | |
- This patch prevents these overflows and allows the kernel to work | |
correctly for the full representable range of tensor sizes. | |
- Also adds relevant tests. | |
cc @nluehr",Ben Barsdell,2021-11-05T18:28:54.257Z,NA,NA,"## Title: | |
Avoid integer overflow in EigenMetaKernel indexing (v2) | |
## Authors: | |
Ben Barsdell | |
## Summary: | |
This merge request is a re-submission aimed at preventing integer overflow issues in the EigenMetaKernel indexing implementation, specifically addressing challenges encountered on Windows builds. | |
### Key Changes: | |
- Removal of inline PTX usage to eliminate prior build issues on Windows. | |
- Introduction of saturated addition in loop iterations where overflow is probable, while using regular addition when overflow is not a concern. | |
### Improvements: | |
- The implementation now ensures safe computation regarding the sum of `size + total_threads`, preventing potential overflow that could lead to `CUDA_ERROR_ILLEGAL_ADDRESS`. | |
- Modifications also address the risk of overflow in the `num_blocks` calculation function `divup()`. | |
- Inclusion of relevant tests to validate the updated implementation. | |
### Impact: | |
These changes enhance the reliability and correctness of the kernel's functionality across the full range of tensor sizes, ensuring it operates correctly without triggering overflow errors." | |
121 (https://gitlab.com/libeigen/eigen/-/merge_requests/121),Added a make format command,"Using `make format` formats the whole source code according to a .clang-format file, which specifies the exact layout. | |
This allows to later check if new Merge Request fulfill these guidelines. | |
The copyright on the FindCLANG_FORMAT can still be changed. : Done",Jens Wehner,2021-11-10T17:35:21.520Z,infrastructure::build system,NA,"## Title: | |
Added a make format command | |
## Authors: | |
Jens Wehner | |
## Summary: | |
This merge request introduces a new `make format` command to the Eigen C++ library, enabling automatic formatting of the source code according to specified guidelines in a `.clang-format` file. | |
### Key Changes: | |
- Implemented the `make format` command to format the codebase. | |
### Improvements: | |
- Enforces consistent code formatting across the project, ensuring all contributors adhere to the same styling guidelines. | |
### Impact: | |
- Facilitates the review process for new Merge Requests by allowing checks against formatting standards, thereby improving code quality and maintainability." | |
720 (https://gitlab.com/libeigen/eigen/-/merge_requests/720),fix a typo,what the title says,Erik Schultheis,2021-11-15T03:42:11.454Z,NA,NA,"## Title: | |
fix a typo | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request addresses a typographical error in the Eigen C++ library documentation. | |
### Key Changes: | |
- Corrected a typo in the documentation. | |
### Improvements: | |
- Enhances the clarity and professionalism of the documentation. | |
### Impact: | |
- Improves user understanding and perception of the library by ensuring accurate documentation." | |
716 (https://gitlab.com/libeigen/eigen/-/merge_requests/716),Convert diag pragmas to nv_diag take 2.,This is a re-submission of MR https://gitlab.com/libeigen/eigen/-/merge_requests/670.,Nathan Luehr,2021-11-15T19:01:01.372Z,NA,NA,"## Title: | |
Convert diag pragmas to nv_diag take 2. | |
## Authors: | |
Nathan Luehr | |
## Summary: | |
This merge request is a re-submission aimed at converting the diagnostic pragmas used within the Eigen C++ library to the more standardized `nv_diag` format. | |
### Key Changes: | |
- Transitioned from various diagnostic pragmas to a unified `nv_diag` approach, enhancing consistency across the codebase. | |
### Improvements: | |
- Streamlined code maintenance by utilizing a single diagnostic pragma format, which simplifies future updates and reduces the likelihood of errors. | |
### Impact: | |
- The change is expected to improve code readability and maintainability while ensuring that diagnostic messages are correctly handled across different compilers and platforms." | |
717 (https://gitlab.com/libeigen/eigen/-/merge_requests/717),moved pruning code to SparseVector.h,"I'm planning to start working on the storage code for sparse matrices. There are a few feature requests for this, see below. | |
This MR does not yet address any of those, but instead moves the prune function, which is only used (and useful) for sparse vectors from the storage implementation to the sparse vector implementation. As this is not related to how the data is stored, this shouldn't (imo) be part of the CompressedStorage class anyway. | |
So this can be seen as a first cleanup step. | |
### Reference issue | |
https://gitlab.com/libeigen/eigen/-/issues/2371 | |
https://gitlab.com/libeigen/eigen/-/issues/2238 | |
https://gitlab.com/libeigen/eigen/-/issues/2207 | |
https://gitlab.com/libeigen/eigen/-/issues/1729",Erik Schultheis,2021-11-15T22:16:03.142Z,NA,NA,"## Title: | |
Moved Pruning Code to SparseVector.h | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request facilitates the initial steps towards improving the sparse matrix storage code by relocating the prune function from the CompressedStorage class to the SparseVector implementation. | |
### Key Changes: | |
- The prune function has been moved specifically to the SparseVector.h file, enhancing the organization of code related to sparse vectors. | |
### Improvements: | |
- The relocation simplifies the code structure by removing functionality from the CompressedStorage class that is not directly related to data storage, preparing for future enhancements in sparse matrix functionality. | |
### Impact: | |
- This cleanup step lays the groundwork for addressing existing feature requests concerning sparse matrices, potentially improving the efficiency and clarity of the library's sparse vector handling in future updates." | |
718 (https://gitlab.com/libeigen/eigen/-/merge_requests/718),use consistent `StorageIndex`,"`SparseMatrix::Map` and `SparseMatrix::TransposedSparseMatrix` are defined in `SparseMatrix`, but they always use the default `StorageIndex`. | |
Now they use the same `StorageIndex` as the `SparseMatrix` object.",Erik Schultheis,2021-11-15T22:35:41.734Z,NA,NA,"## Title: | |
Use consistent `StorageIndex` | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request updates the `SparseMatrix::Map` and `SparseMatrix::TransposedSparseMatrix` classes to utilize the same `StorageIndex` as the `SparseMatrix` object, replacing the previous default setting. | |
### Key Changes: | |
- Adjusted `SparseMatrix::Map` and `SparseMatrix::TransposedSparseMatrix` to inherit the `StorageIndex` from the `SparseMatrix` object. | |
### Improvements: | |
- Enhanced consistency across the `SparseMatrix` implementations by ensuring that all related classes share the same `StorageIndex`. | |
### Impact: | |
- This change improves the overall cohesion and usability of the `SparseMatrix` functionality, potentially reducing errors related to `StorageIndex` discrepancies." | |
327 (https://gitlab.com/libeigen/eigen/-/merge_requests/327),Reimplemented the Tensor stream output.,"The implementation structure is based on the IO for Eigen::Matrix. | |
Predefined formats in struct Eigen::TensorIOFormat are defined for numpy-like output, native output, plain output and Legacy output for backwards compatibilty. | |
Documentation and tests are added.",cpp977,2021-11-16T17:36:59.423Z,MR to reopen,NA,"## Title: | |
Reimplemented the Tensor stream output. | |
## Authors: | |
cpp977 | |
## Summary: | |
This merge request reimplements the Tensor stream output functionality in the Eigen C++ library. The new implementation is structured similarly to Eigen::Matrix IO, enhancing consistency across the library. | |
### Key Changes: | |
- The introduction of predefined formats in the struct Eigen::TensorIOFormat for various output styles: numpy-like output, native output, plain output, and legacy output for backward compatibility. | |
- Addition of comprehensive documentation and test cases to ensure reliability. | |
### Improvements: | |
- Enhances the output flexibility of the Tensor data type by providing multiple predefined output formats. | |
- Improves code consistency and maintainability by aligning Tensor IO with Matrix IO functionality. | |
### Impact: | |
This update enhances user experience by accommodating different formatting needs while maintaining compatibility, ensuring smoother transitions for users upgrading from older versions." | |
723 (https://gitlab.com/libeigen/eigen/-/merge_requests/723),Fix tensor broadcast off-by-one error.,Caught by JAX unit tests. Triggered if broadcast size is smaller than packet size.,Antonio Sánchez,2021-11-16T17:53:58.713Z,NA,NA,"## Title: | |
Fix tensor broadcast off-by-one error. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an off-by-one error in tensor broadcasting within the Eigen C++ library. The issue was identified through JAX unit tests and occurs when the broadcast size is smaller than the packet size. | |
### Key Changes: | |
- Fixed an off-by-one error in tensor broadcasting logic when the broadcast size is smaller than the packet size. | |
### Improvements: | |
- Enhances the robustness of tensor operations in the Eigen library by preventing incorrect broadcasting behavior. | |
### Impact: | |
- Resolves a critical issue that could lead to incorrect results in tensor operations, thereby improving the reliability of the library in applications relying on tensor broadcasting." | |
722 (https://gitlab.com/libeigen/eigen/-/merge_requests/722),Update Umeyama.h,"Update Umeyama.h: `src_var` is only used when `with_scaling == true`. Therefore, the actual computation can be avoided when `with_scaling == false`.",Pablo Speciale,2021-11-16T18:14:11.908Z,NA,NA,"## Title: | |
Update Umeyama.h | |
## Authors: | |
Pablo Speciale | |
## Summary: | |
This merge request enhances the `Umeyama.h` file by optimizing the computation process related to the scaling option in the Umeyama algorithm. The update ensures that unnecessary calculations are avoided, improving efficiency when scaling is not needed. | |
### Key Changes: | |
- Conditional logic introduced to skip computation related to `src_var` when `with_scaling` is set to false. | |
### Improvements: | |
- Enhanced efficiency by reducing computational overhead when scaling is not required. | |
### Impact: | |
- The optimization improves performance, particularly in scenarios where scaling is disabled, making the library more efficient for users who do not require this feature." | |
724 (https://gitlab.com/libeigen/eigen/-/merge_requests/724),Make the new TensorIO implementation work with TensorMap with const elements.,Fix issue with `TensorMap<Tensor<const T...>>` in the new TensorIO implementation from !327.,Rasmus Munk Larsen,2021-11-18T17:45:30.916Z,NA,NA,"## Title: | |
Make the new TensorIO implementation work with TensorMap with const elements. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request addresses an issue with the new TensorIO implementation, specifically enabling compatibility with `TensorMap<Tensor<const T...>>`. | |
### Key Changes: | |
- Fixed compatibility of the TensorIO implementation with const elements in `TensorMap`. | |
### Improvements: | |
- Enhanced the functionality of TensorIO by ensuring it can now handle `TensorMap` instances that contain const elements. | |
### Impact: | |
- This change improves the usability of TensorIO in scenarios where const tensors are utilized, making the library more flexible and robust for developers working with tensor data that should remain immutable." | |
719 (https://gitlab.com/libeigen/eigen/-/merge_requests/719),Fixed Sparse-Sparse Product in case of mixed StorageIndex types,"The Sparse-Sparse product implementation converts its arguments to different storage order, but always uses the storage index of the result matrix. This can cause problems if the input indices are not representable as such. | |
The two cases are | |
1) Result is small, but the inputs are large, e.g. (8 x 512) x (512 x 8) -> (8 x 8). This is the case that was broken. | |
2) Result is large, but inputs are small, e.g. (127 x 8) x ( 8 x 127) -> (127 x 127). That worked, but I've added a test-case nontheless. | |
Note that case 1) can still lead to problems, because the `StorageIndex` imposes 2 limits: | |
a) The maximum coordinate any nonzero can have | |
b) The maximum number of nonzeros. | |
While a) is clearly not violated in case 1), we might create two many nonzeros ans the product would still fail. In this case the result is truly not representable with the given types. This MR only fixes the case when the operation itself is valid.",Erik Schultheis,2021-11-18T18:33:33.309Z,NA,NA,"## Title: | |
Fixed Sparse-Sparse Product in case of mixed StorageIndex types | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request addresses a flaw in the Sparse-Sparse product implementation within the Eigen C++ library, specifically when dealing with mixed StorageIndex types. It ensures that the computation correctly handles the storage indices of matrices to avoid issues that arise when input indices are not representable as the result's storage index. | |
### Key Changes: | |
- Adjusted the Sparse-Sparse product implementation to properly manage storage indices when the result matrix is small, but the input matrices are large. | |
- Added test cases to cover scenarios where the result matrix is large, but the input matrices are small. | |
### Improvements: | |
- Enhanced the robustness of the Sparse-Sparse product by preventing potential failures related to the representation limits imposed by StorageIndex types. | |
- Improved code reliability by ensuring proper handling in both small and large matrix scenarios. | |
### Impact: | |
These changes significantly enhance the reliability of Sparse-Sparse product computations in Eigen, reducing the likelihood of errors when mixing different StorageIndex types and ensuring that the results remain valid and representable for given input sizes." | |
728 (https://gitlab.com/libeigen/eigen/-/merge_requests/728),Fix errors for Windows build.,NA,Antonio Sánchez,2021-11-19T04:41:33.660Z,NA,NA,"## Title: | |
Fix errors for Windows build. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses and resolves various errors associated with building the Eigen C++ library on Windows. | |
### Key Changes: | |
- Corrected compilation errors specific to the Windows environment. | |
### Improvements: | |
- Enhanced compatibility of the Eigen library with Windows systems. | |
### Impact: | |
- Ensures smoother builds and usability of the Eigen library on Windows platforms, potentially increasing user accessibility and satisfaction." | |
726 (https://gitlab.com/libeigen/eigen/-/merge_requests/726),Add basic iterator support for Eigen::array to ease transition to std::array,"In particular, for code built with EIGEN_AVOID_STL_ARRAY, the inconsistency in syntax makes it cumbersome to remove this compilation option.",Rasmus Munk Larsen,2021-11-19T05:31:16.026Z,NA,NA,"## Title: | |
Add basic iterator support for Eigen::array to ease transition to std::array | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request introduces basic iterator support for `Eigen::array`, aimed at facilitating the transition from `Eigen::array` to `std::array`. This update addresses the syntax inconsistencies that arise when using the compilation option `EIGEN_AVOID_STL_ARRAY`. | |
### Key Changes: | |
- Added basic iterator functionality to `Eigen::array`. | |
### Improvements: | |
- Simplifies the process of transitioning code from `Eigen::array` to `std::array`. | |
- Reduces syntax inconsistencies that were cumbersome for developers using `EIGEN_AVOID_STL_ARRAY`. | |
### Impact: | |
- Enhances usability for developers, making it easier to switch or coexist between `Eigen::array` and `std::array`, ultimately improving code flexibility and maintainability." | |
725 (https://gitlab.com/libeigen/eigen/-/merge_requests/725),don't use deprecated MappedSparseMatrix,"This MR removes Eigen-internal references to the deprecated MappedSparseMatrix type. | |
Q: should the MappedSparseMatrix type be removed entirely, now that the C++14 jump is happening?",Erik Schultheis,2021-11-19T15:58:05.090Z,NA,NA,"## Title: | |
Don't use deprecated MappedSparseMatrix | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request focuses on the removal of all internal references to the deprecated MappedSparseMatrix type within the Eigen C++ library. | |
### Key Changes: | |
- Removed instances of the MappedSparseMatrix type from the codebase. | |
### Improvements: | |
- Simplifies the library by eliminating outdated and deprecated components. | |
### Impact: | |
- Enhances maintainability and prepares the library for the transition to C++14, ensuring better compatibility and addressing any potential issues related to deprecated features." | |
727 (https://gitlab.com/libeigen/eigen/-/merge_requests/727),Make numeric_limits members constexpr as per the newer C++ standards.,Author: [email protected].,Rasmus Munk Larsen,2021-11-19T16:14:06.379Z,NA,NA,"## Title: | |
Make numeric_limits members constexpr as per the newer C++ standards. | |
## Authors: | |
[email protected], Rasmus Munk Larsen | |
## Summary: | |
This merge request updates the Eigen C++ library by making the members of `numeric_limits` constexpr, aligning the code with newer C++ standards. This change enhances compile-time evaluation capabilities within the library. | |
### Key Changes: | |
- Converted `numeric_limits` members to constexpr. | |
### Improvements: | |
- Enhanced performance by enabling compile-time constant expressions. | |
### Impact: | |
- Improves efficiency and type safety in numerical computations by leveraging constexpr capabilities in modern C++." | |
729 (https://gitlab.com/libeigen/eigen/-/merge_requests/729),Implement Eigen::array<...>::reverse_iterator if std::reverse_iterator exists.,"This is needed by the new TensorIO implementation, and it is handy to have available. TODO: Implement for CUDA.",Rasmus Munk Larsen,2021-11-20T00:22:46.929Z,NA,NA,"## Title: | |
Implement Eigen::array<...>::reverse_iterator if std::reverse_iterator exists. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request introduces the `reverse_iterator` for `Eigen::array<...>` to enhance functionality, particularly in light of the new TensorIO implementation. | |
### Key Changes: | |
- Implemented `Eigen::array<...>::reverse_iterator` when `std::reverse_iterator` is available. | |
### Improvements: | |
- The addition of `reverse_iterator` provides improved iteration capabilities for `Eigen::array<...>`, making it more versatile for users. | |
### Impact: | |
This change supports the development of the TensorIO module, facilitating easier data manipulation and potentially improving performance in scenarios where reverse iteration is required. The implementation is aimed at future compatibility with CUDA as well." | |
733 (https://gitlab.com/libeigen/eigen/-/merge_requests/733),Fix warnings about shadowing definitions.,NA,Rasmus Munk Larsen,2021-11-23T22:52:25.977Z,NA,NA,"## Title: | |
Fix warnings about shadowing definitions. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request addresses and resolves warnings related to shadowing definitions in the Eigen C++ library, ensuring cleaner and more maintainable code. | |
### Key Changes: | |
- Modified variable names in the code to eliminate instances of shadowing. | |
### Improvements: | |
- Improved code clarity by ensuring that variable names do not conflict with those in outer scopes. | |
### Impact: | |
- Reduces potential confusion for developers reading the code, enhances maintainability, and helps prevent bugs associated with variable shadowing." | |
732 (https://gitlab.com/libeigen/eigen/-/merge_requests/732),remove EIGEN_HAS_CXX11,This MR removes the EIGEN_HAS_CXX11 macro (see #2372) and all corresponding ifs. It also removes the test cases that explicitly set the C++ version to less than 11. I've also removed two conditional compilations for GCC versions less than 4.,Erik Schultheis,2021-11-24T20:08:50.121Z,NA,NA,"## Title: | |
Remove EIGEN_HAS_CXX11 | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request simplifies the Eigen C++ library by removing the EIGEN_HAS_CXX11 macro and related conditional compilation directives. It also eliminates test cases set up for C++ versions below 11 and removes specific conditional compilations for older GCC versions. | |
### Key Changes: | |
- Removed the EIGEN_HAS_CXX11 macro. | |
- Eliminated all conditional statements related to this macro. | |
- Removed test cases targeting C++ versions lower than 11. | |
- Deleted conditional compilations for GCC versions earlier than 4. | |
### Improvements: | |
- Streamlines the codebase by reducing complexity associated with version checks. | |
- Enhances maintainability by focusing support on C++11 and newer standards. | |
### Impact: | |
This change positively impacts the library by reducing conditional compilation, thus simplifying future development and ensuring the code is aligned with modern C++ standards. It eliminates legacy support, which can improve the performance and reliability of the library." | |
737 (https://gitlab.com/libeigen/eigen/-/merge_requests/737),split up large Lapacke LLT macro,"### Reference issue | |
Came across this when working on !731 | |
### What does this implement/fix? | |
Currently, the binding of LLT to Lapacke is done using a large macro. This factors out a large part of the functionality of the macro and implement them explicitly. This results in an increase in line count, though if you take away the number of new comment or blank lines I think the change is marginal. | |
### Additional information | |
On my system, with both `liblapacke` and `liblapacke64` some tests fail. They also fail without the changes here, though.",Erik Schultheis,2021-11-25T16:11:25.952Z,NA,NA,"## Title: | |
Split up large Lapacke LLT macro | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request refactors the existing macro binding for LLT to Lapacke by breaking it down into smaller, explicit implementations. It aims to improve code clarity and maintainability. | |
### Key Changes: | |
- The large macro for LLT binding to Lapacke has been split into smaller, more manageable components. | |
### Improvements: | |
- The refactoring increases the clarity of the code, making it easier to understand and maintain, despite a marginal increase in line count. | |
### Impact: | |
- While some tests fail on the author's system with both `liblapacke` and `liblapacke64`, these issues existed prior to the changes, indicating that the refactor does not introduce new failures." | |
740 (https://gitlab.com/libeigen/eigen/-/merge_requests/740),Remove DenseBase::nonZeros() which just calls DenseBase::size(),NA,David Tellenbach,2021-11-27T14:31:02.742Z,NA,NA,"## Title: | |
Remove DenseBase::nonZeros() which just calls DenseBase::size() | |
## Authors: | |
David Tellenbach | |
## Summary: | |
This merge request involves the removal of the `nonZeros()` function from the `DenseBase` class in the Eigen C++ library. This function was redundant as it merely called the existing `size()` function. | |
### Key Changes: | |
- Deleted the `nonZeros()` method from the `DenseBase` class. | |
### Improvements: | |
- Simplification of the codebase by removing redundant functions, improving readability and maintainability. | |
### Impact: | |
- This change helps streamline the interface of the `DenseBase` class, reducing confusion for users regarding the existence of multiple methods that produce the same output." | |
741 (https://gitlab.com/libeigen/eigen/-/merge_requests/741),Fix for HIP compilation failure in DenseBase.,"Commit 96e537d6fd1a7187feb853c1bdbdea69ee7b99ec introduced EIGEN_DEVICE_FUNC modifiers to some DenseBase functions. | |
The corresponding functions in DenseBase.h were missing the modifiers and this caused a compilation failure in HIP. | |
/cc @cantonios",Rohit Santhanam,2021-11-28T15:59:31.010Z,NA,NA,"## Title: | |
Fix for HIP compilation failure in DenseBase | |
## Authors: | |
Rohit Santhanam | |
## Summary: | |
This merge request addresses a compilation issue in the Eigen C++ library related to HIP (Heterogeneous-computing Interface for Portability). It ensures that specific functions in the DenseBase class are properly marked with EIGEN_DEVICE_FUNC modifiers. | |
### Key Changes: | |
- Added EIGEN_DEVICE_FUNC modifiers to the necessary functions in DenseBase.h that were previously missing. | |
### Improvements: | |
- Resolves the compilation failure when using HIP, enhancing the compatibility of the Eigen library with HIP environments. | |
### Impact: | |
This fix improves the robustness of the Eigen library by ensuring that all relevant functions can be compiled successfully under HIP, thus broadening its usability in heterogeneous computing scenarios." | |
735 (https://gitlab.com/libeigen/eigen/-/merge_requests/735),removed EIGEN_HAS_CXX11_* and redundant EIGEN_COMP_CXXVER checks,"This MR removes conditional compilation for C++11 features `CONTAINERS`, `RVALUE_REFERENCES`, `NOEXCEPT`, `ATOMIC` and `OVERRIDE`. Essentially the feature macros which had a simple condition when they are enabled. I have merged the minimum compiler versions from these checks into the `this compiler is too old` error path. | |
I have also removed explicit checks for `EIGEN_COMP_CXXVER` and `EIGEN_CAX_CPP_VER` (which seem to have rather inconsistent spelling) that are always true (false) when we have at least C++11. | |
Q: What is the `#define EIGEN_INCLUDE_TYPE_TRAITS` doing in the noexcept part? It seems that it is possible to have `EIGEN_HAS_TYPE_TRAITS` 0 but still get `EIGEN_INCLUDE_TYPE_TRAITS`? | |
#2372",Erik Schultheis,2021-11-29T19:18:58.362Z,NA,NA,"## Title: | |
Removed EIGEN_HAS_CXX11_* and Redundant EIGEN_COMP_CXXVER Checks | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request streamlines the Eigen C++ library by eliminating unnecessary conditional compilation for various C++11 features. It simplifies the handling of compiler version checks and enhances code clarity. | |
### Key Changes: | |
- Removed feature macros related to C++11: `CONTAINERS`, `RVALUE_REFERENCES`, `NOEXCEPT`, `ATOMIC`, and `OVERRIDE`. | |
- Merged minimum compiler version checks into a single error message for outdated compilers. | |
- Eliminated redundant checks for `EIGEN_COMP_CXXVER` and `EIGEN_CAX_CPP_VER`. | |
### Improvements: | |
- Reduced code complexity by removing redundant feature checks. | |
- Enhanced readability and maintainability of the codebase. | |
### Impact: | |
This update ensures that code reliant on C++11 features can be compiled more straightforwardly, thereby improving compatibility and reducing compile-time configurations for developers using modern compilers. It also prevents potential inconsistencies related to compiler version checks, facilitating a smoother development experience with Eigen." | |
742 (https://gitlab.com/libeigen/eigen/-/merge_requests/742),Updated CMake,"This MR updates the minimum CMake version required to 3.10, which is supported by both Ubuntu 18 (3.10 supported till April 23) and and Debian Buster (3.13 EOL August 22, LTS?). It is not included in Debian stretch, which is EOL but still receives LTS until June 22. | |
I have also removed the option to disable C++11 tests from the CMake file, and cleaned up the corresponing names in the CI. | |
I have also included a second commit which changes the minimum version of gcc to GCC 5. This might conflict or be redundant with !739, so if you like I can also submit a PR without the second change.",Erik Schultheis,2021-11-29T20:24:21.383Z,NA,NA,"## Title: | |
Updated CMake | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request updates the minimum CMake version requirement to 3.10 and adjusts the minimum GCC version to 5. It also removes the option to disable C++11 tests from the CMake file. | |
### Key Changes: | |
- Minimum CMake version updated to 3.10. | |
- Removed option to disable C++11 tests from CMake. | |
- Minimum GCC version set to 5. | |
### Improvements: | |
- Ensures compatibility with supported versions of major Linux distributions. | |
- Cleans up CI configuration related to C++11 tests. | |
### Impact: | |
- Aligns the library's build system with more modern standards, enhancing overall robustness and support for newer compilers." | |
658 (https://gitlab.com/libeigen/eigen/-/merge_requests/658),Update SVD Module to allow specifying computation options with a template parameter. Resolves #2051,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
This is related to issue #2051 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This refactoring is an API improvement that changes the ``QRPreconditioner`` template parameter in JacobiSVD to a generic ``Options`` template parameter. For consistency, it also adds a similar optional ``Options`` parameter to BDCSVD. | |
This allows users to request thin unitaries for fixed-size matrices, which currently does not work. This change seems particularly beneficial when using ``.solve(y)``. This does not make any algorithmic changes, **but it is backwards incompatible change to the existing API**. | |
[Example of Ceres having to workaround this assertion](https://github.com/ceres-solver/ceres-solver/blob/ec4f2995bbde911d6861fb5c9bb7353ad796e02b/internal/ceres/invert_psd_matrix.h#L71) | |
Here is an example of how the new API gets used: | |
~~~~ | |
#include <Eigen/Dense> | |
#include <iostream> | |
using namespace Eigen; | |
using FixedSizeMatrix = Matrix<double, 4, 5>; | |
int main() { | |
FixedSizeMatrix m = FixedSizeMatrix::Random(); | |
// Oh no! The following line has assertion because of the fixed size matrix type. | |
// JacobiSVD<FixedSizeMatrix> svd1(m, ComputeThinU | ComputeThinV); | |
JacobiSVD<FixedSizeMatrix, ColPivHouseholderQRPreconditioner | ComputeThinU | ComputeThinV> svd2(m); | |
std::cout << svd2.singularValues().transpose() << '\n'; | |
return 0; | |
} | |
~~~~ | |
### Additional information | |
<!--Any additional information you think is important.--> | |
For testing I mostly wanted to try all combinations of options and make sure that sizes and storage orders are set properly. It does run computation checks, but since it's not an algorithmic change I thought it would be redundant to redo everything. | |
I'm also a little unsure about the BDCSVD tests. It seems like the existing tests occasionally fail on the master branch if they're run a lot. Is this actually the case? They still all pass most of the time, but I think it's just the additional computation checks are making failures a little bit more common. | |
All feedback to improve this is greatly appreciated!",Arthur,2021-11-29T20:50:47.484Z,"API change, MR to reopen",NA,"## Title: | |
Update SVD Module to allow specifying computation options with a template parameter. Resolves #2051 | |
## Authors: | |
Arthur | |
## Summary: | |
This merge request enhances the SVD module by introducing a new template parameter for computation options in the JacobiSVD and BDCSVD classes, replacing the previous QRPreconditioner parameter. This change aims to improve code consistency and allows users greater flexibility when handling fixed-size matrices. | |
### Key Changes: | |
- Refactored `QRPreconditioner` template parameter in `JacobiSVD` to a generic `Options` parameter. | |
- Added a similar optional `Options` parameter to `BDCSVD`. | |
- Introduced functionality for requesting thin unitaries in fixed-size matrix computations. | |
### Improvements: | |
- Enhanced API usability and consistency across SVD modules. | |
- Users can now utilize `.solve(y)` for fixed-size matrices with expected behavior. | |
### Impact: | |
- The change is backwards incompatible, which may affect existing users relying on the previous API. | |
- It opens new avenues for users to leverage optimization options in their SVD computations." | |
734 (https://gitlab.com/libeigen/eigen/-/merge_requests/734),Select AVX2 even if the data size is not a multiple of 8,"This is a second version of https://gitlab.com/libeigen/eigen/-/merge_requests/46 . That PR contained very useful comments that now seem to be lost. | |
In any case, it was first merged, and then reverted by @rmlarsen1 because some tests were failing. I fixed the tests and added some more tests on top, but it was never merged back. | |
Moreover, the branch was reverted wrongly, in the sense that the revert still left some changes in: | |
``` | |
% git diff 52a2fbbb008a47c5e3fb8ac1c65c2feecb0c511c..5ca10480b0756e40b0723d90adeba8506291fc7c Eigen/src/Core/util/XprHelper.h | |
diff --git a/Eigen/src/Core/util/XprHelper.h b/Eigen/src/Core/util/XprHelper.h | |
index fd2db56a4..26aa609fe 100644 | |
--- a/Eigen/src/Core/util/XprHelper.h | |
+++ b/Eigen/src/Core/util/XprHelper.h | |
@@ -195,7 +195,7 @@ template<typename T> struct unpacket_traits | |
}; | |
template<int Size, typename PacketType, | |
- bool Stop = Size==Dynamic || (Size%unpacket_traits<PacketType>::size)==0 || is_same<PacketType,typename unpacket_traits<PacketType>::half>::value> | |
+ bool Stop = Size==Dynamic || Size >= unpacket_traits<PacketType>::size || is_same<PacketType,typename unpacket_traits<PacketType>::half>::value> | |
struct find_best_packet_helper; | |
template< int Size, typename PacketType> | |
``` | |
So the first fix of my MR (commit 5ca10480b0756e40b0723d90adeba8506291fc7c) actually mistakenly remained in master, but without the further improvements later. | |
It's quite unfortunate that the very useful discussion that happened in https://gitlab.com/libeigen/eigen/-/merge_requests/46 is now gone. Is there a way to recover it?",Francesco Mazzoli,2021-11-29T21:13:26.011Z,NA,NA,"## Title: | |
Select AVX2 even if the data size is not a multiple of 8 | |
## Authors: | |
Francesco Mazzoli | |
## Summary: | |
This merge request aims to restore and improve functionality for selecting AVX2 optimizations in the Eigen library, even when data sizes are not multiples of 8. The changes address issues that arose from a previous merge request, which was reverted due to failing tests. | |
### Key Changes: | |
- Adjusted the logic in `XprHelper.h` to allow the selection of AVX2 for data sizes that may not be multiples of 8. | |
- Resolved previous test failures and added additional tests to ensure robustness. | |
### Improvements: | |
- Enhanced the selection criteria for packet processing, leading to potentially better optimization for various data sizes. | |
- Addressed oversight in a previous revert that left some necessary fixes unmerged. | |
### Impact: | |
These changes improve the performance and flexibility of the Eigen library by enabling more efficient vectorization for a broader range of data sizes, which can lead to better performance in applications that utilize the library." | |
730 (https://gitlab.com/libeigen/eigen/-/merge_requests/730),bugfix: issue #2375,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
Fixes #2375 | |
### What does this implement/fix? | |
Fixed indexed views for indices of non Eigen, non built-in types (e.g. `std::array`) | |
### Additional information | |
The underlying problem was with the way strides were being computed in the `traits` class for indexed views. When the increment value for the supplied index was undefined (as with e.g. `std::array`), the stride value was computed as the product of the undefined increment value (i.e. `0xfffffe`) and the expression stride. Since nothing in the test suite depends on this stride value, this went unnoticed (perhaps this value is altogether irrelevant for indexed views?). However, for sufficiently large (i.e. 129 or more) values of the expression stride, the aforementioned product resulted in a signed integer overflow, which gcc detected at compile time. The fix consisted of explicitly checking for and undefined increment value and setting the stride to `Dynamic` when it was detected.",Jakub Gałecki,2021-11-29T22:26:16.173Z,NA,NA,"## Title: | |
bugfix: issue #2375 | |
## Authors: | |
Jakub Gałecki | |
## Summary: | |
This merge request addresses and fixes an issue related to indexed views for indices of non-Eigen, non-built-in types, such as `std::array`. | |
### Key Changes: | |
- Fixed the computation of strides in the `traits` class for indexed views, particularly when dealing with indices that have undefined increment values. | |
- Implemented a safeguard that checks for undefined increment values and sets the stride to `Dynamic` when necessary. | |
### Improvements: | |
- Resolves a potential signed integer overflow issue that could occur with large expression strides, which was previously undetected due to a lack of dependency in the test suite. | |
- Enhances the stability and reliability of the library when interacting with custom index types. | |
### Impact: | |
This fix improves the correctness of the Eigen library when handling indexed views with unsupported types, ensuring that developers can use these features without encountering overflow errors or unexpected behavior." | |
736 (https://gitlab.com/libeigen/eigen/-/merge_requests/736),SFINAE'ing away non-const overloads if selfAdjoint/triangular view is not referring to an lvalue,"### Reference issue | |
None, I noticed this while working on !731 | |
### What does this implement/fix? | |
Currently, the non-const transpose methods in selfadjoint and triangular views static_assert that the view represents an lvalue. With this change, this overload is automatically disabled for non-lvalues and the const version is considered. | |
### Additional information | |
Currently clang-tidy produces some ""return type is const-qualified at the top level which may reduce readability without improving const correctness"" warnings. The const-qualification there is in fact necessary, because otherwise the wrong `.transpose` overload would be selected. I have only removed that overload from consideration, but left the return types const for now. | |
There is a second method which has the same static_assert (coeffRef). In this case, though, I have not changed anything, because there is no valid alternative overload, and I think the static assert provides a more informative error message than just getting the compiler error about a missing method, for which you manually have to decipher the SFINAE that disabled it.",Erik Schultheis,2021-11-29T22:51:28.571Z,NA,NA,"## Title: | |
SFINAE'ing away non-const overloads if selfAdjoint/triangular view is not referring to an lvalue | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request addresses the handling of non-const overloads for transpose methods in self-adjoint and triangular views within the Eigen C++ library. It adjusts the behavior to disable specific overloads automatically when the associated view does not refer to an lvalue, thus routing to the const version instead. | |
### Key Changes: | |
- Non-const transpose methods in self-adjoint and triangular views are modified to automatically disable overloads when not referring to an lvalue. | |
### Improvements: | |
- Reduces the occurrence of ""return type is const-qualified"" warnings from clang-tidy by removing unnecessary static assertions. | |
- Maintains const return types to ensure the correct `.transpose` overload is selected. | |
### Impact: | |
- Enhances code readability and correctness by preventing incorrect overload selection while simplifying the interface for users dealing with non-lvalue expressions." | |
745 (https://gitlab.com/libeigen/eigen/-/merge_requests/745),Fix for HIP compilation breakage in selfAdjoint and triangular view classes.,"HIP related compilation fix for selfAdjoint and triangular view class changes. | |
/cc @cantonios",Rohit Santhanam,2021-11-30T14:01:00.453Z,NA,NA,"## Title: | |
Fix for HIP compilation breakage in selfAdjoint and triangular view classes. | |
## Authors: | |
Rohit Santhanam | |
## Summary: | |
This merge request addresses a compilation issue related to HIP (Heterogeneous-compute Interface for Portability) specifically affecting the selfAdjoint and triangular view classes in the Eigen C++ library. | |
### Key Changes: | |
- Implemented fixes to resolve HIP-related compilation errors in the selfAdjoint and triangular view classes. | |
### Improvements: | |
- Enhanced compatibility of the Eigen library with HIP, ensuring smoother compilation for users leveraging this technology. | |
### Impact: | |
- This fix improves the usability of the Eigen library in environments that utilize HIP, thereby expanding its applicability in heterogeneous computing scenarios." | |
746 (https://gitlab.com/libeigen/eigen/-/merge_requests/746),fixed cholesky with 0 sized matrix (cf. #785),"Lapacke considers 0-sized matrices in LLT to be an error, whereas the Eigen test suite expects them to be a success. | |
This change turn lapacke-based LLT into a no-op that returns success if the input has zero size, thus making the corresponding Eigen test cases pass.",Erik Schultheis,2021-11-30T17:17:42.252Z,NA,NA,"## Title: | |
Fixed Cholesky with 0 Sized Matrix (cf. #785) | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request addresses the inconsistency between LAPACKE's handling of 0-sized matrices in LLT and the expectations of the Eigen test suite. The modification ensures that LAPACKE-based LLT treats 0-sized matrices as valid inputs, returning success instead of an error. | |
### Key Changes: | |
- Modified LAPACKE-based LLT to return success for 0-sized matrices. | |
### Improvements: | |
- Aligns the behavior of LAPACKE with Eigen's expectations for handling 0-sized matrices. | |
### Impact: | |
- The change allows relevant Eigen test cases to pass, thus enhancing the reliability and consistency of the library’s functionality with edge cases." | |
749 (https://gitlab.com/libeigen/eigen/-/merge_requests/749),"Revert ""Update SVD Module to allow specifying computation options with a...","This change broke a lot of third party libraries, and without an associated version change, this change is too disruptive. We will revert until we come up with a better solution.",Rasmus Munk Larsen,2021-11-30T18:45:55.438Z,NA,NA,"## Title: | |
Revert ""Update SVD Module to allow specifying computation options with a..."" | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request aims to revert a previous change made to the SVD module of the Eigen C++ library. The original update introduced computation options that proved to be disruptive for many third-party libraries. The decision is to revert the change until a more suitable solution can be developed. | |
### Key Changes: | |
- Reversion of the update to the SVD module that allowed the specification of computation options. | |
### Improvements: | |
- Addresses compatibility issues with third-party libraries that were caused by the earlier update. | |
### Impact: | |
- Restores stability for users relying on third-party libraries, preventing further disruptions until a better solution is proposed." | |
744 (https://gitlab.com/libeigen/eigen/-/merge_requests/744),Require recent GCC and MSCV and removed `EIGEN_HAS_CXX14` and some other feature test macros,"This MR removes the `EIGEN_HAS_VARIADIC_TEMPLATES` and `EIGEN_HAS_STATIC_ARRAY_TEMPLATE`, | |
`EIGEN_HAS_ALIGNAS` (which seemed to be unused) as well as `EIGEN_HAS_CXX14` and the corresponding version checks. | |
It also removed the checks for GCC older than 5.1 and MSCV older than 1900. | |
What is the new minimum version for ICC? I think 1600 would make all the current checks pass. | |
Also, there are some checks that checked `EIGEN_MAX_CPP` for 14 and then used standard feature-test macros directly. So far I've only removed the MAX_CPP part of the check. | |
I've updated the list in #2372",Erik Schultheis,2021-12-01T00:48:35.494Z,NA,NA,"## Title: | |
Require recent GCC and MSCV and removed `EIGEN_HAS_CXX14` and some other feature test macros | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request streamlines feature test macros within the Eigen C++ library by removing outdated checks and requiring more recent versions of GCC and MSVC. Specifically, it eliminates several unused feature tests and updates the compiler requirements. | |
### Key Changes: | |
- Removed `EIGEN_HAS_VARIADIC_TEMPLATES`, `EIGEN_HAS_STATIC_ARRAY_TEMPLATE`, `EIGEN_HAS_ALIGNAS`, and `EIGEN_HAS_CXX14`. | |
- Eliminated checks for GCC versions older than 5.1 and MSVC versions older than 1900. | |
- Adjusted compatibility checks related to `EIGEN_MAX_CPP`. | |
### Improvements: | |
- Simplifies the codebase by removing unnecessary feature test macros, enhancing clarity and maintainability. | |
- Potentially reduces compilation time by decreasing the number of conditional checks referenced during compilation. | |
### Impact: | |
- Users must now utilize more recent compilers (GCC 5.1 or later, MSVC 1900 or later) for compatibility with the Eigen library, which may affect backward compatibility for projects using older compilers. | |
- Improved performance and easier maintenance of the library due to a cleaner and more focused codebase." | |
739 (https://gitlab.com/libeigen/eigen/-/merge_requests/739),Disable GCC-4.8 tests.,This is to unblock moving the minimum requirement to c++!4.,Antonio Sánchez,2021-12-01T02:12:52.637Z,NA,NA,"## Title: | |
Disable GCC-4.8 tests. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request proposes the disabling of tests for GCC-4.8 to facilitate the transition of the Eigen C++ library's minimum requirement to C++14. | |
### Key Changes: | |
- Disabled tests that are incompatible with GCC-4.8. | |
### Improvements: | |
- Streamlines testing requirements, allowing for a smoother transition to C++14. | |
### Impact: | |
- Helps in advancing the library's compatibility and features by dropping support for an older compiler version, thereby enhancing overall code quality and maintainability." | |
752 (https://gitlab.com/libeigen/eigen/-/merge_requests/752),Deprecate macro EIGEN_GPU_TEST_C99_MATH as it's only used in one file and always true.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> Minor fix of deprecated macro, dependent on cxx11 macro before.",Xinle Liu,2021-12-01T14:48:57.304Z,NA,NA,"## Title: | |
Deprecate macro EIGEN_GPU_TEST_C99_MATH as it's only used in one file and always true. | |
## Authors: | |
Xinle Liu | |
## Summary: | |
This merge request deprecates the macro `EIGEN_GPU_TEST_C99_MATH`, which is found to be used only in a single file and has a constant true value. This action streamlines the code by removing unnecessary macros that do not contribute to functionality. | |
### Key Changes: | |
- Deprecated the `EIGEN_GPU_TEST_C99_MATH` macro. | |
### Improvements: | |
- Reduces code clutter by removing unused macros. | |
- Simplifies maintenance of the codebase. | |
### Impact: | |
The deprecation will lead to cleaner code and potentially improve compilation times, as unused code is no longer processed." | |
748 (https://gitlab.com/libeigen/eigen/-/merge_requests/748),Improved lapacke binding code for HouseholderQR and PartialPivLU,"This MR replaces the binding macros with C++ code for HouseholderQR and PartialPivLU and factors out common binding code into a new file. | |
For the remaining Lapacke bindings, I'm not so sure what the best way to handle them is, since they do explicitly specialize functions of an existing template class instead of the class itself. Maybe refactor the class so that the computation is separated from the interface. | |
Currently, `ColPivHouseholderQR` allocates working arrays `m_colsTranspositions`, `m_temp`, `m_colNormsUpdated`, `m_colNormsDirect` which are unused in the Lapacke code path. Encapsulating this in a separate subobject that handles the computations would mean we could get rid of this wasted space in the Lapacke binding, and potentially instead allocate a buffer that is used by the lapacke functions as the work space.",Erik Schultheis,2021-12-02T00:10:58.956Z,NA,NA,"## Title: | |
Improved lapacke binding code for HouseholderQR and PartialPivLU | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request enhances the Eigen C++ library by replacing macros with C++ code for the Lapacke bindings of HouseholderQR and PartialPivLU. Additionally, it extracts common binding code into a new file for better organization and efficiency. | |
### Key Changes: | |
- Replacement of binding macros with C++ code for HouseholderQR and PartialPivLU. | |
- Common binding code factored into a new dedicated file. | |
### Improvements: | |
- The changes aim to streamline the binding process for Lapacke, which could lead to cleaner and more maintainable code. | |
- Proposes the potential refactor of the class structure to separate computation from the interface, improving memory usage by eliminating unused working arrays. | |
### Impact: | |
These enhancements are expected to optimize memory usage and performance of the Lapacke bindings in Eigen, providing a more efficient framework for users relying on these functionalities." | |
755 (https://gitlab.com/libeigen/eigen/-/merge_requests/755),fixed leftover else branch,"This fixes (removes) the leftover else branch that was part of an `#ifdef` that got deleted with the recent changes. I guess the smoketests don't even try to parse the unsupported module. Maybe it would be good to have some tests there that at least try to include the unsupported parts, i.e. they are unsupported, but we still might want to guarantee that at least including their headers into an empty file works.",Erik Schultheis,2021-12-02T18:13:20.537Z,NA,NA,"## Title: | |
Fixed leftover else branch | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request addresses a leftover else branch that remained after the removal of an `#ifdef` in previous changes. | |
### Key Changes: | |
- Removed the unused else branch associated with a deleted `#ifdef`. | |
### Improvements: | |
- Suggests the addition of tests to ensure that unsupported module headers can still be included without issues, promoting better code robustness. | |
### Impact: | |
- Enhances code clarity by eliminating unnecessary branches, potentially reducing confusion for future code maintenance. The proposed tests could improve reliability in including unsupported modules." | |
447 (https://gitlab.com/libeigen/eigen/-/merge_requests/447),Bicgstabl,"Adds the BiCGSTAB(L) algorithm for solving linear systems. | |
Although often IDR(s) is more efficient, for strongly non-symmetric system with large imaginary eigenvalues, BiCGSTAB(L) typically converges faster. | |
This implementation sometimes does not pass the test and fails with a norm of 1e-15. Advice on how to improve the numerical stability are welcome. To some extend it seems to be a limit to this specific algorithm.",Jens Wehner,2021-12-02T22:48:23.453Z,MR to reopen,NA,"## Title: | |
Bicgstabl | |
## Authors: | |
Jens Wehner | |
## Summary: | |
The merge request introduces the BiCGSTAB(L) algorithm for solving linear systems, providing an alternative approach for specific cases compared to the existing IDR(s). | |
### Key Changes: | |
- Addition of the BiCGSTAB(L) algorithm implementation. | |
### Improvements: | |
- Faster convergence for strongly non-symmetric systems with large imaginary eigenvalues compared to IDR(s). | |
### Impact: | |
- Enhances the Eigen library's capabilities for solving challenging linear systems, although it currently faces some numerical stability issues that need addressing." | |
757 (https://gitlab.com/libeigen/eigen/-/merge_requests/757),Idrs refactoring,Basically reformats IDRS code and replaces calls to norm() by calls to StableNorm(),Jens Wehner,2021-12-02T23:32:08.066Z,NA,NA,"## Title: | |
Idrs refactoring | |
## Authors: | |
Jens Wehner | |
## Summary: | |
This merge request refactors the IDRS code within the Eigen C++ library. The primary alteration involves replacing method calls to `norm()` with `StableNorm()` to enhance code stability and functionality. | |
### Key Changes: | |
- Reformatted the IDRS code for improved readability. | |
- Replaced instances of `norm()` with `StableNorm()` for better performance. | |
### Improvements: | |
- The refactoring enhances code clarity and maintainability. | |
- Switching to `StableNorm()` may offer more reliable numerical stability in computations. | |
### Impact: | |
These updates are expected to improve the robustness and efficiency of the IDRS algorithm, potentially leading to better performance in numerical applications using the Eigen library." | |
756 (https://gitlab.com/libeigen/eigen/-/merge_requests/756),Only include <atomic> if needed.,"### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This will allow (e.g. embedded) toolchains without atomic support to continue compiling Eigen Core by setting EIGEN_DONT_PARALLELIZE.",Rasmus Munk Larsen,2021-12-02T23:55:25.835Z,NA,NA,"## Title: | |
Only include <atomic> if needed. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request implements a conditional inclusion of the `<atomic>` header in the Eigen C++ library, aimed at improving compatibility with toolchains that do not support atomic operations. | |
### Key Changes: | |
- The inclusion of `<atomic>` is now conditional, allowing for compilation of Eigen Core in environments where atomic support is not available by defining `EIGEN_DONT_PARALLELIZE`. | |
### Improvements: | |
- Enhances compatibility with embedded toolchains by removing the dependency on atomic operations, thus broadening the usability of the library. | |
### Impact: | |
- Users of Eigen in restricted environments can now compile the library without atomic support, facilitating its use in various applications and improving overall accessibility." | |
759 (https://gitlab.com/libeigen/eigen/-/merge_requests/759),fix typo `StableNorm` -> `stableNorm`,"the new code in `IDRS.h` misspells `stableNorm`. | |
This seems to have not been caught by the smoketests, maybe they should be extended?",Erik Schultheis,2021-12-04T14:52:10.315Z,NA,NA,"## Title: | |
Fix typo `StableNorm` -> `stableNorm` | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request addresses a typo in the `IDRS.h` file where `StableNorm` is incorrectly spelled as `stableNorm`. | |
### Key Changes: | |
- Corrected the spelling of `StableNorm` to `stableNorm` in the code. | |
### Improvements: | |
- Ensures consistency in naming conventions within the codebase. | |
### Impact: | |
- Enhances code clarity, potentially reducing confusion and improving maintainability. Additionally, it raises the question of enhancing smoketests to catch such errors in the future." | |
762 (https://gitlab.com/libeigen/eigen/-/merge_requests/762),fixed snippets,see !760,Erik Schultheis,2021-12-05T17:31:12.703Z,NA,NA,"## Title: | |
fixed snippets | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request addresses and resolves issues related to code snippets in the Eigen C++ library, enhancing clarity and functionality. | |
### Key Changes: | |
- Fixed various code snippets in the documentation to ensure they are accurate and functional. | |
### Improvements: | |
- Improved the readability and reliability of the documentation snippets, making it easier for users to understand and implement the Eigen library. | |
### Impact: | |
- Enhancements contribute to a better user experience, aiding developers in effectively utilizing the Eigen library's features." | |
761 (https://gitlab.com/libeigen/eigen/-/merge_requests/761),Some further cleanup,"This removes further compiler versions checks for obsolete versions, and the `EIGEN_HAS_CXX14_VARIABLES_TEMPLATES`, `EIGEN_HAS_TYPE_TRAITS`, `EIGEN_HAS_SFINAE` flags.",Erik Schultheis,2021-12-06T18:01:15.774Z,NA,NA,"## Title: | |
Some further cleanup | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request focuses on removing outdated checks and flags related to compiler versions and features that are no longer necessary for the Eigen C++ library. | |
### Key Changes: | |
- Removal of checks for obsolete compiler versions. | |
- Deletion of the `EIGEN_HAS_CXX14_VARIABLES_TEMPLATES`, `EIGEN_HAS_TYPE_TRAITS`, and `EIGEN_HAS_SFINAE` flags. | |
### Improvements: | |
- Streamlined codebase by eliminating unnecessary complexity related to outdated compiler configurations. | |
### Impact: | |
- Reduces maintenance overhead and enhances compatibility by aligning the library with current compiler standards." | |
577 (https://gitlab.com/libeigen/eigen/-/merge_requests/577),Idrsstabl,"This implements IDR(s)STAB(l) | |
The IDR(s)STAB(l) is a combination of IDR(s) and BiCGSTAB(l). It is a short-recurrences Krylov method for sparse square problems. | |
It can outperform both IDR(s) and BiCGSTAB(l). IDR(s)STAB(l) generally closely follows the optimal GMRES convergence in | |
terms of the number of Matrix-Vector products. However, without the increasing cost per iteration of GMRES. IDR(s)STAB(l) | |
is suitable for both indefinite systems and systems with complex eigenvalues.",Jens Wehner,2021-12-06T20:00:01.016Z,MR to reopen,NA,"## Title: Idrsstabl | |
## Authors: Jens Wehner | |
## Summary: | |
This merge request introduces the IDR(s)STAB(l) method, which is a new implementation combining features of IDR(s) and BiCGSTAB(l) for solving sparse square problems. This method aims to enhance performance in terms of convergence and computational efficiency. | |
### Key Changes: | |
- Implementation of the IDR(s)STAB(l) method. | |
- Combines attributes of IDR(s) and BiCGSTAB(l) for improved performance. | |
### Improvements: | |
- Provides a Krylov method with short recurrences that can outperform both IDR(s) and BiCGSTAB(l). | |
- Achieves convergence rates similar to GMRES but without the increased cost per iteration. | |
- Suitable for indefinite systems and those with complex eigenvalues. | |
### Impact: | |
The introduction of the IDR(s)STAB(l) method may lead to more efficient solving of sparse matrix problems within the Eigen library, enhancing overall performance for users who work with complex or indefinite systems." | |
765 (https://gitlab.com/libeigen/eigen/-/merge_requests/765),disambiguate overloads for empty index list,"Clang complains about an ambiguous overload for creating compile time indices when the index list is empty. | |
(e.g. https://gitlab.com/libeigen/eigen/-/jobs/1856367113#L8459) | |
``` | |
../unsupported/test/../../unsupported/Eigen/CXX11/src/Tensor/TensorMeta.h:284:12: error: call to 'customIndices2Array' is ambiguous | |
return customIndices2Array(idx, typename gen_numeric_list<Index, NumIndices>::type{}); | |
``` | |
In principle, the generic version should be enough I think, but the git blame said that the second overload was introduced to prevent some warnings, so I've kept the two overloads. | |
Instead, the first overload now explicitly mentions at least one single entry, thus not being viable for the empty case and preventing the ambiguous call.",Erik Schultheis,2021-12-07T19:40:10.444Z,NA,NA,"## Title: | |
Disambiguate overloads for empty index list | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request addresses a Clang compiler issue regarding ambiguous overloads that arise when creating compile-time indices with an empty index list in the Eigen C++ library. The proposed solution modifies the function overloads to eliminate ambiguity and enhance clarity in index list handling. | |
### Key Changes: | |
- Adjusted the first overload of the `customIndices2Array` function to explicitly require at least one index entry. This prevents the function from being called with an empty index list, which was the source of ambiguity. | |
### Improvements: | |
- Reduced ambiguity in function calls related to index creation. | |
- Maintained two overloads to preserve functionality while making the interface clearer. | |
### Impact: | |
- The change enhances code stability and compiler compatibility, particularly when dealing with cases involving empty index lists, thereby improving overall code maintainability and reducing warnings during compilation." | |
760 (https://gitlab.com/libeigen/eigen/-/merge_requests/760),get rid of `using namespace Eigen` in sample code,"Even if we cannot get rid of the bad examples elsewhere, a first step of showing that `using namespace Eigen` is not good practice is to not do it in the examples that eigen comes with.",Erik Schultheis,2021-12-07T19:57:39.184Z,NA,NA,"## Title: | |
Get Rid of `using namespace Eigen` in Sample Code | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request focuses on eliminating the use of `using namespace Eigen` in the sample code provided with the Eigen C++ library. This step aims to promote better coding practices by discouraging the use of this practice in examples. | |
### Key Changes: | |
- Removed instances of `using namespace Eigen` from the sample code. | |
### Improvements: | |
- Enhances code quality in examples by adhering to better namespace practices. | |
### Impact: | |
- Promotes better coding habits among users of the Eigen library, helping prevent potential naming conflicts and improving code clarity." | |
767 (https://gitlab.com/libeigen/eigen/-/merge_requests/767),Make sure exp(-Inf) is zero for vectorized expressions.,"### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
This fixes #2385 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Before this fix exp() would return a non-zero value for -Inf arguments if the expression was vectorized (i.e. the array being at least as long as the packet size of the corresponding scalar type). | |
### Additional information | |
<!--Any additional information you think is important.--> | |
For AVX2 this change gives a small speedup for float and is neutral for double. | |
AVX2 on Skylake: | |
``` | |
name old cpu/op new cpu/op delta | |
BM_eigen_exp_double/1 3.54ns ± 0% 3.54ns ± 0% -0.09% (p=0.005 n=50+49) | |
BM_eigen_exp_double/8 58.8ns ± 1% 59.0ns ± 3% ~ (p=0.385 n=43+56) | |
BM_eigen_exp_double/64 201ns ± 4% 200ns ± 4% ~ (p=0.299 n=59+60) | |
BM_eigen_exp_double/512 1.29µs ± 2% 1.28µs ± 3% -0.73% (p=0.001 n=59+59) | |
BM_eigen_exp_double/4k 9.92µs ± 2% 9.90µs ± 3% ~ (p=0.435 n=59+59) | |
BM_eigen_exp_double/32k 78.8µs ± 2% 78.9µs ± 3% ~ (p=0.584 n=58+59) | |
BM_eigen_exp_double/256k 634µs ± 2% 628µs ± 3% -0.96% (p=0.000 n=59+58) | |
BM_eigen_exp_double/1M 2.54ms ± 2% 2.51ms ± 2% -1.24% (p=0.000 n=34+33) | |
BM_eigen_exp_float/1 3.27ns ± 0% 3.27ns ± 0% -0.10% (p=0.000 n=50+47) | |
BM_eigen_exp_float/8 30.3ns ± 5% 29.6ns ± 0% -2.34% (p=0.001 n=54+50) | |
BM_eigen_exp_float/64 81.3ns ± 2% 79.6ns ± 2% -2.11% (p=0.000 n=58+58) | |
BM_eigen_exp_float/512 471ns ± 4% 455ns ± 3% -3.40% (p=0.000 n=60+58) | |
BM_eigen_exp_float/4k 3.58µs ± 3% 3.45µs ± 3% -3.53% (p=0.000 n=50+49) | |
BM_eigen_exp_float/32k 28.5µs ± 3% 27.5µs ± 3% -3.52% (p=0.000 n=54+52) | |
BM_eigen_exp_float/256k 227µs ± 4% 220µs ± 3% -3.27% (p=0.000 n=49+49) | |
BM_eigen_exp_float/1M 908µs ± 4% 884µs ± 2% -2.65% (p=0.000 n=42+43) | |
``` | |
For SSE, the change nets a 4-6% speedup: | |
``` | |
name old cpu/op new cpu/op delta | |
BM_eigen_exp_double/1 1.90ns ± 0% 1.90ns ± 1% ~ (p=0.567 n=48+60) | |
BM_eigen_exp_double/8 48.2ns ± 0% 45.9ns ± 0% -4.76% (p=0.000 n=49+51) | |
BM_eigen_exp_double/64 348ns ± 2% 328ns ± 2% -5.94% (p=0.000 n=50+49) | |
BM_eigen_exp_double/512 2.74µs ± 0% 2.56µs ± 0% -6.61% (p=0.000 n=44+53) | |
BM_eigen_exp_double/4k 21.9µs ± 0% 20.5µs ± 0% -6.41% (p=0.000 n=58+50) | |
BM_eigen_exp_double/32k 175µs ± 0% 163µs ± 0% -6.52% (p=0.000 n=52+50) | |
BM_eigen_exp_double/256k 1.40ms ± 0% 1.31ms ± 0% -6.45% (p=0.000 n=54+51) | |
BM_eigen_exp_double/1M 5.59ms ± 0% 5.23ms ± 0% -6.41% (p=0.000 n=43+43) | |
BM_eigen_exp_float/1 1.87ns ± 2% 1.89ns ± 0% +1.06% (p=0.000 n=60+53) | |
BM_eigen_exp_float/8 22.5ns ± 0% 25.3ns ± 0% +12.65% (p=0.000 n=54+48) | |
BM_eigen_exp_float/64 149ns ± 0% 142ns ± 0% -4.84% (p=0.000 n=59+50) | |
BM_eigen_exp_float/512 1.17µs ± 0% 1.11µs ± 0% -5.07% (p=0.000 n=54+52) | |
BM_eigen_exp_float/4k 9.36µs ± 0% 8.87µs ± 0% -5.21% (p=0.000 n=52+55) | |
BM_eigen_exp_float/32k 74.9µs ± 0% 70.9µs ± 0% -5.41% (p=0.000 n=54+53) | |
BM_eigen_exp_float/256k 599µs ± 0% 569µs ± 0% -5.11% (p=0.000 n=58+53) | |
BM_eigen_exp_float/1M 2.39ms ± 0% 2.27ms ± 0% -4.92% (p=0.000 n=33+32) | |
```",Rasmus Munk Larsen,2021-12-08T17:57:24.619Z,NA,NA,"## Title: | |
Make sure exp(-Inf) is zero for vectorized expressions. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request addresses an issue (#2385) within the Eigen C++ library concerning the behavior of the `exp()` function when applied to `-Inf` arguments in vectorized expressions. It ensures that `exp(-Inf)` returns zero, aligning the functionality with mathematical expectations. | |
### Key Changes: | |
- Modified the implementation of the `exp()` function to return zero for `-Inf` arguments during vectorized operations, which was not previously supported. | |
### Improvements: | |
- For AVX2, the changes yield a small speedup in performance for float types, while the impact on doubles is neutral. | |
- SSE implementations benefit from a more significant performance improvement of approximately 4-6%. | |
### Impact: | |
- Enhancing the behavior of the `exp()` function improves mathematical correctness when using vectorized expressions. | |
- Performance optimizations for both AVX2 and SSE result in faster computations, particularly for float types, improving overall library efficiency in relevant scenarios." | |
758 (https://gitlab.com/libeigen/eigen/-/merge_requests/758),Build unit tests for HIP using C++14.,"Build GPU unit tests for HIP using C++14. | |
/cc @cantonios",Rohit Santhanam,2021-12-09T08:04:20.163Z,NA,NA,"## Title: | |
Build unit tests for HIP using C++14. | |
## Authors: | |
Rohit Santhanam | |
## Summary: | |
This merge request introduces GPU unit tests for the Eigen library specifically designed for HIP (Heterogeneous-Compute Interface for Portability) utilizing C++14 standards. | |
### Key Changes: | |
- Developed and integrated GPU unit tests for HIP. | |
- Utilization of C++14 features for enhanced code efficiency and readability. | |
### Improvements: | |
- Streamlined testing processes for GPU-related functionalities in Eigen. | |
- Enhanced compatibility and performance checks for the library in a HIP environment. | |
### Impact: | |
- This addition ensures better reliability and robustness of the Eigen library when used in GPU contexts, particularly benefiting developers working with HIP." | |
763 (https://gitlab.com/libeigen/eigen/-/merge_requests/763),removed helper cmake macro and don't use deprecated COMPILE_FLAGS anymore.,"This is a first attempt at starting to clean up the cmake scripts. | |
It replaces manual manipulation of (deprecated) `COMPILE_FLAGS` with `target_compile_options` and | |
`target_compile_definitions` instead. | |
I don't have a SYCL/HIP version available here, so I haven't tested that part. | |
Q: Should we (in a separate MR) define an interface target for the common test flags and then just `target_link_libraries` with those instead of adding all these options inside `ei_add_test_internal`?",Erik Schultheis,2021-12-09T23:09:56.963Z,NA,NA,"## Title: | |
Removed helper cmake macro and don't use deprecated COMPILE_FLAGS anymore. | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request initiates the cleanup of the CMake scripts for the Eigen C++ library by removing the usage of deprecated `COMPILE_FLAGS`. It replaces the manual handling of these flags with the more modern `target_compile_options` and `target_compile_definitions`. | |
### Key Changes: | |
- Eliminated the usage of the deprecated `COMPILE_FLAGS`. | |
- Introduced `target_compile_options` and `target_compile_definitions` for better flag management in CMake. | |
### Improvements: | |
- Enhances the maintainability and readability of the CMake scripts. | |
- Aligns the project with current best practices in CMake usage. | |
### Impact: | |
- Sets the foundation for improved future enhancements to the build system. | |
- Reduces technical debt by phasing out deprecated practices, which could lead to fewer issues in future development." | |
768 (https://gitlab.com/libeigen/eigen/-/merge_requests/768),removed Find*.cmake scripts for which these are available in cmake itself,"As per #2387, this removes find scripts for packages which have these supplied by cmake. | |
For BLAS, Lapack, GLEW, and GSL cmake provides find scripts: | |
* https://cmake.org/cmake/help/v3.10/module/FindBLAS.html | |
* https://cmake.org/cmake/help/v3.10/module/FindLAPACK.html | |
* https://cmake.org/cmake/help/v3.10/module/FindGLEW.html | |
* https://cmake.org/cmake/help/v3.10/module/FindGSL.html | |
For GSL and GLEW, the cmake versions provide imported targets, (for BLAS and LAPACK only after 3.18), so this change should enable further downstream improvements of the cmake scripts that I haven't looked at yet.",Erik Schultheis,2021-12-10T02:02:35.546Z,NA,NA,"## Title: | |
Removed Find*.cmake scripts for which these are available in cmake itself | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request removes specific Find*.cmake scripts related to BLAS, LAPACK, GLEW, and GSL, as these packages are now adequately supported by CMake's own find scripts, improving compatibility and maintainability. | |
### Key Changes: | |
- Eliminated custom Find*.cmake scripts for BLAS, LAPACK, GLEW, and GSL. | |
- CMake now provides built-in support for finding these packages. | |
### Improvements: | |
- Simplifies the CMake configuration process by relying on standard CMake modules. | |
- Imports targets for GSL and GLEW in CMake improve integration and usability. | |
### Impact: | |
- Enhances the maintainability of the Eigen library's CMake scripts. | |
- Facilitates potential downstream enhancements in CMake scripts for better package management and usability." | |
770 (https://gitlab.com/libeigen/eigen/-/merge_requests/770),fixed customIndices2Array forgetting first index,"In !765 i made sure the overload for `customIndices2Array` was picking up the first element separately to disambiguate the empty case. But I forgot to put this first element into the newly created array. Since this only affects the tensor module, running the smoke tests did not reveal the problem. | |
I'm currently running `./check.sh "".*tensor.*""` to check that this fix really helps.",Erik Schultheis,2021-12-10T16:42:00.358Z,NA,NA,"## Title: | |
fixed customIndices2Array forgetting first index | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request addresses an oversight in the `customIndices2Array` function within the Eigen C++ library's tensor module, where the first index was not being included in the newly created array. | |
### Key Changes: | |
- Fixed the `customIndices2Array` function to ensure the first element is included when creating the array from custom indices. | |
### Improvements: | |
- Enhances the functionality of the tensor module by properly handling the first index, preventing potential issues related to empty cases. | |
### Impact: | |
- This fix improves the robustness of the tensor operations in the Eigen library, ensuring accurate results when using custom indices, particularly in edge cases." | |
769 (https://gitlab.com/libeigen/eigen/-/merge_requests/769),Fix,"Fixes | |
``` | |
Eigen/src/CholmodSupport/CholmodSupport.h:13, | |
from Eigen/SPQRSupport:31, | |
from test/spqr_support.cpp:11: | |
Eigen/src/CholmodSupport/./InternalHeaderCheck.h:2:2: error: #error ""Please include Eigen/CholmodSupport instead of including headers inside the src directory directly."" | |
```",Erik Schultheis,2021-12-10T16:59:49.166Z,NA,NA,"## Title: | |
Fix | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request addresses an error encountered when directly including headers from the `src` directory in the Eigen library, specifically related to the `CholmodSupport`. | |
### Key Changes: | |
- Added a guard to prevent the inclusion of headers inside the `src` directory directly. | |
### Improvements: | |
- Enhances the usability of the library by ensuring proper inclusion practices, guiding users to include headers from the main interface instead of internal files. | |
### Impact: | |
- This change improves error handling and helps maintain the integrity of the library's structure by enforcing correct header inclusion, which can prevent potential issues arising from direct access to internal components." | |
753 (https://gitlab.com/libeigen/eigen/-/merge_requests/753),turn some macros intro constexpr functions,"Now that we have C++14, some actual cleanup. | |
This turns some ""computational"" macros into actual constexpr functions. | |
The added benefit is that there is a bit more checking involved, e.g. you cannot pass floats anymore. | |
Should the type here be `int` or rather `Eigen::Index`? | |
Also, both the old macro and the new code are susceptible to problems stemming from narrowing conversions.",Erik Schultheis,2021-12-10T19:27:02.761Z,NA,NA,"## Title: | |
Turn some macros into constexpr functions | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request focuses on improving the Eigen C++ library by converting certain computational macros into `constexpr` functions now that C++14 is available. The change brings enhanced type safety and improves code maintainability. | |
### Key Changes: | |
- Transitioned computational macros to `constexpr` functions. | |
- Increased type checking by disallowing the passing of floats. | |
### Improvements: | |
- Enhanced safety with stricter type checks. | |
- Improved readability and maintainability of the code through the elimination of macros. | |
### Impact: | |
This improvement reduces the likelihood of errors stemming from improper type usage and narrowing conversions, ultimately leading to a more robust codebase." | |
774 (https://gitlab.com/libeigen/eigen/-/merge_requests/774),Fixes for enabling HIP unit tests. Includes a fix to make this work with the latest cmake.,"Fixes for enabling unit tests on HIP. | |
/cc @cantonios",Rohit Santhanam,2021-12-12T21:03:31.341Z,NA,NA,"## Title: | |
Fixes for enabling HIP unit tests | |
## Authors: | |
Rohit Santhanam | |
## Summary: | |
This merge request introduces fixes that enable unit tests for the HIP (Heterogeneous-compute Interface for Portability) framework, ensuring compatibility with the latest version of CMake. | |
### Key Changes: | |
- Implemented fixes to allow the execution of unit tests on HIP. | |
- Updated CMake configuration to work seamlessly with recent changes. | |
### Improvements: | |
- Enhanced compatibility of the Eigen C++ library with the HIP framework. | |
- Streamlined the testing process to ensure robust unit testing capabilities. | |
### Impact: | |
The changes enhance the library's support for HIP, improving its usability for developers working in heterogeneous computing environments and ensuring updated CMake compatibility for smoother build processes." | |
776 (https://gitlab.com/libeigen/eigen/-/merge_requests/776),space separated EIGEN_TEST_CUSTOM_CXX_FLAGS,"Convert spaces in `EIGEN_TEST_CUSTOM_CXX_FLAGS` into `;` to make this a CMake List. | |
This is using the `MODE` version of `separate_arguments` because this respects escapes and quotes. | |
Interestingly, at this point I think `EIGEN_TEST_CUSTOM_LINKER_FLAGS` does not need this processing. I think this is because we are using still using the legacy way of specifying `target_link_dependencies` without `PUBLIC`/`PRIVATE` specifier.",Erik Schultheis,2021-12-13T15:27:34.232Z,NA,NA,"## Title: | |
space separated EIGEN_TEST_CUSTOM_CXX_FLAGS | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request introduces a modification to the `EIGEN_TEST_CUSTOM_CXX_FLAGS` variable, converting spaces into semicolons to enable its use as a CMake List. This change employs the `MODE` version of `separate_arguments`, which properly handles escape sequences and quotation marks. | |
### Key Changes: | |
- Conversion of spaces in `EIGEN_TEST_CUSTOM_CXX_FLAGS` to semicolons. | |
- Utilization of the `MODE` variant of `separate_arguments` for improved handling of special characters. | |
### Improvements: | |
- Enhances the ability to specify multiple compiler flags in `EIGEN_TEST_CUSTOM_CXX_FLAGS`. | |
- Streamlines the integration with CMake by ensuring proper syntax. | |
### Impact: | |
- Improves the build configuration process for projects utilizing the Eigen library by facilitating more robust flag handling. | |
- The existing approach for `EIGEN_TEST_CUSTOM_LINKER_FLAGS` remains unchanged, indicating that no immediate action is required for that variable, thereby maintaining compatibility with the current build system." | |
773 (https://gitlab.com/libeigen/eigen/-/merge_requests/773),Small speed-up in row-major sparse dense product,"This MR changes the implementation of sparse_time_dense_product_impl for `RowMajor` matrices to use two accumulation variables instead of one. This breaks up the dependency chain of addition the values, and opens up options for the CPU to employ instruction-level parallelism. | |
I have run the following benchmark on my system (AMD Threadripper 1950, single threaded) | |
``` | |
#define ANKERL_NANOBENCH_IMPLEMENT | |
#include <nanobench.h> | |
#include <Eigen/Sparse> | |
#include <random> | |
void benchmark_sparse_dense_vector(ankerl::nanobench::Bench& bench) { | |
const Eigen::Index SIZE = 100'000; | |
const Eigen::Index NNZ = 10'000'000; | |
Eigen::SparseMatrix<float, Eigen::RowMajor> matrix(SIZE, SIZE); | |
std::default_random_engine rng; | |
std::uniform_int_distribution<Eigen::Index> dist(0, SIZE - 1); | |
std::vector<Eigen::Triplet<float>> triplets; | |
triplets.reserve(NNZ); | |
auto start = std::chrono::steady_clock::now(); | |
for(int i = 0; i < NNZ; ++i) { | |
triplets.emplace_back(dist(rng), dist(rng), 1.0); | |
} | |
matrix.setFromTriplets(begin(triplets), end(triplets)); | |
Eigen::VectorXf rhs = Eigen::VectorXf::Random(SIZE); | |
Eigen::VectorXf dst(SIZE); | |
bench.run(""Sparse * Dense -> Dense"", [&] {dst = matrix * rhs; }); | |
} | |
int main() { | |
ankerl::nanobench::Bench bench{}; | |
bench.warmup(10).epochs(100); | |
bench.minEpochTime(std::chrono::nanoseconds{1'000'000}); | |
bench.minEpochIterations(10); | |
benchmark_sparse_dense_vector(bench); | |
} | |
``` | |
with the following results | |
# SIZE = 1M | |
## New | |
| ns/op | op/s | err% | ins/op | bra/op | miss% | total | benchmark | |
|--------------------:|--------------------:|--------:|----------------:|---------------:|--------:|----------:|:---------- | |
| 15,099,473.67 | 66.23 | 0.4% | 89,126,601.18 | 14,625,689.82 | 7.2% | 16.55 | `Sparse * Dense -> Dense` | |
## Old | |
| ns/op | op/s | err% | ins/op | bra/op | miss% | total | benchmark | |
|--------------------:|--------------------:|--------:|----------------:|---------------:|--------:|----------:|:---------- | |
| 16,052,703.27 | 62.29 | 1.4% | 90,625,231.30 | 14,125,126.00 | 7.3% | 18.99 | `Sparse * Dense -> Dense` | |
# SIZE = 100k | |
## New | |
| ns/op | op/s | err% | ins/op | bra/op | miss% | total | benchmark | |
|--------------------:|--------------------:|--------:|----------------:|---------------:|--------:|----------:|:---------- | |
| 8,723,373.15 | 114.63 | 1.1% | 62,853,098.64 | 10,452,488.20 | 1.0% | 9.73 | `Sparse * Dense -> Dense` | |
## Old | |
| ns/op | op/s | err% | ins/op | bra/op | miss% | total | benchmark | |
|--------------------:|--------------------:|--------:|----------------:|---------------:|--------:|----------:|:---------- | |
| 9,254,136.85 | 108.06 | 0.4% | 71,994,036.60 | 10,402,745.36 | 1.0% | 10.45 | `Sparse * Dense -> Dense` | |
On the other hand, in my real application (about 60% time spent sparse matrix times dense vector), I don't see any change. I suspect that is because I'm running many of these in parallel, and the computation is severely memory bound anyway.",Erik Schultheis,2021-12-15T18:46:26.435Z,NA,NA,"## Title: | |
Small speed-up in row-major sparse dense product | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request optimizes the implementation of the `sparse_time_dense_product_impl` for `RowMajor` matrices in the Eigen C++ library. By utilizing two accumulation variables instead of one, the dependency chain for value addition is broken, enabling better instruction-level parallelism by the CPU. | |
### Key Changes: | |
- Changed the implementation of the `sparse_time_dense_product_impl` for `RowMajor` matrices. | |
- Introduced two accumulation variables to improve computational efficiency. | |
### Improvements: | |
- Enhanced performance in terms of operation speed demonstrated through benchmarking. | |
- Reduced time per operation (ns/op) for matrix sizes of 100k and 1M. | |
### Impact: | |
- Benchmarks show a significant reduction in computation time: | |
- For SIZE = 1M, the time decreased from 16,052,703.27 ns/op to 15,099,473.67 ns/op. | |
- For SIZE = 100k, the time improved from 9,254,136.85 ns/op to 8,723,373.15 ns/op. | |
- While the improvements are observed in benchmarks, real applications with parallel computations may not benefit significantly due to memory bandwidth limitations." | |
782 (https://gitlab.com/libeigen/eigen/-/merge_requests/782),Fix a bug introduced in !751.,A few uses of the EIGEN_IMPLIES macro had side-effects that were conditionally short-circuited in the old but not the new implementation.,Rasmus Munk Larsen,2021-12-15T22:00:40.737Z,NA,NA,"## Title: | |
Fix a bug introduced in !751. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request addresses a bug resulting from the changes made in a previous implementation (merge request !751). Specifically, it corrects the behavior of the EIGEN_IMPLIES macro, which had side-effects that were not appropriately managed. | |
### Key Changes: | |
- Revised the usage of the EIGEN_IMPLIES macro to ensure that side-effects are handled as they were in the previous version. | |
### Improvements: | |
- Restores the conditional short-circuiting behavior of the EIGEN_IMPLIES macro, preventing unintended side-effects when the macro is evaluated. | |
### Impact: | |
- Ensures that the macro behaves consistently with its prior implementation, which helps avoid potential bugs and ensures reliability in existing code that relies on EIGEN_IMPLIES." | |
783 (https://gitlab.com/libeigen/eigen/-/merge_requests/783),Simplify logical_xor(),"### What does this implement/fix? | |
For `a` and `b` of type `bool` we can simplify `(a || b) && !(a && b)` to `a != b`. | |
Maybe we could even consider using `!=` right away in the code for clarity instead of invoking the function `logical_xor()`. @rmlarsen1 @ngc92 What do you think? | |
### Reference issue | |
See also my comment on commit c20e908e.",Kolja Brix,2021-12-16T20:20:47.898Z,NA,NA,"## Title: | |
Simplify logical_xor() | |
## Authors: | |
Kolja Brix | |
## Summary: | |
This merge request proposes simplifications to the `logical_xor()` function in the Eigen C++ library, specifically for `bool` types. | |
### Key Changes: | |
- The expression `(a || b) && !(a && b)` has been simplified to `a != b` for boolean variables `a` and `b`. | |
### Improvements: | |
- Enhanced readability and clarity in the code by potentially replacing calls to `logical_xor()` with the simpler expression `a != b`. | |
### Impact: | |
- The simplification improves code clarity and efficiency, making the logical operation more understandable and reducing overhead from function calls." | |
785 (https://gitlab.com/libeigen/eigen/-/merge_requests/785),fixed clang warnings about alignment change and floating point precision,"This fixes two warnings that come up in the CI with Clang. | |
The first is related to a conversion of a pointer to a `std::complex` value on the stack to a pointer to `__m64`. We can just align the `res` variable with the same alignment as required by `__mm64`, things should be safe then. | |
The other fix is just dropping a `.f` suffix for a double constant. | |
The remaining Clang warnings are about | |
> Eigen/src/Core/Reverse.h:194:51: warning: implicit conversion loses integer precision: 'Eigen::Index' (aka 'long') to 'int' [-Wshorten-64-to-32] | |
Which is because `Eigen::fix<>` has `int` as its template parameter, even though it fulfills the function of an `Eigen::Index` I think. But addressing that seems to be a more complicated change, so I've left that as is for now.",Erik Schultheis,2021-12-18T17:18:16.892Z,NA,NA,"## Title: | |
Fixed clang warnings about alignment change and floating point precision | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request addresses two specific warnings from Clang that appeared during continuous integration (CI) testing. It makes adjustments to improve code alignment and precision handling. | |
### Key Changes: | |
- Aligned the `res` variable to the same alignment required by `__mm64` to resolve a warning about pointer conversion. | |
- Removed the unnecessary `.f` suffix from a double constant. | |
### Improvements: | |
- Enhanced code alignment for better safety when working with complex values and SIMD types. | |
### Impact: | |
- Reduction of compile-time warnings in Clang, thereby improving code quality and maintainability." | |
786 (https://gitlab.com/libeigen/eigen/-/merge_requests/786),Small cleanup of GDB pretty printer code,"### What does this implement/fix? | |
Small cleanup of pretty printer code for the GNU Debugger (GDB): | |
* Rename variable `type` to avoid conflict with Python function `type()`. | |
* Remove import of module that is no longer needed. | |
* Use `+=` for better readability. | |
* Improve formatting.",Kolja Brix,2021-12-18T17:34:38.717Z,NA,NA,"## Title: | |
Small cleanup of GDB pretty printer code | |
## Authors: | |
Kolja Brix | |
## Summary: | |
This merge request focuses on cleaning up the GDB pretty printer code, enhancing its readability and maintainability. | |
### Key Changes: | |
- Renamed the variable `type` to prevent conflicts with the Python built-in function `type()`. | |
- Removed an unnecessary module import. | |
- Changed concatenation to `+=` for improved readability. | |
- Enhanced overall formatting of the code. | |
### Improvements: | |
The changes lead to clearer and more maintainable code, reducing potential for confusion and errors. | |
### Impact: | |
This cleanup contributes to better code quality, making future modifications easier and enhancing the debugging experience with GDB." | |
788 (https://gitlab.com/libeigen/eigen/-/merge_requests/788),Small fixes,"This MR fixes a bunch of smaller issues, making the following changes: | |
* Template parameters in the documentation are documented with `\tparam` instead of `\param` | |
* superfluous semicolon warnings [note enabled by default] fixed | |
* Fixed the type of literals used to initialize float variables",Erik Schultheis,2021-12-21T16:46:10.725Z,NA,NA,"## Title: | |
Small fixes | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request addresses various minor issues within the Eigen C++ library, enhancing documentation accuracy and resolving warnings. | |
### Key Changes: | |
- Updated the documentation to use `\tparam` for template parameters instead of `\param`. | |
- Fixed warnings related to superfluous semicolons. | |
- Corrected the type of literals used for initializing float variables. | |
### Improvements: | |
- Improved the clarity and correctness of the library's documentation. | |
- Enhanced code quality by eliminating unnecessary warnings. | |
- Ensured proper initialization of float variables, possibly preventing future bugs. | |
### Impact: | |
These changes lead to a more robust and user-friendly library, improving both developer experience and code reliability." | |
789 (https://gitlab.com/libeigen/eigen/-/merge_requests/789),Include immintrin.h if F16C is available and vectorization is disabled,"If EIGEN_DONT_VECTORIZE is defined, immintrin.h is not included even if | |
F16C is available. Trying to use F16C intrinsics thus fails. | |
This fixes issue #2395.",David Tellenbach,2021-12-25T19:51:48.138Z,cherry-pick-to-stable::done,NA,"## Title: | |
Include immintrin.h if F16C is available and vectorization is disabled | |
## Authors: | |
David Tellenbach | |
## Summary: | |
This merge request addresses an issue where the header file `immintrin.h` was not included when the macro `EIGEN_DONT_VECTORIZE` was defined, despite the availability of F16C intrinsics. As a result, attempts to use F16C intrinsics would lead to failures. | |
### Key Changes: | |
- Included `immintrin.h` conditionally when `EIGEN_DONT_VECTORIZE` is defined, ensuring F16C intrinsics can be used. | |
### Improvements: | |
- Resolves issue #2395, enhancing compatibility with F16C intrinsics under vectorization disabled scenarios. | |
### Impact: | |
- This change facilitates the use of F16C operations even when vectorization is turned off, leading to improved functionality and flexibility in code that relies on these intrinsics." | |
790 (https://gitlab.com/libeigen/eigen/-/merge_requests/790),Add missing internal namespace,The vectorization logic tests miss some namespace internal qualifiers.,David Tellenbach,2021-12-27T23:50:32.626Z,NA,NA,"## Title: | |
Add missing internal namespace | |
## Authors: | |
David Tellenbach | |
## Summary: | |
This merge request addresses the omission of internal namespace qualifiers in the vectorization logic tests within the Eigen C++ library. | |
### Key Changes: | |
- Added missing internal qualifiers to the vectorization logic tests. | |
### Improvements: | |
- Enhances the clarity and correctness of the namespace usage in the tests. | |
### Impact: | |
- Improves the organization of the codebase, ensuring better adherence to the library's internal structure and reducing potential for namespace-related bugs in future developments." | |
779 (https://gitlab.com/libeigen/eigen/-/merge_requests/779),Improve exp<float>(): Don't flush denormal results + 4% speedup.,"1. Speed up `exp(x)` by reducing the polynomial approximant from degree 7 to degree 6. With exactly representable coefficients computed by the Sollya tool, this still gives a maximum relative error of 1 ulp, i.e. faithfully rounded, for arguments where exp(x) is a normalized float. This change results in a speedup of about 4% for AVX2. | |
2. Extend the range where `exp(x)` returns a non-zero result to from ~[-88;88] to ~[-104;88] i.e. return denormalized values for large negative arguments instead of zero. Compared to `exp<double>(x)` the denormalized results gradually decrease in accuracy down to 0.033 relative error for arguments around `x = -104` where `exp(x)` is `~std::numeric<float>::denorm_min()`. This is expected and acceptable. | |
Benchmark numbers for AVX2. | |
``` | |
name old cpu/op new cpu/op delta | |
BM_eigen_exp_float/1 3.27ns ± 0% 3.27ns ± 0% ~ (p=0.218 n=46+48) | |
BM_eigen_exp_float/8 29.6ns ± 0% 30.1ns ± 6% +1.56% (p=0.000 n=41+54) | |
BM_eigen_exp_float/64 80.4ns ± 5% 79.7ns ± 5% -0.85% (p=0.007 n=47+60) | |
BM_eigen_exp_float/512 460ns ± 2% 441ns ± 2% -4.31% (p=0.000 n=60+57) | |
BM_eigen_exp_float/4k 3.48µs ± 2% 3.35µs ± 2% -3.52% (p=0.000 n=49+49) | |
BM_eigen_exp_float/32k 27.6µs ± 3% 26.6µs ± 3% -3.75% (p=0.000 n=54+54) | |
BM_eigen_exp_float/256k 221µs ± 2% 212µs ± 2% -3.81% (p=0.000 n=48+56) | |
BM_eigen_exp_float/1M 887µs ± 3% 848µs ± 2% -4.33% (p=0.000 n=39+54) | |
name old time/op new time/op delta | |
BM_eigen_exp_float/1 3.27ns ± 0% 3.27ns ± 0% ~ (p=0.475 n=49+48) | |
BM_eigen_exp_float/8 29.6ns ± 0% 30.1ns ± 6% +1.54% (p=0.000 n=41+54) | |
BM_eigen_exp_float/64 80.4ns ± 5% 79.7ns ± 5% -0.89% (p=0.006 n=48+60) | |
BM_eigen_exp_float/512 460ns ± 2% 441ns ± 2% -4.31% (p=0.000 n=60+57) | |
BM_eigen_exp_float/4k 3.48µs ± 2% 3.35µs ± 2% -3.52% (p=0.000 n=49+49) | |
BM_eigen_exp_float/32k 27.6µs ± 3% 26.6µs ± 3% -3.73% (p=0.000 n=54+54) | |
BM_eigen_exp_float/256k 221µs ± 2% 212µs ± 2% -3.83% (p=0.000 n=48+56) | |
BM_eigen_exp_float/1M 887µs ± 3% 848µs ± 2% -4.33% (p=0.000 n=39+54) | |
name old INSTRUCTIONS/op new INSTRUCTIONS/op delta | |
BM_eigen_exp_float/1 41.0 ± 0% 41.0 ± 0% ~ (all samples are equal) | |
BM_eigen_exp_float/8 308 ± 0% 308 ± 0% ~ (all samples are equal) | |
BM_eigen_exp_float/64 660 ± 0% 632 ± 0% -4.24% (p=0.000 n=60+60) | |
BM_eigen_exp_float/512 3.29k ± 0% 3.04k ± 0% -7.65% (p=0.000 n=53+55) | |
BM_eigen_exp_float/4k 24.3k ± 0% 22.3k ± 0% -8.39% (p=0.000 n=45+45) | |
BM_eigen_exp_float/32k 193k ± 0% 176k ± 0% -8.50% (p=0.000 n=49+48) | |
BM_eigen_exp_float/256k 1.54M ± 0% 1.41M ± 0% -8.51% (p=0.000 n=44+54) | |
BM_eigen_exp_float/1M 6.16M ± 0% 5.64M ± 0% -8.51% (p=0.000 n=37+52) | |
name old CYCLES/op new CYCLES/op delta | |
BM_eigen_exp_float/1 12.0 ± 0% 12.0 ± 0% ~ (p=0.830 n=49+49) | |
BM_eigen_exp_float/8 109 ± 0% 111 ± 6% +1.52% (p=0.000 n=40+54) | |
BM_eigen_exp_float/64 270 ± 2% 269 ± 5% ~ (p=0.051 n=47+60) | |
BM_eigen_exp_float/512 1.55k ± 2% 1.49k ± 2% -4.10% (p=0.000 n=57+60) | |
BM_eigen_exp_float/4k 11.7k ± 1% 11.3k ± 1% -3.78% (p=0.000 n=50+40) | |
BM_eigen_exp_float/32k 93.0k ± 1% 89.5k ± 1% -3.76% (p=0.000 n=52+48) | |
BM_eigen_exp_float/256k 744k ± 1% 715k ± 1% -3.93% (p=0.000 n=49+46) | |
BM_eigen_exp_float/1M 2.99M ± 1% 2.87M ± 1% -4.02% (p=0.000 n=40+58) | |
```",Rasmus Munk Larsen,2021-12-28T15:00:19.706Z,NA,NA,"## Title: | |
Improve exp<float>(): Don't flush denormal results + 4% speedup. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request enhances the `exp<float>()` function in the Eigen library by optimizing the polynomial approximation and extending the range of non-zero results for denormalized values. | |
### Key Changes: | |
1. Reduced the polynomial degree from 7 to 6 for the `exp(x)` function, utilizing Sollya tool for coefficient precision. | |
2. Expanded the non-zero result range for `exp(x)` from approximately [-88, 88] to [-104, 88], allowing for denormalized outputs. | |
### Improvements: | |
- Achieved about a 4% speedup for `exp(float)` operations utilizing AVX2. | |
- Maintained a maximum relative error of 1 ulp for normalized floats. | |
- Improved efficiency across various input sizes in benchmark tests, with reduced execution times and cycles for larger inputs. | |
### Impact: | |
- The changes lead to better performance and accuracy in floating-point operations, making the library more efficient for computational tasks involving exponential calculations. | |
- Users can now expect valid outputs for a broader range of inputs, particularly at the lower extremes where denormal numbers are involved." | |
793 (https://gitlab.com/libeigen/eigen/-/merge_requests/793),Remove unused EIGEN_HAS_STATIC_ARRAY_TEMPLATE,NA,David Tellenbach,2021-12-30T15:26:57.137Z,NA,NA,"## Title: | |
Remove unused EIGEN_HAS_STATIC_ARRAY_TEMPLATE | |
## Authors: | |
David Tellenbach | |
## Summary: | |
This merge request involves the removal of the unused macro `EIGEN_HAS_STATIC_ARRAY_TEMPLATE` from the Eigen C++ library. | |
### Key Changes: | |
- Elimination of the `EIGEN_HAS_STATIC_ARRAY_TEMPLATE` macro. | |
### Improvements: | |
- Clean-up of the codebase by removing unnecessary definitions, thereby enhancing maintainability. | |
### Impact: | |
- Reduces clutter in the code, which may simplify future development and contribute to improved readability." | |
794 (https://gitlab.com/libeigen/eigen/-/merge_requests/794),bugfix: *ALTIVEC_H -> *ZVECTOR_H,"Hello, I noticed that some some header guards were repeated between the [`AltiVec`](https://gitlab.com/shivaghose/eigen/-/tree/e4d8299a417203980af37f4e544226884b8cc031/Eigen/src/Core/arch/AltiVec) package and the [`ZVector`](https://gitlab.com/shivaghose/eigen/-/tree/e4d8299a417203980af37f4e544226884b8cc031/Eigen/src/Core/arch/ZVector) packages. This could cause a problem if (for whatever reason) someone attempts to include headers for both architectures. | |
I'm not sure how to write a test for this, but I'm open to suggestions. | |
<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
I couldn't find any existing issues (happy to file an issue if the bug tracking makes your lives easier). | |
### What does this implement/fix? | |
This MR replaces the duplicated header guards found in [`ZVector/Complex.h`](https://gitlab.com/libeigen/eigen/-/blob/22a347b9d2ee8321543e3b15673e1dd1d5456d4e/Eigen/src/Core/arch/ZVector/Complex.h#L11-12) and [`MathFunctions.h`](https://gitlab.com/libeigen/eigen/-/blob/22a347b9d2ee8321543e3b15673e1dd1d5456d4e/Eigen/src/Core/arch/ZVector/MathFunctions.h#L16-17) with unique header guards. | |
### Additional information | |
Ideally we would use something like `pragma once` but I understand that not all compilers and operating systems could easily support it.",Shiva Ghose,2021-12-31T08:43:25.244Z,NA,NA,"## Title: | |
bugfix: *ALTIVEC_H -> *ZVECTOR_H | |
## Authors: | |
Shiva Ghose | |
## Summary: | |
This merge request addresses an issue with duplicated header guards in the Eigen C++ library's `AltiVec` and `ZVector` packages. By replacing repeated header guards with unique identifiers, the potential for compilation errors when including headers from both architectures is mitigated. | |
### Key Changes: | |
- Replaced duplicated header guards in `ZVector/Complex.h` and `MathFunctions.h` with unique header guards to prevent conflicts. | |
### Improvements: | |
- Enhanced code reliability by ensuring that header guards do not clash between different architectural implementations. | |
### Impact: | |
- Reduces the risk of compilation issues for users working with both AltiVec and ZVector architectures, improving the overall robustness of the library." | |
797 (https://gitlab.com/libeigen/eigen/-/merge_requests/797),Add bounds checking to Eigen serializer,"### What does this implement/fix? | |
#2405",Lingzhu Xiang,2022-01-04T18:42:08.354Z,NA,NA,"## Title: | |
Add bounds checking to Eigen serializer | |
## Authors: | |
Lingzhu Xiang | |
## Summary: | |
This merge request introduces bounds checking to the Eigen serializer, addressing issue #2405. The enhancement aims to improve the safety and reliability of serialization operations within the Eigen library. | |
### Key Changes: | |
- Implementation of bounds checking in the Eigen serializer to prevent out-of-bounds access during serialization tasks. | |
### Improvements: | |
- Enhanced data integrity during serialization by ensuring that only valid data is processed, reducing the risk of errors and potential crashes. | |
### Impact: | |
- This change significantly increases the robustness of the Eigen serialization process, improving user confidence and system stability in applications that utilize the library for scientific computing and data manipulation." | |
771 (https://gitlab.com/libeigen/eigen/-/merge_requests/771),"ensure that eigen::internal::size is not found by ADL, rename to ssize and...","This is an attempt at solving #2391 | |
The naive idea would be to use `std::size` when available (C++17), and otherwise define `Eigen::internal::size`. | |
However, these is a subtle difference between the std:: version and the Eigen version, in that std version is generally expected to return an unsigned value. C++20 defines `ssize` which is specifically for signed size values. | |
Technically, the `size` function in Eigen is compatible with the std version for Eigen types, as that is defined to return whatever the underlying `object.size()` returns. But for std types and arrays the return types would differ. | |
Therefore, I have changed the function name to `ssize` and put an implementation compatible with the standard (I hope, it is based on https://en.cppreference.com/w/cpp/iterator/size). Since this is an internal function, the name change should be OK I think. | |
There is still a slight subtlety, in that the std ssize function returns `std::common_type_t<std::ptrdiff_t, std::make_signed_t<decltype(c.size())>>`, which means that if the user redefined `Eigen::Index` to be smaller than `std::ptrdiff_t`, the old version would return a value of type `Eigen::Index` but the new one `std::ptrdiff_t`.",Erik Schultheis,2022-01-05T00:46:10.359Z,NA,NA,"## Title: | |
Ensure that eigen::internal::size is not found by ADL, rename to ssize | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request addresses issue #2391 by renaming the internal `size` function in the Eigen C++ library to `ssize`. It aims to prevent potential conflicts with ADL (Argument Dependent Lookup) and aligns Eigen's behavior more closely with C++ standards. | |
### Key Changes: | |
- Renamed `Eigen::internal::size` to `ssize`. | |
- Implemented `ssize` to be compatible with the C++ standard, following C++20 definitions. | |
### Improvements: | |
- Enhanced compatibility with C++17 and later versions by aligning with `std::size`. | |
- Improved clarity by differentiating the signed behavior of `ssize` from previous implementations, reducing potential type-related issues. | |
### Impact: | |
- Reduces the likelihood of naming conflicts in user code due to ADL. | |
- Positions Eigen to be more in line with standard practices, which could lead to better integration with codebases utilizing newer C++ standards." | |
800 (https://gitlab.com/libeigen/eigen/-/merge_requests/800),Some serialization API changes were made in commit...,"Serialization API changes were made which broke the GPU unit tests for HIP. | |
This commit fixes things. | |
/cc @cantonios",Rohit Santhanam,2022-01-05T16:18:46.073Z,NA,NA,"## Title: | |
Some serialization API changes were made in commit... | |
## Authors: | |
Rohit Santhanam | |
## Summary: | |
This merge request addresses issues caused by recent serialization API changes that disrupted GPU unit tests for HIP. The adjustments rectify these problems, ensuring the tests function correctly again. | |
### Key Changes: | |
- Revisions to the serialization API to restore compatibility with GPU unit tests. | |
### Improvements: | |
- Corrected functionality of GPU unit tests for HIP, enhancing stability post-serialization API changes. | |
### Impact: | |
- Restores the integrity of GPU-related testing processes, preventing future disruptions in testing resulting from serialization API modifications." | |
792 (https://gitlab.com/libeigen/eigen/-/merge_requests/792),Allow specifying inner & outer stride for CWiseUnaryView - fixes #2398,"### Reference issue | |
#2398 | |
### What does this implement/fix? | |
This MR adds the ability to manually specify the inner and/or outer stride for `CWiseUnaryView`, to avoid issues caused by the incorrect automatic derivation of strides. The strides are set in an identical manner to `Map`, via: | |
```cpp | |
CwiseUnaryView<view_op, VectorType, Stride<OuterStride,InnerStride>> vec_view(vec); | |
``` | |
### Additional information | |
Tests are added to check that strides are set correctly",Andrew Johnson,2022-01-05T19:24:47.225Z,NA,NA,"## Title: | |
Allow specifying inner & outer stride for CWiseUnaryView - fixes #2398 | |
## Authors: | |
Andrew Johnson | |
## Summary: | |
This merge request enhances the `CWiseUnaryView` functionality by allowing users to manually specify inner and outer strides, addressing issues from improper automatic stride derivation. | |
### Key Changes: | |
- Introduced the capability to set inner and outer strides explicitly in `CWiseUnaryView`. | |
- Strides are defined similarly to how they are set for `Map`. | |
### Improvements: | |
- Provides more control over stride configuration, leading to potential performance and correctness benefits in vector operations. | |
### Impact: | |
- Enhances the usability of `CWiseUnaryView` for developers, minimizing errors related to stride calculations and improving overall reliability in applications using the Eigen library." | |
799 (https://gitlab.com/libeigen/eigen/-/merge_requests/799),Improve plog: 20% speedup for float + handle denormals,"This replaces !784 | |
1. For `float`, replace the degree 10 polynomial approximation of `log(1+x)` on `[sqrt(0.5)-1;sqrt(2)-1]` | |
by a (3,3) rational approximation. This speeds up the function by ~20% for AVX2. | |
The max relative error increases slightly from 2 ulp to 2.2 ulp for arguments > 1e-15. | |
For tiny arguments the error in both the old and new implementation rises to 64 ulp | |
as x approaches `std::numeric_limits<float>::denorm_min()`. This is likely related to | |
the range reduction and remains to be investigated. | |
2. Change argument clamping such that `log(x)` does not incorrecctly saturate at `~-88` for | |
denormalized `float` arguments, but continues down to `~-104` for positive denormal arguments. | |
A similar fix is done for `double`. | |
3. Re-enable a test for computing `log(denorm_min)`. | |
Thanks to my colleague James Lottes for suggesting this change and deriving the (3,3) approximant. | |
Benchmark numbers for AVX2: | |
``` | |
name old cpu/op new cpu/op delta | |
BM_eigen_log_float/1 3.55ns ± 0% 3.27ns ± 0% -7.78% (p=0.000 n=48+49) | |
BM_eigen_log_float/8 34.4ns ± 5% 32.7ns ± 0% -4.97% (p=0.000 n=50+38) | |
BM_eigen_log_float/64 107ns ± 5% 86ns ± 3% -19.69% (p=0.000 n=60+60) | |
BM_eigen_log_float/512 640ns ± 5% 502ns ± 5% -21.56% (p=0.000 n=60+60) | |
BM_eigen_log_float/4k 4.94µs ± 5% 3.84µs ± 3% -22.22% (p=0.000 n=60+51) | |
BM_eigen_log_float/32k 39.1µs ± 4% 30.5µs ± 3% -22.07% (p=0.000 n=46+50) | |
BM_eigen_log_float/256k 313µs ± 4% 244µs ± 4% -21.93% (p=0.000 n=45+50) | |
BM_eigen_log_float/1M 1.26ms ± 4% 0.97ms ± 2% -23.06% (p=0.000 n=39+30) | |
name old time/op new time/op delta | |
BM_eigen_log_float/1 3.55ns ± 0% 3.27ns ± 0% -7.79% (p=0.000 n=41+49) | |
BM_eigen_log_float/8 34.4ns ± 5% 32.7ns ± 0% -4.98% (p=0.000 n=50+38) | |
BM_eigen_log_float/64 107ns ± 5% 86ns ± 3% -19.68% (p=0.000 n=60+60) | |
BM_eigen_log_float/512 640ns ± 5% 502ns ± 5% -21.56% (p=0.000 n=60+60) | |
BM_eigen_log_float/4k 4.93µs ± 5% 3.84µs ± 3% -22.19% (p=0.000 n=60+52) | |
BM_eigen_log_float/32k 39.1µs ± 4% 30.5µs ± 3% -22.06% (p=0.000 n=46+50) | |
BM_eigen_log_float/256k 313µs ± 4% 244µs ± 4% -21.94% (p=0.000 n=45+50) | |
BM_eigen_log_float/1M 1.26ms ± 4% 0.97ms ± 2% -23.07% (p=0.000 n=39+30) | |
name old INSTRUCTIONS/op new INSTRUCTIONS/op delta | |
BM_eigen_log_float/1 41.0 ± 0% 41.0 ± 0% ~ (all samples are equal) | |
BM_eigen_log_float/8 328 ± 0% 329 ± 0% +0.30% (p=0.000 n=48+48) | |
BM_eigen_log_float/64 778 ± 0% 684 ± 0% -12.08% (p=0.000 n=56+60) | |
BM_eigen_log_float/512 4.03k ± 0% 3.26k ± 0% -19.03% (p=0.000 n=53+56) | |
BM_eigen_log_float/4k 30.0k ± 0% 23.9k ± 0% -20.47% (p=0.000 n=56+46) | |
BM_eigen_log_float/32k 238k ± 0% 189k ± 0% -20.66% (p=0.000 n=37+44) | |
BM_eigen_log_float/256k 1.90M ± 0% 1.51M ± 0% -20.69% (p=0.000 n=38+45) | |
BM_eigen_log_float/1M 7.60M ± 0% 6.03M ± 0% -20.69% (p=0.000 n=36+35) | |
name old CYCLES/op new CYCLES/op delta | |
BM_eigen_log_float/1 13.1 ± 0% 12.1 ± 0% -7.81% (p=0.000 n=40+50) | |
BM_eigen_log_float/8 127 ± 5% 121 ± 0% -4.98% (p=0.000 n=50+37) | |
BM_eigen_log_float/64 362 ± 2% 293 ± 0% -18.99% (p=0.000 n=56+60) | |
BM_eigen_log_float/512 2.17k ± 2% 1.71k ± 1% -21.00% (p=0.000 n=60+60) | |
BM_eigen_log_float/4k 16.7k ± 2% 13.1k ± 1% -21.65% (p=0.000 n=59+52) | |
BM_eigen_log_float/32k 133k ± 3% 104k ± 1% -21.58% (p=0.000 n=46+45) | |
BM_eigen_log_float/256k 1.06M ± 2% 0.83M ± 1% -21.41% (p=0.000 n=45+50) | |
BM_eigen_log_float/1M 4.26M ± 3% 3.33M ± 1% -21.77% (p=0.000 n=39+38) | |
```",Rasmus Munk Larsen,2022-01-05T23:40:32.471Z,NA,NA,"## Title: | |
Improve plog: 20% speedup for float + handle denormals | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request introduces significant enhancements to the logarithm function for float data types within the Eigen C++ library, focusing on performance improvements and better handling of denormal numbers. | |
### Key Changes: | |
1. Replaced the degree 10 polynomial approximation for `log(1+x)` with a (3,3) rational approximation specifically for float arguments in the range `[sqrt(0.5)-1; sqrt(2)-1]`, resulting in approximately a 20% speedup for AVX2. | |
2. Modified the argument clamping to prevent saturation at `~-88` for denormalized float values, allowing it to extend down to `~-104`. | |
3. Re-enabled tests for computing `log(denorm_min)`. | |
### Improvements: | |
- The new implementation demonstrates a marked reduction in computation time across various benchmarks, achieving speed improvements of up to 23% depending on the argument size. | |
- Slightly increased maximum relative error for values greater than `1e-15`, moving from 2 ulp to 2.2 ulp, although the error remains the same for very small arguments. | |
### Impact: | |
These changes enhance the efficiency of logarithmic calculations for float types, making the Eigen library faster and more reliable when handling denormal numbers, ultimately improving performance in numerical applications that rely on these computations." | |
802 (https://gitlab.com/libeigen/eigen/-/merge_requests/802),Fixes #i2411,"Fixes #2411 | |
### What does this implement/fix? | |
Commit c20e908ebc42f7174b1d85ce82583a66d185520c introduced a truncation from unsigned int to bool | |
this fixes it. Also, the truncation itself might be a bug (0110 => 0, but it should be 1).",Fabian Keßler,2022-01-06T20:02:38.500Z,NA,NA,"## Title: | |
Fixes #i2411 | |
## Authors: | |
Fabian Keßler | |
## Summary: | |
This merge request addresses an issue identified in commit c20e908ebc42f7174b1d85ce82583a66d185520c, which improperly truncated an `unsigned int` to a `bool`. The fix corrects the truncation behavior, ensuring that values are interpreted correctly. | |
### Key Changes: | |
- Corrected improper truncation from `unsigned int` to `bool`, ensuring accurate interpretation of integer values. | |
### Improvements: | |
- The fix enhances the reliability of type conversions within the library, preventing potential misinterpretation of values. | |
### Impact: | |
- This change prevents a bug that could lead to incorrect boolean outcomes (e.g., interpreting `0110` as `0` instead of the expected `1`), thereby improving overall correctness in computations." | |
801 (https://gitlab.com/libeigen/eigen/-/merge_requests/801),Some fixes/cleanups for numeric_limits & fix for related bug in psqrt,"From [email protected]: | |
Some fixes/cleanups for numeric_limits | |
BFloat16: | |
- Set the highest payload bit instead of the lowest for signaling_NaN to match Half | |
- Set has_denorm to denorm_present as this type supports denormals. Otherwise, we should set denorm_min to min as per the standard. | |
Half: | |
- epsilon defined incorrectly | |
- tinyness_before should be identical to the other C++ floating point types | |
- is_bounded defined as false instead of true; is_bounded == false would be true for types with arbitrary precision | |
- traps should be set to what float uses, it is likely false | |
- is_iec559 should be set to true for both types (long double has this true and it has a much weirder encoding) | |
From [email protected]: | |
Add a workaround to the AVX implementation of `psqrt` since `_mm256_rsqrt_ps` appears to flush negative denormal values to zero. | |
Closes #2409",Rasmus Munk Larsen,2022-01-07T01:10:18.838Z,NA,NA,"## Title: | |
Some fixes/cleanups for numeric_limits & fix for related bug in psqrt | |
## Authors: | |
[email protected], [email protected] | |
## Summary: | |
This merge request introduces important fixes and cleanups to the numeric_limits implementations for the BFloat16 and Half types, and addresses a bug in the AVX implementation of the `psqrt` function. | |
### Key Changes: | |
- **BFloat16 Adjustments**: | |
- Highest payload bit is set for signaling_NaN. | |
- Denormal handling corrected by setting `has_denorm` to `denorm_present`. | |
- **Half Type Corrections**: | |
- Corrected definitions for `epsilon` and `tinyness_before`. | |
- Fixed `is_bounded` to return true. | |
- Updated `traps` settings to align with float expectations. | |
- Set `is_iec559` to true for consistency with other floating point types. | |
- **AVX `psqrt` Workaround**: A fix added to handle the flushing of negative denormal values to zero in the `_mm256_rsqrt_ps` function. | |
### Improvements: | |
The changes enhance the accuracy and adherence of the BFloat16 and Half types to the C++ standard for floating-point types, ensuring better compatibility and correctness in numerical computations. | |
### Impact: | |
These updates improve the reliability of floating-point operations in the Eigen C++ library, reducing bugs and increasing predictability when dealing with special cases such as denormals and NaNs. The fix to the `psqrt` function will help prevent erroneous behavior due to negative denormal values." | |
791 (https://gitlab.com/libeigen/eigen/-/merge_requests/791),"Add support for Cray, Fujitsu, and Intel ICX compilers","1. This MR adds support for the Cray (CPE), Fujitsu (FCC), and Intel ICX compilers | |
The following preprocessor macros are added: | |
- `EIGEN_COMP_CPE` and `EIGEN_COMP_CLANGCPE` version number of the CRAY compiler if Eigen is compiled with the Cray C++ compiler, `0` otherwise | |
- `EIGEN_COMP_FCC` and `EIGEN_COMP_CLANGFCC` version number of the FCC compiler if Eigen is compiled with the Fujitsu C++ compiler, `0` otherwise | |
- `EIGEN_COMP_CLANGICC` version number of the ICX compiler if Eigen is compiled with the Intel oneAPI C++ compiler, `0` otherwise | |
All three compilers (Cray, Fujitsu, Intel) offer a traditional and a Clang-based frontend. This is distinguished by the `CLANG` fix. | |
2. This MR extends the detection of the IBM XL compiler to V13.1 and V16.1 which use other predefined macros",Matthias Möller,2022-01-07T18:46:16.971Z,NA,NA,"## Title: | |
Add support for Cray, Fujitsu, and Intel ICX compilers | |
## Authors: | |
Matthias Möller | |
## Summary: | |
This merge request introduces support for the Cray, Fujitsu, and Intel ICX compilers in the Eigen C++ library, enhancing its compatibility with various compiler environments. | |
### Key Changes: | |
- Added preprocessor macros for: | |
- Cray compiler: `EIGEN_COMP_CPE` and `EIGEN_COMP_CLANGCPE` | |
- Fujitsu compiler: `EIGEN_COMP_FCC` and `EIGEN_COMP_CLANGFCC` | |
- Intel ICX compiler: `EIGEN_COMP_CLANGICC` | |
- Extended detection capabilities for the IBM XL compiler to versions 13.1 and 16.1. | |
### Improvements: | |
- Improved support for a broader range of compilers increases the usability of the Eigen library for developers working with different systems. | |
- Clear differentiation between traditional and Clang-based frontends for Cray, Fujitsu, and Intel compilers. | |
### Impact: | |
- Strengthened cross-compatibility of the Eigen library, allowing more developers to utilize Eigen regardless of their preferred compiler, which could lead to broader adoption and enhance the library's user base." | |
796 (https://gitlab.com/libeigen/eigen/-/merge_requests/796),Make fixed-size Matrix and Array trivially copyable after C++20,"Making them trivially copyable allows using std::memcpy() without undefined behaviors. | |
Only Matrix and Array with trivially copyable DenseStorage are marked as trivially copyable with an additional type trait. | |
As described in http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0848r3.html it requires extremely verbose SFINAE to make the special member functions of fixed-size Matrix and Array trivial, unless C++20 concepts are available to simplify the selection of trivial special member functions given template parameters. Therefore only make this feature available to compilers that support C++20 P0848R3. | |
Fix #1855. | |
I have run the official and unsupported tests locally with passing results using the same commands in ci/test.gitlab-ci.yml and `cmake -DCMAKE_CXX_STANDARD=20`, but it's not clear to me how the CI script itself should be updated. | |
Please also let me know if any documentation is needed.",Lingzhu Xiang,2022-01-07T19:04:36.308Z,NA,NA,"## Title: | |
Make fixed-size Matrix and Array trivially copyable after C++20 | |
## Authors: | |
Lingzhu Xiang | |
## Summary: | |
This merge request introduces changes to make the fixed-size Matrix and Array types in the Eigen library trivially copyable, allowing the use of `std::memcpy()` without undefined behaviors. | |
### Key Changes: | |
- Fixed-size Matrix and Array types are now marked as trivially copyable when they have trivially copyable DenseStorage. | |
- Implemented an additional type trait to facilitate this marking. | |
### Improvements: | |
- Utilizes C++20 features, specifically concepts from P0848R3, to reduce the complexity of SFINAE needed to determine trivial special member functions. | |
- Enhances compatibility and performance when using methods like `std::memcpy()`. | |
### Impact: | |
- Improved efficiency and safety in memory management for fixed-size Matrix and Array types. | |
- Only applicable for compilers that support C++20, ensuring modern C++ best practices are followed." | |
803 (https://gitlab.com/libeigen/eigen/-/merge_requests/803),Fix Gcc8.5 warning about missing base class initialisation (#2404),"### Reference issue | |
Gcc8.5 warning about missing base class initialisation (#2404) | |
### What does this implement/fix? | |
This MR initialises the base class explicitly.",Matthias Möller,2022-01-07T19:34:28.784Z,NA,NA,"## Title: | |
Fix Gcc8.5 warning about missing base class initialisation (#2404) | |
## Authors: | |
Matthias Möller | |
## Summary: | |
This merge request addresses a specific warning generated by GCC 8.5 regarding the omission of base class initialization. | |
### Key Changes: | |
- Explicitly initializes the base class within the code. | |
### Improvements: | |
- Resolves compiler warnings, enhancing code compliance and robustness. | |
### Impact: | |
- Reduces potential issues related to uninitialized base classes, promoting safer and more reliable code." | |
805 (https://gitlab.com/libeigen/eigen/-/merge_requests/805),Make sure the scalar and vectorized paths for array.exp() return consistent values.,This fixes #2413.,Rasmus Munk Larsen,2022-01-07T23:31:36.354Z,NA,NA,"## Title: | |
Make sure the scalar and vectorized paths for array.exp() return consistent values. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request addresses an inconsistency in the `array.exp()` function within the Eigen C++ library, ensuring that both scalar and vectorized executions produce the same results. This fix resolves issue #2413. | |
### Key Changes: | |
- Aligned the output of the scalar and vectorized paths in the `array.exp()` function. | |
### Improvements: | |
- Enhanced reliability of the `array.exp()` functionality by ensuring consistent behavior across different execution paths. | |
### Impact: | |
- Users can expect uniform results from the `array.exp()` function, improving the robustness of calculations that rely on this method within the Eigen library." | |
780 (https://gitlab.com/libeigen/eigen/-/merge_requests/780),Fix accuracy of logistic sigmoid,"Fix accuracy of the specialized float32 implementation of the logistic (sigmoid) function `S(x) = exp(x)/(1+exp(x))` in Eigen. The reason to have this specialization in the first place is that logistic sigmoid is a frequently used function in machine learning and statistics, so even a modestly (~30%) faster version can have an impact in applications. | |
The old implementation was very inaccurate around x=-9 where we would switch from a fast rational approximant to returning `exp(x)`, which has the right asymptotic behavior for large negative x. This approach would have errors >8000 ulps around x=-9 and also be slow for SIMD packets containing elements less than -9, since we would compute both the rational approximant and `exp(x)`. | |
The new algorithm uses a hybrid range reduction method: First the standard range reduction used in `pexp(x)` is applied, and secondly we use the identity `exp(r) = exp(r/2)^2`, to avoid generating denormalized intermediate values for large negative arguments. This enables us to use a fast version of `pldexp` that does not properly handle denormals, but is significantly faster. | |
The final result is an implementation that has maximum relative error of 4.5 ulps for normalized results. In addition, the old algorithm would return zero for `x < ~-88`, while the new algorithm extends the range to `x~=-104` where `S(x) ~= std::numeric_limits<float>::denorm_min()`. Relative accuracy degrades gradually down to 0.033 for arguments around `x=-104` where `S(x) ~= std::numeric_limits<float>::denorm_min()`. This is expected and acceptable. | |
The new algorithm is about 30% faster than computing `e = pexp(x); s = e / (1 + e)`, while the old inaccurate algorithm was about 40% faster when no arguments `x < -9` are present and about 2x slower when they are. | |
This change also extends the range of the non-specialized version to properly compute dernormalized values of S(x) for large negative x, e.g. the range `~[-104;-88]` for float, below which S(x) is too small to be represented even in a denormalized float. This is accomplished by evaluating `S(x)` as `exp(x) / (1 + exp(x))` instead of `1 / (1 + exp(-x))`. | |
Thanks to my Google colleagues James Lottes and Sameer Agarwal for the discussions and experimentation that led to this.",Rasmus Munk Larsen,2022-01-08T00:15:15.017Z,NA,NA,"## Title: | |
Fix accuracy of logistic sigmoid | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request improves the accuracy and performance of the logistic (sigmoid) function implementation in the Eigen C++ library. The update addresses significant inaccuracies in the previous float32 implementation, particularly for values around x = -9. | |
### Key Changes: | |
- Corrected the implementation of the logistic sigmoid function `S(x) = exp(x)/(1+exp(x))` to enhance accuracy. | |
- Introduced a hybrid range reduction method to improve performance and accuracy for large negative arguments. | |
- Extended the computation range for the logistic sigmoid to handle inputs as low as x ≈ -104, enabling proper calculation of denormalized values. | |
### Improvements: | |
- The new implementation achieves a maximum relative error of 4.5 ulps for normalized results, compared to the previous implementation which had errors exceeding 8000 ulps for certain inputs. | |
- The revised algorithm is approximately 30% faster than the old version while maintaining acceptable accuracy, especially compared to the old implementation's performance drop for inputs less than -9. | |
### Impact: | |
The improvements lead to both increased accuracy and performance in machine learning and statistical applications where the logistic sigmoid function is frequently employed. The updated algorithm allows for better handling of edge cases and extends the operational range of the function, thereby enhancing its reliability for users of the Eigen library." | |
806 (https://gitlab.com/libeigen/eigen/-/merge_requests/806),Fix IterativeSolverBase referring to itself as ConjugateGradient,"### What does this implement/fix? | |
Two assertion messages in `IterativeSolverBase` refer to it as `ConjugateGradient`. | |
This is incorrect, as the derived class could be several different solvers, not just CG. | |
This MR changes those assertion messages to say `IterativeSolverBase` instead. | |
This is consistent with another identical assertion further down in the file | |
that is already saying `""IterativeSolverBase is not initialized.""` | |
Based on this git history, it looks like IterativeSolverBase was made by extracting | |
code out of `ConjugateGradient` and that these messages were copy-pasted at that | |
time without being modified. A simple copy-paste error.",Essex Edwards,2022-01-08T08:25:16.514Z,NA,NA,"## Title: | |
Fix IterativeSolverBase referring to itself as ConjugateGradient | |
## Authors: | |
Essex Edwards | |
## Summary: | |
This merge request addresses an issue where assertion messages in the `IterativeSolverBase` class incorrectly refer to it as `ConjugateGradient`. The changes ensure that these messages accurately reference the class name, improving clarity and consistency in the codebase. | |
### Key Changes: | |
- Updated assertion messages in `IterativeSolverBase` to correctly reflect it as `IterativeSolverBase` instead of `ConjugateGradient`. | |
### Improvements: | |
- Enhances code readability and correctness by eliminating confusing references to `ConjugateGradient`. | |
- Aligns with the existing assertion message that already references `IterativeSolverBase`. | |
### Impact: | |
- Reduces potential confusion for developers working with the `IterativeSolverBase`, ensuring they understand that it represents a base class for multiple solver types, not solely for Conjugate Gradient." | |
795 (https://gitlab.com/libeigen/eigen/-/merge_requests/795),Reduce usage of reserved names,"In Eigen there are quite some usages of reserved names. For example, we use leading underscores in identifiers, which [is reserved for implementation](https://timsong-cpp.github.io/cppwp/n3337/reserved.names#global.names-1.2), see also discussion in #2205 and #361. | |
### Reference issue | |
#2205 | |
See also #361 and !575. | |
### What does this implement/fix? | |
This MR fixes several usages of reserved names. | |
Note: The MR is still work in progress. | |
### Additional information | |
* As in !575 / commit 4ba872bd I am moving the underscore to the end of the name. Please let me know if you prefer another replacement rule. | |
* I would be very happy if somebody could review my MR.",Kolja Brix,2022-01-10T20:53:29.599Z,NA,NA,"## Title: | |
Reduce usage of reserved names | |
## Authors: | |
Kolja Brix | |
## Summary: | |
This merge request addresses the usage of reserved names in the Eigen C++ library, specifically focusing on the use of leading underscores in identifiers, which are reserved for implementation according to the C++ standard. The changes aim to enhance code compliance and maintainability. | |
### Key Changes: | |
- Refactoring of several identifiers to eliminate the use of leading underscores. | |
- Moving the underscore to the end of the identifier name as a proposed solution. | |
### Improvements: | |
- Increased adherence to the C++ standard guidelines regarding reserved names. | |
- Improved code clarity and maintainability by avoiding potential conflicts with reserved identifiers. | |
### Impact: | |
- This change promotes better coding practices and reduces the likelihood of issues arising from reserved name conflicts, which could affect the library's usability and compatibility." | |
808 (https://gitlab.com/libeigen/eigen/-/merge_requests/808),Explicit type casting,"### What does this implement/fix? | |
The function signature of `pmadd` assumes identical types of all three arguments, i.e. `Eigen/src/Core/GenericPacketMath.h` lines 958ff | |
```cpp | |
/** \internal \returns a * b + c (coeff-wise) */ | |
template<typename Packet> EIGEN_DEVICE_FUNC inline Packet | |
pmadd(const Packet& a, | |
const Packet& b, | |
const Packet& c) | |
{ return padd(pmul(a, b),c); } | |
``` | |
In `Eigen/src/LU/Determinant.h` the function `pmadd` is used with mathematical expressions, i.e. | |
```cpp | |
return internal::pmadd((Scalar)(-m(0,3)),d3_0, (Scalar)(m(1,3)*d3_1)) + | |
internal::pmadd((Scalar)(-m(2,3)),d3_2, (Scalar)(m(3,3)*d3_3)); | |
``` | |
and | |
```cpp | |
return internal::pmadd(m(i0,2), d0, internal::pmadd((Scalar)(-m(i1,2)), d1, (Scalar)(m(i2,2)*d2))); | |
``` | |
which must be explicitly casted to `Scalar`. Otherwise, custom scalar types whose overload of `operator*` and `operator-` return expression templates rather than the original types will lead to compiler errors. | |
### Additional information | |
This bug has ben observed while differentiating Eigen using the AD library CoDiPack.",Matthias Möller,2022-01-10T22:06:44.478Z,NA,NA,"## Title: | |
Explicit type casting | |
## Authors: | |
Matthias Möller | |
## Summary: | |
This merge request addresses a type compatibility issue in the `pmadd` function of the Eigen C++ library, which previously operated under the assumption that all three arguments were of identical types. The need for explicit type casting in mathematical expressions involving `pmadd` was highlighted, particularly when using custom scalar types. | |
### Key Changes: | |
- Modified the `pmadd` function signature to require explicit type casts for arguments of potentially different types. | |
- Examples in `Determinant.h` now include explicit casting of arguments to the `Scalar` type to prevent compilation errors. | |
### Improvements: | |
- Enhances type safety by ensuring scalar types used with overloaded operators are explicitly cast, reducing the risk of compiler errors. | |
- Provides clearer documentation on the necessity of casting, aiding future library users and maintainers. | |
### Impact: | |
The changes improve compatibility with custom scalar types and facilitate the use of the Eigen library with automatic differentiation libraries such as CoDiPack, leading to better integration and usability in mathematical computations." | |
810 (https://gitlab.com/libeigen/eigen/-/merge_requests/810),Fix two corner cases in the new implementation of logistic sigmoid.,"1. Truncate at the first point where the interpolant is exactly 1, | |
such that 1 is returned for all arguments greater than or equal | |
to it. | |
2. Make sure that Sigmoid(+Inf) = 1 in the generic implementation.",Rasmus Munk Larsen,2022-01-12T00:41:30.681Z,NA,NA,"## Title: | |
Fix two corner cases in the new implementation of logistic sigmoid. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request addresses two specific corner cases in the logistic sigmoid function's implementation to enhance its accuracy and robustness. | |
### Key Changes: | |
1. Introduced truncation at the point where the interpolant reaches exactly 1, ensuring that the function returns 1 for all inputs greater than or equal to this point. | |
2. Used a fix to guarantee that Sigmoid(+Inf) consistently returns 1 in the generic implementation. | |
### Improvements: | |
These changes ensure better handling of edge cases within the logistic sigmoid function, enhancing its reliability in mathematical computations. | |
### Impact: | |
By implementing these fixes, the accuracy of the logistic sigmoid function is improved, leading to more reliable outputs for applications relying on this mathematical function." | |
809 (https://gitlab.com/libeigen/eigen/-/merge_requests/809),fix broken asserts,"The conditions in these asserts contained the variable name as a string, so there was no actual checking.",Erik Schultheis,2022-01-12T19:46:45.680Z,NA,NA,"## Title: | |
fix broken asserts | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request addresses an issue with broken assertions in the Eigen C++ library. The original asserts contained variable names as strings, which resulted in no effective checking. | |
### Key Changes: | |
- Corrected the conditions in asserts to ensure valid checks are performed. | |
### Improvements: | |
- Enhanced the reliability of assertion checks within the codebase. | |
### Impact: | |
- Improves the robustness of the library by ensuring that assertions function correctly, potentially preventing unnoticed errors during runtime." | |
811 (https://gitlab.com/libeigen/eigen/-/merge_requests/811),fix compilation issue with gcc < 10 and -std=c++2a,"Fixes #2415 | |
This MR requires GCC <= 9.4 to use the old code for signed size as `std::ssize` is not available.",Joerg Buchwald,2022-01-13T01:43:05.391Z,NA,NA,"## Title: | |
Fix compilation issue with gcc < 10 and -std=c++2a | |
## Authors: | |
Joerg Buchwald | |
## Summary: | |
This merge request addresses a compilation issue encountered when using GCC versions earlier than 10 with the C++2a standard. It specifically resolves the problem documented in issue #2415 by requiring the use of legacy code for signed size, as `std::ssize` is unavailable in these compiler versions. | |
### Key Changes: | |
- Fixed the compilation issue for GCC versions <= 9.4 by using old code for signed size. | |
### Improvements: | |
- Enhanced compatibility with older GCC versions when compiling with C++2a. | |
### Impact: | |
- This change allows users of the Eigen C++ library to successfully compile code with older GCC versions, thereby improving accessibility and usability for a wider range of development environments." | |
764 (https://gitlab.com/libeigen/eigen/-/merge_requests/764),Add MMA and performance improvements for VSX in GEMV for PowerPC.,"Add MMA and performance improvements for VSX in GEMV for PowerPC. | |
Changes include improved complex operations, full packet operations, and addition of VSX and MMA acceleration. | |
Up to 2.5X faster for VSX and 4X for MMA.",Chip Kerchner,2022-01-13T13:23:19.153Z,NA,NA,"## Title: | |
Add MMA and performance improvements for VSX in GEMV for PowerPC. | |
## Authors: | |
Chip Kerchner | |
## Summary: | |
This merge request introduces enhancements in the Eigen C++ library, specifically targeting GEMV operations for PowerPC architecture. The improvements focus on leveraging VSX and MMA technologies to significantly boost performance. | |
### Key Changes: | |
- Improved handling of complex operations. | |
- Enhanced full packet operations. | |
- Added VSX and MMA acceleration features. | |
### Improvements: | |
- Performance gains of up to 2.5X with VSX optimizations. | |
- Performance improvements of up to 4X with MMA enhancements. | |
### Impact: | |
These changes are expected to substantially increase the efficiency of GEMV operations on PowerPC systems, providing better computational performance for applications relying on these operations." | |
812 (https://gitlab.com/libeigen/eigen/-/merge_requests/812),fix implicit conversion warning in vectorwise_reverse_inplace,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
This fixes warning 2 in Issue #2400; the implicit conversion from ``Eigen::Index`` to ``int`` in ``vectorwise_reverse_inplace_impl`` | |
I also noticed this warning when building the BDCSVD tests. | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
There's a new warning when building the tests about an implicit conversion from ``Index`` to ``int`` in Reverse, when it uses ``fix<N>(n)``. | |
This update just explicitly casts ``half`` from an Index to an int to remove the conversion warning. Since as far as I can see, ``fix<N>(n)`` only works with ``int``, this casting should be equivalent to how it was before the recent changes. | |
Before the recent cleanup, the pre-c++14 version of ``fix`` took the runtime value generically, and then manually cast it down to an int. So I think there was generally no implicit conversion.",Arthur,2022-01-13T20:30:55.569Z,NA,NA,"## Title: | |
Fix implicit conversion warning in vectorwise_reverse_inplace | |
## Authors: | |
Arthur | |
## Summary: | |
This merge request addresses an implicit conversion warning encountered in the Eigen C++ library, specifically pertaining to the `vectorwise_reverse_inplace_impl` function. The update involves explicitly casting a variable to prevent warnings related to type conversion during the build process. | |
### Key Changes: | |
- Added explicit casting of the `half` variable from `Eigen::Index` to `int` in the `vectorwise_reverse_inplace_impl` function. | |
- Adjusted the handling of runtime values in the `fix<N>(n)` function to maintain compatibility with previous behavior. | |
### Improvements: | |
- The modification removes the implicit conversion warning, thereby enhancing code clarity and safety. | |
- It ensures that the function adheres to type requirements without altering its original functionality. | |
### Impact: | |
- This change improves the build process by eliminating warnings, contributing to cleaner compilation and potentially reducing debugging time in related tests. | |
- By addressing the issue directly linked to Issue #2400, it enhances the overall robustness of the library." | |
814 (https://gitlab.com/libeigen/eigen/-/merge_requests/814),update comment referencing removed macro EIGEN_SIZE_MIN_PREFER_DYNAMIC,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
super minor :sweat_smile: | |
Just stumbled on this comment referencing ``EIGEN_SIZE_MIN_PREFER_DYNAMIC``, which no longer exists. | |
Updated it to reference the new constexpr function instead.",Arthur,2022-01-14T19:29:48.485Z,NA,NA,"## Title: | |
Update comment referencing removed macro EIGEN_SIZE_MIN_PREFER_DYNAMIC | |
## Authors: | |
Arthur | |
## Summary: | |
This merge request addresses a minor comment update in the Eigen C++ library, specifically changing a reference from the removed macro `EIGEN_SIZE_MIN_PREFER_DYNAMIC` to the new constexpr function. | |
### Key Changes: | |
- Updated comments to remove references to the outdated macro. | |
- Added a reference to the new constexpr function. | |
### Improvements: | |
- Enhanced code clarity by ensuring comments accurately reflect the current implementation and available functions. | |
### Impact: | |
- The update does not introduce functional changes but improves maintainability and reduces confusion for future contributors exploring the code." | |
813 (https://gitlab.com/libeigen/eigen/-/merge_requests/813),Minor correction/clarification to LSCG solver documentation,"This applies some minor corrections/clarifications to the docs of LeastSquaresConjugateGradient (LSCG). | |
LSCG solves the least squares problem ""min |Ax-b|"" or, equivalently, the normal equations ""A'Ax=A'b"". The documentation seems a little confused about this. In one place, it is described as solving ""min |A'Ax-b|"", which appears to be a mashup of the two formulas and is not correct. In another place, it is described as solving Ax=b, which is misleading without mentioning least-squares, as it (probably) will not be able to find an x satisfying that equation. I would guess both of these issues are simple copy-paste/editing errors.",Essex Edwards,2022-01-14T19:48:55.119Z,NA,NA,"## Title: Minor correction/clarification to LSCG solver documentation | |
## Authors: Essex Edwards | |
## Summary: | |
This merge request focuses on correcting and clarifying the documentation for the LeastSquaresConjugateGradient (LSCG) solver in the Eigen C++ library. It addresses inaccuracies in the descriptions of how LSCG solves least squares problems. | |
### Key Changes: | |
- Corrected misrepresentation of the least squares problem in the documentation, particularly the equations presented. | |
- Clarified that LSCG solves the minimum of the least squares problem ""min |Ax-b|"" and the normal equations ""A'Ax=A'b"". | |
### Improvements: | |
- Enhances the clarity of the documentation, ensuring that users understand the proper application of the LSCG solver. | |
- Reduces confusion by explicitly stating the nature of the least squares problems being solved. | |
### Impact: | |
- Improves user comprehension and reduces the risk of misapplication of the LSCG solver, which may lead to better outcomes in users' implementations." | |
815 (https://gitlab.com/libeigen/eigen/-/merge_requests/815),Fix implicit conversion warning in GEBP kernel's packing,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Warning 1 in #2400 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
very small change to remove a shortening conversion warning in GeneralBlockPanelKernel.h. Not too important to fix, but I've seen this warning pop up a number of times in unrelated compilation errors and it's a little annoying to scroll past. | |
(Since it doesn't really show up in the diff,) ``gemm_pack_lhs`` in the gebp kernel includes some code like this | |
~~~~ | |
int pack = Pack1; | |
int psize = PacketSize; | |
... | |
Index left = rows - i; // rows remaining to pack | |
... | |
psize = pack = left & ~1; // triggers -Wshorten-64-to-32 | |
~~~~ | |
This MR just changes ``pack``, ``psize``, (and a loop variable) to ``Index`` to remove the warning. | |
### Additional Notes | |
maybe it's preferable to just do ``int left = internal::convert_index<int>(rows - i)``? | |
I think ``left = rows - i`` is the few remaining rows that didn't fit with the current value of ``pack``. So ``left`` should definitely be small enough to fit in an int. (and just printing the value of ``left`` in the product_large tests confirms it's always smaller than ``Pack1``)",Arthur,2022-01-18T12:55:05.320Z,NA,NA,"## Title: | |
Fix implicit conversion warning in GEBP kernel's packing | |
## Authors: | |
Arthur | |
## Summary: | |
This merge request addresses a warning related to implicit conversions in the GeneralBlockPanelKernel of the Eigen C++ library. The change primarily focuses on modifying variable types to enhance code clarity and eliminate warnings that are often encountered during unrelated compilation errors. | |
### Key Changes: | |
- Changed the data types of `pack`, `psize`, and a loop variable from `int` to `Index` in `GeneralBlockPanelKernel.h` to remove the `-Wshorten-64-to-32` warning. | |
### Improvements: | |
- By modifying the variable types, the merge request improves type safety and reduces unnecessary warnings, leading to cleaner compilation outputs. | |
### Impact: | |
- While the change does not introduce new functionality, it addresses a recurring compilation warning that can distract developers, thereby enhancing the development experience and maintaining code quality." | |
816 (https://gitlab.com/libeigen/eigen/-/merge_requests/816),Port EIGEN_OPTIMIZATION_BARRIER to soft float arm,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Currently Eigen does not build if the -march CFLAG is set to ""armv6j+nofp"". This is happening because a ""w"" inline asm constraint is used. | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This change detects no-floating-point-hw targets using the __ARM_FP macro, and avoids the ""w"" constraint accordingly. | |
### Additional information | |
<!--Any additional information you think is important.-->",David Gao,2022-01-20T00:44:17.678Z,NA,NA,"## Title: | |
Port EIGEN_OPTIMIZATION_BARRIER to soft float arm | |
## Authors: | |
David Gao | |
## Summary: | |
This merge request updates the Eigen C++ library to support compilation on ARM architectures that do not have hardware floating point support. It modifies the inline assembly constraints to ensure compatibility with specific ARM flag settings. | |
### Key Changes: | |
- Detection of no-floating-point hardware targets using the `__ARM_FP` macro. | |
- Avoidance of the ""w"" inline asm constraint for ARM architectures specified with the flag ""-march armv6j+nofp"". | |
### Improvements: | |
- Enhanced compatibility for building Eigen on soft float ARM platforms, broadening the library's usability in various ARM environments. | |
### Impact: | |
The change allows Eigen to successfully compile on ARM systems without hardware floating point support, thus expanding its applicability and availability for developers working on such platforms." | |
819 (https://gitlab.com/libeigen/eigen/-/merge_requests/819),Improve clang warning suppressions by checking if warning is supported,NA,Sean McBride,2022-01-21T00:27:43.962Z,NA,NA,"## Title: | |
Improve clang warning suppressions by checking if warning is supported | |
## Authors: | |
Sean McBride | |
## Summary: | |
This merge request enhances the handling of clang warning suppressions in the Eigen C++ library by ensuring that suppressions are only applied to warnings that are actually supported by the compiler. | |
### Key Changes: | |
- Introduced checks to verify if specific clang warnings are supported before applying suppressions. | |
### Improvements: | |
- Increased robustness of warning suppressions, preventing unnecessary or ineffective suppressions from cluttering the codebase. | |
### Impact: | |
- Improves code quality by ensuring that only relevant warning suppressions are used, making the build process cleaner and potentially aiding in the identification of real issues." | |
818 (https://gitlab.com/libeigen/eigen/-/merge_requests/818),Silence some MSVC warnings,"### What does this implement/fix? | |
Silence two warnings in construct_elements_of_array() | |
- C4701: potentially uninitialized local variable 'i' used (in catch handler) | |
- C4702: unreachable code (return NULL) | |
I have been running these mods locally for a while on vs2017, 2019 and now 2022. Has not caused me issues. | |
However, I have not run tests or compiled on other platforms.",Stephen Pierce,2022-01-21T00:47:15.290Z,NA,NA,"## Title: | |
Silence some MSVC warnings | |
## Authors: | |
Stephen Pierce | |
## Summary: | |
This merge request addresses and silences two specific warnings generated by the Microsoft Visual C++ (MSVC) compiler within the `construct_elements_of_array()` function. | |
### Key Changes: | |
- Resolved warning C4701 regarding a potentially uninitialized local variable 'i' used in a catch handler. | |
- Resolved warning C4702 related to unreachable code that returned NULL. | |
### Improvements: | |
The code modifications enhance the clarity and reliability of the function by eliminating compiler warnings, which may lead to a cleaner build process and more straightforward debugging. | |
### Impact: | |
While the author has tested the changes on MSVC versions 2017, 2019, and 2022 without issues, no testing has been conducted on other platforms, leaving potential impacts on cross-platform compatibility unverified." | |
772 (https://gitlab.com/libeigen/eigen/-/merge_requests/772),Cleanup,"some more cleanup, removing EIGEN_HAS_CONSTEXPR, EIGEN_HAS_INDEX_LIST, EIGEN_HAS_STD_RESULT_OF and Eigens own implementation of index/integer_sequence. | |
These are the macros that were currently on the list here #2372, though there are some more where I'm not sure yet whether they should be removed.",Erik Schultheis,2022-01-21T01:48:59.990Z,NA,NA,"## Title: | |
Cleanup | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request focuses on additional cleanup in the Eigen C++ library by removing several macros and an internal implementation related to sequences. | |
### Key Changes: | |
- Removed `EIGEN_HAS_CONSTEXPR` | |
- Removed `EIGEN_HAS_INDEX_LIST` | |
- Removed `EIGEN_HAS_STD_RESULT_OF` | |
- Removed Eigen's implementation of `index/integer_sequence` | |
### Improvements: | |
The removal of these macros and implementations simplifies the codebase, potentially improving readability and maintainability. | |
### Impact: | |
This cleanup may enhance the overall structure of the code, reducing complexity and making it easier for contributors to navigate and work within the library." | |
817 (https://gitlab.com/libeigen/eigen/-/merge_requests/817),Add support for packets of int64 on x86,Currently only AVX is supported.,Ilya Tokar,2022-01-21T19:55:25.184Z,NA,NA,"## Title: | |
Add support for packets of int64 on x86 | |
## Authors: | |
Ilya Tokar | |
## Summary: | |
This merge request enhances the Eigen C++ library by adding support for processing int64 data types using vectorized operations on x86 architectures. | |
### Key Changes: | |
- Introduced support for int64 packets, expanding the existing functionality that was limited to AVX for other data types. | |
### Improvements: | |
- This addition allows for more efficient computation with int64 data, leveraging SIMD (Single Instruction, Multiple Data) capabilities on supported x86 processors. | |
### Impact: | |
- Users of the Eigen library will benefit from improved performance and functionality when working with int64 data types, particularly in applications requiring high-performance numerical computations." | |
821 (https://gitlab.com/libeigen/eigen/-/merge_requests/821),Prevent heap allocation in diagonal product,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
fixes #2408 - product with a diagonal matrix had a heap allocation. | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Just sets the ``NestByRefBit`` in the ``DiagonalMatrix`` traits to prevent the heap allocation. | |
Since this bit wasn't set, the ``Product`` class was not using a reference type! Instead, it was copying the diagonal matrix.",Arthur,2022-01-21T21:36:01.524Z,NA,NA,"## Title: | |
Prevent heap allocation in diagonal product | |
## Authors: | |
Arthur | |
## Summary: | |
This merge request addresses an issue with the Eigen C++ library related to unnecessary heap allocation during the product operation with diagonal matrices. By setting the `NestByRefBit` in the `DiagonalMatrix` traits, it ensures that the `Product` class uses a reference type instead of copying the diagonal matrix, thus optimizing performance. | |
### Key Changes: | |
- Set `NestByRefBit` in `DiagonalMatrix` traits. | |
- Resolved the issue where the `Product` class was copying matrices instead of referencing them. | |
### Improvements: | |
- Significant reduction in memory usage by preventing heap allocations during matrix operations. | |
### Impact: | |
- Enhanced performance for matrix products involving diagonal matrices, leading to more efficient memory management and potentially faster computations in applications using this feature." | |
820 (https://gitlab.com/libeigen/eigen/-/merge_requests/820),"Add reciprocal packet op and fast specializations for float with SSE, AVX, and AVX512.","Add reciprocal packet op and fast specializations for float with | |
SSE and AVX, which have builtin instructions for approximate reciprocal. | |
The approximation is refined by one step of Newton-Raphson iteration. | |
The result is accurate to 2 ulps for SSE/AVX and within 1 ulp for AVX512, | |
where the `_mm512_rcp14_ps` instruction provides a better starting guess. | |
TODO: Add specializations for more ISAs with fast approximate reciprocal instructions. | |
Benchmark numbers measured on Intel Xeon Gold 6154 (Skylake): | |
``` | |
AVX512 (packet size 16) | |
name old cpu/op new cpu/op delta | |
BM_eigen_inverse_float/1 2.72ns ± 0% 0.61ns ± 2% -77.39% (p=0.000 n=53+59) | |
BM_eigen_inverse_float/8 5.73ns ± 1% 6.25ns ± 2% +9.10% (p=0.000 n=58+60) | |
BM_eigen_inverse_float/64 31.2ns ± 2% 13.4ns ± 2% -56.96% (p=0.000 n=51+60) | |
BM_eigen_inverse_float/512 115ns ± 2% 43ns ± 3% -62.38% (p=0.000 n=59+57) | |
BM_eigen_inverse_float/4k 781ns ± 2% 290ns ± 2% -62.88% (p=0.000 n=60+53) | |
BM_eigen_inverse_float/32k 6.12µs ± 2% 2.94µs ± 3% -51.99% (p=0.000 n=59+48) | |
BM_eigen_inverse_float/256k 80.1µs ± 2% 81.2µs ± 2% +1.28% (p=0.000 n=60+56) | |
BM_eigen_inverse_float/1M 321µs ± 2% 324µs ± 1% +0.91% (p=0.000 n=33+29) | |
AVX (packet size 8): | |
name old cpu/op new cpu/op delta | |
BM_eigen_inverse_float/1 2.72ns ± 0% 3.28ns ± 1% +20.53% (p=0.000 n=54+45) | |
BM_eigen_inverse_float/8 5.72ns ± 0% 6.65ns ± 0% +16.21% (p=0.000 n=56+56) | |
BM_eigen_inverse_float/64 19.0ns ± 0% 12.6ns ± 2% -33.75% (p=0.000 n=58+48) | |
BM_eigen_inverse_float/512 95.1ns ± 0% 50.7ns ± 4% -46.65% (p=0.000 n=52+55) | |
BM_eigen_inverse_float/4k 704ns ± 0% 368ns ± 2% -47.65% (p=0.000 n=56+50) | |
BM_eigen_inverse_float/32k 5.57µs ± 0% 3.47µs ± 3% -37.75% (p=0.000 n=57+50) | |
BM_eigen_inverse_float/256k 78.2µs ± 1% 80.7µs ± 2% +3.29% (p=0.000 n=59+58) | |
BM_eigen_inverse_float/1M 313µs ± 1% 323µs ± 1% +3.42% (p=0.000 n=33+33) | |
SSE (packet size 4): | |
name old cpu/op new cpu/op delta | |
BM_eigen_inverse_float/1 0.83ns ± 1% 0.83ns ± 0% -0.10% (p=0.006 n=56+50) | |
BM_eigen_inverse_float/8 3.26ns ± 0% 2.45ns ± 0% -24.68% (p=0.000 n=48+50) | |
BM_eigen_inverse_float/64 14.7ns ± 0% 11.1ns ± 1% -24.48% (p=0.000 n=53+54) | |
BM_eigen_inverse_float/512 106ns ± 0% 86ns ± 0% -18.47% (p=0.000 n=55+55) | |
BM_eigen_inverse_float/4k 835ns ± 0% 640ns ± 0% -23.44% (p=0.000 n=55+54) | |
BM_eigen_inverse_float/32k 6.67µs ± 0% 5.81µs ± 0% -12.83% (p=0.000 n=51+56) | |
BM_eigen_inverse_float/256k 78.4µs ± 2% 79.0µs ± 1% +0.71% (p=0.000 n=55+54) | |
BM_eigen_inverse_float/1M 313µs ± 1% 316µs ± 1% +0.88% (p=0.000 n=29+30) | |
``` | |
Thanks to @sandwichmaker for reviewing a preliminary version of this at Google.",Rasmus Munk Larsen,2022-01-21T23:49:19.363Z,NA,NA,"## Title: | |
Add reciprocal packet op and fast specializations for float with SSE, AVX, and AVX512. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request introduces a reciprocal packet operation and optimized specializations for floating-point calculations using SSE, AVX, and AVX512. It utilizes built-in instructions for approximate reciprocals and refines their accuracy through Newton-Raphson iteration. The enhancements improve computational accuracy and reduce processing times significantly. | |
### Key Changes: | |
- Added reciprocal packet operation for float. | |
- Implemented optimizations using SSE, AVX, and AVX512 with built-in instructions. | |
- Improved accuracy to 2 ulps for SSE/AVX and within 1 ulp for AVX512. | |
### Improvements: | |
- Significant performance improvements in benchmark results: | |
- AVX512: Up to 77.39% reduction in computation time for certain packet sizes. | |
- AVX: Reductions in computation time up to 46.65%. | |
- SSE: Up to 24.68% reduction in computation time for packet sizes. | |
### Impact: | |
These changes lead to faster execution of float reciprocal operations across different architectures, enhancing the performance of applications that rely on these computations, while maintaining high accuracy levels. Further optimizations for additional instruction set architectures are planned, which may extend these benefits." | |
822 (https://gitlab.com/libeigen/eigen/-/merge_requests/822),make casts explicit and fixed the type,"I think there is a bug in the implementation of the rand test. | |
The variable is called short offset, and assigned a maximum value of 16'000, yet is stored in a singed char. | |
I admittedly don't quite understand how this is used exactly, so it would be good if someone with more knowledge about this test could take a look. | |
I've also changed the maximum number that can be assigned to short offset, because `24345 + 16000 > 2^15` would result in an overflow in line 97.",Erik Schultheis,2022-01-24T18:19:22.495Z,NA,NA,"## Title: | |
Make casts explicit and fix the type | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request addresses a potential bug in the implementation of the random test in the Eigen C++ library. The issue centers around the variable `short offset`, which was originally stored as a signed char, despite being assigned a maximum value of 16,000. The submission corrects this by making necessary type adjustments to prevent overflow. | |
### Key Changes: | |
- Changed the type of `short offset` to accommodate larger values. | |
- Updated the maximum allowable value for `short offset` to prevent overflow. | |
### Improvements: | |
- Ensures that `short offset` can safely handle maximum assignments without leading to unexpected behavior due to overflow. | |
- Introduced explicit casts to clarify type conversions within the code. | |
### Impact: | |
The changes improve the robustness of the random test implementation by preventing potential overflow issues, leading to more reliable test outcomes. This enhances the overall stability of the Eigen library's testing framework." | |
824 (https://gitlab.com/libeigen/eigen/-/merge_requests/824),"Remove inline assembly for FMA (AVX) and add remaining extensions as packet ops: pmsub, pnmadd, and pnmsub.","Adding the additional variation can save explicit negations in various low-level implementations. In a followup to this change, they will be used to make `preciprocal` IEEE compliant with minimal overhead. | |
This change also removes the old workaround for register spilling in `Eigen/src/Core/arch/AVX/PacketMath.h`, which appears very counterproductive on modern compiler/CPU combos. For example, compiling a matrix multiplication benchmark with clang 11 without the workaround yields the following speedups on a Skylake core (in addition to the improved readability). | |
| flags | speedup | | |
| ------ | ------ | | |
| -march=skylake | 25% (!) | | |
| -mavx -mfma | 12% (!) | | |
| -mavx | unchanged | | |
Closes #2231",Rasmus Munk Larsen,2022-01-26T04:25:41.636Z,NA,NA,"## Title: | |
Remove inline assembly for FMA (AVX) and add remaining extensions as packet ops: pmsub, pnmadd, and pnmsub. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request introduces significant modifications to the Eigen C++ library by removing inline assembly for Fused Multiply-Add (FMA) and adding new packet operations (pmsub, pnmadd, pnmsub). These changes aim to enhance performance and improve code readability. | |
### Key Changes: | |
- Removal of inline assembly for FMA (AVX). | |
- Addition of packet operations: pmsub, pnmadd, and pnmsub. | |
- Elimination of the old workaround for register spilling in `Eigen/src/Core/arch/AVX/PacketMath.h`. | |
### Improvements: | |
- The new packet operations can reduce the need for explicit negations in low-level implementations. | |
- The removal of the register spilling workaround improves the overall performance, leading to notable speedups in matrix multiplication benchmarks compiled with modern compilers. | |
### Impact: | |
- Speedups of up to 25% were observed on Skylake cores when compiling without the previous workaround. | |
- Improved readability of the codebase and preparation for making `preciprocal` IEEE compliant with minimal overhead in subsequent changes." | |
825 (https://gitlab.com/libeigen/eigen/-/merge_requests/825),reduce float warnings (comparisons and implicit conversions),"This MR consists of three changes to reduce the number of warnings that would be generated due to floating point concerns: | |
1) It adds new utility functions for performing exact floating point comparisons to 0 and 1. Why the extra function? Because for one, these are the most common types of exact floating point comparisons, and the ones where we usually clearly want exact comparison to happen (e.g. do I need to add this scalar/do I need to multiply with this scalar/can I ignore this term). Furthermore, having this as an extra function means we can hide generating the correctly typed 0 and 1 in their code, and enforce that the correct type is used. | |
2) Replace `==` with `equal_strict` in regular code, and `VERIFY(... == ...)` with `VERIFY_IS_EQUAL` in the tests. | |
3) wrote out some implicit conversions into explicit casts. This is only in the test suite. | |
Q: What should be the name for the comparison to 0/1 numbers? Right now I'm using `is_zero_strict` in analogy to `equal_strict`, but a more descriptive name (maybe) could also be `is_exactly_zero` (this is more how I would read this out aloud in an if, for example).",Erik Schultheis,2022-01-26T18:16:19.578Z,NA,NA,"## Title: | |
Reduce Float Warnings (Comparisons and Implicit Conversions) | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request introduces enhancements to minimize floating-point warnings in the Eigen C++ library. It focuses on exact comparisons and the elimination of implicit conversions, thereby improving the integrity and clarity of numerical operations. | |
### Key Changes: | |
1. Added utility functions for exact floating-point comparisons to 0 and 1. | |
2. Replaced `==` with `equal_strict` in the main code base; changed `VERIFY(... == ...)` to `VERIFY_IS_EQUAL` in tests. | |
3. Converted some implicit conversions into explicit casts within the test suite. | |
### Improvements: | |
- Introduced utility functions streamline exact comparisons, reinforcing type safety. | |
- Enhanced code readability and clarity through consistent use of dedicated comparison functions. | |
- Reduced potential floating-point inaccuracies by making implicit conversions explicit. | |
### Impact: | |
These changes are expected to significantly decrease floating-point related warnings, leading to cleaner compilation output and increased reliability in numerical computations. The modifications also facilitate easier code maintenance and clearer intent in comparisons, which is critical for users relying on precise numerical calculations in their applications." | |
827 (https://gitlab.com/libeigen/eigen/-/merge_requests/827),Make preciprocal IEEE compliant w.r.t. 1/0 and 1/inf.,"The new implementation takes advantage of the precondition that the starting approximation is 0 for infinite arguments and vice versa. Since one term in the Newton-Raphson step is the product of the argument and the approximation, we can detect zeros and infinities by checking if this term is NaN. This is faster than explicitly testing whether the argument is inf or 0. | |
Here are benchmark results comparing this implementation with `pdiv` (i.e. the state before commit ea2c02060cc329cc83f6a77e8247b195b5defcd9.) The change also appears to speed up the scalar path significantly (not sure why, but I'll take it). | |
``` | |
SSE+FMA (-mfma) | |
name old cpu/op new cpu/op delta | |
BM_eigen_inverse_float/1 2.72ns ± 0% 1.09ns ± 0% -59.73% (p=0.000 n=53+58) | |
BM_eigen_inverse_float/8 5.71ns ± 1% 5.70ns ± 0% -0.20% (p=0.000 n=49+54) | |
BM_eigen_inverse_float/64 19.0ns ± 0% 10.4ns ± 3% -45.25% (p=0.000 n=52+60) | |
BM_eigen_inverse_float/512 95.0ns ± 0% 63.8ns ± 3% -32.85% (p=0.000 n=55+59) | |
BM_eigen_inverse_float/4k 703ns ± 0% 508ns ± 3% -27.80% (p=0.000 n=55+60) | |
BM_eigen_inverse_float/32k 5.57µs ± 0% 4.89µs ± 3% -12.30% (p=0.000 n=56+60) | |
BM_eigen_inverse_float/256k 78.3µs ± 1% 80.4µs ± 2% +2.62% (p=0.000 n=60+60) | |
BM_eigen_inverse_float/1M 313µs ± 2% 322µs ± 3% +2.82% (p=0.000 n=35+35) | |
AVX+FMA (-march=skylake) | |
name old cpu/op new cpu/op delta | |
BM_eigen_inverse_float/1 2.72ns ± 0% 1.10ns ± 0% -59.68% (p=0.000 n=47+60) | |
BM_eigen_inverse_float/8 5.72ns ± 0% 5.70ns ± 0% -0.39% (p=0.000 n=52+55) | |
BM_eigen_inverse_float/64 19.0ns ± 0% 11.0ns ± 2% -41.86% (p=0.000 n=52+60) | |
BM_eigen_inverse_float/512 95.1ns ± 0% 64.2ns ± 3% -32.44% (p=0.000 n=54+60) | |
BM_eigen_inverse_float/4k 704ns ± 0% 510ns ± 3% -27.53% (p=0.000 n=54+60) | |
BM_eigen_inverse_float/32k 5.57µs ± 0% 4.89µs ± 3% -12.32% (p=0.000 n=56+60) | |
BM_eigen_inverse_float/256k 78.4µs ± 1% 80.6µs ± 1% +2.88% (p=0.000 n=60+53) | |
BM_eigen_inverse_float/1M 314µs ± 1% 322µs ± 1% +2.79% (p=0.000 n=33+29) | |
AVX512+FMA (-march=skylake-avx512) | |
name old cpu/op new cpu/op delta | |
BM_eigen_inverse_float/1 2.72ns ± 0% 0.62ns ± 2% -77.23% (p=0.000 n=50+60) | |
BM_eigen_inverse_float/8 5.73ns ± 1% 6.28ns ± 2% +9.65% (p=0.000 n=55+59) | |
BM_eigen_inverse_float/64 31.4ns ± 2% 13.5ns ± 2% -57.17% (p=0.000 n=50+60) | |
BM_eigen_inverse_float/512 115ns ± 2% 46ns ± 2% -59.80% (p=0.000 n=60+59) | |
BM_eigen_inverse_float/4k 787ns ± 2% 324ns ± 3% -58.87% (p=0.000 n=60+49) | |
BM_eigen_inverse_float/32k 6.17µs ± 2% 3.20µs ± 3% -48.16% (p=0.000 n=60+50) | |
BM_eigen_inverse_float/256k 80.5µs ± 2% 80.6µs ± 1% ~ (p=0.262 n=57+60) | |
BM_eigen_inverse_float/1M 322µs ± 1% 322µs ± 1% ~ (p=0.429 n=30+29) | |
```",Rasmus Munk Larsen,2022-01-26T20:38:07.115Z,NA,NA,"## Title: | |
Make preciprocal IEEE compliant w.r.t. 1/0 and 1/inf. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request introduces an optimized implementation of the preciprocal function to ensure compliance with IEEE standards when handling special cases such as division by zero and infinity. The new method utilizes an improved approximation technique, significantly enhancing performance in these scenarios. | |
### Key Changes: | |
- Reworked the implementation to take advantage of inherent properties of zero and infinity in the Newton-Raphson step. | |
- Removed explicit checks for zero or infinity, relying instead on detecting NaN values, which accelerates the process. | |
### Improvements: | |
- Benchmark results show substantial reductions in CPU time per operation for smaller input sizes. | |
- For instance, BM_eigen_inverse_float/1 shows a reduction of 59.73%, and BM_eigen_inverse_float/64 has a 45.25% decrease in processing time, demonstrating the efficiency of the new approach compared to the previous implementation. | |
### Impact: | |
- The changes significantly speed up the scalar path of the preciprocal function, contributing to overall improved performance in calculations involving these operations. | |
- Although marginal slowdowns are observed for large input sizes, the vast majority of tests reflect substantial gains, making this update beneficial for a wide range of applications." | |
828 (https://gitlab.com/libeigen/eigen/-/merge_requests/828),Fix number of block columns to NOT overflow the cache (PowerPC) abnormally in GEMV,Fix number of block columns to NOT overflow the cache (PowerPC) abnormally in GEMV. Timings were unusual when number of columns was between 2500-3200-ish.,Chip Kerchner,2022-01-27T20:35:53.586Z,NA,NA,"## Title: | |
Fix number of block columns to NOT overflow the cache (PowerPC) abnormally in GEMV | |
## Authors: | |
Chip Kerchner | |
## Summary: | |
This merge request addresses an issue in the Eigen C++ library related to the GEMV (General Matrix-Vector multiplication) function, specifically for PowerPC architecture. It resolves abnormal cache overflow when the number of block columns lies between 2500 and 3200. | |
### Key Changes: | |
- Fixed the number of block columns in the GEMV implementation to prevent caching issues on PowerPC. | |
### Improvements: | |
- Enhanced performance stability for GEMV across specific column sizes by avoiding unusual timings. | |
### Impact: | |
- This change is expected to improve the efficiency and reliability of matrix operations on PowerPC, ensuring consistent performance in scenarios with a higher number of columns." | |
830 (https://gitlab.com/libeigen/eigen/-/merge_requests/830),removed some documentation referencing c++98 behaviour,This MR removes some comments/documentation that reference C++98/03 behaviour.,Erik Schultheis,2022-01-30T20:22:51.206Z,NA,NA,"## Title: | |
Removed some documentation referencing C++98 behaviour | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request focuses on the removal of comments and documentation that reference behaviors specific to C++98 and C++03 standards. | |
### Key Changes: | |
- Removal of outdated documentation and comments related to C++98/03. | |
### Improvements: | |
- Streamlined documentation, making it more relevant to current standards. | |
### Impact: | |
- Enhances clarity and accuracy of the documentation by eliminating references to obsolete C++ standards." | |
826 (https://gitlab.com/libeigen/eigen/-/merge_requests/826),Update SVD Module with Options template parameter,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Take 2 of !658. This MR is a less devastating API break :sweat_smile: | |
Updated following discussion in !750. | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This makes several API breaking changes to the SVD module. However, it stays compatible with the most common use of the old API. | |
1. Adds the ``Options`` template parameter to the SVD module: ``JacobiSVD<MatrixType, ComputeThinU | ComputeThinV> svd(m);`` | |
- This ""improved"" API allows computing thin unitaries of fixed-size matrices, which is not possible at the moment. | |
2. ""Deprecates"" the constructor taking computation options: ``SvdType svd(m, ComputeThinU);`` to stay compatible with the old version. | |
3. Disallows using both the ``Options`` template parameter and deprecated constructor at the same time. | |
- This is so it can use a static assert to fail early, rather than failing at runtime or silently preferring one setting over the other. | |
- I.e., doing ``JacobiSVD<MatrixType, Options> svd(m, options);`` raises a static assert. | |
4. Removes the overload of ``compute`` that could change the computation options: ``svd.compute(m2, newOptions);`` | |
The plugin methods are essentially in the same situation as the constructor. Can use either | |
- ``m.template jacobiSvd<Options>();`` | |
- ``m.jacobiSvd(options);`` | |
### Additional information | |
<!--Any additional information you think is important.--> | |
Worked on this very sporadically, but the majority of this should be essentially the same as the original MR.",Arthur,2022-02-02T00:15:44.550Z,NA,NA,"## Title: | |
Update SVD Module with Options template parameter | |
## Authors: | |
Arthur | |
## Summary: | |
This merge request introduces significant updates to the SVD module of the Eigen C++ library. It implements multiple API breaking changes aimed at improving flexibility and usability while maintaining compatibility with existing common use cases. | |
### Key Changes: | |
1. Introduction of the `Options` template parameter in the SVD module for enhanced computation of thin unitaries in fixed-size matrices. | |
2. Deprecation of the existing constructor that accepts computation options to ensure backward compatibility. | |
3. Enforcement of a static assert to prevent simultaneous use of the new `Options` template parameter and the deprecated constructor. | |
4. Removal of the `compute` method overload that allowed changing computation options. | |
### Improvements: | |
- The updated API enhances the functionality of the SVD module, allowing for more efficient computations of thin unitaries. | |
- It introduces clearer error handling through static assertions, which can prevent runtime failures. | |
### Impact: | |
These changes impact the SVD module's API, potentially requiring updates to existing code that relies on the old constructor or method signatures. However, they offer a more robust and forward-compatible system for users implementing SVD in their applications." | |
835 (https://gitlab.com/libeigen/eigen/-/merge_requests/835),Fix ODR violations.,"We can't use unnamed namespaces (or any internal linkage) in headers, since this will | |
lead to ODR violations and undefined behavior. | |
Fixes #2392",Antonio Sánchez,2022-02-04T19:01:08.324Z,NA,NA,"## Title: | |
Fix ODR violations. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses issues related to One Definition Rule (ODR) violations in the Eigen C++ library by removing unnamed namespaces and internal linkage from header files. | |
### Key Changes: | |
- Eliminated unnamed namespaces in headers to prevent ODR violations. | |
### Improvements: | |
- Enhanced code consistency and reliability by ensuring compliance with ODR. | |
### Impact: | |
- Reduced the risk of undefined behavior, leading to more stable and predictable library usage across different compilation units." | |
832 (https://gitlab.com/libeigen/eigen/-/merge_requests/832),"Fix AVX512 math function consistency, enable for ICC.",Fixes #2419.,Antonio Sánchez,2022-02-04T19:35:19.174Z,NA,NA,"## Title: | |
Fix AVX512 math function consistency, enable for ICC. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses issues related to the consistency of AVX512 mathematical functions and enables their support for the Intel C++ Compiler (ICC). The changes resolve a specific problem documented in issue #2419. | |
### Key Changes: | |
- Correction of inconsistencies in AVX512 math functions. | |
- Added compatibility for using AVX512 functions with ICC. | |
### Improvements: | |
- Enhanced functionality and performance of mathematical operations for users compiling with ICC. | |
- Improved reliability of numerical computations using AVX512 capabilities. | |
### Impact: | |
This update provides a more consistent and reliable usage of AVX512 math functions for users, particularly those utilizing the ICC compiler, thereby enhancing the overall performance of applications that rely on the Eigen library for mathematical computations." | |
833 (https://gitlab.com/libeigen/eigen/-/merge_requests/833),Fix 32-bit arm int issue.,"For (some?) 32-bit arm platforms, `int32_t` is actually of type `long int`, | |
not `int`. We have a couple places where we assume extracting a bit | |
pattern from a `float` is type `int`, when we should use `int32_t` | |
instead. The discrepancy in types causes some packet functions to | |
fail. | |
I'm not sure how prevalent this issue is, but these changes at least fix | |
it for #2412. | |
Fixes #2412.",Antonio Sánchez,2022-02-04T21:59:34.504Z,NA,NA,"## Title: | |
Fix 32-bit arm int issue. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses a type discrepancy in the Eigen C++ library related to 32-bit ARM platforms. It corrects instances where `int32_t` is misused as `int`, which can lead to function failures in packet processing. | |
### Key Changes: | |
- Replaced instances of `int` with `int32_t` for type consistency when extracting bit patterns from `float`. | |
### Improvements: | |
- Ensures that the code properly handles types on 32-bit ARM platforms, potentially reducing error occurrences related to type mismatches. | |
### Impact: | |
- Fixes the specific issue reported in #2412 and improves robustness for 32-bit ARM architecture compatibility." | |
840 (https://gitlab.com/libeigen/eigen/-/merge_requests/840),Correct use of EIGEN_CUDACC to respect EIGEN_NO_CUDA.,"Previously the mix of EIGEN_CUDACC and __CUDACC__ led to discrepancies | |
when `EIGEN_NO_CUDA` was defined. | |
Fixes #2290",Antonio Sánchez,2022-02-04T22:24:32.003Z,NA,NA,"## Title: | |
Correct use of EIGEN_CUDACC to respect EIGEN_NO_CUDA. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses inconsistencies in the Eigen library caused by the interplay between the `EIGEN_CUDACC` and `__CUDACC__` definitions when `EIGEN_NO_CUDA` is set, ensuring that CUDA-related code is properly excluded as intended. | |
### Key Changes: | |
- Adjusted the usage of `EIGEN_CUDACC` to align with the `EIGEN_NO_CUDA` flag, preventing unwanted compilation of CUDA code when it is disabled. | |
### Improvements: | |
- Enhances the reliability of the library by ensuring that CUDA features are only included when explicitly permitted, leading to clearer behavior during compilation. | |
### Impact: | |
- Prevents potential compilation errors and issues for users who do not wish to use CUDA, thereby improving the portability and flexibility of the Eigen library in diverse development environments." | |
838 (https://gitlab.com/libeigen/eigen/-/merge_requests/838),Define EIGEN_HAS_AVX512_MATH in PacketMath.,"Previously was used before it was defined, so defaulted to 0. This | |
fixes the order.",Antonio Sánchez,2022-02-04T22:25:52.813Z,NA,NA,"## Title: Define EIGEN_HAS_AVX512_MATH in PacketMath. | |
## Authors: Antonio Sánchez | |
## Summary: | |
This merge request addresses the definition of the `EIGEN_HAS_AVX512_MATH` flag in the PacketMath component of the Eigen C++ library. It corrects the order of its definition, which previously defaulted to 0. | |
### Key Changes: | |
- Defined `EIGEN_HAS_AVX512_MATH` in the correct order within PacketMath. | |
### Improvements: | |
- Ensures that the AVX512 math capabilities are properly recognized and utilized within the library. | |
### Impact: | |
- This change enhances the performance optimizations associated with AVX512 support, leading to potentially faster mathematical operations in the library when AVX512 features are available." | |
836 (https://gitlab.com/libeigen/eigen/-/merge_requests/836),Restrict GCC<6.3 maxpd workaround to only gcc.,"Previously this was applied to any gnuc-compatible compiler (e.g. clang). | |
Fixes #2332.",Antonio Sánchez,2022-02-04T22:47:35.454Z,NA,NA,"## Title: | |
Restrict GCC<6.3 maxpd workaround to only gcc. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request updates the Eigen C++ library by limiting the workaround for the `maxpd` issue to only GCC versions earlier than 6.3. The previous implementation applied the workaround to all GNU-compatible compilers, including Clang. | |
### Key Changes: | |
- The `maxpd` workaround is now restricted solely to GCC versions below 6.3. | |
### Improvements: | |
- More precise handling of compiler compatibility, reducing unintended side effects for Clang users. | |
### Impact: | |
- Enhances the stability and performance of Eigen when compiled with compatible versions of GCC while improving compatibility for users of other compilers, such as Clang." | |
841 (https://gitlab.com/libeigen/eigen/-/merge_requests/841),"Add generic fast psqrt and prsqrt impls and make them correct for 0, +Inf, NaN, and negative arguments.","1. Consolidate fast psqrt and prsqrt into generic implementations and avoid duplicating this code for SSE,AVX, and AVVX512. TODO: Use these generic implementations for more architectures. | |
1. Make both fast psqrt and prsqrt correct for 0, Inf, NaN and negative arguments. These functions are now fully standard compliant, except that they treat positive subnormal input arguments as zeros. | |
The performance regressions associated with these changes are less than 5% measured for SSE+FMA, AVX, and AVX512 on Skylake.",Rasmus Munk Larsen,2022-02-05T00:20:14.109Z,NA,NA,"## Title: | |
Add generic fast psqrt and prsqrt impls and make them correct for 0, +Inf, NaN, and negative arguments. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request consolidates the implementations of fast psqrt and prsqrt into a generic form, ensuring correct behavior across various input values, including 0, +Inf, NaN, and negative numbers. The goal is to maintain standard compliance while enhancing the existing functionality. | |
### Key Changes: | |
- Merged fast psqrt and prsqrt into generic implementations to eliminate code duplication across architectures (SSE, AVX, AVX512). | |
- Enhanced both functions to handle special cases correctly: 0, Inf, NaN, and negative arguments. The functions now treat positive subnormal inputs as zeros. | |
### Improvements: | |
- The new implementations ensure standard compliance for edge cases. | |
- Future potential to use these generic implementations across more architectures. | |
### Impact: | |
- Performance regressions associated with the changes are minimal, measured at less than 5% for SSE+FMA, AVX, and AVX512 on Skylake, indicating that the improvements in correctness come with negligible performance cost." | |
844 (https://gitlab.com/libeigen/eigen/-/merge_requests/844),Update MPL2 with https.,Fixes #2433.,Antonio Sánchez,2022-02-07T17:51:16.891Z,NA,NA,"## Title: | |
Update MPL2 with https. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request updates the MPL2 license link to use HTTPS, addressing issue #2433. | |
### Key Changes: | |
- The MPL2 license URL has been changed from HTTP to HTTPS. | |
### Improvements: | |
- Enhances security by using a secure protocol for the license link. | |
### Impact: | |
- Ensures that users accessing the license are directed to a secure site, thereby improving trust and reliability in the documentation." | |
843 (https://gitlab.com/libeigen/eigen/-/merge_requests/843),Fix collision with resolve.h.,"`resolve.h` pollutes the global namespace with `_res`, so this clashes | |
with our local parameter name. | |
Rename locals `_*` to `*_` to address. | |
Fixes #2435",Antonio Sánchez,2022-02-07T18:17:43.329Z,NA,NA,"## Title: | |
Fix collision with resolve.h. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses naming collisions caused by the inclusion of `resolve.h`, which introduces global variables that clash with local parameter names in the Eigen C++ library. | |
### Key Changes: | |
- Renamed local variables from `_*` to `*_` to avoid conflicts. | |
### Improvements: | |
- Eliminates global namespace pollution issues that could lead to ambiguous variable references. | |
### Impact: | |
- Enhances code clarity and stability by resolving potential conflicts in variable naming, thus preventing unexpected behavior in the library." | |
842 (https://gitlab.com/libeigen/eigen/-/merge_requests/842),Typo in COD's doc: matrixR() -> matrixT(),"### What does this implement/fix? | |
Small typo in method documentation of matrixT() | |
### Additional information | |
I think this is just a typo, or am I utterly confused? There's no matrixR() method on CompleteOrthogonalDecomposition, or should the user access m_cpqr.matrixR()?",Björn Dahlgren,2022-02-07T18:49:16.388Z,NA,NA,"## Title: | |
Typo in COD's doc: matrixR() -> matrixT() | |
## Authors: | |
Björn Dahlgren | |
## Summary: | |
This merge request addresses a small typo in the documentation of the `matrixT()` method within the Complete Orthogonal Decomposition (COD) class of the Eigen C++ library. | |
### Key Changes: | |
- Corrected the documentation reference from `matrixR()` to `matrixT()`. | |
### Improvements: | |
- Enhances clarity and accuracy in the documentation, ensuring users have the correct information regarding method references. | |
### Impact: | |
- Users referencing the documentation will find the correct method name, reducing confusion and potential errors in their implementation." | |
845 (https://gitlab.com/libeigen/eigen/-/merge_requests/845),Provide a definition for numeric_limits static data members,"Authored by [email protected] | |
Provide a definition for numeric_limits static data members | |
Eigen provides specializations which have static data member initializers. However, const non-inline static data members which are ODR used must have a definition at namespace scope. | |
We cannot use C++17 inline variables to solve this so we must come up with another, wordier, solution. | |
Move the implementation into a class template. | |
Provide a definition for the data members. Because they are templated, we are safe from ODR violations. | |
Make the std::numeric_limits specializations inherit from the helper class template. | |
Note that the class template’s template parameter is not meaningful, we only need it because we want to be able to have the static data member emitted into a COMDAT linker section.",Rasmus Munk Larsen,2022-02-08T20:34:53.804Z,NA,NA,"## Title: | |
Provide a definition for numeric_limits static data members | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request addresses the issue of defining static data members for specialized numeric_limits in Eigen, ensuring compliance with the One Definition Rule (ODR) in C++. It involves a reorganization of the implementation into a class template that facilitates safe definition of these members. | |
### Key Changes: | |
- Implementation of numeric_limits static data members moved into a class template. | |
- Definitions provided for the static data members to avoid ODR violations. | |
- std::numeric_limits specializations are made to inherit from the helper class template. | |
### Improvements: | |
- Enhanced compliance with C++ standards, particularly regarding ODR. | |
- The new structure allows for better organization and future expansions of numeric_limits specializations. | |
### Impact: | |
This change ensures that static data members are correctly defined, preventing potential linkage errors and improving the stability and reliability of the Eigen library's numeric_limits functionality." | |
846 (https://gitlab.com/libeigen/eigen/-/merge_requests/846),Return alphas() and betas() by const reference,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
#2436 | |
### What does this implement/fix? | |
Returns a reference instead of copying vectors, to match the docs and reduce allocations.",Matt Keeter,2022-02-08T23:16:10.921Z,NA,NA,"## Title: | |
Return alphas() and betas() by const reference | |
## Authors: | |
Matt Keeter | |
## Summary: | |
This merge request introduces a change that allows the functions alphas() and betas() to return their vectors by const reference instead of creating copies. This adjustment aims to enhance performance and memory efficiency. | |
### Key Changes: | |
- Modified alphas() and betas() functions to return vectors by const reference. | |
### Improvements: | |
- Reduces unnecessary copying of vectors, aligning the implementation with the existing documentation. | |
- Minimizes memory allocations, potentially improving performance during vector access. | |
### Impact: | |
The changes enhance the efficiency of the Eigen library, particularly in scenarios where alphas() and betas() are frequently accessed, leading to better performance and reduced resource usage." | |
847 (https://gitlab.com/libeigen/eigen/-/merge_requests/847),"Cleanup compiler warnings, etc from recent changes in GEMM & GEMV for PowerPC","Cleanup compiler warnings, etc from recent changes in GEMM & GEMV for PowerPC.",Chip Kerchner,2022-02-09T18:47:09.227Z,NA,NA,"## Title: | |
Cleanup compiler warnings, etc from recent changes in GEMM & GEMV for PowerPC | |
## Authors: | |
Chip Kerchner | |
## Summary: | |
This merge request focuses on addressing and cleaning up compiler warnings resulting from recent modifications in the GEMM (General Matrix Multiply) and GEMV (General Matrix-Vector Multiply) implementations for PowerPC architecture. | |
### Key Changes: | |
- Resolved various compiler warnings associated with recent changes in GEMM and GEMV functions. | |
### Improvements: | |
- Enhanced code quality by eliminating unnecessary warnings, making the codebase cleaner and easier to maintain. | |
### Impact: | |
- Improved compilation output by reducing warnings, which can lead to a more stable and reliable build process for users working with PowerPC." | |
849 (https://gitlab.com/libeigen/eigen/-/merge_requests/849),Complete doc with MatrixXNt and MatrixNXt,"The doc in TutorialMatrixClass.dox was missing two matrix patterns `MatrixXNt` and `MatrixNXt`. | |
Also missing some missing namespaces to TutorialLinAlgSVDSolve.cpp, that block doc compilation. | |
Partially fix https://gitlab.com/libeigen/eigen/-/issues/2437",Florian Maurin,2022-02-11T21:55:54.927Z,NA,NA,"## Title: | |
Complete doc with MatrixXNt and MatrixNXt | |
## Authors: | |
Florian Maurin | |
## Summary: | |
This merge request enhances the documentation for the Eigen C++ library by adding details regarding the `MatrixXNt` and `MatrixNXt` matrix patterns. Additionally, it resolves issues in the documentation compilation related to missing namespaces in `TutorialLinAlgSVDSolve.cpp`. | |
### Key Changes: | |
- Added documentation for the `MatrixXNt` and `MatrixNXt` matrix patterns in `TutorialMatrixClass.dox`. | |
- Included missing namespaces in `TutorialLinAlgSVDSolve.cpp` to ensure successful documentation compilation. | |
### Improvements: | |
- Improved the completeness and accuracy of the Eigen library documentation, providing users with better guidance on these matrix patterns. | |
- Enhanced the functionality of the documentation system by addressing namespace issues. | |
### Impact: | |
These changes facilitate a smoother user experience by providing clearer information on specific matrix types, thus aiding in the understanding and usage of the Eigen library. Additionally, fixing the documentation compilation issues helps maintain the integrity of the project’s documentation." | |
852 (https://gitlab.com/libeigen/eigen/-/merge_requests/852),Add convenience method `constexpr std::size_t size() const` to `Eigen::IndexList`,NA,Rasmus Munk Larsen,2022-02-12T04:23:03.915Z,NA,NA,"## Title: | |
Add convenience method `constexpr std::size_t size() const` to `Eigen::IndexList` | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request introduces a new convenience method, `constexpr std::size_t size() const`, to the `Eigen::IndexList` class in the Eigen C++ library. | |
### Key Changes: | |
- Addition of a `size()` method to `Eigen::IndexList` that returns the size as a `constexpr` value. | |
### Improvements: | |
- Simplifies the process of retrieving the size of `IndexList`, enhancing usability. | |
### Impact: | |
- This change improves the efficiency of working with `IndexList`, allowing compile-time size evaluation and potentially reducing runtime overhead in template metaprogramming scenarios." | |
853 (https://gitlab.com/libeigen/eigen/-/merge_requests/853),Fix ODR failures in TensorRandom.,NA,Antonio Sánchez,2022-02-12T15:15:11.523Z,NA,NA,"## Title: | |
Fix ODR failures in TensorRandom. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses and resolves issues related to One Definition Rule (ODR) failures in the TensorRandom component of the Eigen C++ library. | |
### Key Changes: | |
- Modifications were made to the TensorRandom code to ensure compliance with the One Definition Rule. | |
### Improvements: | |
- Enhanced code stability and reliability by eliminating potential ODR conflicts. | |
### Impact: | |
- The fix improves the overall robustness of the TensorRandom component, reducing the likelihood of errors during code linkage and ensuring more consistent behavior across different translation units." | |
855 (https://gitlab.com/libeigen/eigen/-/merge_requests/855),remove unused macros,"with the generic implementation of `prsqrt`, these macros are not used anywhere anymore.",Erik Schultheis,2022-02-14T10:34:26.315Z,NA,NA,"## Title: | |
Remove Unused Macros | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request focuses on the removal of obsolete macros related to the `prsqrt` implementation in the Eigen C++ library. With the introduction of a more generic version of the `prsqrt` function, these macros are no longer necessary. | |
### Key Changes: | |
- Deleted unused macros associated with the prior implementation of `prsqrt`. | |
### Improvements: | |
- Streamlined the codebase by eliminating unused components, contributing to improved clarity and maintainability. | |
### Impact: | |
- Reduces clutter in the code, fostering better readability and potentially easing future development efforts within the Eigen library." | |
859 (https://gitlab.com/libeigen/eigen/-/merge_requests/859),Fix MSVC+NVCC 9.2 pragma error.,"For NVCC 9.2, we require MSVC 14.16 or earlier. Unfortunately, this does not seem to properly support `_Pragma`, despite claims from the [documentation](https://docs.microsoft.com/en-us/cpp/preprocessor/pragma-directives-and-the-pragma-keyword?view=msvc-140) that it does (or perhaps it's the interaction between MSVC and NVCC?). This does work with the microsoft-specific extension `__pragma` though.",Antonio Sánchez,2022-02-15T19:14:55.239Z,NA,NA,"## Title: | |
Fix MSVC+NVCC 9.2 pragma error. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses a compatibility issue between MSVC 14.16 and NVCC 9.2 concerning the `_Pragma` directive. | |
### Key Changes: | |
- Replaced the use of `_Pragma` with the Microsoft-specific extension `__pragma` to ensure proper functionality in NVCC 9.2. | |
### Improvements: | |
- Enhances compatibility between MSVC and NVCC, preventing compilation errors that arise from the previous pragma usage. | |
### Impact: | |
- This fix ensures that users of NVCC 9.2 and MSVC 14.16 can compile their code without encountering pragma-related errors, improving the usability of the Eigen library with these specific versions of the compilers." | |
858 (https://gitlab.com/libeigen/eigen/-/merge_requests/858),Fix sqrt/rsqrt for NEON.,"Newly added tests must check `HasSqrt`/`HasRsqrt` via `CHECK_CWISE1_IF` | |
to avoid compile errors. | |
Replaced NEON hand-written versions with new generic version that | |
correctly handles 0, `inf`.",Antonio Sánchez,2022-02-15T21:31:52.203Z,NA,NA,"## Title: | |
Fix sqrt/rsqrt for NEON. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses issues with the square root (`sqrt`) and reciprocal square root (`rsqrt`) functions for NEON in the Eigen C++ library. It introduces enhanced testing and improved implementations to ensure accuracy and compatibility. | |
### Key Changes: | |
- Added tests to check `HasSqrt` and `HasRsqrt` using `CHECK_CWISE1_IF` to prevent compile errors. | |
- Replaced existing hand-written NEON implementations with a new generic version that correctly handles special cases, including 0 and infinity. | |
### Improvements: | |
- Improved test coverage to catch potential compile errors related to `sqrt` and `rsqrt`. | |
- Enhanced the accuracy and reliability of the `sqrt` and `rsqrt` functions for NEON architecture. | |
### Impact: | |
These changes enhance the functionality and robustness of the Eigen library's mathematical operations on NEON, ensuring that edge cases are correctly handled and providing a more reliable experience for developers utilizing these functions." | |
850 (https://gitlab.com/libeigen/eigen/-/merge_requests/850),Add descriptions to Matrix typedefs.,"Otherwise, they don't show up in doxygen documentation. Fixes #2437.",Antonio Sánchez,2022-02-15T21:53:28.275Z,NA,NA,"## Title: | |
Add descriptions to Matrix typedefs. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request enhances the Doxygen documentation for the Eigen C++ library by adding descriptions to the Matrix typedefs. This improvement addresses the issue identified in the ticket #2437, where the lack of descriptions prevented proper visibility in the generated documentation. | |
### Key Changes: | |
- Added descriptive comments to Matrix typedefs to facilitate proper documentation generation. | |
### Improvements: | |
- Enhances clarity and usability of the Doxygen documentation for users and developers by providing additional context about the Matrix typedefs. | |
### Impact: | |
- Users of the Eigen library will benefit from improved understanding of the Matrix typedefs, leading to easier navigation of the documentation and better implementation of the library features." | |
857 (https://gitlab.com/libeigen/eigen/-/merge_requests/857),"Re-add `svd::compute(Matrix, options)` method to avoid breaking external projects.","Re-add `svd::compute(Matrix, options)` method to avoid breaking external projects. | |
There are too many other projects (open-source, and at Google) that rely | |
on the existing mechanism. For the open-source projects, we at least need a | |
version number increase to check for which API to use. We also likely | |
need some transition time to allow projects to adapt. | |
Adding the method back in seemed trivial. We may want to reconsider whether | |
we truly want to deprecate this behavior. If we do, we can probably | |
remove it after the next major release. | |
/cc @rmlarsen1 @arthurfeeney",Antonio Sánchez,2022-02-16T00:54:03.048Z,NA,NA,"## Title: | |
Re-add `svd::compute(Matrix, options)` method to avoid breaking external projects. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request reintroduces the `svd::compute(Matrix, options)` method in the Eigen C++ library to ensure compatibility with numerous existing external projects that depend on it. | |
### Key Changes: | |
- The `svd::compute(Matrix, options)` method has been reinstated. | |
### Improvements: | |
- Restoring this method alleviates potential disruptions for users, particularly open-source projects and internal Google projects relying on the existing API. | |
### Impact: | |
- The change helps maintain compatibility for current users and minimizes the risk of breaking dependent projects. There is a suggestion for a transition period to allow users to adapt if the method is to be deprecated in the future." | |
861 (https://gitlab.com/libeigen/eigen/-/merge_requests/861),"Make FixedInt constexpr, fix ODR of fix<N>","Related to #2392, the use of a static variable in a header file | |
leads to potential ODR violations. Removing `static` should resolve | |
this (without `inline`) since it is a variable template. Also made | |
this `constexpr`, which required making the `FixedInt` class | |
`constexpr`-compatible.",Antonio Sánchez,2022-02-16T17:47:52.616Z,NA,NA,"## Title: | |
Make FixedInt constexpr, fix ODR of fix<N> | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses the potential ODR (One Definition Rule) violations caused by the use of a static variable in a header file related to issue #2392. It removes the `static` keyword from the variable template, ensuring compliance without adding `inline`. Additionally, the `FixedInt` class has been made `constexpr`-compatible. | |
### Key Changes: | |
- Removed `static` from the variable template in the header file. | |
- Made the `FixedInt` class `constexpr`. | |
### Improvements: | |
- Resolves potential ODR violations associated with static variables. | |
- Enhances `FixedInt` class functionality by allowing compile-time evaluation. | |
### Impact: | |
These changes improve code reliability and compile-time performance, allowing for safer and more efficient use of `FixedInt` in constexpr contexts." | |
862 (https://gitlab.com/libeigen/eigen/-/merge_requests/862),Use fixed-sized U/V for fixed-sized inputs.,"The change in !826 was breaking (it made U/V dynamic by default). | |
Since we don't allow thin U/V for fixed-sized matrices and default | |
options anyways, there is no need for dynamic sizes anyways. | |
This restores the original U/V sizes.",Antonio Sánchez,2022-02-16T18:31:48.531Z,NA,NA,"## Title: | |
Use fixed-sized U/V for fixed-sized inputs. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request restores the original fixed sizes for U and V matrices in the Eigen library, addressing an issue introduced in a previous change that made them dynamic by default. | |
### Key Changes: | |
- Restores U/V sizes to fixed for fixed-sized matrices. | |
- Reverts the dynamic sizing introduced in a prior merge request. | |
### Improvements: | |
- Eliminates unnecessary complexity by ensuring U/V matrices maintain fixed sizes, which aligns with the library's design principles for fixed-sized matrices. | |
### Impact: | |
- Enhances stability and predictability in the behavior of U/V matrices for fixed-sized inputs, ensuring consistent performance in related operations." | |
866 (https://gitlab.com/libeigen/eigen/-/merge_requests/866),Fix for crash bug in SPQRSupport: Initialize pointers to nullptr to avoid free() calls of invalid pointers.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
No issue filed, should I add one? | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
If SuiteSparseQR fails, `m_isInitialized` is set to false, but some pointers stay uninitialized. Then in the destructor, they are still free'd, causing a crash. | |
Initializing the pointers to nullptr fixes the issue, as both `std::free` and `cholmod_l_free_*` are no-op for null pointers. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
N/A",Martin Heistermann,2022-02-18T16:13:29.299Z,NA,NA,"## Title: | |
Fix for crash bug in SPQRSupport: Initialize pointers to nullptr to avoid free() calls of invalid pointers. | |
## Authors: | |
Martin Heistermann | |
## Summary: | |
This merge request addresses a critical bug in the SPQRSupport component of the Eigen C++ library. The issue arose when the SuiteSparseQR fails, leaving some pointers uninitialized, which could lead to crashes when those pointers are freed in the destructor. | |
### Key Changes: | |
- Pointers that can be uninitialized after a failure in SuiteSparseQR are now initialized to `nullptr`. | |
### Improvements: | |
- By setting pointers to `nullptr`, the code prevents invalid memory access during the destructor, as both `std::free` and `cholmod_l_free_*` functions safely handle null pointers without causing crashes. | |
### Impact: | |
This fix enhances the stability and reliability of the SPQRSupport component, ensuring that it will not crash due to uninitialized pointer handling, thereby improving the overall robustness of the Eigen library." | |
869 (https://gitlab.com/libeigen/eigen/-/merge_requests/869),[SYCL] Fix CMake for SYCL support,"### What does this implement/fix? | |
These fixes are needed to compile and run Eigen tests with SYCL enabled. This only needs a simple CMake command `cmake -Bbuild -DEIGEN_TEST_SYCL=ON`. The SYCL tests can be run with `./check.sh sycl` from the build folder. | |
* There is no need to force a C++ version when compiling Eigen with SYCL | |
anymore. Forcing C++11 would disable some Eigen features. | |
* Remove a CMake workaround that is not needed anymore since the | |
compiler definitions are now defined in the `COMPILE_DEFINITIONS` property instead of | |
`COMPILE_FLAGS`. | |
* Sigmoid tests are temporarily disabled. SYCL needs to support more | |
functions to handle the edge cases of Sigmoid.",Romain Biessy,2022-02-22T16:53:28.242Z,NA,NA,"## Title: | |
[SYCL] Fix CMake for SYCL support | |
## Authors: | |
Romain Biessy | |
## Summary: | |
This merge request implements crucial fixes for compiling and running Eigen tests with SYCL support. It streamlines the configuration process through an updated CMake command and makes adjustments that enhance compatibility with various C++ versions. | |
### Key Changes: | |
- Simplified CMake command to enable SYCL tests: `cmake -Bbuild -DEIGEN_TEST_SYCL=ON`. | |
- SYCL tests can now be executed with `./check.sh sycl`. | |
- Removed the requirement to force a specific C++ version when compiling, allowing for broader feature access in Eigen. | |
- Eliminated an outdated CMake workaround as compiler definitions are now correctly assigned in the `COMPILE_DEFINITIONS` property. | |
### Improvements: | |
- Enhanced user experience by simplifying the build setup for SYCL. | |
- Increased compatibility with different C++ versions, which allows developers to utilize more Eigen features without version limitations. | |
### Impact: | |
These changes improve the flexibility and usability of Eigen when working with SYCL, making it easier for developers to integrate and test SYCL functionalities in their applications. The temporary disabling of Sigmoid tests highlights ongoing efforts to enhance SYCL support further." | |
870 (https://gitlab.com/libeigen/eigen/-/merge_requests/870),Fix test macro conflicts with STL headers in C++20,"Fix test build with GCC 9, 10, 11 in C++20.",Lingzhu Xiang,2022-02-23T05:14:33.171Z,NA,NA,"## Title: | |
Fix test macro conflicts with STL headers in C++20 | |
## Authors: | |
Lingzhu Xiang | |
## Summary: | |
This merge request addresses issues with test builds of the Eigen C++ library when using GCC versions 9, 10, and 11 under the C++20 standard. The primary focus is on resolving macro conflicts that arise from interactions with STL headers. | |
### Key Changes: | |
- Adjustments to test macros to prevent conflicts with standard library headers in C++20. | |
### Improvements: | |
- Enhanced compatibility of the Eigen library's test suite with current GCC versions and the C++20 standard. | |
### Impact: | |
- This fix ensures that users can successfully build and run tests of the Eigen library when using the specified GCC versions with C++20, improving reliability and developer experience." | |
865 (https://gitlab.com/libeigen/eigen/-/merge_requests/865),Add assert for edge case if Thin U Requested at runtime,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Discussion in !862 | |
(Ignore the stuff about changing workspace sizes. That's not necessary since this case doesn't work anyway.) | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
There's an old issue when requesting thin unitaries at runtime: | |
``` | |
#include <Eigen/Dense> | |
int main() { | |
using namespace Eigen; | |
using MT = Matrix<double, 5, Dynamic>; | |
MT m = MT::Random(5, 4); // fixed rows > dynamic cols | |
// fails to resize stuff because U is 5x5 at compile-time, but needs to be 5x4. | |
BDCSVD<MT> svd1(m, ComputeThinU | ComputeThinV); | |
} | |
``` | |
I don't think this can be ""fixed,"" so this adds an assert for this case and updates tests to look for the assert. | |
This shouldn't change any behavior, basically just a fix for the tests and to improve the failure message.",Arthur,2022-02-23T05:35:20.399Z,NA,NA,"## Title: | |
Add assert for edge case if Thin U Requested at runtime | |
## Authors: | |
Arthur | |
## Summary: | |
This merge request introduces an assertion mechanism to handle an edge case when requesting thin unitaries in the Eigen library. The issue arises when trying to compute thin unitaries for a matrix with only fixed rows but dynamic columns. | |
### Key Changes: | |
- Added an assertion for cases where thin unitaries are requested with incompatible matrix dimensions. | |
- Updated related tests to ensure they now check for the newly introduced assert. | |
### Improvements: | |
- Enhances error messaging, providing clearer feedback when the edge case occurs. | |
- Improves test coverage related to the handling of this specific computational scenario. | |
### Impact: | |
This change does not alter existing behavior but improves the failure diagnostics and ensures that users are made aware when they attempt to compute incompatible thin unitaries. The overall robustness of the code is enhanced through better validation of input parameters." | |
863 (https://gitlab.com/libeigen/eigen/-/merge_requests/863),Modify test expression to avoid numerical differences (#2402).,"It looks like when comparing slice to block evaluation and aggressive | |
optimizations (`-O3`), we can sometimes get slightly different numerical | |
results due to operation fusing (e.g. fma). | |
Here we simply modify the reference expression to avoid this. | |
Fixes #2402.",Antonio Sánchez,2022-02-23T16:37:04.053Z,NA,NA,"## Title: | |
Modify test expression to avoid numerical differences (#2402) | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses numerical discrepancies that arise when comparing slice to block evaluations during aggressive optimization levels in the Eigen C++ library. | |
### Key Changes: | |
- Modified the reference expression in tests to prevent slight numerical differences caused by operation fusing, such as fused multiply-add (fma). | |
### Improvements: | |
- Ensures more consistent and reliable numerical results across different optimization settings. | |
### Impact: | |
- Enhances the accuracy of tests, leading to improved validation of library functionality under various optimization flags." | |
868 (https://gitlab.com/libeigen/eigen/-/merge_requests/868),Changes to fast SQRT/RSQRT,"1. x86 processors from Skylake and Zen2 onwards have significantly higher throughput square root units. Therefore, as determined by our benchmarking, it is counter-productive to use Newton-Raphson iteration for SQRT if only SSE or AVX is available and proper handling of corner cases is required. Therefore, this change removes the corresponding specializations of internal::psqrt. Newton-Raphson is still a win for AVX512 for SQRT and for SSE/AVX/AVX512 for RSQRT. | |
2. Add a function for testing packet math functions on IEEE special values {+,-} x {denorm_min, min, 0, inf, NaN}, and fix the generic SQRT/RSQRT implementations to pass this test. If EIGEN_FAST_MATH is 1 we relax the test in subnormal inputs by allowing the function to return the same as the reference with the inputs flushed to zero with the same sign.",Rasmus Munk Larsen,2022-02-23T17:32:23.229Z,NA,NA,"## Title: | |
Changes to fast SQRT/RSQRT | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request introduces optimizations to the SQRT and RSQRT functions within the Eigen C++ library, particularly focused on enhancing performance on x86 processors from Skylake and Zen2 onwards. | |
### Key Changes: | |
- Removal of Newton-Raphson iteration for SQRT on SSE and AVX, optimizing performance by leveraging improved throughput of square root units in modern processors. | |
- Introduction of a testing function for packet math on IEEE special values, ensuring robust handling of corner cases in SQRT and RSQRT implementations. | |
### Improvements: | |
- The generic SQRT/RSQRT implementations have been updated to pass the newly introduced test for special values. | |
- For configurations with EIGEN_FAST_MATH enabled, the handling of subnormal inputs has been relaxed, allowing for simplified return values consistent with the reference. | |
### Impact: | |
These changes are expected to enhance performance significantly for square root calculations on the specified modern processors, while also improving accuracy and handling of edge cases in computations involving special floating-point values." | |
873 (https://gitlab.com/libeigen/eigen/-/merge_requests/873),Disable deprecated warnings in SVD tests.,"Our tests currently purposely check assertions and consistent behavior | |
of legacy SVD methods that accept runtime options. Now that we have | |
marked them deprecated, GCC/clang spew out many | |
`-Wdeprecated-declaration` warnings. Here we disable them for our tests | |
to remove the crud from the build logs.",Antonio Sánchez,2022-02-23T18:32:01.085Z,NA,NA,"## Title: | |
Disable deprecated warnings in SVD tests. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request focuses on suppressing deprecated warnings in the SVD tests of the Eigen C++ library, which arise from the legacy SVD methods that have been marked as deprecated. | |
### Key Changes: | |
- Disabled `-Wdeprecated-declaration` warnings for the SVD tests to clean up build logs. | |
### Improvements: | |
- Enhanced clarity and readability of build logs by removing unnecessary warning messages related to deprecated methods. | |
### Impact: | |
- Streamlined testing process by reducing noise in logs, allowing developers to focus on relevant issues without distraction from deprecated warnings." | |
874 (https://gitlab.com/libeigen/eigen/-/merge_requests/874),Fix gcc-5 packetmath_12 bug.,"There seems to be a gcc-5 bug that's causing `data1` within | |
`packetmath_minus_zero_add()` to be filled with | |
``` | |
(-0, 0), (-0, NaN) | |
``` | |
when optimizations `-O2` or higher are on. This is before any packet | |
operations are called. Printing the values causes the bug to disappear. I've | |
double-checked we're not running into any aliasing bugs (the cast from | |
`std::complex<double>*` to `double*` is legal, and using c++ casts | |
doesn't fix the issue). Compiling with `-fno-strict-aliasing` does *not* | |
solve the issue, so seems to be related to something else. The test works | |
with gcc-6 and later, and all other compilers/versions. | |
Initializing the memory to zeroes causes the bug to disappear.",Antonio Sánchez,2022-02-23T21:56:26.227Z,NA,NA,"## Title: | |
Fix gcc-5 packetmath_12 bug. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses a critical bug in the Eigen C++ library stemming from an issue with gcc-5. The problem specifically affects the `packetmath_minus_zero_add()` function where `data1` can erroneously be filled with specific values when optimization flags of `-O2` or higher are used. The solution involves initializing the memory to zeros to prevent this issue from occurring. | |
### Key Changes: | |
- Identified a bug in gcc-5 affecting the initialization of `data1` in `packetmath_minus_zero_add()`. | |
- Implemented a fix by initializing memory to zeroes, which resolves the problem at hand. | |
### Improvements: | |
- Enhances compatibility with gcc-5, ensuring accurate behavior in memory initialization. | |
### Impact: | |
- Solves a critical issue in packet operations for users compiling with gcc-5, leading to more reliable and predictable library behavior in scenarios where this bug would have affected calculations." | |
875 (https://gitlab.com/libeigen/eigen/-/merge_requests/875),Fix packetmath compilation error.,"Unfortunately we can't pass a pointer to `psqrt<Packet>` as a functor, since such a | |
function might not exist (e.g. if `HasSqrt` is `false`). The only way to pass an overloaded function | |
as a functor is to wrap it in a struct. Here we create a simple wrapper | |
around `psqrt`, `prsqrt`.",Antonio Sánchez,2022-02-23T23:27:08.997Z,NA,NA,"## Title: | |
Fix packetmath compilation error. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses a compilation error in the Eigen C++ library related to the `psqrt<Packet>` function. The issue arises from the inability to pass pointers to this function as a functor, particularly when the overloaded function may not be available. | |
### Key Changes: | |
- Introduced a wrapper struct around the `psqrt` function, named `prsqrt`, to facilitate its use as a functor. | |
### Improvements: | |
- Streamlined the function handling for `psqrt` by enabling it to be easily passed as a functor, ensuring better compatibility and reducing compilation errors. | |
### Impact: | |
This change improves the robustness of the packet math operations by allowing consistent usage of the `psqrt` function in scenarios where `HasSqrt` may be `false`, ultimately enhancing the functionality and stability of the Eigen library." | |
877 (https://gitlab.com/libeigen/eigen/-/merge_requests/877),Disable deprecated warnings for SVD tests on MSVC.,"We are currently purposely testing the deprecated behavior, and the | |
warnings are cluttering up the build log.",Antonio Sánchez,2022-02-24T21:20:50.841Z,NA,NA,"## Title: | |
Disable deprecated warnings for SVD tests on MSVC. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request focuses on disabling deprecated warnings in the Singular Value Decomposition (SVD) tests for the MSVC compiler. The goal is to keep the build log clean since these warnings are intentionally being generated to test deprecated behavior. | |
### Key Changes: | |
- Deprecated warnings for SVD tests on MSVC have been disabled. | |
### Improvements: | |
- Cleaner build logs by eliminating unnecessary warning messages related to deprecated features. | |
### Impact: | |
- Enhances readability of the build output, making it easier for developers to focus on relevant messages while still testing deprecated behavior effectively." | |
878 (https://gitlab.com/libeigen/eigen/-/merge_requests/878),Fix frexp packetmath tests for MSVC.,"Silly MSVC, `frexp` sets the exponent to 1 for non-finite inputs when the docs say the output | |
is ""unspecified"". Our tests assume it remains zero, so we need to reset it for our reference value.",Antonio Sánchez,2022-02-24T22:16:38.443Z,NA,NA,"## Title: | |
Fix frexp packetmath tests for MSVC. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an issue with the `frexp` function in the Eigen C++ library, specifically for MSVC. The existing tests incorrectly assumed that the exponent for non-finite inputs remained zero, contrary to the MSVC implementation which sets it to one. | |
### Key Changes: | |
- Adjusted the tests for the `frexp` function to correctly handle cases where the input is non-finite, ensuring the exponent is reset to the expected reference value. | |
### Improvements: | |
- Enhanced test accuracy for the `frexp` function under MSVC, aligning the behavior of tests with the actual implementation of the function. | |
### Impact: | |
- This change prevents incorrect test failures when running under MSVC, improving the reliability and robustness of the test suite for the Eigen library." | |
876 (https://gitlab.com/libeigen/eigen/-/merge_requests/876),Fix mixingtypes for g++-11.,"The `_mm512_broadcast_f64x2` instruction is no faster than the | |
`_mm512_broadcast_f32x4` instruction, and is warned about in [intel's | |
documentation](https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=_mm512_broadcast_f64x2&ig_expand=554). There's also a funny interaction within our GEBP kernels with g++-11 that | |
causes the real component to be duplicated to the imaginary component | |
when optimizations are turned on (-O1 or higher). Removing this | |
instruction fixes the issue. | |
This fixes both `mixingtypes_6` and `mixingtypes_7` tests for AVX512.",Antonio Sánchez,2022-02-25T19:28:11.129Z,NA,NA,"## Title: | |
Fix mixingtypes for g++-11 | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an issue with the `_mm512_broadcast_f64x2` instruction in the Eigen C++ library, particularly in relation to g++-11 optimization behavior. | |
### Key Changes: | |
- Removed the `_mm512_broadcast_f64x2` instruction due to its performance parity with `_mm512_broadcast_f32x4` and a warning in Intel's documentation. | |
- Resolved an issue where the real component was erroneously duplicated to the imaginary component during GEBP kernel operations with optimization levels -O1 or higher. | |
### Improvements: | |
- Corrected the `mixingtypes_6` and `mixingtypes_7` tests for AVX512, enhancing the reliability of these tests. | |
### Impact: | |
This update improves the stability and performance of the Eigen library when using g++-11, specifically for operations involving AVX512, leading to more accurate results in complex type computations." | |
880 (https://gitlab.com/libeigen/eigen/-/merge_requests/880),Fix SVD for MSVC.,"There's an odd bug for MSVC where the `Options` template parameter seems | |
to be forgotten (is always treated as zero) unless we store it within | |
the class as an `enum` or `static constexpr` member. This was causing | |
the `ShouldComputeThinU`, ... members to all always evaluate to `false`, | |
breaking all the fixed-sized SVDs. This was happening with all versions | |
of msvc tested. Storing it and using the stored version fixes this. | |
Also fixed some warnings on MSVC, and temporarily disabled the | |
`EIGEN_DEPRECATED` attribute on one constructor until I get a chance to | |
fix another library that uses it and has `-Werror=deprecated-declarations` | |
enabled by default.",Antonio Sánchez,2022-02-28T19:53:15.957Z,NA,NA,"## Title: | |
Fix SVD for MSVC | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses a critical bug affecting the Singular Value Decomposition (SVD) functionality in the Eigen C++ library when compiled with Microsoft Visual Studio (MSVC). The issue was due to the `Options` template parameter not being properly recognized, leading to incorrect evaluations of SVD configurations. | |
### Key Changes: | |
- Modified the handling of the `Options` template parameter to ensure it is stored as an `enum` or `static constexpr` member within the class. | |
- Resolved various warnings in MSVC. | |
- Temporarily disabled the `EIGEN_DEPRECATED` attribute on one constructor to accommodate compatibility with another library using this feature. | |
### Improvements: | |
- Fixes the evaluation logic for `ShouldComputeThinU` and related members, restoring the functionality of fixed-sized SVDs in all MSVC versions tested. | |
- Enhanced code compatibility and reduced warning clutter when compiling with MSVC. | |
### Impact: | |
This update ensures that users compiling with MSVC can reliably utilize the SVD functionality, improving overall stability and usability of the Eigen library in MSVC environments." | |
879 (https://gitlab.com/libeigen/eigen/-/merge_requests/879),Fix any/all reduction in the case of row-major layout,Fix any/all reduction inefficiency in the case of row-major layout.,Yury Gitman,2022-03-01T05:27:51.110Z,NA,NA,"## Title: | |
Fix any/all reduction in the case of row-major layout | |
## Authors: | |
Yury Gitman | |
## Summary: | |
This merge request addresses inefficiencies in the any/all reduction operations for row-major layout in the Eigen C++ library. | |
### Key Changes: | |
- Improved the implementation of any/all reduction functions specifically for row-major matrix layouts. | |
### Improvements: | |
- Enhanced performance of reduction operations, leading to more efficient handling of row-major layouts. | |
### Impact: | |
- Users working with row-major matrices will experience improved efficiency in reduction operations, resulting in better overall performance for applications relying on these functionalities." | |
882 (https://gitlab.com/libeigen/eigen/-/merge_requests/882),Fix SVD for MSVC+CUDA.,"CUDA gets confused about the definition of `Index`, and can't match the | |
out-of-line definitions of a few functions with their declarations. | |
If we just import the `SVDBase::Index` definition, then gcc/clang get confused | |
about the out-of-line definition. To fix all, we import | |
`SVDBase::Index`, and modify all definitions to use the internal `Index` | |
type. | |
Also addressed annoying warnings about not returning anything at the end | |
of a non-void function.",Antonio Sánchez,2022-03-01T21:35:23.386Z,NA,NA,"## Title: | |
Fix SVD for MSVC+CUDA. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request resolves compatibility issues with the Singular Value Decomposition (SVD) implementation in the Eigen C++ library when using Microsoft Visual Studio (MSVC) in conjunction with CUDA. It focuses on the `Index` type discrepancies and addresses function return warnings. | |
### Key Changes: | |
- Imported `SVDBase::Index` definition to prevent mismatches in function declarations. | |
- Modified all affected function definitions to consistently use the internal `Index` type. | |
### Improvements: | |
- Eliminated compiler confusion regarding function declarations vs. definitions, enhancing compatibility with both MSVC and CUDA. | |
- Resolved warnings related to non-void functions not returning a value. | |
### Impact: | |
These changes improve the usability of the Eigen library on systems utilizing MSVC and CUDA, ensuring smoother compilation and execution while adhering to coding standards." | |
883 (https://gitlab.com/libeigen/eigen/-/merge_requests/883),Adjust tolerance of matrix_power test for MSVC.,Was failing pretty consistently for MSVC 19.16.,Antonio Sánchez,2022-03-01T23:34:00.195Z,NA,NA,"## Title: | |
Adjust tolerance of matrix_power test for MSVC. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses a consistency issue with the `matrix_power` test specifically for MSVC version 19.16. The tolerance levels for the test were adjusted to prevent frequent failures. | |
### Key Changes: | |
- Modified tolerance settings for the `matrix_power` test to enhance compatibility with MSVC 19.16. | |
### Improvements: | |
- Reduced the frequency of test failures in the MSVC environment, ensuring more reliable test results. | |
### Impact: | |
- Enhances the stability of the testing process in MSVC, leading to a more robust development cycle for the Eigen library." | |
872 (https://gitlab.com/libeigen/eigen/-/merge_requests/872),Modified sqrt/rsqrt for denormal handling.,"This updates the new generic sqrt/rsqrt implementation after !868 | |
to account for the following: | |
- Better handling of `std::numeric_limits<T>::denorm_min()` (the | |
original incorrectly returns `NaN` for AVX512) | |
- Better handling of denormals in general (will often give correct | |
answers rather than flushing to 0/`inf`) | |
- Faster `sqrt` and `rsqrt` for AVX512 (but slightly slower rsqrt for | |
SSE, AVX had no change) | |
Google benchmark numbers (only significant changes shown): | |
``` | |
Comparing ./sqrt_old_sse4.2 to ./sqrt_new_sse4.2 | |
Benchmark Time CPU Time Old Time New CPU Old CPU New | |
---------------------------------------------------------------------------------------------------------------------- | |
BM_Rsqrt<float>/8/1 +0.1165 +0.1165 5 5 5 5 | |
BM_Rsqrt<float>/64/1 +0.1355 +0.1355 25 28 25 28 | |
BM_Rsqrt<float>/512/1 +0.1340 +0.1340 195 221 195 221 | |
BM_Rsqrt<float>/2048/1 +0.0715 +0.0714 1016 1089 1016 1089 | |
Comparing ./sqrt_old_avx512dq to ./sqrt_new_avx512dq | |
Benchmark Time CPU Time Old Time New CPU Old CPU New | |
---------------------------------------------------------------------------------------------------------------------- | |
BM_Sqrt<float>/8/1 -0.0226 -0.0226 9 8 9 8 | |
BM_Sqrt<float>/64/1 -0.3050 -0.3050 14 9 14 9 | |
BM_Sqrt<float>/512/1 -0.3282 -0.3282 104 70 104 70 | |
BM_Sqrt<float>/2048/1 -0.2790 -0.2790 469 338 469 338 | |
BM_Sqrt<double>/8/1 -0.1990 -0.1990 5 4 5 4 | |
BM_Sqrt<double>/64/1 -0.2366 -0.2366 34 26 34 26 | |
BM_Sqrt<double>/512/1 -0.2236 -0.2236 313 243 313 243 | |
BM_Sqrt<double>/2048/1 -0.2237 -0.2237 1287 999 1287 999 | |
BM_Rsqrt<float>/8/1 +0.0166 +0.0165 5 5 5 5 | |
BM_Rsqrt<float>/64/1 -0.0715 -0.0715 11 10 11 10 | |
BM_Rsqrt<float>/512/1 -0.1097 -0.1097 82 73 82 73 | |
BM_Rsqrt<float>/2048/1 -0.1323 -0.1323 387 335 387 335 | |
BM_Rsqrt<double>/8/1 -0.0874 -0.0874 5 5 5 5 | |
BM_Rsqrt<double>/64/1 -0.1198 -0.1198 31 27 31 27 | |
BM_Rsqrt<double>/512/1 -0.1499 -0.1499 287 244 287 244 | |
BM_Rsqrt<double>/2048/1 -0.1728 -0.1727 1181 977 1181 977 | |
OVERALL_GEOMEAN -0.1616 -0.1616 0 0 0 0 | |
```",Antonio Sánchez,2022-03-02T17:20:49.074Z,NA,NA,"## Title: | |
Modified sqrt/rsqrt for denormal handling. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request updates the generic `sqrt` and `rsqrt` implementations in the Eigen C++ library, enhancing the handling of denormal numbers and improving computational efficiency on specific architectures. | |
### Key Changes: | |
- Improved handling of denormals to prevent returning `NaN` in AVX512. | |
- General enhancements to denormal processing to avoid flushing to zero or infinity. | |
- Increased performance of `sqrt` and `rsqrt` on AVX512 architecture, with varying effects on other architectures. | |
### Improvements: | |
- Significant reductions in computation time for `sqrt` operations on AVX512. | |
- Better accuracy and reliability for operations involving denormal numbers. | |
### Impact: | |
The changes lead to faster and more accurate computation for standard square root operations, particularly in AVX512 environments, thereby optimizing performance and numerical stability in calculations dealing with denormal numbers." | |
884 (https://gitlab.com/libeigen/eigen/-/merge_requests/884),Remove poor non-convergence checks in NonLinearOptimization.,"Both the `levenberg_marquardt` and `NonLinearOptimization` tests are | |
essentially the same. For some reason, they purposely check for a | |
specific status that indicates non-convergence, even though there are | |
many closely related statuses based on tolerances. They also check | |
for exact (or near exact) number of iterations. This makes no sense | |
in general when dealing with multiple architectures, nor when | |
considering things like FMA and optimization levels (e.g. `-O3`) can | |
slightly change numerical results. | |
Removed a bunch of poor checks to allow these tests to pass. | |
Note: the NonLinearOptimization subpackage seems to be an ""updated"" | |
version of LevenbergMarquardt. They have a bunch of colliding symbols | |
and the test is nearly identical. We may consider removing one (or | |
both) of these.",Antonio Sánchez,2022-03-02T19:31:20.798Z,NA,NA,"## Title: | |
Remove poor non-convergence checks in NonLinearOptimization. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request focuses on simplifying the convergence checks in the `NonLinearOptimization` and `levenberg_marquardt` tests by removing overly strict checks that hinder test success, given the variability in numerical results across different architectures and optimization levels. | |
### Key Changes: | |
- Removed specific status checks that indicated non-convergence, which were unnecessarily stringent. | |
- Eliminated the requirement for tests to match an exact number of iterations. | |
### Improvements: | |
- Enhanced flexibility in the tests to accommodate varying numerical results due to different architectures and compiler optimizations. | |
### Impact: | |
- Improved test reliability by allowing more cases to pass under normal operational variability, resulting in a more robust testing framework for the optimization algorithms." | |
886 (https://gitlab.com/libeigen/eigen/-/merge_requests/886),Skip denormal test if `Cond` is false.,"Minor bug, the test should be skipped if the packet operation doesn't | |
exist.",Antonio Sánchez,2022-03-03T04:32:13.780Z,NA,NA,"## Title: | |
Skip denormal test if `Cond` is false. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses a minor bug in the Eigen C++ library related to denormal tests. The update ensures that tests are only executed when the corresponding packet operation exists. | |
### Key Changes: | |
- Added a conditional check to skip the denormal test when `Cond` is false. | |
### Improvements: | |
- Improved the test suite’s efficiency by avoiding unnecessary test executions for non-existent packet operations. | |
### Impact: | |
- This change enhances the reliability of the test framework by preventing false negatives and ensuring that tests only run when applicable, leading to a clearer and more effective testing process." | |
885 (https://gitlab.com/libeigen/eigen/-/merge_requests/885),Fix enum conversion warnings in BooleanRedux.,NA,Antonio Sánchez,2022-03-03T05:04:27.500Z,NA,NA,"## Title: | |
Fix enum conversion warnings in BooleanRedux. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses and resolves enum conversion warnings within the BooleanRedux component of the Eigen C++ library. | |
### Key Changes: | |
- Modified the enum conversion methods to eliminate warnings during compilation. | |
### Improvements: | |
- Enhanced code clarity and compliance by resolving type conversion issues related to enums. | |
### Impact: | |
- The changes will lead to a cleaner build process by suppressing warnings, potentially improving developer experience and maintainability of the codebase." | |
888 (https://gitlab.com/libeigen/eigen/-/merge_requests/888),Speed lscg by using .noalias,"Hi, Eigen developers | |
This MR aims to speed up `least_square_conjugate_gradient()` by adding `.noalias()`.",Zhuo Zhang,2022-03-03T18:42:19.213Z,NA,NA,"## Title: | |
Speed lscg by using .noalias | |
## Authors: | |
Zhuo Zhang | |
## Summary: | |
This merge request introduces optimizations to the `least_square_conjugate_gradient()` function in the Eigen C++ library by applying the `.noalias()` optimization. | |
### Key Changes: | |
- Added `.noalias()` to `least_square_conjugate_gradient()` to prevent unnecessary temporary allocations and improve performance. | |
### Improvements: | |
- Enhanced computation speed of the least squares conjugate gradient function. | |
### Impact: | |
- The performance boost is expected to lead to faster computations in applications utilizing the least squares conjugate gradient algorithm, improving overall efficiency in numerical methods that rely on Eigen." | |
851 (https://gitlab.com/libeigen/eigen/-/merge_requests/851),Fix JacobiSVD_LAPACKE bindings,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
JacobiSVD_LAPACKE.h doesn't reflect current SVD module for runtime options. :upside_down: This just corrects that. | |
Tested these changes with apple Accelerate.",Arthur,2022-03-03T19:24:08.236Z,NA,NA,"## Title: | |
Fix JacobiSVD_LAPACKE bindings | |
## Authors: | |
Arthur | |
## Summary: | |
This merge request addresses an inconsistency in the JacobiSVD_LAPACKE header file related to the current SVD module's runtime options. | |
### Key Changes: | |
- Updated the JacobiSVD_LAPACKE.h file to align with the latest SVD module specifications. | |
### Improvements: | |
- Ensured accurate reflection of runtime options in the JacobiSVD_LAPACKE bindings. | |
- Tested and validated changes using Apple Accelerate. | |
### Impact: | |
- Enhances the correctness and usability of the SVD module within the Eigen library, potentially improving performance and decreasing confusion for users relying on LAPACKE bindings." | |
887 (https://gitlab.com/libeigen/eigen/-/merge_requests/887),Update vectorization_logic tests for all platforms.,"A few of the tests depend on packet (and half packet) availability, | |
implementation of specific packet functions (e.g. add, div), and | |
unrolling limits and costs, which can vary by platform. Here we add | |
the ability to ignore the traversal or unrolling when | |
it is not informative. | |
Also, with some half-packets, if the size of the packet is smaller | |
than 16 bytes (e.g. `Packet2f`), then matrices of certain sizes | |
no longer vectorize with `EIGEN_UNALIGNED_VECTORIZE=0`, since they | |
are assumed unaligned. Adjusted matrix sizes in such cases to | |
get the tests to pass. | |
This *should* help tests pass on all platforms. Tested with | |
SSE, AVX, AVX512, NEON, AltiVec.",Antonio Sánchez,2022-03-03T19:54:16.335Z,NA,NA,"## Title: | |
Update vectorization_logic tests for all platforms. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request focuses on enhancing the vectorization logic tests across various platforms by addressing dependencies on packet availability, specific packet function implementations, and unrolling limits. | |
### Key Changes: | |
- Introduced the capability to ignore traversal or unrolling when it's not informative, allowing for greater flexibility in tests. | |
- Adjusted matrix sizes for specific half-packets (e.g., `Packet2f`) to ensure compatibility with the `EIGEN_UNALIGNED_VECTORIZE=0` setting. | |
### Improvements: | |
- The updates aim to ensure that tests pass consistently across different platforms, including SSE, AVX, AVX512, NEON, and AltiVec. | |
### Impact: | |
- This change is expected to enhance the reliability of tests and facilitate smoother integration across multiple architectures, improving overall test success rates in the Eigen library." | |
864 (https://gitlab.com/libeigen/eigen/-/merge_requests/864),Removed EIGEN_UNUSED decorations from many functions that are in fact used,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Sean McBride,2022-03-03T20:19:33.917Z,NA,NA,"## Title: | |
Removed EIGEN_UNUSED decorations from many functions that are in fact used | |
## Authors: | |
Sean McBride | |
## Summary: | |
This merge request addresses the cleanup of the Eigen C++ library by removing unnecessary `EIGEN_UNUSED` decorations from various functions that are still actively used in the codebase. | |
### Key Changes: | |
- Eliminated `EIGEN_UNUSED` annotations from several functions where such decorations were incorrectly applied. | |
### Improvements: | |
- Enhanced code clarity by ensuring that `EIGEN_UNUSED` is only used where applicable. | |
- Reduced potential confusion for developers regarding function usage. | |
### Impact: | |
- This cleanup helps improve maintainability and readability of the code, promoting better practices for function annotations in the Eigen library." | |
890 (https://gitlab.com/libeigen/eigen/-/merge_requests/890),Remove duplicate IsRowMajor declaration.,"It already exists in DenseBase, and this shadow copy is generating warnings.",Antonio Sánchez,2022-03-04T21:22:03.523Z,NA,NA,"## Title: | |
Remove duplicate IsRowMajor declaration. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses the issue of a duplicate `IsRowMajor` declaration in the Eigen C++ library. The duplicate declaration was found to be unnecessary as the original definition already exists in the `DenseBase` class. Removing this redundancy resolves warnings generated during compilation. | |
### Key Changes: | |
- Eliminated the duplicate `IsRowMajor` declaration. | |
### Improvements: | |
- Reduction of compilation warnings by removing unnecessary code. | |
### Impact: | |
- Streamlined the codebase, leading to cleaner and more maintainable code while improving compilation efficiency." | |
891 (https://gitlab.com/libeigen/eigen/-/merge_requests/891),Split and reduce SVD test sizes.,"The original tests consume A LOT of memory - causing MSVC to | |
constantly run out of heap space, gcc to sometimes crash without | |
warning, and our CI machines to start using swap space, drastically | |
slowing down compile times. | |
Here we split up and re-number a bunch of the tests. Also | |
reduced some fixed-size matrix sizes - the new stack allocation for U/V | |
gets pretty big, which also seems to be drastically slowing down | |
compile times.",Antonio Sánchez,2022-03-05T00:15:28.613Z,NA,NA,"## Title: | |
Split and reduce SVD test sizes. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses excessive memory consumption in the SVD tests within the Eigen C++ library, which has caused issues with compilers and continuous integration processes. | |
### Key Changes: | |
- Split and re-numbered multiple SVD tests to optimize memory usage. | |
- Reduced the sizes of some fixed-size matrices used in the tests. | |
### Improvements: | |
- The changes lead to lower memory consumption during testing. | |
- Helps prevent compilers, specifically MSVC and gcc, from running out of memory or crashing unexpectedly. | |
### Impact: | |
- The adjustments significantly improve compile times on CI machines by alleviating heavy swap space usage." | |
893 (https://gitlab.com/libeigen/eigen/-/merge_requests/893),Adds new CMake Options for controlling build components.,"Adds build options `EIGEN_BUILD_BLAS`, `EIGEN_BUILD_LAPACK`, and `EIGEN_BUILD_CMAKE_PACKAGE`. Resurrected from !512.",Antonio Sánchez,2022-03-05T05:49:46.376Z,NA,NA,"## Title: | |
Adds new CMake Options for controlling build components. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request introduces new CMake configuration options that allow users to control various components during the build process of the Eigen C++ library. | |
### Key Changes: | |
- Added new CMake options: | |
- `EIGEN_BUILD_BLAS` | |
- `EIGEN_BUILD_LAPACK` | |
- `EIGEN_BUILD_CMAKE_PACKAGE` | |
### Improvements: | |
- Enhanced flexibility in the build system by allowing users to select specific components to include or exclude. | |
### Impact: | |
- These changes provide developers with greater control over the build configuration, potentially leading to more efficient builds tailored to specific needs." | |
894 (https://gitlab.com/libeigen/eigen/-/merge_requests/894),"Fix broken tensor executor test, allow tensor packets of size 1.","The cxx11_tensor_executor test assumed vectorization was always possible for the given types - though they may not be depending on the platform. Added a check via `packet_traits<T>`. | |
Also modified tensor ops to actually allow `PacketSize == 1`, | |
such as for `Packet1cd`. The README previously said complex | |
was known to be broken, but we've since added some fixes for that.",Antonio Sánchez,2022-03-07T20:30:39.287Z,NA,NA,"## Title: | |
Fix broken tensor executor test, allow tensor packets of size 1. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses issues found in the `cxx11_tensor_executor` test, which erroneously assumed vectorization was always applicable for certain types. It introduces checks to ensure compatibility based on platform capabilities and updates tensor operations to support packet sizes of one. | |
### Key Changes: | |
- Implemented checks using `packet_traits<T>` to validate vectorization based on platform. | |
- Modified tensor operations to permit `PacketSize == 1`, including support for `Packet1cd`. | |
### Improvements: | |
- Improved the robustness of the tensor executor tests by aligning assumptions with actual platform capabilities. | |
- Enhanced the handling of complex types, correcting previous issues documented in the README. | |
### Impact: | |
These changes ensure more reliable tensor operations across different platforms, improving test accuracy and expanding the functionality for low packet sizes in the Eigen C++ library." | |
856 (https://gitlab.com/libeigen/eigen/-/merge_requests/856),Add support for Apple's Accelerate sparse matrix solvers,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
This merge request implements support for the sparse matrix solvers found in Apple's Accelerate framework. Wrappers around the following solvers are provided: | |
* AccelerateLLT: Cholesky (LL^T) factorization | |
* AccelerateLDLT: Default LDL^T factorization (Currently an alias to AccelerateLDLTTPP) | |
* AccelerateLDLTUnpivoted: Cholesky-like LDL^T with only 1x1 pivots and no pivoting | |
* AccelerateLDLTSBK: LDL^T with Supernode Bunch-Kaufman and static pivoting | |
* AccelerateLDLTTPP: LDL^T with full threshold partial pivoting | |
* AccelerateQR: QR factorization | |
* AccelerateCholeskyAtA: QR factorization without storing Q (equivalent to A^TA = R^T R) | |
### Performance | |
Notes: | |
* Default solver settings are used in each test. | |
* Eigen QR did not finish in a reasonable amount of time. | |
* Tests run on an Apple M1 Devkit (4e4p cores) | |
Solving system of size: 1946848x1946848 | |
| Solver | Analysis (ms) | Factorization (ms) | Solve (ms) | Total (ms) | | |
|-------------------------|----------|---------------|-------|-------| | |
| AccelerateLDLT | 722 | 9861 | 169 | 10753 | | |
| AccelerateLLT | 744 | 774 | 242 | 1760 | | |
| Eigen LDLT | 1376 | 45685 | 343 | 47405 | | |
| Eigen LLT | 1366 | 45716 | 346 | 47429 | | |
| Cholmod Simplicial LDLT | 12134 | 10171 | 79 | 22385 | | |
| Cholmod Supernodal LLT | 13138 | 864 | 110 | 14114 | | |
Solving system of size: 57504x57504: | |
| Solver | Analysis (ms) | Factorization (ms) | Solve (ms) | Total (ms) | | |
|--------------|----------|---------------|-------|-------| | |
| AccelerateQR | 221 | 2939 | 217 | 3378 | | |
| SPQR | 69 | | 961 | 4371 | | |
| Eigen QR | 64 | > 60000 | | | | |
 | |
### Additional information | |
In order for Accelerate to work, you need to provide which triangle you wish to use (for LDLT and LLT) and the type of matrix that you are factorizing. Eg symmetric, ordinary or triangular. The way that I'm determining this currently is by the exploiting the `UpLo` template argument. | |
The question I have is what would be a sensible `UpLo` default? If you try and run Accelerate's LDLT solver and `m_sparseKind` is not `SparseSymmetric`, it will assert and crash the program. In the way that I'm currently handling it, you would therefore need to do: `AccelerateLDLT<SparseMatrix<double>, Symmetric | Upper> LDLT;` This doesn't exactly follow what the other solvers do however. We could always assume that whatever you're passing into the solver is positive definite, which is what the PARDISO wrapper seems to do. This would work for LDLT and LLT, but not necessarily for QR. | |
QR is a bit tricky as your matrix may not be upper or lower triangular, and may also not be symmetric - ie it's ""ordinary"" as Apple calls it. However UpLoType does not have an appropriate entry for this case, so you must pass 0 in order to set `m_sparseKind` to `SparseOrdinary`. Eg: `AccelerateQR<SparseMatrix<double>, 0> QR;` You can see an example of this in the test program.",John Mather,2022-03-08T00:09:19.119Z,NA,NA,"## Title: | |
Add support for Apple's Accelerate sparse matrix solvers | |
## Authors: | |
John Mather | |
## Summary: | |
This merge request introduces support for sparse matrix solvers from Apple's Accelerate framework, adding wrappers for various factorization methods. These implementations aim to enhance performance in solving sparse matrices. | |
### Key Changes: | |
- Added wrappers for the following Accelerate solvers: | |
- **AccelerateLLT**: Cholesky (LL^T) factorization | |
- **AccelerateLDLT**: Default LDL^T factorization | |
- **AccelerateLDLTUnpivoted**: Cholesky-like LDL^T without pivoting | |
- **AccelerateLDLTSBK**: LDL^T with Supernode Bunch-Kaufman | |
- **AccelerateLDLTTPP**: LDL^T with threshold partial pivoting | |
- **AccelerateQR**: QR factorization | |
- **AccelerateCholeskyAtA**: QR factorization without storing Q | |
### Improvements: | |
- Performance benchmarks indicate significant speed enhancements over Eigen's existing solvers: | |
- For a system size of 1,946,848: | |
- `AccelerateLDLT` achieved a total time of 10,753 ms compared to Eigen's 47,405 ms. | |
- `AccelerateLLT` completed in 1,760 ms, again outperforming Eigen's implementation. | |
- For a smaller system size of 57,504: | |
- `AccelerateQR` outperformed both Eigen's QR and SPQR, completing in 3,378 ms. | |
### Impact: | |
This integration allows users of the Eigen library to leverage Apple’s Accelerate framework, particularly on Apple hardware, providing faster and more efficient sparse matrix computations. The new solvers are particularly beneficial for applications requiring high-performance linear algebra operations, likely increasing the library's usability in performance-critical applications." | |
897 (https://gitlab.com/libeigen/eigen/-/merge_requests/897),Remove copy_bool workaround for gcc 4.3,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This is a split-off from MR !881 . | |
The testsuite uses a workaround for gcc 4.3, a noughties-era compiler which implements something called C++0x as its highest version of C++. In other words, this compiler can no longer be used to compile Eigen. Torn between turning this workaround into a `constexpr` and outright removing it, I opted for the latter. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
I ran the first few hundred testsuite checks on my PC and found no regressions.",Tobias Schlüter,2022-03-08T17:43:12.165Z,NA,NA,"## Title: | |
Remove copy_bool workaround for gcc 4.3 | |
## Authors: | |
Tobias Schlüter | |
## Summary: | |
This merge request eliminates a workaround for the gcc 4.3 compiler within the Eigen C++ library testsuite. It addresses an issue stemming from the aged compiler, which can no longer compile Eigen due to its limited support for C++ standards. | |
### Key Changes: | |
- Removed the `copy_bool` workaround specifically designed for gcc 4.3. | |
### Improvements: | |
- Simplifies the codebase by removing obsolete compatibility workarounds, enhancing maintainability. | |
### Impact: | |
- It ensures that Eigen no longer supports gcc 4.3 but also streamlines the code, reducing complexity without affecting modern compiler functionality. The testsuite has shown no regressions after this change." | |
895 (https://gitlab.com/libeigen/eigen/-/merge_requests/895),make SparseSolverBase and IterativeSolverBase move constructable,"Hello everyone, | |
Thanks for this powerful library. | |
I witnessed an issue in my code where I wanted to move iterative solvers around. | |
Since they derive from SparseSolverBase they are not copyable. | |
This is fine, but I'm interested in the reason why this is the case. | |
Nevertheless, I propose the change to include move constructors, if C++11 is supported. | |
### What does this implement/fix? | |
This fixes the issue that IterativeSolvers cannot be moved. I.e., it solves the issue outlined in https://godbolt.org/z/rM8M76bqW",Alex_M,2022-03-08T19:41:30.587Z,NA,NA,"## Title: | |
make SparseSolverBase and IterativeSolverBase move constructable | |
## Authors: | |
Alex_M | |
## Summary: | |
This merge request introduces move constructors for the `SparseSolverBase` and `IterativeSolverBase` classes in the Eigen C++ library, resolving the limitations related to moving these objects. | |
### Key Changes: | |
- Added move constructors to `SparseSolverBase` and `IterativeSolverBase`. | |
- Ensured compatibility with C++11 standards. | |
### Improvements: | |
- Enhanced the flexibility of iterative solvers by enabling them to be moved, rather than copied. | |
### Impact: | |
This change allows users to efficiently handle instances of iterative solvers in their code, improving performance and usability when managing solver objects." | |
889 (https://gitlab.com/libeigen/eigen/-/merge_requests/889),"Add construct_at, destroy_at wrappers. Use throughout.","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This is the first independent part of my previous merge request 881 https://gitlab.com/libeigen/eigen/-/merge_requests/881 | |
This implements wrappers for C++17 and C++20's `std::destroy_at` and `std::construct_at` and uses them throughout instead of placement new and explicit destructor calls. Since this is independent of the previous merge request, I chose to take a less targeted approach and change all occurences. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
Except for a trap in JacobiSVD (where the order of arguments is reversed between subsequent invocations, so simple copy and paste would be wrong) and in IterativeLinearSolvers/IterativeSolverBase.h (where type names are used inconsistently before this patch so the change may look wrong), there are no surprises here.",Tobias Schlüter,2022-03-08T20:43:23.217Z,NA,NA,"## Title: | |
Add construct_at, destroy_at wrappers. Use throughout. | |
## Authors: | |
Tobias Schlüter | |
## Summary: | |
This merge request introduces wrappers for `std::construct_at` and `std::destroy_at`, which are features in C++17 and C++20. The proposal replaces the traditional use of placement new and explicit destructor calls throughout the Eigen library. | |
### Key Changes: | |
- Implemented `construct_at` and `destroy_at` wrappers. | |
- Replaced all occurrences of placement new and explicit destructor calls with the new wrappers. | |
### Improvements: | |
- Enhances code clarity and consistency by using standardized modern C++ practices. | |
- Reduces potential for errors by avoiding manual memory management tasks, making the codebase cleaner and more efficient. | |
### Impact: | |
The change improves maintainability and safety of the code, aligning with modern C++ standards, ultimately benefiting developers working with the Eigen library." | |
898 (https://gitlab.com/libeigen/eigen/-/merge_requests/898),Fix edge-case in zeta for large inputs.,"Reported by MLIR folks, large inputs are triggering NaNs in TF/Eigen. | |
NaNs are being triggered in the tail sum correction term due to an overflow | |
in the `a` parameter. Returning zero for such large inputs (e.g. 2000, 2000) | |
is consistent with scipy.",Antonio Sánchez,2022-03-08T21:21:20.766Z,NA,NA,"## Title: | |
Fix edge-case in zeta for large inputs. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an issue with the zeta function in the Eigen C++ library that causes NaNs when processing large input values. Specifically, it corrects an overflow in the `a` parameter during the tail sum correction term, ensuring consistent behavior with scipy. | |
### Key Changes: | |
- Resolved NaN issue for large inputs in the zeta function. | |
- Modified the handling of the `a` parameter to prevent overflow. | |
### Improvements: | |
- The fix allows the zeta function to return zero for large inputs, aligning its behavior with that of scipy. | |
### Impact: | |
This change enhances the reliability of the zeta function when dealing with large inputs, preventing runtime errors and improving compatibility with existing numerical libraries." | |
896 (https://gitlab.com/libeigen/eigen/-/merge_requests/896),Remove ComputeCpp-specific code from SYCL Vptr,"### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
The virtual pointer code used to rely on ComputeCpp-specific details, but can now be implemented using the reinterpret functionality of the SYCL buffer class. | |
### Additional information | |
<!--Any additional information you think is important.-->",Duncan McBain,2022-03-08T22:44:18.751Z,NA,NA,"## Title: | |
Remove ComputeCpp-specific code from SYCL Vptr | |
## Authors: | |
Duncan McBain | |
## Summary: | |
This merge request removes ComputeCpp-specific code from the SYCL virtual pointer (Vptr) implementation. The new implementation leverages the reinterpret functionality of the SYCL buffer class, improving compatibility and performance. | |
### Key Changes: | |
- Eliminated ComputeCpp-specific code in the SYCL Vptr implementation. | |
- Introduced use of the reinterpret functionality within the SYCL buffer class for better handling. | |
### Improvements: | |
- Enhanced compatibility with SYCL standards by removing dependency on a particular vendor's implementation. | |
- Potentially improved performance and maintainability through a more generalized solution. | |
### Impact: | |
The changes are expected to streamline the SYCL Vptr implementation, making it more adaptable to various SYCL environments and reducing limitations tied to ComputeCpp." | |
901 (https://gitlab.com/libeigen/eigen/-/merge_requests/901),Fix construct_at compilation breakage on ROCm.,"Enable the recently added construct_at and destroy_at functions to build on HIP. | |
/cc @cantonios",Rohit Santhanam,2022-03-09T17:48:18.620Z,NA,NA,"## Title: | |
Fix construct_at compilation breakage on ROCm. | |
## Authors: | |
Rohit Santhanam | |
## Summary: | |
This merge request addresses a compilation issue with the `construct_at` and `destroy_at` functions in the Eigen C++ library when used in ROCm (Radeon Open Compute) environments, by enabling their successful build on HIP (Heterogeneous-Compute Interface for Portability). | |
### Key Changes: | |
- Enabled the building of `construct_at` and `destroy_at` functions for ROCm. | |
### Improvements: | |
- Improved compatibility with ROCm, enhancing the library's usability on AMD GPUs. | |
### Impact: | |
- Ensures that users can effectively utilize `construct_at` and `destroy_at` in HIP environments, thereby improving the library's functionality for a broader range of hardware platforms." | |
902 (https://gitlab.com/libeigen/eigen/-/merge_requests/902),Temporarily disable aarch64 CI.,Disable Arm CI as WoA machines are temporarily down.,Everton Constantino,2022-03-10T14:46:41.657Z,NA,NA,"## Title: | |
Temporarily disable aarch64 CI. | |
## Authors: | |
Everton Constantino | |
## Summary: | |
This merge request proposes the temporary disabling of the Arm continuous integration (CI) for the Eigen C++ library due to the unavailability of Windows on Arm (WoA) machines. | |
### Key Changes: | |
- Disabled the aarch64 CI to prevent build failures while the necessary hardware is down. | |
### Improvements: | |
- Allows continued development and integration testing without interruptions related to CI failures. | |
### Impact: | |
- Ensures that the development process can proceed smoothly despite the temporary unavailability of Arm CI resources." | |
900 (https://gitlab.com/libeigen/eigen/-/merge_requests/900),Fix swap test for size 1 inputs.,"The swap of a matrix and its first row actually passes in this case, | |
causing the assertion test to fail sporadically.",Antonio Sánchez,2022-03-10T15:05:59.488Z,NA,NA,"## Title: | |
Fix swap test for size 1 inputs. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses a problem in the Eigen C++ library related to the swap test for matrices of size 1. The issue caused sporadic failures in the assertion tests when attempting to swap a matrix with its first row. | |
### Key Changes: | |
- Correction of the swap test logic for size 1 inputs to ensure it functions as intended. | |
### Improvements: | |
- Stability and reliability of the assertion tests for edge cases involving size 1 matrices. | |
### Impact: | |
- Enhances the robustness of the Eigen library by preventing assertion failures in specific scenarios, thereby improving overall test reliability." | |
903 (https://gitlab.com/libeigen/eigen/-/merge_requests/903),"Convert bit calculation to constexpr, avoid casts.","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This is a simple patch, which changes `enum`s to `static constexpr int` in evaluation of floating point bit sizes. The advantage of this is that a few casts can be removed. `static constexpr int` is quite a mouthful, so I chose to not repeat it for every single variable. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
I was looking more globally at replacing `enum`s in traits with `constexpr`, mainly to not have to cast to int all the time when comparing the dimensions of matrices but that is a larger project of which I could make this little spin-off. | |
I ran most of the testsuite, in particular the `packetmath` tests with no errors.",Tobias Schlüter,2022-03-14T18:59:35.958Z,NA,NA,"## Title: | |
Convert bit calculation to constexpr, avoid casts. | |
## Authors: | |
Tobias Schlüter | |
## Summary: | |
This merge request implements a change in the Eigen C++ library by replacing `enum`s with `static constexpr int` in the evaluation of floating point bit sizes, leading to simplified code and the elimination of unnecessary type casts. | |
### Key Changes: | |
- Replaced `enum`s with `static constexpr int` for evaluating floating point bit sizes. | |
### Improvements: | |
- Reduction of type casts within the code. | |
- Streamlined variable declarations by minimizing the repetition of `static constexpr int`. | |
### Impact: | |
- Enhances code readability and maintainability. | |
- Paves the way for a broader initiative to replace `enum`s with `constexpr` in traits, potentially improving performance and reducing cast-related issues in future developments." | |
907 (https://gitlab.com/libeigen/eigen/-/merge_requests/907),Fix up PowerPC MMA flags so it builds by default.,"Introduces the build option | |
``` | |
EIGEN_ALTIVEC_ENABLE_MMA_DYNAMIC_DISPATCH | |
``` | |
If set and MMA builtins are available, will allow the dynamic dispatch | |
path. If building for power10 and mma (`-mmma -cpu=power10`), switches | |
to always use MMA. | |
~~Removed the `EIGEN_ALTIVEC_DISABLE_MMA` option.~~ edit: put it back in. | |
Fixes #2457, and partly fixes #2324 in that the LTO issue should now | |
be avoided by default.",Antonio Sánchez,2022-03-15T20:22:23.690Z,NA,NA,"## Title: | |
Fix up PowerPC MMA flags so it builds by default. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request introduces a new build option for the Eigen library that enhances the compatibility and performance of PowerPC MMA (Matrix Multiply Acceleration) by enabling dynamic dispatch when certain conditions are met. | |
### Key Changes: | |
- Introduced the build option `EIGEN_ALTIVEC_ENABLE_MMA_DYNAMIC_DISPATCH`, which allows dynamic dispatch if MMA builtins are available. | |
- When building for Power10 with MMA (`-mmma -cpu=power10`), the build will default to always using MMA. | |
- The `EIGEN_ALTIVEC_DISABLE_MMA` option has been reinstated. | |
### Improvements: | |
- Resolves issue #2457 and provides a partial fix for issue #2324 by preventing LTO (Link Time Optimization) issues by default. | |
### Impact: | |
These changes improve the build process for PowerPC architectures and optimize performance by leveraging hardware capabilities more effectively, particularly for users building for Power10 processors." | |
910 (https://gitlab.com/libeigen/eigen/-/merge_requests/910),"Revert ""Fix up PowerPC MMA flags so it builds by default.""",Premature merge.,Rasmus Munk Larsen,2022-03-15T20:51:04.306Z,NA,NA,"## Title: | |
Revert ""Fix up PowerPC MMA flags so it builds by default."" | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request is a revert of a previous change aimed at fixing PowerPC MMA flags to ensure they build by default. The revert was initiated due to the determination that the initial merge was premature. | |
### Key Changes: | |
- Reverted the previously merged modifications related to PowerPC MMA flags. | |
### Improvements: | |
- Addresses potential issues arising from the premature merge, ensuring a more stable code base. | |
### Impact: | |
- Restores the library to its prior state regarding PowerPC MMA flag configurations, which may lead to better compatibility and stability in builds on PowerPC architecture." | |
909 (https://gitlab.com/libeigen/eigen/-/merge_requests/909),Remove workarounds for bad GCC-4 warnings,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
No reference, just stumbled on these | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Minor changes. This removes a few smelly workarounds that were done to avoid bad warnings in old versions of GCC. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
I ran tests with gcc 5.5 and didn't see any of the warnings mentioned. | |
Also tried making these changes from the 3.4 branch, against gcc 4.9.",Arthur,2022-03-16T00:08:17.649Z,NA,NA,"## Title: | |
Remove workarounds for bad GCC-4 warnings | |
## Authors: | |
Arthur | |
## Summary: | |
This merge request focuses on eliminating outdated workarounds implemented to address warnings from older versions of the GCC compiler (specifically GCC-4). The changes ensure the code is cleaner and more maintainable by removing unnecessary complexity. | |
### Key Changes: | |
- Removed legacy workarounds designed to suppress warnings from GCC-4. | |
### Improvements: | |
- Simplified codebase by eliminating obsolete warning-related modifications. | |
- Enhanced readability and maintainability of the code. | |
### Impact: | |
- Improved compatibility with newer GCC versions, as tests with GCC 5.5 showed no warnings. | |
- Streamlined development process by removing unnecessary code." | |
829 (https://gitlab.com/libeigen/eigen/-/merge_requests/829),Replace Eigen type metaprogramming with corresponding std types and make use of alias templates,"This MR removes a bunch of metaprogramming facilities in eigen in favour of their std versions. It also replaces | |
`typename type_trait<X>::type` with `type_trait_t<X>` for these types. (Overall, this reduces the number of such constructs, `typename .*::type`, from about 2000 to 1300). | |
I've run the test suite on CPU, but not on other devices. Given the large amount of small changes that look extremely similar, I think it would be good to also run extensive tests on CUDA and SYCL before merging.",Erik Schultheis,2022-03-16T16:43:40.744Z,NA,NA,"## Title: | |
Replace Eigen type metaprogramming with corresponding std types and make use of alias templates | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request updates the Eigen C++ library by replacing its internal type metaprogramming constructs with standard library (std) alternatives. This change aims to streamline the codebase and improve maintainability by leveraging the modern features of C++. | |
### Key Changes: | |
- Removal of numerous Eigen-specific metaprogramming facilities, substituted with corresponding standard library types. | |
- Replacement of `typename type_trait<X>::type` with the more concise `type_trait_t<X>`. | |
- Reduction of occurrences of `typename .*::type` constructs from approximately 2000 to 1300. | |
### Improvements: | |
- Simplifies the codebase, leading to improved readability and maintainability. | |
- Decreases the complexity of type trait definitions, making it easier for future development. | |
### Impact: | |
- While the test suite has been run on CPU, further extensive testing is recommended on CUDA and SYCL platforms to ensure stability across all supported devices before final merging. This transition is expected to enhance the library's compatibility with modern C++ standards and practices." | |
914 (https://gitlab.com/libeigen/eigen/-/merge_requests/914),Disable schur non-convergence test.,"It seems that about half the time, the schur decomposition passes when | |
the maximum number of iterations is set to 1, so checking that it | |
doesn't converge leads to flaky results. Disabling the non-convergence | |
check. | |
Fixes #2458",Antonio Sánchez,2022-03-16T17:33:54.017Z,NA,NA,"## Title: | |
Disable schur non-convergence test. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request proposes disabling the Schur non-convergence test due to inconsistent results when the maximum number of iterations is set to 1. | |
### Key Changes: | |
- Disabled the non-convergence check for the Schur decomposition. | |
### Improvements: | |
- Reduced flaky results during the Schur decomposition process. | |
### Impact: | |
- Enhances the reliability of the Schur decomposition by preventing false non-convergence errors." | |
834 (https://gitlab.com/libeigen/eigen/-/merge_requests/834),AVX512 Optimizations for Triangular Solve,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This MR includes optimized AVX512 kernels to improve fp32/fp64 Triangular Solve performance. These kernels are ""nocopy"" (i.e matrices are not packed, `GEBP` not used) and are meant to get better performance for smaller problem sizes (only inner strides of 1 are supported). The existing generic implementation of this functionality was pulled into separate functions `trsmKernelL/trsmKernelR` to allow dropping in the optimized versions when needed in `TriangularSolverMatrix.h`. | |
Changes: | |
- `Eigen/src/Core/products/TriangularSolverMatrix.h`: Replaced the previous inner triangular solve loop with a wrapper to the original implementation (`trsmKernelL`, `trsmKernelR`). | |
- `Eigen/src/Core/arch/AVX512/trsmKernel_impl.hpp`: The optimized kernels are implemented here. Template specializations to fp32/fp64 `trsmKernelL` and `trsmKernelR` are here as well. | |
- The solve kernel, `trisolve`, solves `AX=B` where `A` is `MxM` triangular, and `B` is `MxN`. `A` and `B` can be row/col-major and `A` can be upper/lower triangular. Combinations of these layouts can be used to handle the cases where `A` is on the right. | |
- `gemm_MNK__` is used to update panels of `B` and computes `C -= A*B`. This can be reused for Matrix Multiply optimizations (smaller sizes for certain transpose cases) | |
- Both these kernels use various unrolls which are generated recursively using templates. | |
- For small/medium sizes the solve kernel (**built with clang**) is generally faster and is used directly for the entire problem. TODO: improve heuristics for determining when to use kernels directly (current cutoffs determined from quick benchmarking). | |
- **Note**: we have noticed increases in compile time as a result of these changes. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
Here are some performance results of fp32/fp64 triangular solve with the optimized kernels. The charts are for the RUN (right, upper, non-transposed) and LLN (left, lower, non-transposed) trsm cases. The metric is flops/cycle, measured on `Intel(R) Xeon(R) Gold 6336Y` (peak is 64 flops/cycle for fp32, 32 flops/cycle for fp64). Compilers used were `g++`/`clang++` with versions `8.4.1` and `11.0.0` respectively. | |
For the RUN case, the data in the matrices are organized in the most optimal way (both A/B are row-major) so this provides the best performance of the 8 cases. For the LLN case, we do intermediate transposes so performance here is not as great. For large problem sizes triangular solve performance is entirely dependent on `GEBP` performance. Currently GNU compilers are generating sub-optimal code for the gemm micro kernel. We are seeing some register spilling not present in clang (this is mentioned in comments in the code). This only impacts performance for smaller sizes, for larger sizes performance using either compilers were similar. | |
 | |
 | |
 | |
 | |
+@aaraujom",b-shi,2022-03-16T18:04:51.497Z,NA,NA,"## Title: | |
AVX512 Optimizations for Triangular Solve | |
## Authors: | |
b-shi | |
## Summary: | |
This merge request introduces optimized AVX512 kernels aimed at enhancing the performance of floating-point triangular solve operations (fp32/fp64) within the Eigen C++ library. The optimization targets smaller problem sizes and involves a revision of the existing implementations to allow for improved efficiency without the need for matrix packing. | |
### Key Changes: | |
- Replaced the previous inner triangular solve loop in `TriangularSolverMatrix.h` with new wrappers for optimized implementations (`trsmKernelL`, `trsmKernelR`). | |
- Developed optimized kernels in `trsmKernel_impl.hpp` that handle various triangular matrices in both row and column-major formats. | |
- Introduced specializations for both fp32 and fp64 types in the triangular solve kernels. | |
- Included a new approach to update matrix `B` via `gemm_MNK__`, facilitating potential reusability for matrix multiplication optimizations. | |
### Improvements: | |
- Optimized kernels provide significant performance enhancements for triangular solves when working with smaller matrix sizes, thanks to refined templates and structure in the implementations. | |
- Elimination of the packing requirement for matrices improves computational efficiency and simplicity in implementation. | |
- Resulting performance metrics indicate marked improvements in flops/cycle for both RUN and LLN cases on specific hardware (Intel Xeon Gold 6336Y). | |
### Impact: | |
- Enhanced triangular solve performance will benefit applications requiring frequent solving of triangular systems, particularly in numerical methods and linear algebra applications. | |
- Increased reliance on these optimized kernels may lead to longer compile times, which should be taken into account in future optimizations. | |
- Performance gains can establish a baseline for future improvements, including further developments in heuristic decision-making for kernel usage based on problem size." | |
913 (https://gitlab.com/libeigen/eigen/-/merge_requests/913),Fix up PowerPC MMA flags so it builds by default.,"Introduces the build option | |
``` | |
EIGEN_ALTIVEC_ENABLE_MMA_DYNAMIC_DISPATCH | |
``` | |
If set and MMA builtins are available, will allow the dynamic dispatch path. Otherwise, if building for power10 and mma (`-mmma -cpu=power10`), switches to always use MMA. | |
Fixes #2457, and partly fixes #2324 in that the LTO issue should now be avoided by default.",Antonio Sánchez,2022-03-16T19:16:29.262Z,NA,NA,"## Title: | |
Fix up PowerPC MMA flags so it builds by default. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request introduces a build option that enhances the PowerPC architecture's capabilities in the Eigen C++ library, specifically concerning the dynamic dispatch for MMA (Matrix-Multiply-Accumulate) operations. | |
### Key Changes: | |
- Added the build option `EIGEN_ALTIVEC_ENABLE_MMA_DYNAMIC_DISPATCH`. | |
- If this option is enabled and MMA builtins are available, it allows for the dynamic dispatch path. | |
- For builds targeting power10 with MMA (`-mmma -cpu=power10`), it now defaults to always using MMA. | |
### Improvements: | |
- Resolves issue #2457 regarding MMA flag configurations. | |
- Partly addresses issue #2324 by avoiding the Link Time Optimization (LTO) problem by default. | |
### Impact: | |
These changes streamline the build process for PowerPC, ensuring that users can efficiently utilize MMA features without additional configuration, improving performance in matrix operations on supported architectures." | |
915 (https://gitlab.com/libeigen/eigen/-/merge_requests/915),Fix missing pound,NA,Antonio Sánchez,2022-03-16T19:26:49.477Z,NA,NA,"## Title: | |
Fix missing pound | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an issue related to a missing pound directive within the Eigen C++ library. | |
### Key Changes: | |
- Added the missing pound directive to ensure proper compilation and functionality. | |
### Improvements: | |
- Corrects potential compilation errors that could arise from the missing directive, enhancing code robustness. | |
### Impact: | |
- Improves the overall stability and reliability of the Eigen library by preventing compilation issues." | |
917 (https://gitlab.com/libeigen/eigen/-/merge_requests/917),Work around g++-10 docker issue for geo_orthomethods_4.,"There seems to be a weird compiler bug in the g++-10 version of the | |
`ubuntu:20.04` docker image that is optimizing out one of our test vectors | |
in `geo_orthomethods_4` and causing a test failure. The weirder part is that on a machine | |
actually running ubuntu 20.04, with the exact same version of g++-10, | |
the test passes as-is. Reported version on both is | |
``` | |
g++-10 (Ubuntu 10.3.0-1ubuntu1~20.04) 10.3.0 | |
``` | |
To reproduce the original failure in docker, | |
``` | |
docker run -it ubuntu:20.04 | |
export DEBIAN_FRONTEND=noninteractive | |
apt update | |
apt-get install -y --no-install-recommends software-properties-common git | |
add-apt-repository -y ppa:ubuntu-toolchain-r/test | |
apt update | |
apt install g++-10 | |
git clone https://gitlab.com/libeigen/eigen.git | |
g++-10 -Ieigen -DEIGEN_TEST_PART_4=1 -O3 -mfma eigen/test/geo_orthomethods.cpp -o geo_orthomethods_4 | |
./geo_orthomethods_4 | |
``` | |
By storing the casted vector into `v2`, we seem to work around the issue.",Antonio Sánchez,2022-03-16T21:46:05.208Z,NA,NA,"## Title: | |
Work around g++-10 docker issue for geo_orthomethods_4. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses a compiler bug encountered in the g++-10 version of the `ubuntu:20.04` docker image, which led to a test failure in `geo_orthomethods_4`. The issue arises because the test vector is being optimized out in the docker environment, despite passing in a native installation of the same OS and compiler version. A workaround is implemented by storing the casted vector into `v2`, which resolves the issue. | |
### Key Changes: | |
- Introduced a workaround that prevents the optimization bug in g++-10 within the docker environment by modifying how the test vector is handled. | |
### Improvements: | |
- Ensured that tests in the docker environment consistently pass, thus improving reliability when using containerized builds. | |
### Impact: | |
- This fix enhances the robustness of the Eigen library's testing framework in docker environments, facilitating better development practices and reducing inconsistencies across different environments." | |
919 (https://gitlab.com/libeigen/eigen/-/merge_requests/919),Completed a missing parenthesis in tutorial.,Added a missing parenthesis in tutorial code.,Øystein Sørensen,2022-03-17T14:52:08.842Z,NA,NA,"## Title: | |
Completed a missing parenthesis in tutorial. | |
## Authors: | |
Øystein Sørensen | |
## Summary: | |
This merge request addresses a minor but important correction in the Eigen C++ library tutorial by adding a missing parenthesis in the example code. | |
### Key Changes: | |
- Added a missing parenthesis in the tutorial code. | |
### Improvements: | |
- Enhances code clarity and correctness for users learning from the tutorial. | |
### Impact: | |
- Helps prevent confusion and errors for users following the tutorial example, ensuring a smoother learning experience." | |
911 (https://gitlab.com/libeigen/eigen/-/merge_requests/911),Fix RowMajorBit <-> RowMajor mixup.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This fixes a bit of code that assumes that `RowMajorBit == RowMajor` and `ColMajor == 0`. Both are true, but both shouldn't be relied on. | |
### Additional information | |
<!--Any additional information you think is important.-->",Tobias Schlüter,2022-03-17T15:28:13.578Z,NA,NA,"## Title: | |
Fix RowMajorBit <-> RowMajor mixup. | |
## Authors: | |
Tobias Schlüter | |
## Summary: | |
This merge request addresses a code assumption in the Eigen C++ library regarding the relationship between `RowMajorBit` and `RowMajor`, and `ColMajor`. It clarifies that while they may hold true, relying on this equality is not advisable. | |
### Key Changes: | |
- Corrected the assumption in the code that `RowMajorBit` is equivalent to `RowMajor` and that `ColMajor` is 0. | |
### Improvements: | |
- Enhances the robustness of the code by eliminating reliance on specific bit definitions, thus ensuring more reliable behavior across varying implementations. | |
### Impact: | |
This change will improve code reliability and maintainability, reducing potential bugs arising from incorrect assumptions about bit definitions in matrix storage order." | |
922 (https://gitlab.com/libeigen/eigen/-/merge_requests/922),Work around MSVC compiler bug dropping `const`.,"MSVC seems to drop the `const` from the underlying `Const**ReturnType` | |
when trying to match the out-of-line definition of `transpose()` and | |
`diagonal()` to the declaration. When using `is_same` and `is_const` | |
to inspect the types the `const` *is* actually there... it's just | |
ignored when trying to find the corresponding definition. | |
Adding an extra `const` seems to fix this. | |
Fixes #2464",Antonio Sánchez,2022-03-17T20:50:26.983Z,NA,NA,"## Title: | |
Work around MSVC compiler bug dropping `const`. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses a bug in the MSVC compiler that incorrectly drops the `const` qualifier from the types `Const**ReturnType` when matching out-of-line definitions of `transpose()` and `diagonal()` to their declarations. By adding an extra `const`, the issue is resolved. | |
### Key Changes: | |
- Added an extra `const` to the definitions of `transpose()` and `diagonal()` to ensure proper type matching with the declarations. | |
### Improvements: | |
- Resolves type consistency issues caused by the MSVC compiler bug, ensuring that the functionality of `transpose()` and `diagonal()` works as intended. | |
### Impact: | |
- Fixes compilation errors related to `const` handling in MSVC, improving compatibility and stability of the Eigen library on this compiler." | |
916 (https://gitlab.com/libeigen/eigen/-/merge_requests/916),Change EIGEN_ALTIVEC_ENABLE_MMA_DYNAMIC_DISPATCH and EIGEN_ALTIVEC_DISABLE_MMA flags to be like TensorFlow's...,"Change EIGEN_ALTIVEC_ENABLE_MMA_DYNAMIC_DISPATCH and EIGEN_ALTIVEC_DISABLE_MMA flags to be like a TensorFlow's flag (either 0 or 1 - rather than undefined or defined). This will allow TensorFlow to control this flag. | |
Update documentation of new Altivec MMA flags.",Chip Kerchner,2022-03-17T22:35:28.858Z,NA,NA,"## Title: | |
Change EIGEN_ALTIVEC_ENABLE_MMA_DYNAMIC_DISPATCH and EIGEN_ALTIVEC_DISABLE_MMA flags to be like TensorFlow's... | |
## Authors: | |
Chip Kerchner | |
## Summary: | |
This merge request updates the EIGEN_ALTIVEC_ENABLE_MMA_DYNAMIC_DISPATCH and EIGEN_ALTIVEC_DISABLE_MMA flags to function similarly to TensorFlow's implementation, allowing these flags to be set to either 0 or 1 rather than being undefined or defined. This modification enables more straightforward control by TensorFlow. | |
### Key Changes: | |
- Modified the EIGEN_ALTIVEC_ENABLE_MMA_DYNAMIC_DISPATCH and EIGEN_ALTIVEC_DISABLE_MMA flags to accept binary values (0 or 1). | |
- Updated documentation to reflect the new behavior of the Altivec MMA flags. | |
### Improvements: | |
- Enhanced compatibility with TensorFlow by allowing for easier flag management. | |
- Improved user documentation regarding the new flag settings. | |
### Impact: | |
This change simplifies the integration process with TensorFlow, potentially improving performance and ease of use for developers working on applications that utilize both Eigen and TensorFlow." | |
923 (https://gitlab.com/libeigen/eigen/-/merge_requests/923),Fix AVX512 builds with MSVC.,"Fix AVX512 builds with MSVC. | |
MSVC doesn't allow `reinterpret_cast` between types `__m512` and | |
`__m512d` and the like. Requires the use of the explicit cast | |
intrinsic. | |
Also actually add the AVX512 test option for MSVC for our CI... | |
previously was always testing only SSE. | |
Fixes #2466.",Antonio Sánchez,2022-03-18T16:04:53.960Z,NA,NA,"## Title: | |
Fix AVX512 builds with MSVC. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses issues with building AVX512 support using Microsoft's Visual C++ (MSVC) compiler. It introduces necessary changes to ensure compatibility with specific type conversions and enhances testing coverage. | |
### Key Changes: | |
- Replaced `reinterpret_cast` between `__m512` and `__m512d` types with the appropriate explicit cast intrinsic. | |
- Added AVX512 test options for MSVC in the Continuous Integration (CI) process, which were previously limited to testing only SSE. | |
### Improvements: | |
- Improved compatibility of Eigen library with AVX512 when using MSVC. | |
- Ensured comprehensive testing for AVX512 support in CI, enhancing the reliability of the builds. | |
### Impact: | |
These changes fix the AVX512 build issues with MSVC, thereby allowing users to leverage the performance benefits of AVX512 support without compatibility problems. The addition of relevant tests will help catch any potential issues in future updates." | |
925 (https://gitlab.com/libeigen/eigen/-/merge_requests/925),Fix ODR violation in trsm.,"Functions need to be marked inline. | |
Fixes #2468.",Antonio Sánchez,2022-03-20T15:56:55.474Z,NA,NA,"## Title: | |
Fix ODR violation in trsm. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an One Definition Rule (ODR) violation in the trsm functionality of the Eigen C++ library. The update involves marking specific functions as inline to resolve the issue. | |
### Key Changes: | |
- Functions in the trsm module have been modified to be marked as inline. | |
### Improvements: | |
- The change ensures compliance with the ODR, enhancing the stability and reliability of the trsm implementation. | |
### Impact: | |
- This fix prevents potential linking issues and undefined behavior, thereby improving the overall integrity and performance of the Eigen library when using trsm." | |
926 (https://gitlab.com/libeigen/eigen/-/merge_requests/926),Fix usages of wrong namespace,"### What does this implement/fix? | |
Fix compilation errors introduced by 421cbf086. | |
Changes were formatted and tested with SYCL tests.",Romain Biessy,2022-03-21T15:07:56.022Z,NA,NA,"## Title: | |
Fix usages of wrong namespace | |
## Authors: | |
Romain Biessy | |
## Summary: | |
This merge request addresses compilation errors caused by a previous commit (421cbf086). The issue was resolved by correcting the usage of namespaces in the code. | |
### Key Changes: | |
- Corrected instances of incorrect namespace usage that were leading to compilation errors. | |
### Improvements: | |
- Enhanced code stability by ensuring proper namespace alignment, thereby reducing potential conflicts. | |
### Impact: | |
- This fix allows for successful compilation of the Eigen library with SYCL tests, improving the overall reliability of the codebase." | |
927 (https://gitlab.com/libeigen/eigen/-/merge_requests/927),Update warning suppression to latest.,Fixes #2453.,Antonio Sánchez,2022-03-21T15:56:04.552Z,NA,NA,"## Title: | |
Update warning suppression to latest. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request updates the warning suppression methods in the Eigen C++ library to align with the latest standards. This change addresses issue #2453. | |
### Key Changes: | |
- Updated warning suppression techniques to the latest version. | |
### Improvements: | |
- Enhanced compatibility with newer compilers and coding standards. | |
### Impact: | |
- Reduces compatibility issues and ensures better adherence to current best practices in code maintenance." | |
921 (https://gitlab.com/libeigen/eigen/-/merge_requests/921),Optimize visitor traversal in case of RowMajor.,"Non-vectorized paths previously always traversed in col-major order. | |
Here we check for layout and traverse in row-major for RowMajor inputs. | |
Fixes #2463.",Antonio Sánchez,2022-03-23T15:27:58.595Z,NA,NA,"## Title: | |
Optimize visitor traversal in case of RowMajor. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request introduces optimizations in the Eigen C++ library by adjusting the traversal method for visitor patterns when dealing with RowMajor inputs. Previously, non-vectorized paths only followed a column-major order, but this update allows them to properly traverse in row-major order when applicable. | |
### Key Changes: | |
- Implemented row-major traversal for non-vectorized paths specifically for RowMajor inputs. | |
### Improvements: | |
- Enhanced performance by optimizing traversal methods according to data layout, potentially reducing computation time for operations on row-major matrices. | |
### Impact: | |
This change improves efficiency in processing row-major matrices, addressing issue #2463 and contributing to more versatile and performant matrix operations in the Eigen library." | |
929 (https://gitlab.com/libeigen/eigen/-/merge_requests/929),Split general_matrix_vector_product interface for Power into two macros - one ColMajor and RowMajor.,Fixes TensorFlow compilation issues related to GEMV.,Chip Kerchner,2022-03-23T18:09:34.188Z,NA,NA,"## Title: | |
Split general_matrix_vector_product interface for Power into two macros - one ColMajor and RowMajor. | |
## Authors: | |
Chip Kerchner | |
## Summary: | |
This merge request addresses compilation issues encountered in TensorFlow related to the general matrix-vector product (GEMV) interface for Power architectures. It introduces a split in the interface by defining separate macros for column-major and row-major formats. | |
### Key Changes: | |
- Created two distinct macros for the general_matrix_vector_product: one for ColMajor and another for RowMajor. | |
### Improvements: | |
- Enhances compatibility with TensorFlow by resolving GEMV-related compilation issues. | |
### Impact: | |
- Improves the usability of the Eigen library with Power architectures, potentially increasing integration and performance in applications like TensorFlow." | |
798 (https://gitlab.com/libeigen/eigen/-/merge_requests/798),Add a NNLS solver to unsupported - issue #655,"### Reference issue | |
#655 | |
### What does this implement/fix? | |
This adds a non-negative least squares (NNLS) solver to unsupported. | |
The algorithm is the standard active-set algorithm from the text ""Solving Least Squares Problems"", Charles L. Lawson and Richard J. Hanson. | |
It's also the same algorithm described on the wikipedia page https://en.wikipedia.org/wiki/Non-negative_least_squares (accessed 2022-01-04). | |
The API is similar to the Sparse Solver concept (i.e. is has `compute`, `solve`, and `info` methods). | |
However, it does not inherit from a Sparse Solver class, because this code applies to dense matrices. | |
### Additional information | |
The code has a bit of history. | |
I believe it was started by Hannes Matuschek (who opened issue #655) in 2014 and that history is visible on his git repo https://github.com/hmatuschek/eigen3-nnls. | |
Based on the contents of that repo, I suspect it began life as an f2c translation from an older FORTRAN program. | |
Between 2018-2022 I'm aware of no work on this solver. | |
Then, at the beginning of 2022, I did some significant refactoring and testing to prepare this PR. | |
The hmatuschek/eigen3-nnls version has been in use for many years. | |
The main changes relative to that version are: | |
- lots more tests | |
- refactoring the API to resemble Eigen's other iterative solvers, e.g. the 'compute' and 'solve' and 'info' API. | |
- deleting some non-essential and broken parts",Essex Edwards,2022-03-23T20:20:45.502Z,NA,NA,"## Title: | |
Add a NNLS solver to unsupported - issue #655 | |
## Authors: | |
Essex Edwards | |
## Summary: | |
This merge request introduces a non-negative least squares (NNLS) solver to the Eigen C++ library, utilizing the active-set algorithm from ""Solving Least Squares Problems"" by Lawson and Hanson. The implementation features a user-friendly API that aligns with Eigen's other iterative solvers, despite being designed for dense matrices rather than sparse ones. | |
### Key Changes: | |
- Added NNLS solver based on an established active-set algorithm. | |
- API includes `compute`, `solve`, and `info` methods, similar to other Eigen iterative solvers. | |
- Enhanced testing coverage. | |
- Refactored code to improve usability and maintainability. | |
- Removed non-essential and broken components from previous versions. | |
### Improvements: | |
- Increased reliability and robustness through extensive testing. | |
- User-friendly API design that conforms with existing Eigen conventions. | |
### Impact: | |
The addition of the NNLS solver equips users with a powerful tool for solving non-negative least squares problems, enhancing the library's functionality and meeting a broader range of computational needs." | |
918 (https://gitlab.com/libeigen/eigen/-/merge_requests/918),Add missing explicit reinterprets,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
In https://gitlab.com/libeigen/eigen/-/merge_requests/834, there is a missing explicit reinterprets in `_mm512_shuffle_f32x4` which causes some build errors when using g++.",b-shi,2022-03-23T21:10:26.752Z,NA,NA,"## Title: | |
Add missing explicit reinterprets | |
## Authors: | |
b-shi | |
## Summary: | |
This merge request addresses a specific issue in the Eigen C++ library related to the function `_mm512_shuffle_f32x4`. It adds necessary explicit reinterprets that were previously missing, which caused build errors when using the g++ compiler. | |
### Key Changes: | |
- Added explicit reinterprets within the `_mm512_shuffle_f32x4` function. | |
### Improvements: | |
- Enhanced compatibility with g++ by resolving build errors tied to the missing reinterprets. | |
### Impact: | |
- The changes improve the robustness of the Eigen library’s compatibility with the g++ compiler, reducing the likelihood of build errors related to this function." | |
930 (https://gitlab.com/libeigen/eigen/-/merge_requests/930),added a missing typename and fixed a unused typedef warning,this adds a missing typename that breaks compilation on gcc 9. Also removes an unused typedef to get rid of the associated warning.,Erik Schultheis,2022-03-24T19:48:47.393Z,NA,NA,"## Title: | |
Added a missing typename and fixed an unused typedef warning | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request addresses a compilation issue in GCC 9 by adding a missing typename. It also removes an unused typedef to eliminate a related warning during compilation. | |
### Key Changes: | |
- Added a missing typename to ensure compatibility with GCC 9. | |
- Removed an unused typedef to resolve associated warnings. | |
### Improvements: | |
- Enhanced code compatibility with GCC 9, preventing compilation failures. | |
- Cleaned up code by eliminating unnecessary warnings, improving overall code quality. | |
### Impact: | |
These changes improve the compilation process on GCC 9 and contribute to cleaner code, which benefits maintainability and reduces potential confusion for developers." | |
931 (https://gitlab.com/libeigen/eigen/-/merge_requests/931),Enable Aarch64 CI,Aarch64 CI machines are back online so we should reenable those pipelines.,Everton Constantino,2022-03-24T20:10:51.717Z,NA,NA,"## Title: | |
Enable Aarch64 CI | |
## Authors: | |
Everton Constantino | |
## Summary: | |
This merge request re-enables the Continuous Integration (CI) pipelines for Aarch64 machines following their restoration. | |
### Key Changes: | |
- Reinstatement of Aarch64 CI pipelines. | |
### Improvements: | |
- Allows for testing and validation of the Eigen library on Aarch64 architecture. | |
### Impact: | |
- Improved reliability and support for Aarch64 users, ensuring code quality across more platforms." | |
892 (https://gitlab.com/libeigen/eigen/-/merge_requests/892),"Add is_constant_evaluated, update alignment checks","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This is the second of the separate MRs for MR https://gitlab.com/libeigen/eigen/-/merge_requests/881 | |
It adds a wrapper for `std::is_constant_evaluated` which evaluates to false in the case of C++ versions which don't have that function. The alignment check assertions are disabled in constant evaluation using the new wrapper function, and lastly, and in line with the extant comment in the `block_evaluator` constructor, `eigen_assert` there is replaced with `eigen_internal_assert` (this is an unrelated change of course, but it seemed prudent to do it while someone is actually looking at the code).",Tobias Schlüter,2022-03-25T04:00:58.776Z,NA,NA,"## Title: | |
Add is_constant_evaluated, update alignment checks | |
## Authors: | |
Tobias Schlüter | |
## Summary: | |
This merge request introduces functionality for better C++ constant evaluation support and updates the alignment check assertions in the Eigen C++ library. | |
### Key Changes: | |
- Added a wrapper for `std::is_constant_evaluated` that returns false for C++ versions without this feature. | |
- Disabled alignment check assertions during constant evaluations using the new wrapper. | |
- Replaced `eigen_assert` with `eigen_internal_assert` in the `block_evaluator` constructor. | |
### Improvements: | |
- Enhanced compatibility with older C++ versions by safeguarding against the absence of `std::is_constant_evaluated`. | |
- Improved code clarity and assertion handling by using a more appropriate assertion function in the constructor. | |
### Impact: | |
These changes improve the robustness of the Eigen library during constant evaluations and ensure that assertion checks are appropriately handled in varied compilation scenarios, contributing to overall code quality and reliability." | |
937 (https://gitlab.com/libeigen/eigen/-/merge_requests/937),Eliminate trace unused warning.,NA,Antonio Sánchez,2022-03-29T22:25:20.090Z,NA,NA,"## Title: | |
Eliminate trace unused warning | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses and removes warnings related to unused trace statements in the Eigen C++ library. | |
### Key Changes: | |
- Eliminated warnings associated with unused trace code. | |
### Improvements: | |
- Cleaner code with no unnecessary warnings during compilation. | |
### Impact: | |
- Enhances the user experience by reducing noise in the build process, leading to a more efficient development environment." | |
924 (https://gitlab.com/libeigen/eigen/-/merge_requests/924),Disable f16c scalar conversions for MSVC.,"MSVC seems to be lacking the scalar conversion instructions. | |
It has the vector ones. | |
Bug reported over at TensorFlow: https://github.com/tensorflow/tensorflow/issues/54397 | |
Similiar issue found in an intel repo: Similar issue found in an intel repo: https://github.com/intel/tinycbor/pull/193",Antonio Sánchez,2022-03-30T18:35:33.143Z,NA,NA,"## Title: | |
Disable f16c scalar conversions for MSVC. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses the issue of unsupported f16c scalar conversions in the MSVC compiler, which only has vector support for these conversions. The change disables these scalar conversions to prevent compatibility issues. | |
### Key Changes: | |
- Disabled f16c scalar conversions specifically for the MSVC compiler. | |
### Improvements: | |
- Ensures that the Eigen library remains compatible with MSVC by avoiding unsupported operations that could lead to errors. | |
### Impact: | |
- Enhances reliability of the Eigen library when compiled with MSVC, reducing potential bugs and increasing stability in scenarios where scalar conversions are relevant." | |
939 (https://gitlab.com/libeigen/eigen/-/merge_requests/939),Don't include .cpp in lapack.,"This is bad practice in general. | |
I didn't rename these as `.h` since they're not really headers | |
either, they are special macro-dependent implementation details | |
that *could* be considered ""textual headers"" from a modules | |
perspective (https://clang.llvm.org/docs/Modules.html).",Antonio Sánchez,2022-03-30T21:41:57.372Z,NA,NA,"## Title: | |
Don't include .cpp in lapack. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses the inclusion of `.cpp` files in the LAPACK module, highlighting it as bad practice. Instead of renaming these files to `.h`, which are not proper headers, they are identified as special implementation details that function more like ""textual headers"". | |
### Key Changes: | |
- Removal of `.cpp` file inclusions in the LAPACK module. | |
### Improvements: | |
- Enhances code clarity by avoiding the misuse of file extensions. | |
### Impact: | |
- Promotes better coding practices within the Eigen codebase, ensuring a more organized and maintainable code structure." | |
934 (https://gitlab.com/libeigen/eigen/-/merge_requests/934),fixed order of arguments in blas syrk,"During my work on !906 I came across this error: The order of the template arguments is wrong. This becomes noticable as a compile error once the Row/ColMajor arguments are no longer implicitly convertible to/from bool -- then the template instantiation will be rejected. | |
TODO: Figure out why this was not caught by any automated test. | |
Verify that the new code is actually doing what BLAS specifies.",Erik Schultheis,2022-03-30T22:05:08.191Z,NA,NA,"## Title: | |
Fixed order of arguments in BLAS SYRK | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request addresses an issue related to the order of template arguments in the BLAS SYRK implementation of the Eigen C++ library. It corrects a compilation error that arises when the Row/ColMajor arguments are not implicitly convertible to/from bool. | |
### Key Changes: | |
- Corrected the order of template arguments in the BLAS SYRK function to align with specifications. | |
### Improvements: | |
- Resolved a compilation error related to template instantiation by fixing argument order. | |
- Enhanced code correctness by ensuring compliance with BLAS specifications. | |
### Impact: | |
- The change improves the reliability of the library by preventing potential compilation issues in future implementations and ensuring compatibility with BLAS conventions. It highlights the importance of automated testing in catching such errors." | |
941 (https://gitlab.com/libeigen/eigen/-/merge_requests/941),Consider inf/nan in scalar test_isApprox.,"Otherwise we periodically get comparisons of `VERIFY_IS_APPROX(inf, | |
inf)`, which probably should be `true`, but instead fails.",Antonio Sánchez,2022-04-01T17:00:25.259Z,NA,NA,"## Title: | |
Consider inf/nan in scalar test_isApprox. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request modifies the behavior of the scalar `test_isApprox` function in the Eigen C++ library to correctly handle comparisons involving infinity (`inf`) and not-a-number (`nan`) values. Previously, comparisons such as `VERIFY_IS_APPROX(inf, inf)` would fail, despite logically being expected to return `true`. | |
### Key Changes: | |
- Updated the `test_isApprox` function to accommodate `inf` and `nan` comparisons. | |
### Improvements: | |
- Ensures that comparisons involving `inf` are treated as equal. | |
- Prevents false failures in tests that rely on the approximation checks. | |
### Impact: | |
This change enhances the robustness of the library's testing framework by providing accurate behavior for edge cases involving infinite and NaN values, resulting in more reliable unit tests and less likelihood of misleading results." | |
940 (https://gitlab.com/libeigen/eigen/-/merge_requests/940),Add back std::remove* aliases - third-party libraries rely on these.,"Removing them broke a bunch of 3P libs that need to be updated. That will likely take some time, so putting aliases back in.",Antonio Sánchez,2022-04-01T17:23:11.380Z,NA,NA,"## Title: | |
Add back std::remove* aliases - third-party libraries rely on these. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request aims to reintroduce the `std::remove*` aliases which were previously removed. The removal caused issues for numerous third-party libraries that depend on these aliases, necessitating their restoration to facilitate smoother integration and functionality. | |
### Key Changes: | |
- Reintroduced the `std::remove*` aliases. | |
### Improvements: | |
- Restoring compatibility for third-party libraries that rely on the `std::remove*` functionality. | |
### Impact: | |
- Reduces the immediate burden on third-party library maintainers by allowing existing codebases to function without requiring immediate updates." | |
854 (https://gitlab.com/libeigen/eigen/-/merge_requests/854),Added Scaling function overload for vector rvalue reference,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
#2431 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Provides an overload to `Scaling` that takes an rvalue reference, so that if a user attempts to generate a diagonal (scaling) matrix from one, e.g. a temporary Vector<Scalar, Size>, they obtain a valid matrix. The current `Scaling` overload that is used when an eigen user tries to do this returns a `DiagonalWrapper<Derived>` unlike all the other `Scaling` functions. | |
The linked issue explains the issue in more detail, and shows an example of a user falling into this trap, as well as a code example demonstrating this overload working in the desired case. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
An alternative fix is to not supply an additional overload, but change the current overload that takes a vector and returns a `DiagonalWrapper<Derived>` and instead return a `DiagonalMatrix`, such as by the method used in the new overload in this MR.",William Talbot,2022-04-04T16:50:09.765Z,NA,NA,"## Title: | |
Added Scaling function overload for vector rvalue reference | |
## Authors: | |
William Talbot | |
## Summary: | |
This merge request introduces an additional overload for the `Scaling` function in the Eigen C++ library, specifically designed to handle rvalue references of vectors. This enhancement addresses an issue where previously users would receive a `DiagonalWrapper<Derived>` instead of the intended `DiagonalMatrix`. | |
### Key Changes: | |
- Added an overload for the `Scaling` function that accepts an rvalue reference to a vector. | |
- Ensured that temporary vectors, when used to create a scaling matrix, produce a valid `DiagonalMatrix`. | |
### Improvements: | |
- Resolves a potential source of confusion for users attempting to generate a diagonal matrix from temporary vectors. | |
- Aligns the behavior of the new overload with existing `Scaling` functionalities, providing a more consistent user experience. | |
### Impact: | |
This change significantly improves usability and correctness by preventing users from inadvertently working with an unsuitable `DiagonalWrapper`, thus enhancing the development experience in Eigen for users employing scaling operations on temporary vectors." | |
904 (https://gitlab.com/libeigen/eigen/-/merge_requests/904),static const class members turned into constexpr,"This MR turns `static const` class member variables into `static constexpr`. | |
Based on the discussion in the other MR, I've left the enums as they are. | |
There are also some `static const` variables at function score, I haven't looked at those.",Erik Schultheis,2022-04-04T17:33:33.782Z,NA,NA,"## Title: | |
static const class members turned into constexpr | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request transforms `static const` class member variables into `static constexpr`, enhancing the efficiency and compile-time constants usage within the Eigen C++ library. | |
### Key Changes: | |
- Conversion of `static const` class member variables to `static constexpr`. | |
- Retention of enums in their original form, as discussed in a prior merge request. | |
### Improvements: | |
- Increased performance by enabling better optimization opportunities during compilation. | |
- Clarification in the intent of certain constants being immutable compile-time values. | |
### Impact: | |
- The change promotes more efficient memory usage and potentially faster execution of programs utilizing the Eigen library, thereby improving overall software performance." | |
943 (https://gitlab.com/libeigen/eigen/-/merge_requests/943),More constexpr helpers,"This MR converts some helper functions in `XprHelper.h` from template metaprogramming to `constexpr` functions. The change functions are | |
* `compute_default_alignment_helper` | |
* `compute_matrix_flags` | |
* `size_at_compile_time` | |
When replacing usages of the latter, I noticed that in many cases the call site could actually be simplified by using the already existing helper `size_of_xpr_at_compile_time`, so I've updated these. | |
In order to get these to compile, I also had to mark `ignore_unused_variable` as a `constexpr` function.",Erik Schultheis,2022-04-04T18:38:35.562Z,NA,NA,"## Title: | |
More constexpr helpers | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request introduces enhancements to the Eigen C++ library by converting several helper functions in `XprHelper.h` to `constexpr`, which improves compile-time evaluations and potentially optimizes performance. | |
### Key Changes: | |
- Conversion of the following helper functions to `constexpr`: | |
- `compute_default_alignment_helper` | |
- `compute_matrix_flags` | |
- `size_at_compile_time` | |
- Simplification of call sites by replacing certain usages with the existing helper `size_of_xpr_at_compile_time`. | |
- Marking `ignore_unused_variable` as a `constexpr` function to facilitate compilation. | |
### Improvements: | |
- The transition to `constexpr` functions allows for better compile-time computation, which can lead to increased efficiency in the library's usage. | |
- Enhancing the clarity and maintainability of the codebase by simplifying function calls. | |
### Impact: | |
These changes are expected to improve performance by leveraging compile-time evaluation capabilities, thus making the library more efficient in various scenarios. Additionally, the codebase is now clearer, reducing potential confusion for developers working with the helper functions." | |
936 (https://gitlab.com/libeigen/eigen/-/merge_requests/936),Performance improvements in GEMM for Power,"Added vector_pair loads for LHS of GEMM for MMA (10% faster) | |
An extra accumulator for extra_row of GEMM for MMA & VSX (non-vectorized right portion of the matrix executes for essentially free in almost all cases - 1200% / number of columns eliminated) | |
Single pass for extra_col of GEMM for VSX (bottom of the matrix executes in a single pass versus up to 3 passes - 2400% / number of rows faster). | |
Other minor performance changes.",Chip Kerchner,2022-04-05T12:18:53.720Z,NA,NA,"## Title: | |
Performance improvements in GEMM for Power | |
## Authors: | |
Chip Kerchner | |
## Summary: | |
This merge request introduces significant performance enhancements for the General Matrix Multiply (GEMM) operations on Power architecture, focusing on the efficiency of Matrix-Matrix Multiplication (MMA) and Vector-SIMD Extensions (VSX). | |
### Key Changes: | |
- Implemented vector_pair loads for the left-hand side (LHS) of GEMM, resulting in a 10% increase in speed. | |
- Introduced an additional accumulator for the extra_row in GEMM for MMA and VSX, minimizing the computational cost of the non-vectorized right portion. | |
- Optimized the processing of extra_col in GEMM for VSX, reducing execution time from multiple passes to a single pass, which can lead to a speedup of up to 2400% based on the number of rows. | |
### Improvements: | |
- Overall GEMM performance is significantly enhanced due to optimized load mechanisms and reduced computational overhead. | |
- The changes streamline the execution for matrix portions that previously required multiple processing passes. | |
### Impact: | |
These performance improvements can vastly accelerate matrix operations in applications utilizing the Eigen library on Power architecture, particularly benefiting computations involving large matrices and enhancing the efficiency of numerical algorithms." | |
944 (https://gitlab.com/libeigen/eigen/-/merge_requests/944),constexpr reshape helper,This changes one more metaprogramming utility from template to constexpr function.,Erik Schultheis,2022-04-05T17:32:18.302Z,NA,NA,"## Title: | |
constexpr reshape helper | |
## Authors: | |
Erik Schultheis | |
## Summary: | |
This merge request makes a significant enhancement to the Eigen C++ library by transitioning a metaprogramming utility from a template-based approach to a constexpr function. This change aims to improve compile-time evaluation and usability. | |
### Key Changes: | |
- Conversion of a metaprogramming utility to a constexpr function. | |
### Improvements: | |
- Enhanced compile-time evaluation efficiency. | |
- Simplified the code structure for better readability and maintenance. | |
### Impact: | |
The shift to constexpr is expected to improve performance during compilation and may facilitate easier debugging and modification in the future." | |
942 (https://gitlab.com/libeigen/eigen/-/merge_requests/942),Fix navbar scroll with toc.,"Looks like doxygen's navtree js has changed, requiring us to change our | |
override hacks to resize the navbar to fit our TOC. Because the `resizeHeight` | |
function is now privately nested in another function, we need to override the | |
entire outer `initResizable()`. | |
Also requires adjusting the position - the absolute position for | |
`div.toc` seems to interfere with scrollbars. | |
Fixes #2467",Antonio Sánchez,2022-04-05T20:14:22.809Z,NA,NA,"## Title: | |
Fix navbar scroll with toc. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses issues with the navigation bar's scrolling functionality due to changes in Doxygen's navtree JavaScript. The solution involves overriding the `initResizable()` function to accommodate these changes and adjusting the position of the table of contents (TOC) to ensure proper scrollbar functionality. | |
### Key Changes: | |
- Overrode the `initResizable()` function to adapt to Doxygen's updated JS. | |
- Adjusted the positioning of `div.toc` to eliminate interference with scrollbars. | |
### Improvements: | |
- Enhances the usability of the navigation bar when scrolling, ensuring it fits the TOC correctly. | |
### Impact: | |
- Resolves the issue reported in #2467, thereby improving the overall user experience of the documentation navigation." | |
945 (https://gitlab.com/libeigen/eigen/-/merge_requests/945),Fix some max size expressions.,These were accidentally replaced in !943.,Antonio Sánchez,2022-04-06T22:19:58.469Z,NA,NA,"## Title: | |
Fix some max size expressions. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an issue where certain max size expressions were unintentionally modified in a previous merge request (!943). | |
### Key Changes: | |
- Restoration of the originally intended max size expressions that were incorrectly altered. | |
### Improvements: | |
- Ensures that the max size expressions function as originally designed, preserving the intended behavior of the library. | |
### Impact: | |
- Fixes potential issues arising from incorrect max size calculations, which could lead to errors or unexpected behavior in applications using the Eigen library." | |
949 (https://gitlab.com/libeigen/eigen/-/merge_requests/949),Fix ODR issues in lapacke_helpers.,Fixes #2473,Antonio Sánchez,2022-04-08T15:31:30.863Z,NA,NA,"## Title: | |
Fix ODR issues in lapacke_helpers. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses and resolves the One Definition Rule (ODR) issues identified in the `lapacke_helpers` module of the Eigen C++ library. | |
### Key Changes: | |
- Fixed ODR violations in the `lapacke_helpers` code. | |
### Improvements: | |
- Ensured more consistent behavior of the library by adhering to ODR principles, which helps avoid potential runtime issues. | |
### Impact: | |
- Enhances the reliability and stability of the `lapacke_helpers` module, contributing to overall code quality in the Eigen library. This fix can prevent future complications arising from ODR violations, leading to improved performance in applications using these helpers." | |
948 (https://gitlab.com/libeigen/eigen/-/merge_requests/948),Fix MSVC+CUDA issues.,"Darn MSVC+CUDA gets confused for diagonal and transpose again, not able | |
to match out-of-line definitions with the corresponding declarations. | |
~~Removed~~ Modified internal typedefs ~~and just use the true type~~ to get around this. | |
MSVC also complained about not passing enough arguments to function-like | |
macro, and about invalid friend declarations. Removed unused macro | |
argument, and explicitly specified friend classes to get around these.",Antonio Sánchez,2022-04-08T18:05:33.193Z,NA,NA,"## Title: | |
Fix MSVC+CUDA issues. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses various compatibility issues between Microsoft Visual C++ (MSVC) and CUDA, particularly related to diagonal and transpose functionality in the Eigen C++ library. | |
### Key Changes: | |
- Modified internal typedefs to resolve confusion in matching out-of-line definitions with declarations. | |
- Adjusted a function-like macro to remove an unused argument, which was causing warnings about insufficient arguments. | |
- Explicitly specified friend classes to fix issues with invalid friend declarations. | |
### Improvements: | |
- Enhanced compatibility with MSVC and CUDA, leading to fewer compilation errors and warnings. | |
- Improved code clarity and maintainability by simplifying typedefs and macro usage. | |
### Impact: | |
These changes help ensure that the Eigen library compiles cleanly when used with MSVC and CUDA, promoting better usability for developers working in these environments." | |
946 (https://gitlab.com/libeigen/eigen/-/merge_requests/946),Remove EIGEN_EMPTY_STRUCT_CTOR,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This removes the old empty struct workaround for gcc as discussed in !899 | |
### Additional information | |
<!--Any additional information you think is important.-->",Tobias Schlüter,2022-04-08T18:47:57.892Z,NA,NA,"## Title: | |
Remove EIGEN_EMPTY_STRUCT_CTOR | |
## Authors: | |
Tobias Schlüter | |
## Summary: | |
This merge request removes the legacy workaround `EIGEN_EMPTY_STRUCT_CTOR` used for handling empty structs in GCC, as discussed in previous contributions. | |
### Key Changes: | |
- Elimination of the `EIGEN_EMPTY_STRUCT_CTOR` macro which was used for compatibility with older GCC versions. | |
### Improvements: | |
- Simplification of the codebase by removing obsolete constructs, leading to better clarity and maintainability. | |
### Impact: | |
- This change streamlines the handling of empty structs, which may improve compilation processes and reduce potential confusion around legacy code for developers using the Eigen library." | |
953 (https://gitlab.com/libeigen/eigen/-/merge_requests/953),Fix ambiguous DiagonalMatrix constructors.,"The following became ambiguous: | |
``` | |
const Eigen::DiagonalMatrix<double, 4> m({1, -1, -1, 1}); | |
``` | |
since the initializer list `{1, -1, -1, 1}` could create either a | |
`DiagonalMatrix` with list of scalars, *or* a `DiagonalVectorType` with list of scalars. | |
Added a single initializer list constructor to avoid this ambiguity.",Antonio Sánchez,2022-04-11T19:13:26.180Z,NA,NA,"## Title: | |
Fix ambiguous DiagonalMatrix constructors. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an ambiguity issue in the Eigen C++ library related to the construction of `DiagonalMatrix` objects using initializer lists. | |
### Key Changes: | |
- Introduced a single initializer list constructor for `DiagonalMatrix` to clarify the construction process and avoid ambiguity with `DiagonalVectorType`. | |
### Improvements: | |
- Eliminated the potential confusion when initializing `DiagonalMatrix` with lists of scalars, ensuring that the intended type is constructed. | |
### Impact: | |
- This change prevents compile-time errors and improves the usability of the `DiagonalMatrix` class by providing a clear and unambiguous way to create diagonal matrices from initializer lists." | |
951 (https://gitlab.com/libeigen/eigen/-/merge_requests/951),Fix Power GEMV order of operations in predux for MMA.,A rare unit test failure showed a difference of the order of operations in predux (add all the elements together for a Packet) for GEMV MMA. Changing it to be similar to the VSX version and reduce the number of instructions from 20 to 7. Fix some inline assembly for GCC that no longer compiles/assembles.,Chip Kerchner,2022-04-11T21:29:06.196Z,NA,NA,"## Title: | |
Fix Power GEMV order of operations in predux for MMA. | |
## Authors: | |
Chip Kerchner | |
## Summary: | |
This merge request addresses a rare unit test failure related to the order of operations in the predux function for GEMV MMA. The changes align the implementation more closely with the VSX version, optimizing performance and fixing compatibility issues in inline assembly for GCC. | |
### Key Changes: | |
- Corrected the order of operations in the predux function for GEMV MMA. | |
- Reduced the number of instructions from 20 to 7. | |
- Fixed inline assembly issues that were preventing proper compilation/assembly with GCC. | |
### Improvements: | |
- Enhanced efficiency of the predux calculation by minimizing instruction count. | |
- Increased compatibility with GCC through fixes in inline assembly. | |
### Impact: | |
The changes improve performance and reliability of the GEMV MMA operations, reducing the likelihood of unit test failures and enhancing overall stability in the Eigen library." | |
952 (https://gitlab.com/libeigen/eigen/-/merge_requests/952),Allow all tests to pass with `EIGEN_TEST_NO_EXPLICIT_VECTORIZATION`,"Some tests currently fail due to alignment assumptions. Here we just | |
work around the failing tests. | |
Identified in #2470.",Antonio Sánchez,2022-04-12T14:48:24.054Z,NA,NA,"## Title: | |
Allow all tests to pass with `EIGEN_TEST_NO_EXPLICIT_VECTORIZATION` | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses the issue where certain tests in the Eigen C++ library fail due to alignment assumptions, allowing all tests to pass by applying workarounds for these failing cases. | |
### Key Changes: | |
- Introduced a workaround for tests that fail due to alignment issues under the `EIGEN_TEST_NO_EXPLICIT_VECTORIZATION` setting. | |
### Improvements: | |
- Enhances test stability by ensuring all tests can pass even with existing alignment assumptions. | |
### Impact: | |
- Improves the robustness of the test suite, making it easier to validate changes and maintain the code quality of the Eigen library. This could lead to a more reliable development process and fewer disruptions caused by test failures." | |
959 (https://gitlab.com/libeigen/eigen/-/merge_requests/959),"Restrict new AVX512 trsm to AVX512VL, rename files for consistency.","Some of the newly added AVX512 packetmath functions, including | |
the masked add for AVX require `__AVX512VL__`. | |
Also renamed headers for consistency with the rest of Eigen: | |
- upper-camel case `.h` | |
- non-standard headers are `.inc` (i.e. require being included in odd places)",Antonio Sánchez,2022-04-14T16:58:33.189Z,NA,NA,"## Title: | |
Restrict new AVX512 trsm to AVX512VL, rename files for consistency. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request introduces specific restrictions on the newly added AVX512 packetmath functions, particularly for the masked add function, which now requires the `__AVX512VL__` macro. Additionally, it renames header files to ensure consistency across the Eigen library. | |
### Key Changes: | |
- Restricted certain AVX512 functions to utilize `__AVX512VL__`. | |
- Renamed header files to follow a consistent naming convention: | |
- Upper-camel case for `.h` files. | |
- Non-standard headers now use the `.inc` extension. | |
### Improvements: | |
- Enhances code consistency within the Eigen library by standardizing header file naming. | |
- Improves clarity regarding the required hardware capabilities for certain functions. | |
### Impact: | |
Ensuring that AVX512 functions are compatible only with the corresponding hardware capabilities enhances the performance reliability of the library. The consistency in file naming may also facilitate better maintenance and usability for developers." | |
960 (https://gitlab.com/libeigen/eigen/-/merge_requests/960),Remove AVX512VL dependency in trsm,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
This PR addresses the issue mentioned in https://gitlab.com/libeigen/eigen/-/merge_requests/959. The `_mm256_mask*` instrinsics are not supported in `AVX512F` (`-mfma -avx512f`) and requires `AVX512F + AVX512VL`. To fix this we switch to corresponding `_mm512_mask*` intrinsics and reinterpret `zmm <-> ymm` when necessary. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
In https://gitlab.com/libeigen/eigen/-/merge_requests/834 `-march=native` was used for performance testing. With `-march=native` the changes here do not cause any performance regressions. With `-mfma -mavx512f` performance is lower for smaller problem sizes in cases requiring intermediate transposes.",b-shi,2022-04-14T20:55:13.848Z,NA,NA,"## Title: | |
Remove AVX512VL dependency in trsm | |
## Authors: | |
b-shi | |
## Summary: | |
This merge request addresses the removal of the AVX512VL dependency in the trsm function of the Eigen C++ library. The change involves switching from `_mm256_mask*` intrinsics, which are incompatible with `AVX512F`, to `_mm512_mask*` intrinsics while handling data reinterpretation between `zmm` and `ymm`. | |
### Key Changes: | |
- Transitioned from using `_mm256_mask*` intrinsics to `_mm512_mask*` intrinsics. | |
- Implemented reinterpretation between `zmm` and `ymm` as needed. | |
### Improvements: | |
- Improved compatibility by eliminating the need for the AVX512VL dependency, simplifying the requirements for using the trsm function. | |
- Performance testing indicated no regressions with `-march=native` settings. | |
### Impact: | |
The changes enhance compatibility across different AVX configurations, especially for users not utilizing AVX512VL, while maintaining performance on larger problem sizes, though some performance degradation may occur for smaller sizes in certain cases." | |
962 (https://gitlab.com/libeigen/eigen/-/merge_requests/962),Update HouseholderSequence.h,"Applying a Householder sequence to the left of a vector (mostly commonly Q from a QR factorization) results in an avoidable heap allocation. Previous fix did not consider right hand sides with fixed number of columns when the argument `inputIsIdentity` is `true`. This fix checks `inputIsIdentity`, and if true, uses a dynamic size block with `bottomRightCorner()`. Otherwise, a fixed size column block with `bottomRows()` is used which implicitly passes 'Dest::ColsAtCompileTime'. Same logic is applied to the block case. Passing `Dest::ColsAtCompileTime` to `internal::apply_block_householder_on_the_left` revealed a typo: `TFactorSize` should correspond to `VectorsType::ColsAtCompileTime`.",Charles Schlosser,2022-04-15T16:56:17.831Z,NA,NA,"## Title: | |
Update HouseholderSequence.h | |
## Authors: | |
Charles Schlosser | |
## Summary: | |
This merge request addresses an issue related to heap allocation when applying a Householder sequence to the left of a vector. It enhances the handling of right-hand sides with a fixed number of columns when `inputIsIdentity` is set to `true`. | |
### Key Changes: | |
- Added a check for `inputIsIdentity` that uses `bottomRightCorner()` for dynamic size blocks when true, and `bottomRows()` for fixed size blocks otherwise. | |
- Updated logic in block handling to consistently apply the same principles. | |
- Corrected a typo in `internal::apply_block_householder_on_the_left` concerning `TFactorSize`. | |
### Improvements: | |
- Eliminated unnecessary heap allocations when `inputIsIdentity` is true, improving performance. | |
- Streamlined block handling logic for better code clarity and efficiency. | |
### Impact: | |
The changes will result in more efficient memory usage during Householder transformations, especially when working with identity input cases, leading to potentially better performance in scenarios that utilize QR factorization." | |
963 (https://gitlab.com/libeigen/eigen/-/merge_requests/963),Fix cwise NaN propagation for scalar input.,"Was missing a template parameter. Updated tests. | |
Fixes #2474.",Antonio Sánchez,2022-04-16T05:07:44.501Z,NA,NA,"## Title: | |
Fix cwise NaN propagation for scalar input. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an issue with the propagation of NaN values for scalar inputs in the Eigen C++ library, which was caused by a missing template parameter. It also includes updates to the related tests to ensure proper functionality. | |
### Key Changes: | |
- Added the missing template parameter for NaN propagation in scalar input. | |
- Updated tests to verify the fix. | |
### Improvements: | |
- Enhanced handling of NaN values, leading to more reliable performance for scalar operations. | |
### Impact: | |
This fix improves the correctness of computations involving NaN values, which is crucial for users relying on accurate numerical results in their applications." | |
964 (https://gitlab.com/libeigen/eigen/-/merge_requests/964),Fix HouseholderSequence.h,"The InnerPanel template parameter is not always false and in those cases where it is true, the assignment won't compile. | |
/cc @cantonios",Rohit Santhanam,2022-04-17T06:08:03.072Z,NA,NA,"## Title: | |
Fix HouseholderSequence.h | |
## Authors: | |
Rohit Santhanam | |
## Summary: | |
This merge request addresses a compilation issue in the `HouseholderSequence.h` file related to the `InnerPanel` template parameter. It ensures that the assignment compiles correctly when `InnerPanel` is true. | |
### Key Changes: | |
- Modified the handling of the `InnerPanel` template parameter to ensure compatibility in cases where it is set to true. | |
### Improvements: | |
- Enhanced the code robustness by preventing compilation errors associated with the `InnerPanel` parameter. | |
### Impact: | |
- This fix improves the usability of the `HouseholderSequence` component, allowing it to function correctly in more scenarios without causing compilation failures." | |
965 (https://gitlab.com/libeigen/eigen/-/merge_requests/965),"Add fused multiply functions for PowerPC - pmsub, pnmadd and pnmsub","Add fused multiply functions for PowerPC - pmsub, pnmadd and pnmsub",Chip Kerchner,2022-04-18T16:16:32.757Z,NA,NA,"## Title: | |
Add fused multiply functions for PowerPC - pmsub, pnmadd and pnmsub | |
## Authors: | |
Chip Kerchner | |
## Summary: | |
This merge request introduces new fused multiply functions specifically for PowerPC architecture, enhancing the mathematical capabilities of the Eigen C++ library. | |
### Key Changes: | |
- Added three new fused multiply functions: `pmsub`, `pnmadd`, and `pnmsub`. | |
### Improvements: | |
- Enhanced efficiency of mathematical operations on PowerPC, leading to more optimized performance in computations involving these fused functions. | |
### Impact: | |
- The addition of these functions provides improved computational speed and accuracy for applications utilizing the Eigen library on PowerPC platforms, potentially benefiting performance-sensitive applications." | |
958 (https://gitlab.com/libeigen/eigen/-/merge_requests/958),Fix compiler bugs for GCC 10 & 11 for Power GEMM,Inline assembly for load vector pair broken for GCC 10 & 11. Using multiple vector pairs broken for GCC 10.,Chip Kerchner,2022-04-20T15:59:00.776Z,NA,NA,"## Title: | |
Fix compiler bugs for GCC 10 & 11 for Power GEMM | |
## Authors: | |
Chip Kerchner | |
## Summary: | |
This merge request addresses specific compiler bugs in the Eigen C++ library related to the use of inline assembly for the Power GEMM (General Matrix Multiplication) on GCC versions 10 and 11. | |
### Key Changes: | |
- Fixed issues with inline assembly for loading vector pairs. | |
- Resolved problems with the handling of multiple vector pairs. | |
### Improvements: | |
- Enhanced compatibility of the Eigen library with GCC 10 and 11, ensuring more reliable performance on systems using these compilers. | |
### Impact: | |
By fixing these bugs, the merge improves the functionality and stability of the Eigen library when compiled with GCC 10 and 11, specifically for users leveraging Power GEMM." | |
966 (https://gitlab.com/libeigen/eigen/-/merge_requests/966),Removed need to supply the Symmetric flag to UpLo argument for Accelerate LLT and LDLT,"This pull request removes the need to supply the Symmetric flag to the UpLo template argument for Accelerate's LLT and LDLT. | |
The Accelerate LLT and LDLT solvers require the supplied matrices to be symmetric. We therefore make it easier to utilize the support module by implicitly ORing the Symmetric flag with the supplied UpLo argument. | |
The UpLo argument to AccelerateQR and AccelerateCholeskyAtA has also been removed as it was unnecessary.",John Mather,2022-04-21T20:02:10.996Z,NA,NA,"## Title: | |
Removed need to supply the Symmetric flag to UpLo argument for Accelerate LLT and LDLT | |
## Authors: | |
John Mather | |
## Summary: | |
This merge request simplifies the usage of the Accelerate LLT and LDLT solvers by eliminating the requirement to explicitly supply the Symmetric flag with the UpLo argument. | |
### Key Changes: | |
- Implicitly ORs the Symmetric flag with the UpLo argument for LLT and LDLT solvers. | |
- Removed the UpLo argument from AccelerateQR and AccelerateCholeskyAtA as it was deemed unnecessary. | |
### Improvements: | |
- Streamlines the interface for users, making it easier to leverage these solvers without the need for additional flags. | |
### Impact: | |
- Enhances user experience by reducing complexity and potential for errors in specifying symmetry conditions for matrix operations." | |
967 (https://gitlab.com/libeigen/eigen/-/merge_requests/967),Add load vector_pairs for RHS of GEMM MMA. Improved predux GEMV.,"Add load vector_pairs for RHS of GEMM MMA (10% faster in some situations). Improved predux GEMV - use vectors instead of scalars. General cleanup of GEMV - remove unnecessary typename Index from GEMM, etc.",Chip Kerchner,2022-04-25T16:23:02.162Z,NA,NA,"## Title: | |
Add load vector_pairs for RHS of GEMM MMA. Improved predux GEMV. | |
## Authors: | |
Chip Kerchner | |
## Summary: | |
This merge request introduces enhancements to the Eigen C++ library, specifically focusing on optimizing operations related to GEMM (General Matrix Multiplication) and GEMV (General Matrix-Vector). | |
### Key Changes: | |
- Added loading of vector_pairs for the right-hand side (RHS) of GEMM MMA, which can lead to performance improvements. | |
- Improved the pre-reduction (predux) GEMV by utilizing vectors instead of scalars. | |
- Conducted a general cleanup of GEMV, including the removal of unnecessary typename Index from GEMM. | |
### Improvements: | |
- Achieved up to a 10% speed increase in certain scenarios due to the vector_pairs loading optimization. | |
- Enhanced code clarity and maintainability with the cleanup efforts. | |
### Impact: | |
These changes contribute to better performance and efficiency in matrix multiplication and vector operations, potentially benefiting users with demanding computational tasks." | |
968 (https://gitlab.com/libeigen/eigen/-/merge_requests/968),make diagonal matrix cols() and rows() methods constexpr,"### What does this implement/fix? | |
This PR adds the Eigen constexpr macro `EIGEN_CONSTEXPR` to the `cols` and `rows()` methods of `Eigen::DiagonalMatrix`. | |
This also lines it up with `Eigen::Matrix` which already has this constexpr methods. This is then inline with https://gitlab.com/libeigen/eigen/-/issues/2152.",Alex_M,2022-05-05T15:29:21.993Z,NA,NA,"## Title: | |
Make diagonal matrix cols() and rows() methods constexpr | |
## Authors: | |
Alex_M | |
## Summary: | |
This merge request enhances the Eigen C++ library by incorporating the `EIGEN_CONSTEXPR` macro into the `cols` and `rows()` methods of the `Eigen::DiagonalMatrix`. | |
### Key Changes: | |
- Added the `EIGEN_CONSTEXPR` macro to the `cols` and `rows()` methods in `Eigen::DiagonalMatrix`. | |
### Improvements: | |
- Aligns `Eigen::DiagonalMatrix` with `Eigen::Matrix`, which already includes these constexpr methods. | |
### Impact: | |
- Enhances compile-time evaluation capabilities for the diagonal matrix, potentially improving performance in constexpr contexts and making it consistent with other matrix types in the library." | |
969 (https://gitlab.com/libeigen/eigen/-/merge_requests/969),Add `uninstall` target only if not already defined.,"### What does this implement/fix? | |
As suggested by https://gitlab.kitware.com/cmake/community/-/wikis/FAQ#can-i-do-make-uninstall-with-cmake it is common to check for the `uninstall` target to exist before adding it. | |
While this is not a problem for standalone projects or for projects using `ExternalProject` (https://cmake.org/cmake/help/latest/module/ExternalProject.html), while using FetchContent (https://cmake.org/cmake/help/latest/module/FetchContent.html) all targets are ""imported"" into a single namespace. | |
It is reasonable to let the user know if a dependency already defines a specific target, but the `uninstall` meta target has a common name and for this reason it is a good practice to check for its existence before defining it. | |
This change should not have any impact on users installing the project separately, but will allow the use of Eigen with FetchContent",Francesco Romano,2022-05-05T17:43:10.496Z,NA,NA,"## Title: | |
Add `uninstall` target only if not already defined. | |
## Authors: | |
Francesco Romano | |
## Summary: | |
This merge request introduces a conditional addition of the `uninstall` target in the Eigen C++ library's CMake configuration. The goal is to prevent conflicts when using the library in conjunction with other projects that may also define an `uninstall` target. | |
### Key Changes: | |
- Implemented a check to determine if the `uninstall` target already exists before defining it in the CMake configuration. | |
### Improvements: | |
- Enhances compatibility with projects using FetchContent, as it prevents importing conflicts with existing `uninstall` targets. | |
### Impact: | |
- No negative impact on users who install the project independently; however, it improves the integration experience when used in conjunction with other libraries or projects that might also define an `uninstall` target." | |
356 (https://gitlab.com/libeigen/eigen/-/merge_requests/356),Adding PocketFFT support in FFT module since kissfft has some flaw in accuracy and performance,"Eigen using KissFFT as the default fft implementation now,but it`s performance drops sharply in some specific situations since kissfft fail to handle ""odd-sized"" inputs (i.e., sizes with large factors) efficiently, which was proposed in #1717.In pocketfft, for lengths with very large prime factors, Bluestein's algorithm is used, and instead of an FFT of length n, a convolution of length ```n2 >= 2*n-1``` is performed,where n is chosen to be highly composite. | |
In addition, the google/jax project uses eigen's fft as its backend before, which also has problems with performance and accuracy. You can see their discussion [FFT precision/performance #2952](https://github.com/google/jax/issues/2952). Currently, they have switched to pocketfft as their default fft implementation. | |
Prelininary performance comparison(complex to complex and -O3 optimization ): | |
``` | |
------------------------------------------- | |
length kissfft pocketfft | |
------------------------------------------- | |
100000 7.03 ms 4.36 ms | |
100000*2 13.7 ms 9.67 ms | |
100001 14999 ms 24.3 ms | |
``` | |
Updated benchmarks tested by fft_benchmark.cpp on AArch64 (time units:ns) : | |
``` | |
Benchmark Time CPU Time Old Time New CPU Old CPU New | |
------------------------------------------------------------------------------------------------------------------------- | |
test_scalar_float/100 -0.1687 -0.1688 2555 2124 2554 2123 | |
test_scalar_float/512 -0.4176 -0.4176 14773 8603 14767 8600 | |
test_scalar_float/4096 -0.5413 -0.5413 158755 72818 158684 72781 | |
test_scalar_float/32768 -0.4713 -0.4714 1639528 866753 1638255 866018 | |
test_scalar_float/100000 -0.5012 -0.5012 6823375 3403716 6817207 3400439 | |
test_scalar_float/100001 -0.9988 -0.9988 22048197988 25454072 22029408670 25420215 | |
test_scalar_double/100 -0.1298 -0.1298 2421 2107 2420 2106 | |
test_scalar_double/512 -0.3835 -0.3835 13892 8565 13885 8561 | |
test_scalar_double/4096 -0.5060 -0.5060 153078 75624 153005 75581 | |
test_scalar_double/32768 -0.4349 -0.4350 1912198 1080593 1910733 1079557 | |
test_scalar_double/100000 -0.4256 -0.4256 7089696 4072526 7083264 4068571 | |
test_scalar_double/100001 -0.9985 -0.9985 29955156310 44809112 29930479990 44739807 | |
test_complex_float/100 -0.4638 -0.4638 4454 2388 4452 2387 | |
test_complex_float/512 -0.6285 -0.6285 29882 11102 29869 11097 | |
test_complex_float/4096 -0.6174 -0.6174 285366 109192 285238 109137 | |
test_complex_float/32768 -0.6265 -0.6265 3698800 1381495 3695956 1380318 | |
test_complex_float/100000 -0.6730 -0.6730 13653536 4465063 13642033 4460459 | |
test_complex_float/100001 -0.9990 -0.9990 22388960173 22772174 22370358320 22745811 | |
test_complex_double/100 -0.1601 -0.1602 4239 3560 4237 3558 | |
test_complex_double/512 -0.6085 -0.6085 28463 11143 28452 11138 | |
test_complex_double/4096 -0.5924 -0.5924 294522 120061 294352 119977 | |
test_complex_double/32768 -0.5963 -0.5963 4087268 1650075 4083765 1648615 | |
test_complex_double/100000 -0.5011 -0.5011 15675452 7820870 15659329 7811764 | |
test_complex_double/100001 -0.9985 -0.9985 28986634903 43583605 28961759140 43521986 | |
``` | |
[fft_benchmark.cpp](/uploads/163b7e8dcde4510db63e134e1f3ce218/fft_benchmark.cpp)",Guoqiang QI,2022-05-11T17:44:22.976Z,MR to reopen,NA,"## Title: | |
Adding PocketFFT support in FFT module since kissfft has some flaw in accuracy and performance | |
## Authors: | |
Guoqiang QI | |
## Summary: | |
This merge request introduces PocketFFT as an alternative to KissFFT in the Eigen C++ library's FFT module. The change is motivated by KissFFT's performance limitations, particularly with odd-sized inputs, leading to accuracy issues. PocketFFT addresses these shortcomings by utilizing Bluestein's algorithm for sizes with large prime factors. | |
### Key Changes: | |
- Replaced KissFFT with PocketFFT as the default FFT implementation. | |
- Implemented techniques to handle odd-sized inputs more efficiently, particularly those with large factors. | |
### Improvements: | |
- Performance benchmarks show significant improvements in execution times for various input sizes, especially for sizes such as 100,000 and 100,001: | |
- PocketFFT reduced processing time from 14,999 ms (KissFFT) to 24.3 ms (PocketFFT) for input size 100,001. | |
- For input size 100,000, time decreased from 7.03 ms to 4.36 ms. | |
### Impact: | |
- Enhanced accuracy and performance of FFT computations within the Eigen library. | |
- Broader adoption potential, as evidenced by JAX switching to PocketFFT due to similar performance issues with KissFFT. | |
- Overall user experience is expected to improve due to more efficient handling of complex computational scenarios." | |
860 (https://gitlab.com/libeigen/eigen/-/merge_requests/860),Add AVX512 optimizations for matrix multiply,"## Edit | |
I've refactored the original implementation in this merge request to use packet math and avoid inline asm/intrinsics as much as possible. It also supports double precision. | |
The changes implement optimizations for compute kernels use 48x8 and 24x8 unrolls for single and double precision respectively. Tail handling is done with powers of 2 when possible, that is, when packing routines (`gemm_pack_rhs` and `gemm_pack_lhs`) support it. If not supported, we loop over ones as done before. | |
The new kernels do not support inner stride different than one for C matrix, hence, we fallback to Eigen's previously used kernels (with `nr == 4`). We need to make decision at `gebp_traits` stage such all kernels are compatible to avoid more intrusive changes in other Eigen drivers that use `gebp_kernel`, `gemm_pack_rhs` and `gemm_pack_lhs`. | |
I've also added a couple macros to reduce register pressure, which is very high: 24 accumulators + 6 registers for load A and 2 for loading B. Using `EIGEN_ARCH_AVX512_GEMM_KERNEL_USE_LESS_A_REGS` or `EIGEN_ARCH_AVX512_GEMM_KERNEL_USE_LESS_B_REGS` will reduce register use for A and B by half. Performance lost was not that much (less than 2% for large sizes). For gcc we use 3 register to load A by default, since it was the only way I was able to avoid the zmm register spills. | |
I've build the tests and run them for the following architectures: SSE2, SSE3, SSSE3, SSE4.1 SSE4.2, AVX, AVX2, AVX512, and AVX512DQ. Other ones need to be checked. | |
### Performance | |
I've done a simple sweep test for square problem sizes on Xeon 8180 (Skylake) in sequential mode. gcc11 and clang11 were used to compile benchmark code. There are speedups for dgemm (~20%) and sgemm (~15%), but for sgemm there can be some slowdowns for small problem sizes. Below there are some more details on the performance. I've also measured the other transpose case (NT, TN and TT), but results are similar. | |
According to @b-shi he also saw improvements for trsm with this patch. | |
#### `dgemm` | |
Performance improvements for ""`dgemm`"" seems reasonable around ~20% improvements when comparing to Eigen before changes (c38f91d). Some small sizes also improved if they are multiples of 2 at least. | |
 | |
 | |
For smaller sizes I didn't see very large regressions. | |
 | |
 | |
#### `sgemm` | |
For ""`sgemm`"", I see some speedups as well (up to 15% for clang and a bit more for gcc), but I've also notice some slow downs depending if the size is a multiple of 4 or not and for small sizes. | |
 | |
 | |
As mentioned before non-multiples of 4 have some regressions for smaller problem sizes. Hopefully, the benefits outweighs the regressions enough that will make those changes worth it. | |
 | |
 | |
For reference here is the performance for multiples of 2 (step = 2): | |
 | |
 | |
This slowdowns can probably be mitigated if we further enable packing to handle the tail with m = 2 for single precision directly instead of looping over ones. For example, for m = 47 tail handling would be 32 + 8 + 4 + **2** + 1 instead of 32 + 8 + 4 + **1 + 1** + 1. | |
## Old stuff before large refactor. | |
### What does this implement/fix? | |
This implements/adds some optimizations for ""sgemm"" compute kernels for AVX512. This is still work in progress since it doesn't use packet math yet. However, it will be useful in getting some early feedback on the changes. | |
Here are some comments/questions worth mentioning: | |
1. It will be useful to know if the inline asm used in the kernel (`Eigen/src/Core/arch/AVX512/sgemm_kern.hpp`) is reasonable/acceptable. In particular, there is some register mapping that was used to avoid gcc register spills. Also, using inline asm for loading A/B elements result in better performance with gcc. | |
2. The kernel is quite verbose, but it should be buildable with c++14. It manually unrolls and handle tails with powers of 2, except for `m/n` equal to 2, where I had to loop around ones such I could reuse the packing kernels. Does it make sense in rewriting the kernel with c++17 as @b-shi did in !834? | |
3. I'm not really sure if the performance improvements justify changes. It seems the Eigen's ""sgemm"" performance is quite good. I see about 10% to 15% performance increase for large sizes with the changes. Maybe we will need to use some threshold to dispatch the kernel for larges sizes only to avoid regressions for smaller sizes. Is this acceptable? | |
4. I tried to reenable packing with `nr = 8` for `gemm_pack_rhs`, by uncommenting + small changes. It seems to work for matrix multiplication, but I'm not sure if it was commented out for other reasons. Was there a reason? | |
### Additional information | |
Here is some initial performance measurements on `Intel(R) Xeon(R) Platinum 8180` for A/B non-transpose. For clang we can actually remove the register mapping without having register spills, but gcc performance would be lower. | |
#### NN using gcc11 | |
 | |
#### NN using clang11 | |
 | |
+@b-shi",aaraujom,2022-05-12T23:41:20.087Z,NA,NA,"## Title: | |
Add AVX512 optimizations for matrix multiply | |
## Authors: | |
aaraujom | |
## Summary: | |
This merge request introduces significant AVX512 optimizations for matrix multiplication in the Eigen C++ library, particularly enhancing the performance of both single and double precision computing kernels. The implementation minimizes the use of inline assembly and favors packet math where applicable. | |
### Key Changes: | |
- Refactored matrix multiplication to utilize packet math and reduce reliance on inline asm/intrinsics. | |
- Implemented compute kernels with 48x8 and 24x8 unrolls for single and double precision, respectively. | |
- Introduced macros to lower register pressure, improving register management without significant performance loss. | |
- Retained compatibility with existing kernels for certain C matrix configurations. | |
### Improvements: | |
- Achieved approximately 20% speedup for double precision (`dgemm`) and up to 15% for single precision (`sgemm`) on large problem sizes compared to previous versions, notably on Xeon 8180 (Skylake). | |
- Enhanced handling of tail elements in matrix packing routines for better optimization. | |
- Enabled testing on various architectures (SSE2 to AVX512). | |
### Impact: | |
The optimized matrix multiplication routines significantly enhance performance benchmarks for large matrix sizes, making Eigen more efficient in numerical computations utilizing AVX512 capabilities. However, there were some observed slowdowns for smaller problem sizes, particularly with `sgemm`, which may require adjustments to fully mitigate regressions. Overall, these changes are expected to benefit users engaged in computational tasks requiring high-performance matrix operations." | |
908 (https://gitlab.com/libeigen/eigen/-/merge_requests/908),Fix 'Incorrect reference code in STL_interface.hh for ata_product' eigen/isses/2425,"### Reference issue | |
#2425 | |
### What does this implement/fix? | |
This fixes bug in code ata_product",Rohan Ghige,2022-05-18T14:42:58.314Z,NA,NA,"## Title: | |
Fix 'Incorrect reference code in STL_interface.hh for ata_product' | |
## Authors: | |
Rohan Ghige | |
## Summary: | |
This merge request addresses a bug related to the `ata_product` function in the Eigen library's `STL_interface.hh` file. | |
### Key Changes: | |
- Corrected the reference code for the `ata_product` implementation. | |
### Improvements: | |
- Enhances the reliability and functionality of the `ata_product` function. | |
### Impact: | |
- Fixing this bug improves code accuracy, potentially preventing errors in applications utilizing the `ata_product` function within the Eigen library." | |
974 (https://gitlab.com/libeigen/eigen/-/merge_requests/974),Prevent BDCSVD crash caused by index out of bounds.,"For a large matrix of ones, we end up trying to access `perm(-1)`, which | |
causes a memory access error and crash. | |
Added a basic check for this case, and force `zhat` to zero, in an | |
attempt to continue. Reports a numerical issue. | |
Related to #2491",Antonio Sánchez,2022-05-19T22:29:49.402Z,NA,NA,"## Title: | |
Prevent BDCSVD crash caused by index out of bounds. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses a critical issue in the Eigen C++ library's BDCSVD implementation, which could lead to a crash when processing large matrices of ones due to an index out of bounds error. | |
### Key Changes: | |
- Introduced a check to prevent access to `perm(-1)` that arises in specific cases. | |
- Forced the variable `zhat` to zero to mitigate the crash and allow continued processing. | |
### Improvements: | |
- Enhanced robustness of the BDCSVD algorithm by ensuring it can handle edge cases without crashing. | |
- Added a report for numerical issues, providing better diagnostics for similar scenarios. | |
### Impact: | |
This change significantly reduces the likelihood of crashes during matrix processing, leading to improved stability and reliability of the BDCSVD functionality in the library." | |
973 (https://gitlab.com/libeigen/eigen/-/merge_requests/973),Add arg() to tensor,"### What does this implement/fix? | |
Adds a .arg() method to Tensors.",Tobias Wood,2022-05-20T03:51:50.545Z,NA,NA,"## Title: Add arg() to tensor | |
## Authors: Tobias Wood | |
## Summary: | |
This merge request introduces a new method, `.arg()`, to the Tensor class in the Eigen C++ library, enhancing its functionality. | |
### Key Changes: | |
- Implementation of the `.arg()` method for Tensors. | |
### Improvements: | |
- Provides a way to retrieve the indices of the maximum or minimum values along specified dimensions of a Tensor. | |
### Impact: | |
- Enhances the ease of use and versatility of Tensors, allowing for more efficient data manipulation and analysis in numerical computations." | |
977 (https://gitlab.com/libeigen/eigen/-/merge_requests/977),Fix BDCSVD condition for failing with numerical issue.,NA,Antonio Sánchez,2022-05-20T15:40:38.069Z,NA,NA,"## Title: | |
Fix BDCSVD condition for failing with numerical issue. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses a numerical issue within the BDCSVD (Bidiagonal Divide and Conquer SVD) algorithm in the Eigen C++ library. | |
### Key Changes: | |
- Modified the condition handling within the BDCSVD function to prevent failures due to numerical instability. | |
### Improvements: | |
- Enhanced the robustness of the BDCSVD algorithm, ensuring more reliable results across various use cases. | |
### Impact: | |
- Users of the Eigen library will experience improved stability and performance when using the BDCSVD functionality, reducing the likelihood of encountering numerical errors." | |
984 (https://gitlab.com/libeigen/eigen/-/merge_requests/984),unset executable flag,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Subject says it all. | |
### Additional information | |
<!--Any additional information you think is important.-->",Eisuke Kawashima,2022-05-23T04:13:16.289Z,NA,NA,"## Title: | |
Unset executable flag | |
## Authors: | |
Eisuke Kawashima | |
## Summary: | |
This merge request addresses the removal of the executable flag from certain files in the Eigen C++ library. | |
### Key Changes: | |
- Unset the executable flag on specified files to ensure proper permissions. | |
### Improvements: | |
- Enhances the management of file permissions within the project structure. | |
### Impact: | |
- Prevents execution of files that should not be executable, thereby improving security and compliance with development best practices." | |
985 (https://gitlab.com/libeigen/eigen/-/merge_requests/985),Improve plogical_shift_* implementations and fix typo in SVE/PacketMath.h,"This changes was part of works i did before, i think it may be useful for Eigen, so i create a MR.",Guoqiang QI,2022-05-23T14:44:42.521Z,NA,NA,"## Title: | |
Improve plogical_shift_* implementations and fix typo in SVE/PacketMath.h | |
## Authors: | |
Guoqiang QI | |
## Summary: | |
This merge request enhances the implementations of logical shift operations in the Eigen C++ library and corrects a typographical error in the SVE/PacketMath.h file. These changes aim to improve functionality and maintain code clarity. | |
### Key Changes: | |
- Improved implementations of the plogical_shift_* functions. | |
- Fixed a typo in the SVE/PacketMath.h header file. | |
### Improvements: | |
- The enhancements to logical shift operations are expected to lead to better performance and more robust functionality in mathematical computations. | |
### Impact: | |
These changes contribute to the overall quality and reliability of the Eigen library, potentially increasing the accuracy and efficiency of operations that utilize logical shifts." | |
983 (https://gitlab.com/libeigen/eigen/-/merge_requests/983),[SYCL] Extending SYCL queue interface extension.,This PR extends the `QueueInterface` in the SYCL backend to accept an existing SYCL queue. This will enable us to integrate Eigen SYCL in high-level frameworks that already have SYCL-queue. Reusing the existing SYCL queue will avoid the extra context creation and potential unnecessary memory movement which is expensive.,Mehdi Goli,2022-05-23T14:45:28.119Z,NA,NA,"## Title: | |
[SYCL] Extending SYCL queue interface extension. | |
## Authors: | |
Mehdi Goli | |
## Summary: | |
This merge request enhances the SYCL backend by extending the `QueueInterface` to permit the use of an existing SYCL queue. This modification aims to improve integration with high-level frameworks that already utilize SYCL queues. | |
### Key Changes: | |
- Extended the `QueueInterface` to accept an existing SYCL queue. | |
### Improvements: | |
- Enables integration of Eigen SYCL with existing frameworks. | |
- Reduces the need for creating a new context, thereby minimizing unnecessary memory movement. | |
### Impact: | |
- This change is expected to improve performance by avoiding costly operations associated with context creation and memory management, leading to more efficient execution of SYCL tasks within Eigen." | |
980 (https://gitlab.com/libeigen/eigen/-/merge_requests/980),Avoid signed integer overflow in adjoint test.,Sanitizers complain about this.,Antonio Sánchez,2022-05-23T14:46:17.414Z,NA,NA,"## Title: | |
Avoid signed integer overflow in adjoint test. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses the issue of signed integer overflow in the adjoint test within the Eigen C++ library. The change was prompted by feedback from sanitizers indicating potential problems. | |
### Key Changes: | |
- Implemented modifications to prevent signed integer overflow during the execution of the adjoint test. | |
### Improvements: | |
- Increased reliability of the adjoint test by ensuring that integer overflow scenarios are handled properly. | |
### Impact: | |
- Enhances the robustness of the Eigen library’s testing framework, contributing to overall code safety and performance." | |
975 (https://gitlab.com/libeigen/eigen/-/merge_requests/975),Add subMappers to Power GEMM packing - simplifies the address calculations (10% faster),Add subMappers to Power GEMM packing - simplifies the address calculations (10% faster). Added missing getSubMapper & getLinearMapper for TensorContractonMapper - fixed compilation issues. Other minor complex packing improvements.,Chip Kerchner,2022-05-23T15:18:30.414Z,NA,NA,"## Title: | |
Add subMappers to Power GEMM packing - simplifies the address calculations (10% faster) | |
## Authors: | |
Chip Kerchner | |
## Summary: | |
This merge request introduces subMappers for Power GEMM packing, which streamline address calculations, resulting in a performance increase of approximately 10%. It also includes the addition of missing functions for TensorContractonMapper and addresses various minor improvements in complex packing. | |
### Key Changes: | |
- Introduction of subMappers to improve Power GEMM packing. | |
- Added missing `getSubMapper` and `getLinearMapper` functions. | |
- Fixed compilation issues related to these additions. | |
### Improvements: | |
- Performance enhancement of about 10% due to simplified address calculations. | |
- Various minor enhancements to complex packing mechanisms. | |
### Impact: | |
The changes are expected to significantly improve the efficiency of GEMM operations within the Eigen library, leading to faster computations in applications relying on matrix operations." | |
982 (https://gitlab.com/libeigen/eigen/-/merge_requests/982),Avoid ambiguous Tensor comparison operators for C++20 compatibility,Should be compatible with any C++ version.,Benjamin Kramer,2022-05-23T17:36:03.918Z,NA,NA,"## Title: | |
Avoid ambiguous Tensor comparison operators for C++20 compatibility | |
## Authors: | |
Benjamin Kramer | |
## Summary: | |
This merge request focuses on refining the Tensor comparison operators in the Eigen C++ library to eliminate ambiguities that affect compatibility with C++20. The changes ensure that the library functions correctly across different versions of C++. | |
### Key Changes: | |
- Resolved ambiguities in Tensor comparison operators. | |
### Improvements: | |
- Enhanced compatibility with C++20 while maintaining functionality with earlier C++ standards. | |
### Impact: | |
- The changes provide a clearer and more reliable implementation of Tensor comparisons, fostering better code interoperability and adherence to modern programming standards." | |
986 (https://gitlab.com/libeigen/eigen/-/merge_requests/986),[SYCL] SYCL-2020 range does not have default constructor.,According to the [SYCL-2020 spec](https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#table.constructors.range) the range class does not have a default constructor anymore. This PR changes all default constructed ranges to the range of size 1 to make sure at least one thread will be created to run the `parallel_for`.,Mehdi Goli,2022-05-24T03:11:47.315Z,NA,NA,"## Title: | |
[SYCL] SYCL-2020 range does not have default constructor. | |
## Authors: | |
Mehdi Goli | |
## Summary: | |
This merge request addresses the absence of a default constructor for the range class as specified in the SYCL-2020 standard. It modifies existing code to replace default constructed ranges with ranges of size 1, ensuring that at least one thread is created for executing the `parallel_for` function. | |
### Key Changes: | |
- Updated all instances of default constructed ranges in the code to ranges of size 1. | |
### Improvements: | |
- Enhances compliance with the SYCL-2020 specification, which eliminates the default constructor for the range class. | |
### Impact: | |
- Guarantees that the `parallel_for` will always execute with at least one thread, potentially improving the reliability and functionality of parallel operations within the library." | |
976 (https://gitlab.com/libeigen/eigen/-/merge_requests/976),fix: issue 2481: LDLT produce wrong results with AutoDiffScalar,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
#2481 | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
When the value of an `AutoDiffScalar` is 0 (as in the minimal example), some updates in the derivative that should take place in a `triangular_solve_vector<...>::run` are being skipped. | |
See for instance: | |
https://gitlab.com/libeigen/eigen/-/blob/master/Eigen/src/Core/products/TriangularSolverVector.h#L118 | |
The error does not occur when the value of the `AutoDiffScalar` is something different than 0. | |
Proposed fix: | |
- The check must take into account the value of the derivatives when checking `AutoDiffScalar` for zeroes | |
### Additional information | |
<!--Any additional information you think is important.-->",Mario Rincon-Nigro,2022-05-25T15:26:11.707Z,NA,NA,"## Title: | |
Fix: issue 2481: LDLT produce wrong results with AutoDiffScalar | |
## Authors: | |
Mario Rincon-Nigro | |
## Summary: | |
This merge request addresses an issue where the LDLT decomposition produces incorrect results when using `AutoDiffScalar` with a value of 0. The fix ensures that the derivative values are properly considered during calculations, preventing skipped updates in the triangular solve operation. | |
### Key Changes: | |
- Updated the logic in `triangular_solve_vector<...>::run` to account for both the value of `AutoDiffScalar` and its derivatives when checking for zero values. | |
### Improvements: | |
- Ensures accurate handling of `AutoDiffScalar` when its value is 0, improving numerical stability and correctness in calculations involving LDLT decomposition. | |
### Impact: | |
- This fix prevents erroneous results in computations using `AutoDiffScalar`, enhancing the reliability of the Eigen library in applications relying on automatic differentiation." | |
971 (https://gitlab.com/libeigen/eigen/-/merge_requests/971),Add R-Bidiagonalization step to BDCSVD,"### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This adds an R-Bidiagonalization step to BDCSVD. (mentioned in section 8.6.3 of the Golub and Van Loan ""Matrix Computations"" book.) | |
- When the input matrix ``A`` is sufficiently tall (or wide), this first computes the QR decomposition of ``A`` (or ``A*``). It then just runs the normal bidiagonalization/svd routines on the top rows of R. Finally, it has to apply Q on the left of ``U`` (or ``V``) to get the actual SVD of ``A = Q(R/0) = Q(USV^T/0)``. | |
- Optimization for tall matrices since less work is done in bidiagonalization. | |
As far as I can tell, LAPACK always does this in [dgesdd](http://www.netlib.org/lapack/explore-html/d1/d7e/group__double_g_esing_gad8e0f1c83a78d3d4858eaaa88a1c5ab1.html#gad8e0f1c83a78d3d4858eaaa88a1c5ab1) | |
when there's sufficient workspace. | |
I'm not sure if adding additional workspace here is acceptable, but this mr shouldn't add much more. A lot of the new workspace needed for the QR decomposition should be cancelled out because UpperBidiagonalization uses less. | |
Looked like an easy win, so I thought I'd try it out. (There was also some recent discussion on Discord where people were using BDCSVD on super tall matrices, so I think this meets a real need.) | |
### Additional information | |
<!--Any additional information you think is important.--> | |
I think it's working pretty well. Passes the test suite and I'm pleased with the speedup I'm seeing. | |
These are from some informal benchmarks: | |
- Run on an AMD EPYC 7742, with gcc-10.2.0, -O3 -march=core-avx2 | |
- show results for computing both Thin U and V and just computhing Thin V, for double and complex<double>. | |
##### Small-ish | |
~~~~ | |
Benchmark Time CPU Time Old Time New CPU Old CPU New | |
----------------------------------------------------------------------------------------------------------------------------------------- | |
bdcsvd_double_computeThinUV/38/38 -0.0150 -0.0146 560730 552343 558549 550367 | |
bdcsvd_double_computeThinUV/38/138 -0.0311 -0.0317 665697 644985 663199 642201 | |
bdcsvd_double_computeThinUV/38/238 -0.0851 -0.0852 783044 716412 780009 713560 | |
bdcsvd_double_computeThinUV/38/338 -0.0897 -0.0905 884546 805231 881206 801484 | |
bdcsvd_double_computeThinUV/38/438 -0.0891 -0.0896 969086 882702 965227 878719 | |
bdcsvd_double_computeThinUV/38/538 -0.1708 -0.1712 1118555 927540 1114049 923328 | |
bdcsvd_double_computeThinUV/38/638 -0.1533 -0.1534 1213303 1027309 1208212 1022824 | |
bdcsvd_double_computeThinUV/38/738 -0.2031 -0.2033 1334411 1063345 1329151 1058981 | |
bdcsvd_double_computeThinUV/38/838 -0.1994 -0.1998 1430272 1145130 1424248 1139691 | |
bdcsvd_double_computeThinV/38/38 +0.0027 +0.0025 495862 497220 494012 495256 | |
bdcsvd_double_computeThinV/38/138 +0.0156 +0.0153 526925 535143 524829 532842 | |
bdcsvd_double_computeThinV/38/238 -0.0104 -0.0108 564038 558148 562120 556069 | |
bdcsvd_double_computeThinV/38/338 -0.0942 -0.0950 654858 593161 652641 590671 | |
bdcsvd_double_computeThinV/38/438 -0.1722 -0.1726 746768 618174 743960 615546 | |
bdcsvd_double_computeThinV/38/538 -0.1859 -0.1867 780695 635544 777907 632657 | |
bdcsvd_double_computeThinV/38/638 -0.2150 -0.2149 842719 661564 839380 658989 | |
bdcsvd_double_computeThinV/38/738 -0.2572 -0.2572 926010 687850 922064 684915 | |
bdcsvd_double_computeThinV/38/838 -0.2878 -0.2876 996698 709868 992685 707202 | |
~~~~ | |
##### Larger | |
~~~~ | |
Benchmark Time CPU Time Old Time New CPU Old CPU New | |
------------------------------------------------------------------------------------------------------------------------------------------- | |
bdcsvd_double_computeThinUV/500/500 -0.0223 -0.0218 120934799 118241344 120206450 117587615 | |
bdcsvd_double_computeThinUV/500/1000 -0.0781 -0.0768 168039073 154912570 166927595 154101147 | |
bdcsvd_double_computeThinUV/500/2000 -0.0841 -0.0839 252251583 231030995 250837282 229786319 | |
bdcsvd_double_computeThinUV/500/3000 -0.2933 -0.2931 354488660 250504511 352466612 249157515 | |
bdcsvd_double_computeThinUV/500/4000 -0.3849 -0.3845 498082335 306358967 495251025 304808607 | |
bdcsvd_double_computeThinUV/500/5000 -0.4102 -0.4101 611187632 360497410 607805405 358562461 | |
bdcsvd_double_computeThinUV/500/6000 -0.5858 -0.5855 993030518 411281976 986559546 408897937 | |
bdcsvd_double_computeThinV/500/500 +0.0023 +0.0025 96144720 96369521 95591914 95829350 | |
bdcsvd_double_computeThinV/500/1000 -0.0108 -0.0104 125544195 124185888 124790609 123487358 | |
bdcsvd_double_computeThinV/500/2000 +0.0516 +0.0515 169231449 177963941 168368397 177041849 | |
bdcsvd_double_computeThinV/500/3000 -0.3211 -0.3211 220459022 149679402 219407709 148955709 | |
bdcsvd_double_computeThinV/500/4000 -0.3434 -0.3435 292084731 191771479 290552164 190751061 | |
bdcsvd_double_computeThinV/500/5000 -0.4782 -0.4776 404396172 211012044 401878571 209961360 | |
bdcsvd_double_computeThinV/500/6000 -0.6294 -0.6291 598237028 221678815 594283542 220415778 | |
bdcsvd_complex_double_computeThinUV/500/500 +0.0916 +0.0897 255520963 278933399 254162600 276962257 | |
bdcsvd_complex_double_computeThinUV/500/1000 -0.0046 -0.0063 470148880 467963802 467394496 464450050 | |
bdcsvd_complex_double_computeThinUV/500/2000 +0.0331 +0.0314 917793280 948137549 912521721 941157706 | |
bdcsvd_complex_double_computeThinUV/500/3000 -0.2198 -0.2211 1360012279 1061090288 1351584140 1052804356 | |
bdcsvd_complex_double_computeThinUV/500/4000 -0.3050 -0.3053 1915540819 1331249650 1903755375 1322459094 | |
bdcsvd_complex_double_computeThinUV/500/5000 -0.2804 -0.2804 2421978472 1742944820 2405996094 1731417102 | |
bdcsvd_complex_double_computeThinUV/500/6000 -0.3471 -0.3468 2871235454 1874670684 2851029992 1862315016 | |
bdcsvd_complex_double_computeThinV/500/500 -0.0028 -0.0027 197499887 196955268 196285795 195757475 | |
bdcsvd_complex_double_computeThinV/500/1000 -0.0236 -0.0235 318554734 311042543 316618729 309192952 | |
bdcsvd_complex_double_computeThinV/500/2000 -0.0431 -0.0423 567077441 542655686 563321471 539468600 | |
bdcsvd_complex_double_computeThinV/500/3000 -0.4819 -0.4812 915616107 474385083 909423431 471792729 | |
bdcsvd_complex_double_computeThinV/500/4000 -0.4868 -0.4862 1144582829 587365450 1137211810 584304876 | |
bdcsvd_complex_double_computeThinV/500/5000 -0.5118 -0.5114 1436150418 701125911 1427595270 697484811 | |
bdcsvd_complex_double_computeThinV/500/6000 -0.5330 -0.5328 1733322684 809429173 1722951672 805023436 | |
OVERALL_GEOMEAN -0.2741 -0.2740 0 0 0 0 | |
~~~~ | |
##### Very Tall | |
~~~~ | |
Benchmark Time CPU Time Old Time New CPU Old CPU New | |
---------------------------------------------------------------------------------------------------------------------------------------------- | |
bdcsvd_double_computeThinUV/32/100000 -0.3170 -0.3175 226529586 154717160 225472766 153888702 | |
bdcsvd_double_computeThinUV/264/100000 -0.6096 -0.6095 5149984761 2010317098 5119070612 1998970140 | |
bdcsvd_double_computeThinUV/1024/100000 -0.6538 -0.6530 59780980101 20698847700 59313139187 20578710233 | |
bdcsvd_double_computeThinV/32/100000 -0.2606 -0.2609 258735994 191316443 257246119 190123143 | |
bdcsvd_double_computeThinV/264/100000 -0.6068 -0.6065 5527338202 2173494907 5493199942 2161352169 | |
bdcsvd_double_computeThinV/1024/100000 -0.6419 -0.6411 60740516522 21750376091 60278909057 21633052616 | |
bdcsvd_complex_double_computeThinUV/32/100000 -0.2993 -0.2989 636792072 446202903 633206937 443951859 | |
bdcsvd_complex_double_computeThinUV/264/100000 -0.4889 -0.4878 18326362603 9366890170 18194148881 9318136331 | |
bdcsvd_complex_double_computeThinUV/1024/100000 -0.4089 -0.4083 195544107745 115591737930 194190302999 114900596242 | |
bdcsvd_complex_double_computeThinV/32/100000 -0.4411 -0.4417 368514577 205950727 366569465 204654500 | |
bdcsvd_complex_double_computeThinV/264/100000 -0.6587 -0.6583 12379953027 4225803248 12292426264 4200267389 | |
bdcsvd_complex_double_computeThinV/1024/100000 -0.6676 -0.6669 125078044965 41579651579 124086840510 41338883708 | |
OVERALL_GEOMEAN -0.5259 -0.5255 7 3 7 3 | |
~~~~",Arthur,2022-05-27T02:00:25.878Z,NA,NA,"## Title: | |
Add R-Bidiagonalization step to BDCSVD | |
## Authors: | |
Arthur | |
## Summary: | |
This merge request introduces an R-Bidiagonalization step to the BDCSVD implementation in the Eigen C++ library. This improvement is based on techniques discussed in the ""Matrix Computations"" book by Golub and Van Loan. The addition optimizes the SVD process for tall or wide matrices by leveraging QR decomposition, ultimately enhancing computational efficiency. | |
### Key Changes: | |
- Implemented R-Bidiagonalization as an initial step in the BDCSVD algorithm. | |
- Uses QR decomposition on the input matrix, followed by standard SVD routines on a reduced R matrix. | |
- Includes an application of matrix Q to the left of U or V to achieve the final SVD. | |
### Improvements: | |
- Enhanced performance for tall matrices by reducing the computational work required in the bidiagonalization process. | |
- Benchmarks indicate significant speed improvements, particularly for very large matrices, showing notable reductions in execution time. | |
### Impact: | |
The changes improve the speed of SVD computations in the Eigen library, especially for large and tall matrices, addressing a demand identified in user discussions. The implementation passes the test suite and shows promising performance enhancements across various benchmarks, confirming its utility and effectiveness." | |
987 (https://gitlab.com/libeigen/eigen/-/merge_requests/987),Fix integer shortening warnings in visitor tests.,NA,Rasmus Munk Larsen,2022-05-27T18:51:38.136Z,NA,NA,"## Title: | |
Fix integer shortening warnings in visitor tests. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request addresses and fixes integer shortening warnings that were present in the visitor tests of the Eigen C++ library. | |
### Key Changes: | |
- Resolved integer shortening warnings in visitor tests. | |
### Improvements: | |
- Enhanced code quality by eliminating warnings, leading to cleaner and safer code. | |
### Impact: | |
- A reduction in compiler warnings, which can improve maintainability and readability of the codebase, ensuring that future contributions encounter fewer issues related to integer type handling." | |
972 (https://gitlab.com/libeigen/eigen/-/merge_requests/972),Add AVX512 s/dgemm optimizations for compute kernel (2nd try),"This is a follow up to resolve issues that pop up after [!860](https://gitlab.com/libeigen/eigen/-/merge_requests/860) was merged but reverted. | |
It addresses the following: | |
* Build issue on NEON64 | |
* Rename/move of transpose for trsm | |
* Rename member in data mapper to include to something other than `incr` to avoid shadowing | |
I'm keeping the commits separate for now to help with readability. I will squash all of them on final rebase.",aaraujom,2022-05-28T02:00:22.752Z,NA,NA,"## Title: | |
Add AVX512 s/dgemm optimizations for compute kernel (2nd try) | |
## Authors: | |
aaraujom | |
## Summary: | |
This merge request implements optimizations for the AVX512 s/dgemm compute kernel, addressing several issues that arose after a previous merge which was subsequently reverted. | |
### Key Changes: | |
- Resolved a build issue on NEON64. | |
- Renamed and moved transpose functionality for trsm. | |
- Changed a member name in the data mapper to avoid naming conflicts with `incr`. | |
### Improvements: | |
- Enhancements ensure better compatibility and functionality across different architectures, specifically addressing issues that emerged in previous implementations. | |
### Impact: | |
These changes improve the stability and performance of the compute kernel, particularly for users operating on NEON64 architecture, while also clarifying code structure." | |
981 (https://gitlab.com/libeigen/eigen/-/merge_requests/981),Adding an MKL adapter in FFT module.,"Fix missing template argument bug in FFT header for mkl adapter. | |
Add adapter inplementations for mkl, kfr and ffts FFT libraries. oneAPI mkl is now free both dor Winlows and Linux and gives performance similar to fftw for Intel CPU.kfr FFT gives performanse similar to fftw. It requires either GPL or comercial license. FFTS gives good performans. There is no developers activity for last three years. Another drawback is thet FFTS has not completed real to complex and 2d transformation for double precission. For 1d real to complex and backward transformation I have to convert real array to complex with Im = 0.",Oleg Shirokobrod,2022-06-02T18:10:44.351Z,NA,NA,"## Title: | |
Adding an MKL adapter in FFT module. | |
## Authors: | |
Oleg Shirokobrod | |
## Summary: | |
This merge request introduces an MKL adapter to the FFT module, addressing a bug related to missing template arguments in the FFT header for the MKL adapter. Additionally, it includes adapter implementations for the MKL, KFR, and FFTS FFT libraries. | |
### Key Changes: | |
- Fixed a missing template argument bug in the FFT header for the MKL adapter. | |
- Added adapter implementations for MKL, KFR, and FFTS FFT libraries. | |
### Improvements: | |
- The oneAPI MKL is now available for both Windows and Linux, offering performance comparable to FFTW on Intel CPUs. | |
- The KFR FFT library provides performance akin to FFTW but requires a GPL or commercial license. | |
- The FFTS library shows decent performance, although it has not had developer activity in the past three years and lacks completion for certain transformations. | |
### Impact: | |
The addition of the MKL adapter and other FFT library implementations enhances the flexibility and performance options available in the Eigen C++ library for FFT operations, catering to different licensing requirements and performance needs." | |
989 (https://gitlab.com/libeigen/eigen/-/merge_requests/989),Fix c++20 ambiguity of comparisons.,"Via Google core library team: usage of comparison operators and | |
templates are now ambiguous in c++20 due to operator reversal. This change | |
resolves the ambiguity.",Antonio Sánchez,2022-06-03T05:11:07.793Z,NA,NA,"## Title: | |
Fix c++20 ambiguity of comparisons. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses the issue of ambiguity in comparison operators and templates introduced by C++20 due to operator reversal. The changes made resolve these ambiguities, ensuring clearer behavior in comparisons. | |
### Key Changes: | |
- Resolution of ambiguity in comparison operators as per C++20 standards. | |
### Improvements: | |
- Enhanced clarity in the behavior of comparison templates, preventing potential errors in code that relies on these operations. | |
### Impact: | |
- The changes improve compliance with C++20, contributing to more robust and reliable code within the Eigen library, minimizing the risk of unexpected behavior in user applications." | |
988 (https://gitlab.com/libeigen/eigen/-/merge_requests/988),Fix build issues with MSVC for AVX512,"There are couple build issues with MSVC currently. I've seen it in this [pipeline](https://gitlab.com/libeigen/eigen_ci_cross_testing/-/jobs/2516428634). This addresses the ones I've seen so far. Test are still building. MSVC is being real slow to build and consuming quite a bit of memory. | |
--- | |
~Edit: I've seen a single process of `cl.exe` taking more than 60GB of memory. I think it will be more prudent to disable the recent optimizations added for AVX512 (GemmKernel and TrsmKernel) to avoid build issues on Windows with MSVC. I think this might be related to the template recursion used in both kernels. Compilation aborts with fatal error below.~ | |
``` | |
fatal error C1002: compiler is out of heap space in pass 2 | |
```",aaraujom,2022-06-03T14:55:41.443Z,NA,NA,"## Title: | |
Fix build issues with MSVC for AVX512 | |
## Authors: | |
aaraujom | |
## Summary: | |
This merge request addresses build issues encountered when using MSVC for AVX512 in the Eigen C++ library. The author has identified several problems related to high memory consumption and slow build times. To mitigate these issues, particularly those causing compilation failures, recent optimizations for AVX512 have been temporarily disabled. | |
### Key Changes: | |
- Identification and addressing of build issues with MSVC for AVX512-related components. | |
- Temporary disabling of optimizations for GemmKernel and TrsmKernel. | |
### Improvements: | |
- Reduction in memory consumption during the build process by avoiding optimizer-related failures. | |
- Enhancement of stability in the build process under MSVC, preventing fatal compilation errors. | |
### Impact: | |
- Improved build reliability for users compiling the Eigen library with MSVC, particularly in environments where AVX512 optimizations were causing resource exhaustion. | |
- Mitigation of slow build processes that could hinder development and integration efforts." | |
990 (https://gitlab.com/libeigen/eigen/-/merge_requests/990),Provide DiagonalMatrix Product and Initializers,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This includes two tiny additions to DiagonalMatrix: 1. diag*diag products and 2. static initializers for zero and identity. | |
~~~~ | |
// dumb example | |
DiagType m = ...; | |
DiagType n = ...; | |
DiagType B = (m * n) + 2 * DiagType::Identity(); | |
~~~~ | |
### Additional information | |
<!--Any additional information you think is important.--> | |
These are just things I've tried before and sort of assumed would work. Just for convenience, so nbd if stuff like this isn't desired! | |
I'm guessing DiagonalMatrix is meant to be lightweight since most stuff is pretty easy to do with vector ops on ``.diagonal()``? Maybe there could be other methods for stuff like operatorNorm, determinant, etc as well that have simple, but not totally obvious, implementations for diagonal matrices :shrug:",Arthur,2022-06-06T21:43:22.741Z,NA,NA,"## Title: | |
Provide DiagonalMatrix Product and Initializers | |
## Authors: | |
Arthur | |
## Summary: | |
This merge request introduces two enhancements to the DiagonalMatrix class in the Eigen C++ library: the ability to perform diagonal matrix multiplication and the addition of static initializers for zero and identity matrices. | |
### Key Changes: | |
1. Implements the product operation for diagonal matrices (diag * diag). | |
2. Adds static initializers for zero and identity diagonal matrices. | |
### Improvements: | |
These changes make it easier to work with diagonal matrices by simplifying multiplication and providing convenient initialization options. | |
### Impact: | |
The enhancements improve the usability of the DiagonalMatrix class, allowing for more straightforward operations and potentially reducing the need for additional computations when working with diagonal structures in matrix algebra." | |
991 (https://gitlab.com/libeigen/eigen/-/merge_requests/991),Fix ambiguous comparisons for c++20 (again again),"C++20 introduces a reversibility lookup for comparison operators, | |
which leads to ambiguous comparison warnings in clang. | |
Modify comparisons in `TensorBase` to be symmetric. | |
This is a redo of !982 and !989.",Antonio Sánchez,2022-06-07T17:06:18.241Z,NA,NA,"## Title: | |
Fix ambiguous comparisons for c++20 (again again) | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses issues with ambiguous comparison warnings in clang due to changes in C++20, specifically regarding reversibility lookup for comparison operators. The modifications focus on making the comparisons in `TensorBase` symmetric. | |
### Key Changes: | |
- Adjusted comparison operators in `TensorBase` to resolve ambiguities introduced by C++20. | |
### Improvements: | |
- Enhanced clarity and functionality of comparison operations within the library, leading to more predictable behavior during comparisons. | |
### Impact: | |
- Reduces compilation warnings in clang, improving the developer experience while working with Eigen, particularly for C++20 users. This change is expected to maintain code robustness and prevent potential comparison-related bugs." | |
993 (https://gitlab.com/libeigen/eigen/-/merge_requests/993),Fix row vs column vector typo in Matrix class tutorial,"In the matrix class tutorial, the role of column- and row-vector is swapped in one of the example code snippets. This is just a typo, but it occurs at a very prominent place and might confuse first-time users of Eigen.",sfalmo,2022-06-07T17:28:20.125Z,NA,NA,"## Title: | |
Fix row vs column vector typo in Matrix class tutorial | |
## Authors: | |
sfalmo | |
## Summary: | |
This merge request corrects a typo in the Matrix class tutorial of the Eigen C++ library, where the roles of row-vectors and column-vectors are mistakenly swapped in an example code snippet. | |
### Key Changes: | |
- Corrected the swap of row- and column-vector terminology in the tutorial. | |
### Improvements: | |
- Enhanced clarity and accuracy in the documentation, making it more user-friendly for first-time users. | |
### Impact: | |
- Reduces potential confusion for new users, ensuring that they receive correct information when learning about the Matrix class, thus improving their overall experience with the Eigen library." | |
992 (https://gitlab.com/libeigen/eigen/-/merge_requests/992),AVX512 TRSM Kernels respect EIGEN_NO_MALLOC,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
Addresses `malloc` comments in https://gitlab.com/libeigen/eigen/-/merge_requests/988. Switched `malloc` calls to eigen's `handmade` versions. To respect `EIGEN_NO_MALLOC`, the `trsmKernelL` kernels are disabled if `malloc` is not allowed. The previous struct `trsm_kernels` is split into `trsmKernelL`/`trsmKernelR` to make disabling the left-variant kernels simpler. `EIGEN_USE_AVX512_TRSM_KERNELS` and `EIGEN_ENABLE_AVX512_NOCOPY_TRSM_CUTOFFS` macros are split apart similarly. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
It seems that the general triangular solve driver does not fully support `EIGEN_NO_MALLOC` even with the AVX512 optimizations (GEBP and TRSM) disabled. From some quick testing, I saw that for double precision and moderate sized problems (> M=N=~200), eigen's `check_that_malloc_is_allowed()` fails.",b-shi,2022-06-07T18:53:54.510Z,NA,NA,"## Title: | |
AVX512 TRSM Kernels respect EIGEN_NO_MALLOC | |
## Authors: | |
b-shi | |
## Summary: | |
This merge request enhances the AVX512 TRSM kernels in the Eigen C++ library by ensuring they adhere to the `EIGEN_NO_MALLOC` configuration, implementing changes to reduce `malloc` usage and improve memory management. | |
### Key Changes: | |
- Switched from standard `malloc` calls to Eigen’s custom memory allocation methods. | |
- Introduced a mechanism to disable `trsmKernelL` kernels when `malloc` is not permitted under `EIGEN_NO_MALLOC`. | |
- Split the previous `trsm_kernels` struct into `trsmKernelL` and `trsmKernelR` for easier disabling of left-variant kernels. | |
- Separated the macros `EIGEN_USE_AVX512_TRSM_KERNELS` and `EIGEN_ENABLE_AVX512_NOCOPY_TRSM_CUTOFFS` to provide clearer configuration options. | |
### Improvements: | |
- The new structure and memory handling enhance the library’s compliance with user-defined malloc settings, potentially reducing memory allocation issues for users who specify `EIGEN_NO_MALLOC`. | |
### Impact: | |
This update improves the robustness of the Eigen library when used in environments where dynamic memory allocation is restricted, thus broadening its applicability in constrained memory scenarios. However, the general triangular solve driver may still not fully support `EIGEN_NO_MALLOC`, which could affect scenarios using AVX512 optimizations." | |
994 (https://gitlab.com/libeigen/eigen/-/merge_requests/994),Mark `index_remap` as `EIGEN_DEVICE_FUNC` in `src/Core/Reshaped.h` (Fixes #2493),"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #2493 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Marking `index_remap` as `EIGEN_DEVICE_FUNC` allows usage of expression reshape in GPU code. | |
### Additional information | |
<!--Any additional information you think is important.-->",Binhao Qin,2022-06-07T20:10:48.427Z,NA,NA,"## Title: | |
Mark `index_remap` as `EIGEN_DEVICE_FUNC` in `src/Core/Reshaped.h` (Fixes #2493) | |
## Authors: | |
Binhao Qin | |
## Summary: | |
This merge request enhances the Eigen C++ library by modifying the `index_remap` function within the `src/Core/Reshaped.h` file, marking it as `EIGEN_DEVICE_FUNC`. This change facilitates the utilization of expression reshaping in GPU code. | |
### Key Changes: | |
- Marked `index_remap` as `EIGEN_DEVICE_FUNC`. | |
### Improvements: | |
- Enables the use of expression reshaping in GPU contexts, improving performance and flexibility for device computation. | |
### Impact: | |
- This change allows Eigen to better leverage GPU capabilities, unlocking improved computational efficiency when handling reshaped expressions on devices." | |
995 (https://gitlab.com/libeigen/eigen/-/merge_requests/995),Document DiagonalBase,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
DiagonalBase is undocumented and doesn't show up in the [class lists](https://eigen.tuxfamily.org/dox/group__Core__Module.html). It has the math-y diagonal methods so I think it's important that it has some docs. I tried to base these off of other docs I saw in MatrixBase. | |
I also included some cleanup and clang-formatted the updated portions. | |
The type aliases are kind of ugly, but seem necessary because the return types can be very long and easily go past the end of my screen in the doc website. IDK if there's a better way to handle that. | |
Happy to make any changes!",Arthur,2022-06-08T17:46:32.660Z,NA,NA,"## Title: | |
Document DiagonalBase | |
## Authors: | |
Arthur | |
## Summary: | |
This merge request focuses on adding documentation for the undocumented DiagonalBase class in the Eigen C++ library. The updates aim to enhance clarity and usability by organizing information related to its mathematical methods. | |
### Key Changes: | |
- Added documentation for the DiagonalBase class, addressing its mathematical functions. | |
- Improved integration of DiagonalBase in the class listings on the documentation website. | |
- Included code cleanup and formatted the changed portions according to Clang standards. | |
### Improvements: | |
- Enhanced visibility of DiagonalBase through comprehensive documentation, making it easier for users to understand and utilize its features. | |
- Addressed long return types with type aliases to maintain readability in the documentation. | |
### Impact: | |
The addition of documentation for DiagonalBase helps users better understand the functionalities available, thus improving the overall usability of the Eigen library. It facilitates developers’ ability to leverage DiagonalBase effectively in their applications." | |
996 (https://gitlab.com/libeigen/eigen/-/merge_requests/996),[SYCL-Spec] According to [SYCL-2020 spec](...,"According to [SYCL-2020 spec]( https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:naming.kernels), the types used for kernel names be a C++ type and be forward declarable. | |
Since Eigen is the expression tree-based kernel, the expression type will be used as the | |
name of the kernel. When an enum is used in the kernel's name, the integration header in SYCL must forward enum. [To do this forward declaration](https://docs.microsoft.com/en-us/cpp/cpp/enumerations-cpp?view=msvc-170), either the unscoped enum must have an underlying type specified or the scoped enum (e.g enum class) should be used. however, using scoped enum requires the enumerator to be qualified by enum type(e.g `EnumCLASS::VALUE`) which can be a more intrusive change. This PR inherits the enum from int to be less intrusive and compliant with the SYCL2020 spec for kernel name.",Mehdi Goli,2022-06-13T15:52:30.230Z,NA,NA,"## Title: | |
SYCL-Spec Compliance for Kernel Names | |
## Authors: | |
Mehdi Goli | |
## Summary: | |
This merge request updates the Eigen C++ library to ensure compliance with the SYCL-2020 specification regarding kernel naming conventions. The modification involves using appropriate types for kernel names, which are derived from the expression tree within the Eigen framework. | |
### Key Changes: | |
- Adjusted kernel naming to utilize C++ types that are forward declarable. | |
- Inherited enumerations from `int` rather than employing scoped enums to minimize intrusive changes. | |
### Improvements: | |
- Enhanced compatibility with SYCL-2020 specifications, particularly in how kernels are named using enumeration types. | |
### Impact: | |
- Ensures better integration with SYCL, resulting in more robust and standard-compliant code for users leveraging SYCL features in the Eigen library." | |
998 (https://gitlab.com/libeigen/eigen/-/merge_requests/998),Fix tanh and erf to use vectorized version for EIGEN_FAST_MATH in VSX.,Fix tanh and erf to use vectorized version for EIGEN_FAST_MATH in VSX.,Chip Kerchner,2022-06-15T16:06:44.934Z,NA,NA,"## Title: | |
Fix tanh and erf to use vectorized version for EIGEN_FAST_MATH in VSX. | |
## Authors: | |
Chip Kerchner | |
## Summary: | |
This merge request addresses the implementation of the hyperbolic tangent (tanh) and error function (erf) in the Eigen C++ library, ensuring they leverage a vectorized approach when the EIGEN_FAST_MATH option is enabled in the VSX architecture. | |
### Key Changes: | |
- Updated the implementations of the tanh and erf functions to utilize vectorized operations specifically for EIGEN_FAST_MATH in the VSX environment. | |
### Improvements: | |
- Enhanced computational efficiency for the tanh and erf functions by using vectorized versions, which can lead to faster execution times. | |
### Impact: | |
- Users on the VSX architecture will benefit from improved performance in calculations involving tanh and erf, making mathematical operations in Eigen more efficient." | |
997 (https://gitlab.com/libeigen/eigen/-/merge_requests/997),AVX512 TRSM kernels use alloca if EIGEN_NO_MALLOC requested,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
Follow-up PR to address comments in https://gitlab.com/libeigen/eigen/-/merge_requests/992. In that PR, LHS variants of TRSM kernels are disabled if `EIGEN_NO_MALLOC` is requested. In particular the use of `alloca` was suggested [here](https://gitlab.com/libeigen/eigen/-/merge_requests/992#note_974476732) instead of completely disabling the LHS variant AVX512 TRSM kernels. | |
This PR changes the behaviour as follows: | |
- If `EIGEN_NO_MALLOC` is requested: | |
- If max temp workspace size using default blocking sizes is less than `EIGEN_STACK_ALLOCATION_LIMIT` then use `alloca`. | |
- Otherwise, reduce blocking size up to the minimum supported then use `alloca` (perf. is still better than generic trsm kernel, see graph below) | |
- If max temp workspace size using minimum blocking sizes is still larger than `EIGEN_STACK_ALLOCATION_LIMIT` then throw assertion. | |
- If `EIGEN_NO_MALLOC` is not requested we use `handmade_aligned_malloc` | |
### Additional information | |
There is a noticeable performance hit (see graph below) when using `alloca` vs `malloc`, so `malloc` is still used if allowed. | |
 | |
- Non-optimized: generic trsm kernels, code-path used when `EIGEN_NO_MALLOC` is requested (behaviour as of https://gitlab.com/libeigen/eigen/-/merge_requests/992) | |
- Min-blocking: AVX512 trsm kernels with minimum required blocking sizes + `alloca`. | |
- Default-blocking: AVX512 trsm kernels with default blocking sizes + `alloca`. | |
- Malloc: Default-blocking: AVX512 trsm kernels with default blocking sizes + `malloc`.",b-shi,2022-06-17T18:05:27.791Z,NA,NA,"## Title: | |
AVX512 TRSM kernels use alloca if EIGEN_NO_MALLOC requested | |
## Authors: | |
b-shi | |
## Summary: | |
This merge request introduces changes to the AVX512 TRSM kernels in the Eigen C++ library. It allows for the use of `alloca` instead of completely disabling the left-hand side (LHS) variants of TRSM kernels when memory allocation is restricted by the `EIGEN_NO_MALLOC` flag. | |
### Key Changes: | |
- Introduced behavior where `alloca` is used for temporary workspace allocations when `EIGEN_NO_MALLOC` is requested. | |
- If the maximum temporary workspace size using the default blocking sizes is below the `EIGEN_STACK_ALLOCATION_LIMIT`, `alloca` is utilized. | |
- If it exceeds the limit, the blocking size is reduced, and `alloca` is still employed up to the minimum supported size. | |
- An assertion is triggered if the workspace size exceeds the limit even after adjusting the blocking size. | |
- When `EIGEN_NO_MALLOC` is not set, `handmade_aligned_malloc` is used. | |
### Improvements: | |
- Enhanced memory management in the AVX512 TRSM kernels by allowing stack allocation when dynamic allocation is disabled. | |
- Performance retention by ensuring that even when using `alloca`, the performance remains better compared to generic TRSM kernels. | |
### Impact: | |
The merge request improves usability for scenarios where dynamic memory allocation is restricted, helping maintain performance efficiencies in the Eigen library's AVX512 implementation. It addresses previous limitations by providing a method to allocate necessary resources without reverting to slower generic implementations." | |
999 (https://gitlab.com/libeigen/eigen/-/merge_requests/999),Use numext::sqrt in Householder.h.,"This is to make it easier to apply to custom types - by using the | |
`numext` version, the user can specialize the function more easily. | |
Otherwise we require a `sqrt` function to be defined prior to including | |
the Eigen headers, which can be awkward and lead to header-include-order | |
issues. | |
Related to #2496",Antonio Sánchez,2022-06-21T16:30:00.070Z,NA,NA,"## Title: | |
Use numext::sqrt in Householder.h. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request updates the Householder.h file in the Eigen C++ library to use the `numext::sqrt` function. This change aims to facilitate the application of custom types by allowing users to specialize the square root function more easily. | |
### Key Changes: | |
- Replaced the existing square root function in Householder.h with `numext::sqrt`. | |
### Improvements: | |
- Simplifies the use of custom types with Eigen by reducing dependencies on user-defined `sqrt` functions prior to including Eigen headers. | |
### Impact: | |
- Mitigates potential header-include-order issues, leading to a more user-friendly experience when integrating custom types with Eigen." | |
1003 (https://gitlab.com/libeigen/eigen/-/merge_requests/1003),Eliminate undef warnings when not compiling for AVX512.,"The original code assumes some macros are defined, when they are only | |
ever defined for AVX512. Here we add some guards to eliminate the | |
warnings.",Antonio Sánchez,2022-06-24T15:10:11.268Z,NA,NA,"## Title: | |
Eliminate undef warnings when not compiling for AVX512. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses undefined warnings in the Eigen C++ library that arise when compiling without AVX512 support. By adding appropriate guards, the code ensures that these warnings are eliminated. | |
### Key Changes: | |
- Introduced guards to prevent undefined warnings when certain macros are not defined outside of AVX512 compilation. | |
### Improvements: | |
- Enhanced code stability by removing unnecessary compiler warnings, leading to cleaner build outputs. | |
### Impact: | |
- Improved user experience during compilation by eliminating confusing warnings, making it easier for developers to identify genuine issues in the code." | |
1001 (https://gitlab.com/libeigen/eigen/-/merge_requests/1001),Skip f16/bf16 bessel specializations on AVX512 if unavailable.,"The bessel functions are not available for AVX512 on msvc prior to 1923 (VS 2019) | |
or old versions of gcc (prior to 5.3). This causes a build error since | |
`pexp` is not available for these half->float specializations. | |
Fixes #2499.",Antonio Sánchez,2022-06-24T15:10:37.063Z,NA,NA,"## Title: | |
Skip f16/bf16 bessel specializations on AVX512 if unavailable. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses a build issue related to the Bessel functions for f16/bf16 specializations on AVX512. It ensures compatibility by skipping these specializations when the required support is unavailable in certain compiler versions. | |
### Key Changes: | |
- Added a check to skip the f16/bf16 Bessel function specializations on AVX512 for MSVC prior to version 1923 and GCC versions earlier than 5.3. | |
### Improvements: | |
- Resolves build errors caused by missing `pexp` functions in unsupported compiler environments. | |
### Impact: | |
- Enhances the portability of the Eigen library, allowing it to build successfully on a wider range of compilers and settings, thereby improving user experience and reducing potential build failures." | |
1002 (https://gitlab.com/libeigen/eigen/-/merge_requests/1002),Fix clang-tidy warnings about function definitions in headers.,"Clang-tidy is generating warnings for this. | |
Also clang-formatted, since the weird indenting made the original hard to read.",Antonio Sánchez,2022-06-24T15:19:57.232Z,NA,NA,"## Title: | |
Fix clang-tidy warnings about function definitions in headers. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses and resolves warnings generated by clang-tidy regarding function definitions in header files. In addition, the code has been reformatted using clang-format to enhance readability. | |
### Key Changes: | |
- Resolved clang-tidy warnings related to function definitions in headers. | |
- Reformatted the affected code for improved readability. | |
### Improvements: | |
- Enhanced code clarity and structure through proper formatting. | |
- Reduced potential issues by adhering to clang-tidy's recommendations. | |
### Impact: | |
The changes improve code maintainability and reduce warning noise, leading to a cleaner codebase for future development." | |
1000 (https://gitlab.com/libeigen/eigen/-/merge_requests/1000),Better performance for Power10 using more load and store vector pairs for GEMV,Better performance for Power10 using more load and store vector pairs,Chip Kerchner,2022-06-27T18:11:56.165Z,NA,NA,"## Title: | |
Better performance for Power10 using more load and store vector pairs for GEMV | |
## Authors: | |
Chip Kerchner | |
## Summary: | |
This merge request introduces optimizations aimed at enhancing the performance of the Eigen C++ library specifically for Power10 architecture. By increasing the number of load and store vector pairs utilized in General Matrix-Vector multiplication (GEMV), the changes are expected to leverage the unique capabilities of Power10 processors. | |
### Key Changes: | |
- Increased the number of load and store vector pairs in GEMV operations tailored for Power10. | |
### Improvements: | |
- Improved efficiency of GEMV computations on Power10, potentially leading to faster execution times for matrix-vector multiplications. | |
### Impact: | |
- The optimizations may lead to significant performance enhancements for applications and libraries utilizing Eigen on Power10 systems, benefiting users with faster computation times in relevant scenarios." | |
947 (https://gitlab.com/libeigen/eigen/-/merge_requests/947),"Add pload_partial, pstore_partial (and unaligned versions), pgather_partial, pscatter_partial, loadPacketPartial and storePacketPartial.","Add ploadN, pstoreN (and unaligned versions), pgatherN, pscatterN, loadPacketN and storePacketN. | |
Useful for: | |
1) memory access - prevent reading/writing past end of data (only elements needed), | |
2) performance - eliminates masking, one Packet vs N scalars, less complexity for edge condition functions/templates (better i-cache), etc. | |
3) partial Packet operations - simplified Packet operations instead of read scalars, merge with Packet, operation, get scalar, write scalars. | |
4) consistent results - reduces variations for scalar vs packet operations",Chip Kerchner,2022-06-27T19:18:01.129Z,NA,NA,"## Title: | |
Add pload_partial, pstore_partial (and unaligned versions), pgather_partial, pscatter_partial, loadPacketPartial and storePacketPartial. | |
## Authors: | |
Chip Kerchner | |
## Summary: | |
This merge request introduces new functionalities for memory access and performance optimization by adding partial loading and storing operations to the Eigen C++ library. These enhancements focus on efficient handling of data by accessing only the necessary elements and streamlining operations with improved consistency between scalar and packet operations. | |
### Key Changes: | |
- Added `pload_partial`, `pstore_partial`, `pgather_partial`, `pscatter_partial`, and their unaligned versions. | |
- Introduced `loadPacketPartial` and `storePacketPartial`. | |
- Included new functions for `ploadN`, `pstoreN`, `pgatherN`, `pscatterN`, `loadPacketN`, and `storePacketN`. | |
### Improvements: | |
- Enhanced memory access by avoiding reading or writing beyond data boundaries. | |
- Improved performance by minimizing masking and using a single Packet instead of multiple scalars. | |
- Simplified edge condition handling, leading to better utilization of instruction cache (i-cache). | |
- Provided consistent results across scalar and packet operations. | |
### Impact: | |
The changes result in optimized memory usage, better execution speed, and more consistent behavior in operations, making the Eigen library more efficient for both developers and end-users." | |
1005 (https://gitlab.com/libeigen/eigen/-/merge_requests/1005),Enable subtests which use device side malloc since this has been fixed in ROCm 5.2.,"Device side malloc functionality has been fixed in the recently released ROCm 5.2 so reenable the unit tests that use it. | |
/cc @cantonios",Rohit Santhanam,2022-06-29T21:52:08.511Z,NA,NA,"## Title: | |
Enable subtests which use device side malloc since this has been fixed in ROCm 5.2. | |
## Authors: | |
Rohit Santhanam | |
## Summary: | |
This merge request re-enables unit tests that utilize device side malloc functionality following its fix in ROCm 5.2. | |
### Key Changes: | |
- Reenabled unit tests that use device side malloc. | |
### Improvements: | |
- Leverages the stability of device side malloc in ROCm 5.2 for improved testing. | |
### Impact: | |
- Enhances the testing framework by validating functionality that was previously disabled due to issues, ensuring better reliability of device functionalities within the Eigen library." | |
1007 (https://gitlab.com/libeigen/eigen/-/merge_requests/1007),Fix ODR violations.,"This declaration in a header: | |
``` | |
typedef enum { ... } E; | |
``` | |
creates a new unnamed type every time that header is `#include`d, | |
resulting in an (undiagnosed) ODR violation. | |
With header modules such ODR violations cause build failure with a cryptic | |
error message. | |
Fix this by creating a named type instead. | |
Courtesy of Paul Pluzhnikov.",Antonio Sánchez,2022-07-09T04:56:37.478Z,NA,NA,"## Title: | |
Fix ODR violations. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses the issue of One Definition Rule (ODR) violations caused by the repeated declaration of an unnamed type in a header. The fix involves changing the declaration to create a named type, reducing the risk of build failures associated with header modules. | |
### Key Changes: | |
- Replaced the unnamed type declaration with a named type in the header file. | |
### Improvements: | |
- Avoids undiagnosed ODR violations which can lead to build failures with unclear error messages. | |
### Impact: | |
- Enhances the stability and clarity of the codebase by preventing potential build issues linked to ODR violations." | |
1006 (https://gitlab.com/libeigen/eigen/-/merge_requests/1006),"AutoDiff depends on Core, so include appropriate header.","Our other top-level headers include their dependencies, so this probably | |
should too.",Antonio Sánchez,2022-07-09T23:57:10.476Z,NA,NA,"## Title: | |
AutoDiff depends on Core, so include appropriate header. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses the inclusion of a necessary header file for the AutoDiff module within the Eigen C++ library, aligning it with the approach used for other top-level headers. | |
### Key Changes: | |
- Included the appropriate header for the AutoDiff module to ensure it properly depends on the Core functionality. | |
### Improvements: | |
- Enhanced consistency in header file management across the library. | |
### Impact: | |
- This change improves the reliability and maintainability of the AutoDiff module by ensuring all required dependencies are explicitly included." | |
1009 (https://gitlab.com/libeigen/eigen/-/merge_requests/1009),Fix wrong doxygen group usage,Fix wrong usage of doxygen groups,Mathieu Westphal,2022-07-12T15:17:58.052Z,NA,NA,"## Title: | |
Fix wrong doxygen group usage | |
## Authors: | |
Mathieu Westphal | |
## Summary: | |
This merge request addresses the incorrect usage of Doxygen groups within the Eigen C++ library documentation. | |
### Key Changes: | |
- Correction of the Doxygen group usage in the documentation. | |
### Improvements: | |
- Enhanced clarity and structure of the library's documentation through proper grouping. | |
### Impact: | |
- Improved documentation quality, which facilitates better understanding and usage of the library for developers and users." | |
1013 (https://gitlab.com/libeigen/eigen/-/merge_requests/1013),Add option to disable avx512 GEBP kernels,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
Issue mentioend [here](https://gitlab.com/libeigen/eigen/-/merge_requests/972#note_1022907267). | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This is a quick fix by allowing enabling/disabling of AVX512 GEBP kernels via compiler flag. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
Fixed some some `undef` warnings when avx512 trsm kernels are enabled, but not using `clang`.",b-shi,2022-07-18T17:59:09.960Z,NA,NA,"## Title: | |
Add option to disable AVX512 GEBP kernels | |
## Authors: | |
b-shi | |
## Summary: | |
This merge request introduces a feature allowing users to enable or disable AVX512 General Matrix Multiplication (GEBP) kernels using a compiler flag. It also addresses some warnings related to the AVX512 triangular solve (TRSM) kernels when using certain compilers. | |
### Key Changes: | |
- Introduced a compiler flag to enable/disable AVX512 GEBP kernels. | |
### Improvements: | |
- Resolved `undef` warnings associated with AVX512 TRSM kernels when not utilized with Clang. | |
### Impact: | |
This change enhances user flexibility in configuring the Eigen library for various compiler settings, potentially improving compatibility and reducing warning messages related to AVX512 features." | |
1014 (https://gitlab.com/libeigen/eigen/-/merge_requests/1014),Fix aligned_realloc to call check_that_malloc_is_allowed() if ptr == 0,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
The macros `EIGEN_RUNTIME_NO_MALLOC` and `EIGEN_NO_MALLOC` help developers detect dynamic memory allocations requested by Eigen by triggering as assertion. It is possible to circumvent this behavior by declaring an empty object and subsequently resizing. For example: `VectorXd x; x.conservativeResize(100);` `conservativeResize()` internally calls `std::realloc(ptr,new_size)`, which is equivalent to `std::malloc(new_size)` if `ptr == 0`. This is fixed by defering to `aligned_malloc` if `ptr == 0`. | |
### Additional information | |
https://godbolt.org/z/Pb6xzdzbP",Charles Schlosser,2022-07-19T20:59:07.769Z,NA,NA,"## Title: | |
Fix aligned_realloc to call check_that_malloc_is_allowed() if ptr == 0 | |
## Authors: | |
Charles Schlosser | |
## Summary: | |
This merge request addresses an issue in the Eigen C++ library regarding the handling of dynamic memory allocations when resizing vectors. Specifically, it ensures that the allocator checks are correctly applied when a pointer is null. | |
### Key Changes: | |
- Updated `aligned_realloc` to call `check_that_malloc_is_allowed()` when the pointer (`ptr`) passed is zero. This resolves an issue where developers could bypass memory allocation checks by resizing an empty vector. | |
### Improvements: | |
- Reinforces the integrity of memory management by maintaining the constraints imposed by the macros `EIGEN_RUNTIME_NO_MALLOC` and `EIGEN_NO_MALLOC`. | |
### Impact: | |
- Enhances the library's error checking for memory allocations, preventing potential misuse that could lead to unintended dynamic memory allocations in applications relying on those macros. This ensures better adherence to memory management practices within the Eigen library." | |
1015 (https://gitlab.com/libeigen/eigen/-/merge_requests/1015),Disable AVX512 GEMM kernels by default.,"They are causing segfaults in application. Bug reproducer to be | |
investigated.",Antonio Sánchez,2022-07-20T21:22:48.653Z,NA,NA,"## Title: | |
Disable AVX512 GEMM kernels by default. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request implements a change to disable AVX512 GEMM (General Matrix Multiply) kernels by default due to the occurrence of segmentation faults in applications. A bug reproducer is to be investigated to identify the underlying issue. | |
### Key Changes: | |
- AVX512 GEMM kernels are now disabled by default. | |
### Improvements: | |
- Enhances stability by preventing applications from crashing due to segmentation faults related to the AVX512 GEMM implementation. | |
### Impact: | |
- Users will experience fewer crashes and improved reliability in applications that rely on the Eigen C++ library, pending the resolution of the underlying bug." | |
1016 (https://gitlab.com/libeigen/eigen/-/merge_requests/1016),Include immintrin.h header for enscripten.,"The other headers seem to fail. | |
Fixes #2514.",Antonio Sánchez,2022-07-22T02:27:43.397Z,NA,NA,"## Title: | |
Include immintrin.h header for enscripten. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an issue related to the inclusion of the `immintrin.h` header file for the Emscripten environment, which was failing due to dependencies on other headers. | |
### Key Changes: | |
- Added `immintrin.h` header to the Emscripten build process. | |
### Improvements: | |
- Resolves header inclusion issues that were affecting compatibility with Emscripten. | |
### Impact: | |
- Enhances the functionality and compatibility of the Eigen C++ library when used with the Emscripten compiler, ensuring smoother compilation and execution in web environments. Fixes a specific issue tracked under #2514." | |
978 (https://gitlab.com/libeigen/eigen/-/merge_requests/978),Add Sparse Subset of Matrix Inverse,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! | |
--> | |
### What does this implement/fix? | |
Certain problems require access to specific elements of the inverse of a sparse matrix. Calculating the full inverse will usually result in a dense matrix, and the other option of calculating just a single column of the inverse can quickly become expensive and tends towards needing the whole matrix if eg. block diagonal elements of the inverse are needed. | |
This MR implements a method to efficiently calculate a sparse subset of the inverse, corresponding to the dense elements in an LU decomposition, plus any additional elements required. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
The Takahashi method (https://dl.acm.org/doi/10.1145/360680.360704) allows the computation of a sparse subset of the inverse, corresponding to the dense elements of a sparse LU factorization. Once this sparse subset is calculated, any additional elements of the inverse can be calculated at the cost of a single sparse dot product. In this implementation, this was achieved by calculating the inverse value for all dense values in the LU. Thus if a user needs specific values of the inverse, they can insert corresponding zeros into the sparse non-inverted matrix before calculating the inverse. | |
Due to large sparse matrices and in general interesting problems easily becoming close to rank deficient, and also the recursive nature of the algorithm, it is very sensitive to numerical errors, particularly on the first elements. In order to reduce this sensitivity, the dot products used for the accumulator were modified to use [Kahan Summation](https://en.wikipedia.org/wiki/Kahan_summation_algorithm#The_algorithm) for the accumulator. In testing (and in the test case added) this makes many problems go from intractable to easily solveable. However, the Kahan summation does come at a cost: roughly 4x the operations of a simple summation. In addition, [Neumaeier summation](https://en.wikipedia.org/wiki/Kahan_summation_algorithm#Further_enhancements) was tested, which only merges back the accumulated error at the end of the summation rather than at each iteration. This was found to be less accurate and a similar speed, so Kahan summation was used. | |
Despite the addition of the Kahan summation, the Takahashi method is still very fast, particularly for larger, sparser matrices. A potential follow-up to this would be to make the algorithm block-aware, to be more efficient on block sparse matrices.",Julian Kent,2022-07-28T18:04:41.005Z,NA,NA,"## Title: | |
Add Sparse Subset of Matrix Inverse | |
## Authors: | |
Julian Kent | |
## Summary: | |
This merge request introduces a method for efficiently calculating a sparse subset of the inverse of a sparse matrix, which is crucial for problems requiring access to specific elements of the inverse without generating a full dense matrix. | |
### Key Changes: | |
- Implemented a new method based on the Takahashi algorithm for computing sparse subsets of matrix inverses. | |
- Allows the retrieval of dense values from an LU decomposition while enabling the calculation of additional required elements through efficient sparse dot products. | |
- Incorporated Kahan Summation to improve numerical stability during accumulation, addressing sensitivity to numerical errors. | |
### Improvements: | |
- Enhanced performance in solving large sparse matrix problems, making previously intractable issues solvable. | |
- Maintained speed efficiency of the Takahashi method, particularly for larger and sparser matrices despite the additional computational costs introduced by Kahan Summation. | |
### Impact: | |
This improvement significantly optimizes the computational efficiency and numerical accuracy of sparse matrix operations in the Eigen library, making it particularly beneficial for users working with complex or nearly rank-deficient problems. Future enhancements may focus on making the algorithm block-aware for better efficiency in block sparse matrices." | |
1021 (https://gitlab.com/libeigen/eigen/-/merge_requests/1021),Updated AccelerateSupport documentation after PR 966.,This fixes the documentation after the changes incorporated in PR 966.,John Mather,2022-07-29T17:42:31.839Z,NA,NA,"## Title: | |
Updated AccelerateSupport documentation after PR 966. | |
## Authors: | |
John Mather | |
## Summary: | |
This merge request focuses on correcting and updating the documentation for the AccelerateSupport section of the Eigen C++ library following the changes made in pull request 966. | |
### Key Changes: | |
- Revised documentation to reflect recent updates from PR 966. | |
### Improvements: | |
- Enhanced clarity and accuracy of AccelerateSupport documentation. | |
### Impact: | |
- Ensures users have up-to-date information, improving usability and understanding of the AccelerateSupport functionality within the Eigen library." | |
1019 (https://gitlab.com/libeigen/eigen/-/merge_requests/1019),Avoid including <sstream> with EIGEN_NO_IO,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
This allows using the Eigen/Dense header in embedded environments using | |
libc++ built with -DLIBCXX_ENABLE_LOCALIZATION=OFF. Without this change, | |
including Eigen/Dense will result in the following compiler error: | |
`error: ""<locale.h> is not supported since libc++ has been configured without support for localization` | |
I did not include a test for this change since we would have to mock | |
headers for an entire C++ standard library without iostream support. | |
### Additional information | |
I used the following test binary to check that I can build against my custom embedded libc++ | |
```c++ | |
#define EIGEN_NO_IO | |
#define EIGEN_NO_MALLOC | |
#include <eigen3/Eigen/Dense> | |
#include <eigen3/Eigen/Sparse> | |
int main() { | |
Eigen::Vector3f f; | |
return f.size(); | |
} | |
```",Alexander Richardson,2022-07-29T18:02:51.996Z,NA,NA,"## Title: | |
Avoid including <sstream> with EIGEN_NO_IO | |
## Authors: | |
Alexander Richardson | |
## Summary: | |
This merge request addresses a compilation issue when using the Eigen library in environments that restrict certain C++ features, specifically in embedded systems utilizing libc++ without localization support. | |
### Key Changes: | |
- Modified the Eigen/Dense header to prevent the inclusion of `<sstream>` when the `EIGEN_NO_IO` directive is defined. | |
- Enables usage of Eigen in embedded contexts with `-DLIBCXX_ENABLE_LOCALIZATION=OFF`, avoiding errors related to unsupported localization headers. | |
### Improvements: | |
- Enhances compatibility of Eigen with custom embedded libc++ configurations. | |
- Reduces unnecessary dependencies on localization headers, promoting more modular and lightweight code usage in resource-constrained environments. | |
### Impact: | |
This change significantly broadens the applicability of the Eigen library for developers working in embedded systems, facilitating easier integration without the risk of compiler errors associated with unsupported features." | |
1004 (https://gitlab.com/libeigen/eigen/-/merge_requests/1004),Add true determinant to QR and its variants,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Fixes #471. | |
Implements `determinant()` method which gives true determinant, for `HouseholderQR`, `ColPivHouseholderQR`, `FullPivHouseholderQR`, `CompleteOrthogonalDecomposition`. | |
Documentation and test code is included. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
To calculate determinant of Q matrix, `struct householder_determinant` is added into `namespace internal`. | |
This class is specialized for real scalar type, so that it can make use of the fact that each reflection negates determinant for real matrices.",sjusju,2022-07-29T18:24:15.274Z,NA,NA,"## Title: | |
Add true determinant to QR and its variants | |
## Authors: | |
sjusju | |
## Summary: | |
This merge request implements a `determinant()` method that computes the true determinant for various QR decomposition classes in the Eigen library, specifically targeting `HouseholderQR`, `ColPivHouseholderQR`, `FullPivHouseholderQR`, and `CompleteOrthogonalDecomposition`. | |
### Key Changes: | |
- Introduced a `determinant()` method for the aforementioned QR classes. | |
- Added a `struct householder_determinant` in the `namespace internal` to facilitate the determinant calculation of the Q matrix, utilizing properties of real scalar types. | |
### Improvements: | |
- Enhanced the functionality of QR decomposition classes by allowing users to easily obtain the true determinant. | |
- Included documentation and test code to support the new feature and ensure reliability. | |
### Impact: | |
This addition allows for accurate determinant calculations directly from QR decompositions, improving the library's utility for users needing precise numerical linear algebra operations." | |
1011 (https://gitlab.com/libeigen/eigen/-/merge_requests/1011),Improve pblend AVX implementation,"blendv only cares about top bit of a mask, so we can use ints. | |
Removes vcvtdq2ps instruction and makes pblend faster: | |
BM_blend 1.31ns ± 1% 0.98ns ±15% -24.84% (p=0.008 n=5+5)",Ilya Tokar,2022-07-29T18:45:33.813Z,NA,NA,"## Title: | |
Improve pblend AVX implementation | |
## Authors: | |
Ilya Tokar | |
## Summary: | |
This merge request enhances the pblend AVX implementation by optimizing blendv operations which only require the top bit of a mask, allowing the use of integers instead. | |
### Key Changes: | |
- Removed the `vcvtdq2ps` instruction. | |
- Optimized the operation of the pblend, resulting in a performance boost. | |
### Improvements: | |
- Reduced execution time from 1.31ns to 0.98ns, achieving a speed improvement of approximately 24.84%. | |
### Impact: | |
The changes lead to a more efficient implementation of the pblend operation in the Eigen library, enhancing performance in applications relying on this function." | |
1020 (https://gitlab.com/libeigen/eigen/-/merge_requests/1020),Use numext::sqrt in ConjugateGradient.,"This allows us to apply the method for types that have custom sqrt | |
functions, e.g. gcc `__float128`. | |
Fixes #2519.",Antonio Sánchez,2022-07-29T20:17:24.802Z,NA,NA,"## Title: | |
Use numext::sqrt in ConjugateGradient. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request modifies the ConjugateGradient implementation to utilize the `numext::sqrt` function, enabling compatibility with types that have custom square root functions, such as `__float128` from GCC. | |
### Key Changes: | |
- Replaced standard square root calls with `numext::sqrt` within the ConjugateGradient algorithm. | |
### Improvements: | |
- Enhances flexibility by allowing ConjugateGradient to work with custom numeric types that implement their own square root functions. | |
### Impact: | |
- Broadens the usability of the ConjugateGradient method across different numeric types, improving overall functionality in various contexts." | |
1023 (https://gitlab.com/libeigen/eigen/-/merge_requests/1023),Fix flaky packetmath_1 test.,"The pmsub and pnmadd tests often fail due to cancellation of values. | |
Here we adjust the inputs so that they don't.",Antonio Sánchez,2022-08-02T17:42:45.862Z,NA,NA,"## Title: | |
Fix flaky packetmath_1 test. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses the inconsistency in the `packetmath_1` test, which was frequently failing due to value cancellations. Adjustments to the input values have been made to eliminate these failures. | |
### Key Changes: | |
- Adjusted inputs in the `pmsub` and `pnmadd` tests to prevent cancellation of values that led to test failures. | |
### Improvements: | |
- Increased stability and reliability of the `packetmath_1` test suite by reducing flakiness. | |
### Impact: | |
- Enhances the robustness of the testing process, ensuring more reliable results in the Eigen C++ library's test suite." | |
1010 (https://gitlab.com/libeigen/eigen/-/merge_requests/1010),Fix inner iterator for sparse block.,"The original incorrectly ignores the outer index of the block. | |
It looks like the main two-arg constructor for `InnerIterator` | |
does correctly consider the appropriate `outerIndexPtr`, so | |
we simply forward to that constructor. | |
Fixes #2507.",Antonio Sánchez,2022-08-03T17:26:13.579Z,NA,NA,"## Title: | |
Fix inner iterator for sparse block. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an issue with the inner iterator in the sparse block implementation of the Eigen C++ library. The original implementation incorrectly ignored the outer index of the block, which led to potential inaccuracies in sparse matrix operations. | |
### Key Changes: | |
- Corrected the implementation of the inner iterator to properly consider the outer index. | |
- Updated the `InnerIterator` to use the main two-arg constructor that correctly handles the `outerIndexPtr`. | |
### Improvements: | |
- Ensured that sparse block operations accurately reflect the intended indices, enhancing the reliability of the inner iterator. | |
### Impact: | |
This fix resolves issue #2507, improving the overall functionality of sparse matrix handling in the library and potentially preventing future bugs related to index handling." | |
1024 (https://gitlab.com/libeigen/eigen/-/merge_requests/1024),Partial Packet support for GEMM real-only (PowerPC). Also fix compilation warnings & errors for some conditions in new API.,"Partial Packet support for GEMM real-only (PowerPC). Also fix compilation warnings & errors for some conditions in new API. | |
Up to 40% reduction in binary size. Minor performance improvements.",Chip Kerchner,2022-08-03T18:15:21.141Z,NA,NA,"## Title: | |
Partial Packet support for GEMM real-only (PowerPC). Also fix compilation warnings & errors for some conditions in new API. | |
## Authors: | |
Chip Kerchner | |
## Summary: | |
This merge request introduces support for Partial Packet in GEMM (General Matrix Multiply) for real-only operations on PowerPC architecture. Additionally, it addresses various compilation warnings and errors related to the new API. | |
### Key Changes: | |
- Added Partial Packet support for GEMM operations specifically for real-only cases on PowerPC. | |
- Resolved compilation warnings and errors in the new API. | |
### Improvements: | |
- Achieved up to a 40% reduction in binary size. | |
- Introduced minor performance improvements. | |
### Impact: | |
The changes enhance the efficiency of GEMM operations on PowerPC, streamline the compilation process, and improve overall code quality by reducing binary size and fixing issues in the API." | |
1025 (https://gitlab.com/libeigen/eigen/-/merge_requests/1025),Fix use of Packet2d type for non-VSX.,Fix use of Packet2d type for non-VSX.,Chip Kerchner,2022-08-03T20:48:13.923Z,NA,NA,"## Title: | |
Fix use of Packet2d type for non-VSX. | |
## Authors: | |
Chip Kerchner | |
## Summary: | |
This merge request addresses issues with the Packet2d type in the Eigen C++ library specifically for environments that do not support VSX. | |
### Key Changes: | |
- Corrected the implementation of the Packet2d type for compatibility with non-VSX platforms. | |
### Improvements: | |
- Enhances the library's portability and usability across various architectures by ensuring that Packet2d functions correctly outside of VSX. | |
### Impact: | |
- This fix allows users on non-VSX architectures to utilize the Packet2d type without encountering compatibility issues, improving the overall reliability of the library on a wider range of systems." | |
1028 (https://gitlab.com/libeigen/eigen/-/merge_requests/1028),Fix non-VSX PowerPC build,"Fix non-VSX PowerPC build. | |
This should resolve [issue2513](https://gitlab.com/libeigen/eigen/-/issues/2513)",Chip Kerchner,2022-08-08T18:18:18.299Z,NA,NA,"## Title: | |
Fix non-VSX PowerPC build | |
## Authors: | |
Chip Kerchner | |
## Summary: | |
This merge request addresses and resolves the issue of non-VSX PowerPC builds within the Eigen C++ library. | |
### Key Changes: | |
- Implemented fixes to ensure compatibility for non-VSX PowerPC architectures. | |
### Improvements: | |
- Enhanced build functionality for PowerPC systems that do not utilize VSX. | |
### Impact: | |
- This change allows for smoother builds on non-VSX PowerPC platforms, improving accessibility and usability of the library on these systems." | |
1027 (https://gitlab.com/libeigen/eigen/-/merge_requests/1027),Fix code and unit test for a few corner cases in vectorized pow(),"Due to a bad test, a few corner cases were not handled correctly by the vectorized implementation of `pow()`. Specifically, the following two specifications would not be satisfied: | |
1. pow(-∞, exp) returns -∞ if exp is a positive odd integer. | |
2. pow(-0, exp), where exp is a negative odd integer, returns -∞. | |
Instead, the erroneous code returned +∞ in these cases. | |
Thanks to @chuckyschluz for reporting this.",Rasmus Munk Larsen,2022-08-08T18:48:36.771Z,NA,NA,"## Title: | |
Fix code and unit test for a few corner cases in vectorized pow() | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request addresses incorrect handling of specific corner cases in the vectorized implementation of the `pow()` function. It corrects the function's behavior to align with mathematical expectations in edge cases. | |
### Key Changes: | |
- Fixed the handling of `pow(-∞, exp)` for positive odd integers, now correctly returning -∞. | |
- Adjusted the return value of `pow(-0, exp)` for negative odd integers to -∞, correcting a previous error that returned +∞. | |
### Improvements: | |
- Enhanced the robustness of the `pow()` function by ensuring it adheres to defined mathematical specifications in edge cases. | |
- Improved unit tests to adequately cover these corner cases, preventing future regressions. | |
### Impact: | |
The fixes contribute to the accuracy and reliability of the `pow()` function in the Eigen C++ library, ensuring that it provides correct results for a broader range of inputs, thereby enhancing the library’s overall fidelity for mathematical computations." | |
1012 (https://gitlab.com/libeigen/eigen/-/merge_requests/1012),Fix vectorized Jacobi Rotation,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
There seems to be a bug in the `apply_rotation_in_the_plane_selector`, so the packet math vectorized version is never used. (Modern clang and gcc seem to vectorize the default version pretty well FWIW.) | |
This also makes some fixes to get the ""fixed-size"" code path to pass the test suite. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
Just for reference, this seems to be the reason that the packet-math version isn't being used atm: | |
~~~~ | |
const bool Vectorizable = (int(VectorX::Flags) & int(VectorY::Flags) & PacketAccessBit) ... | |
~~~~ | |
`Vectorizable` is always false because `VectorX` and `VectorY` are block expressions, which seem to not set the PacketAccessBit at all. Doing something like checking the `Flags` of the block `evaluator` instead seems to work better: | |
~~~~ | |
const bool Vectorizable = (int(evaluator<VectorX>::Flags) & int(evaluator<VectorY>::Flags) & PacketAccessBit) ... | |
~~~~",Arthur,2022-08-08T19:29:57.698Z,NA,NA,"## Title: | |
Fix vectorized Jacobi Rotation | |
## Authors: | |
Arthur | |
## Summary: | |
This merge request addresses a bug in the `apply_rotation_in_the_plane_selector` that prevents the packet math vectorized version from being utilized. It also includes fixes to ensure the ""fixed-size"" code path passes the test suite. | |
### Key Changes: | |
- Corrected the logic in determining whether vectorized operations can be applied by checking the flags of the block evaluator instead of the flags of the block expressions directly. | |
### Improvements: | |
- Ensures that the vectorized version of the Jacobi rotation is used, thus enabling potential performance enhancements when applicable. | |
- Successfully passes the test suite for the fixed-size code path, enhancing overall reliability. | |
### Impact: | |
The changes improve the functionality of the Jacobi rotation implementation in the Eigen library, potentially leading to better performance in scenarios where vectorization is advantageous." | |
1026 (https://gitlab.com/libeigen/eigen/-/merge_requests/1026),Vectorize the sign operator in Eigen.,"This fixes an old TODO to vectorize `scalar_sign_op` for real types. | |
Measured speedup on Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz: | |
(ctype == `std::complex<float>`, cdtype == `std::complex<double>`) | |
``` | |
--march=nehalem (SSE*) | |
before: | |
BM_eigen_sign_double/8_mean 6.44 6.41 109201388 | |
BM_eigen_sign_double/64_mean 34.0 34.0 20234383 | |
BM_eigen_sign_double/512_mean 260 260 2666898 | |
BM_eigen_sign_double/2k_mean 1016 1016 687992 | |
BM_eigen_sign_float/8_mean 9.71 9.71 72262781 | |
BM_eigen_sign_float/64_mean 21.7 21.7 32433773 | |
BM_eigen_sign_float/512_mean 117 117 5997151 | |
BM_eigen_sign_float/2k_mean 449 448 1200000 | |
BM_eigen_sign_ctype/8_mean 57.2 57.2 12000000 | |
BM_eigen_sign_ctype/64_mean 509 509 1200000 | |
BM_eigen_sign_ctype/512_mean 4095 4086 167682 | |
BM_eigen_sign_ctype/2k_mean 16289 16288 42865 | |
BM_eigen_sign_cdtype/8_mean 75.5 75.2 9199233 | |
BM_eigen_sign_cdtype/64_mean 722 722 982228 | |
BM_eigen_sign_cdtype/512_mean 6704 6682 105444 | |
BM_eigen_sign_cdtype/2k_mean 27827 27832 24978 | |
after: | |
BM_eigen_sign_double/8_mean 5.74 5.74 120000000 | |
BM_eigen_sign_double/64_mean 31.7 31.7 21927020 | |
BM_eigen_sign_double/512_mean 252 252 2706623 | |
BM_eigen_sign_double/2k_mean 982 982 712900 | |
BM_eigen_sign_float/8_mean 9.68 9.68 72326995 *not vectorized | |
BM_eigen_sign_float/64_mean 21.4 21.4 32621349 *not vectorized | |
BM_eigen_sign_float/512_mean 116 116 6013897 *not vectorized | |
BM_eigen_sign_float/2k_mean 447 447 1200000 *not vectorized | |
BM_eigen_sign_ctype/8_mean 15.6 15.6 45293266 | |
BM_eigen_sign_ctype/64_mean 88.6 88.6 7900154 | |
BM_eigen_sign_ctype/512_mean 680 680 1031326 | |
BM_eigen_sign_ctype/2k_mean 2689 2691 257909 | |
BM_eigen_sign_cdtype/8_mean 28.4 28.4 24440221 | |
BM_eigen_sign_cdtype/64_mean 257 257 2711124 | |
BM_eigen_sign_cdtype/512_mean 2077 2077 336557 | |
BM_eigen_sign_cdtype/2k_mean 8312 8313 84323 | |
--march=skylake (AVX2) | |
before: | |
BM_eigen_sign_double/8_mean 12.0 12.0 58310104 | |
BM_eigen_sign_double/64_mean 38.5 38.5 17869356 | |
BM_eigen_sign_double/512_mean 250 249 2775578 | |
BM_eigen_sign_double/2k_mean 996 996 714018 | |
BM_eigen_sign_float/8_mean 12.7 12.7 55495035 | |
BM_eigen_sign_float/64_mean 32.4 32.4 21378522 | |
BM_eigen_sign_float/512_mean 122 122 5698877 | |
BM_eigen_sign_float/2k_mean 414 413 1618011 | |
BM_eigen_sign_ctype/8_mean 58.2 58.2 11965594 | |
BM_eigen_sign_ctype/64_mean 518 518 1200000 | |
BM_eigen_sign_ctype/512_mean 4080 4063 169364 | |
BM_eigen_sign_ctype/2k_mean 16333 16332 42414 | |
BM_eigen_sign_cdtype/8_mean 79.7 79.7 8728939 | |
BM_eigen_sign_cdtype/64_mean 718 717 971102 | |
BM_eigen_sign_cdtype/512_mean 6705 6700 105340 | |
BM_eigen_sign_cdtype/2k_mean 28451 28447 24539 | |
after: | |
BM_eigen_sign_double/8_mean 10.2 10.2 68883377 | |
BM_eigen_sign_double/64_mean 27.8 27.8 25028814 | |
BM_eigen_sign_double/512_mean 169 169 4134488 | |
BM_eigen_sign_double/2k_mean 631 631 1102968 | |
BM_eigen_sign_float/8_mean 16.4 16.4 42694836 | |
BM_eigen_sign_float/64_mean 21.1 21.1 33109093 | |
BM_eigen_sign_float/512_mean 96.9 96.9 7209604 | |
BM_eigen_sign_float/2k_mean 326 326 2110066 | |
BM_eigen_sign_ctype/8_mean 27.7 27.7 25070458 | |
BM_eigen_sign_ctype/64_mean 96.1 96.1 7270548 | |
BM_eigen_sign_ctype/512_mean 634 634 1102494 | |
BM_eigen_sign_ctype/2k_mean 2467 2467 280365 | |
BM_eigen_sign_cdtype/8_mean 28.3 28.3 24573556 | |
BM_eigen_sign_cdtype/64_mean 241 241 2869555 | |
BM_eigen_sign_cdtype/512_mean 1946 1946 358788 | |
BM_eigen_sign_cdtype/2k_mean 7793 7793 89187 | |
--march=skylake-avx512 (AVX512) | |
before: | |
BM_eigen_sign_double/8_mean 11.5 11.5 61014691 | |
BM_eigen_sign_double/64_mean 41.5 41.5 16519411 | |
BM_eigen_sign_double/512_mean 285 285 2438747 | |
BM_eigen_sign_double/2k_mean 1140 1140 598276 | |
BM_eigen_sign_float/8_mean 11.5 11.5 61125484 | |
BM_eigen_sign_float/64_mean 29.4 29.4 23598559 | |
BM_eigen_sign_float/512_mean 103 103 6759891 | |
BM_eigen_sign_float/2k_mean 371 371 1851622 | |
BM_eigen_sign_ctype/8_mean 58.4 58.4 11787688 | |
BM_eigen_sign_ctype/64_mean 509 509 1200000 | |
BM_eigen_sign_ctype/512_mean 4091 4093 168895 | |
BM_eigen_sign_ctype/2k_mean 16314 16295 42802 | |
BM_eigen_sign_cdtype/8_mean 79.6 79.6 8605987 | |
BM_eigen_sign_cdtype/64_mean 717 717 965781 | |
BM_eigen_sign_cdtype/512_mean 6831 6828 101197 | |
BM_eigen_sign_cdtype/2k_mean 28749 28743 24183 | |
after: | |
BM_eigen_sign_double/8_mean 16.4 16.4 42809039 | |
BM_eigen_sign_double/64_mean 23.2 23.2 30221446 | |
BM_eigen_sign_double/512_mean 74.3 74.3 9397251 | |
BM_eigen_sign_double/2k_mean 258 259 2604931 | |
BM_eigen_sign_float/8_mean 16.5 16.5 42515449 | |
BM_eigen_sign_float/64_mean 31.3 31.3 22136770 | |
BM_eigen_sign_float/512_mean 60.9 60.9 11516560 | |
BM_eigen_sign_float/2k_mean 153 153 4570941 | |
BM_eigen_sign_ctype/8_mean 62.7 62.7 10956854 | |
BM_eigen_sign_ctype/64_mean 121 121 5783435 | |
BM_eigen_sign_ctype/512_mean 501 501 1200000 | |
BM_eigen_sign_ctype/2k_mean 1835 1835 379651 | |
BM_eigen_sign_cdtype/8_mean 57.7 57.8 11825262 | |
BM_eigen_sign_cdtype/64_mean 203 203 3418327 | |
BM_eigen_sign_cdtype/512_mean 1392 1392 497932 | |
BM_eigen_sign_cdtype/2k_mean 5406 5406 120000 | |
```",Rasmus Munk Larsen,2022-08-09T19:54:58.036Z,NA,NA,"## Title: | |
Vectorize the sign operator in Eigen. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request implements vectorization of the `scalar_sign_op` for real types in Eigen, addressing a long-standing TODO. The changes aim to enhance performance for calculating the sign of scalar values in vectorized formats. | |
### Key Changes: | |
- Vectorization of the sign operator for real types, including `std::complex<float>` and `std::complex<double>`. | |
- Performance benchmarks before and after the changes were included for various data types across different CPU architectures (Nehalem, Skylake, and Skylake-AVX512). | |
### Improvements: | |
- The vectorized implementation demonstrates significant reductions in computation time across various benchmarks. For instance, results indicate improvements ranging from minor reductions in execution time for smaller data sizes to more substantial improvements at larger sizes. | |
- Across different architectures (Nehalem, Skylake, and Skylake-AVX512), the vectorization resulted in increased throughput, as shown by the benchmark results post-implementation. | |
### Impact: | |
- The vectorization of the sign operator is expected to yield better performance for applications utilizing Eigen, especially in cases involving computations with complex numbers. This optimization enhances the efficiency of mathematical operations, which can lead to overall improvements in application performance relying on the Eigen library." | |
1030 (https://gitlab.com/libeigen/eigen/-/merge_requests/1030),Don't double-define Half functions on aarch64,"### What does this implement/fix? | |
This change fixes a compilation error that occurs when compiling for GPU (CUDA) with an aarch64 (aka Arm64) host. Two sets of Half functions are defined, which conflict with each other. To fix this, the change specifically disables the aarch64 versions during the GPU compile phase.",Lexi Bromfield,2022-08-09T20:00:34.784Z,NA,NA,"## Title: | |
Don't double-define Half functions on aarch64 | |
## Authors: | |
Lexi Bromfield | |
## Summary: | |
This merge request addresses a compilation error encountered when compiling the Eigen library for GPU (CUDA) on aarch64 (Arm64) platforms. It resolves the issue of conflicting definitions of Half functions that were causing the compilation failure by disabling the aarch64-specific versions during GPU compilation. | |
### Key Changes: | |
- Disabled aarch64 versions of Half functions during GPU compile phase to prevent conflicts. | |
### Improvements: | |
- Resolves compilation errors specific to aarch64 when targeting GPU, thus enhancing compatibility and stability of the codebase. | |
### Impact: | |
- This fix enables successful compilation for GPU on aarch64 hosts, improving support for diverse architectures within the Eigen library." | |
1031 (https://gitlab.com/libeigen/eigen/-/merge_requests/1031),Eliminate bool bitwise warnings.,These previously triggered `-Wbitwise-instead-of-logical` warnings.,Antonio Sánchez,2022-08-09T22:42:32.300Z,NA,NA,"## Title: | |
Eliminate bool bitwise warnings. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses and eliminates warnings related to the use of bitwise operations on boolean types, specifically targeting the `-Wbitwise-instead-of-logical` warnings that were previously triggered in the Eigen C++ library. | |
### Key Changes: | |
- Refactored code to replace bitwise operations on boolean values with logical operations. | |
### Improvements: | |
- Improved code clarity and correctness by adhering to standard practices for boolean operations. | |
### Impact: | |
- Reduces warning clutter during compilation, enhancing the overall code quality and maintainability of the Eigen library." | |
1032 (https://gitlab.com/libeigen/eigen/-/merge_requests/1032),"Disable bad ""deprecated warning"" edge-case in BDCSVD","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This fixes a couple of invalid deprecation warnings in BDCSVD. | |
If no computationOptions are set with the runtime parameters, BDCSVD dispatches to JacobiSVD at runtime. BDCSVD has to pass a ``0`` as the computationOptions to JacobiSVD's deprecated constructor. Unfortunately, I don't think there is a simple way to naturally prevent this in the implementation. So, I just disable the warning when bdcsvd tries to call jacobisvd's deprecated constructor. :shrug: | |
~~~~ | |
#include <Eigen/Core> | |
#include <Eigen/SVD> | |
int main() { | |
Eigen::MatrixXd m = Eigen::MatrixXd::Random(100, 100); | |
// Valid input, to get the singular values, but shows deprecation warning for JacobiSVD! | |
Eigen::BDCSVD<Eigen::MatrixXd> svd1(m); | |
// Deprecated warning for both BDCSVD and JacobiSVD. | |
Eigen::BDCSVD<Eigen::MatrixXd> svd2(m, Eigen::ComputeFullU); | |
return 0; | |
} | |
~~~~ | |
I just borrowed the stuff to disable the warning from [highway](https://github.com/google/highway/blob/33b43877277c438d25247e6624fe6a30616b44ae/hwy/base.h#L48). this should work for the compiler versions eigen supports...",Arthur,2022-08-11T18:43:32.535Z,NA,NA,"## Title: Disable bad ""deprecated warning"" edge-case in BDCSVD | |
## Authors: Arthur | |
## Summary: | |
This merge request addresses the issue of invalid deprecation warnings in the BDCSVD class of the Eigen C++ library. | |
### Key Changes: | |
- Implemented a mechanism to disable deprecation warnings when BDCSVD calls JacobiSVD's deprecated constructor with a `0` as the computationOptions. | |
### Improvements: | |
- The change prevents unnecessary deprecation warnings that could confuse users, especially when using valid inputs for singular value decomposition. | |
### Impact: | |
- Users of the BDCSVD class will have a cleaner output without misleading warnings, improving the overall user experience while utilizing the Eigen library." | |
1033 (https://gitlab.com/libeigen/eigen/-/merge_requests/1033),[SYCL] Fix some SYCL tests,"### What does this implement/fix? | |
Sigmoid failed in tensor_math because of the specializations in PacketMath. | |
The binary logic operators were casting floating types with rounding but they | |
are meant to do bitwise casting. The generic implementations of and, or, xor, | |
andnot are working as expected with SYCL. | |
tensor_builtin test could fail for certain seeds because of log | |
producing too small outputs for the test precision. | |
tensor_random could fail for certain seeds. The neighborhood check is | |
removed as it is not a safe way to check for the distribution. | |
This matches the behavior of the CUDA test.",Romain Biessy,2022-08-16T17:37:54.733Z,NA,NA,"## Title: | |
[SYCL] Fix some SYCL tests | |
## Authors: | |
Romain Biessy | |
## Summary: | |
This merge request addresses several issues in the SYCL tests of the Eigen library, specifically focusing on the execution of tensor math operations and the accuracy of certain tests. | |
### Key Changes: | |
- Corrected the implementation of the sigmoid function in tensor_math due to incorrect specializations in PacketMath. | |
- Fixed binary logic operators to ensure they perform bitwise casting instead of incorrectly rounding floating types. | |
- Resolved failures in the tensor_builtin test linked to the log function generating outputs that were too small for the specified precision. | |
- Modified the tensor_random test by removing the neighborhood check, which was deemed unsafe for checking distribution. | |
### Improvements: | |
- Enhanced overall accuracy and reliability of the tensor math operations within the SYCL implementation. | |
- Ensured that logic operations correctly follow the intended bitwise behavior, thus improving computational consistency. | |
### Impact: | |
These changes improve the robustness and correctness of SYCL tests in the Eigen library. Users will experience more reliable tensor operations and reduced test failures, thereby enhancing the overall user experience and confidence in the library's functionality." | |
1035 (https://gitlab.com/libeigen/eigen/-/merge_requests/1035),Removed unnecessary checks for FP16C,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
The AVX512 packetmath unnecessarily checks for the presence of FP16C when using the intrinsics `_mm512_cvtps_ph` and `_mm512_cvtph_ps` - these are AVX512F intrinsics, and do not need this flag to be set. Currently, if -mfp16c is not set, a scalar typecast is used for float2half and half2float. | |
### Additional information | |
Checking various versions of GCC, clang, MSVC, all seem to compile the intrinsics fine with only AVX512F enabled. This makes a massive performance difference if someone has set -mavx512f but not -mfp16c, as it avoids very slow scalar typecasts.",Matthew Sterrett,2022-08-16T19:09:47.278Z,NA,NA,"## Title: | |
Removed unnecessary checks for FP16C | |
## Authors: | |
Matthew Sterrett | |
## Summary: | |
This merge request eliminates redundant checks for the FP16C feature when using specific AVX512 intrinsics within the Eigen C++ library. The current implementation conditionally performs scalar typecasts, which degrades performance when certain compiler flags are not set. | |
### Key Changes: | |
- Removed unnecessary FP16C checks for AVX512 intrinsics `_mm512_cvtps_ph` and `_mm512_cvtph_ps`. | |
- Streamlined code execution by avoiding scalar typecasts when only AVX512F is enabled. | |
### Improvements: | |
- Enhanced performance by eliminating inefficient scalar typecasting, leading to faster float-to-half and half-to-float conversions. | |
### Impact: | |
This change leads to significant speed improvements for users compiling with AVX512F without FP16C, optimizing the computation efficiency of the affected intrinsics in the Eigen library." | |
1029 (https://gitlab.com/libeigen/eigen/-/merge_requests/1029),add fixed power unary operation,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
\#1425 | |
### What does this implement/fix? | |
Adds a unary expression for performing a coefficientwise real-valued power operation on an array. If the exponent argument is an integer, the operation is performed using repeated squaring. Otherwise, the operation defers to Eigen's existing vectorized pow routine with a fixed exponent. Both cases are IEEE compliant with respect to error handling. | |
This MR replaces !1022 as the functionality is quite different, though the result is the same. | |
### Additional information | |
Some benchmark measurements are here: https://gitlab.com/libeigen/eigen/-/snippets/2388067. The speedups for integer exponents are very impressive. | |
The benchmarks measure the following simple functions for float and double arrays of varying sizes: | |
``` | |
template<typename T> | |
void eigen_powquarter(const Eigen::Matrix<T, Eigen::Dynamic, 1>& v, | |
Eigen::Matrix<T, Eigen::Dynamic, 1>* u) { | |
*u = v.array().pow(T(0.25)); | |
} | |
template<typename T> | |
void eigen_pow4(const Eigen::Matrix<T, Eigen::Dynamic, 1>& v, | |
Eigen::Matrix<T, Eigen::Dynamic, 1>* u) { | |
*u = v.array().pow(int(4)); | |
} | |
template<typename T> | |
void eigen_pow4float(const Eigen::Matrix<T, Eigen::Dynamic, 1>& v, | |
Eigen::Matrix<T, Eigen::Dynamic, 1>* u) { | |
*u = v.array().pow(T(4)); | |
} | |
``` | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-08-16T21:32:36.602Z,NA,NA,"## Title: | |
Add Fixed Power Unary Operation | |
## Authors: | |
Charles Schlosser | |
## Summary: | |
This merge request introduces a new unary operation to the Eigen C++ library that allows for coefficientwise real-valued power operations on arrays. It handles both integer and non-integer exponents efficiently, enhancing the library's mathematical capabilities. | |
### Key Changes: | |
- Implemented a unary expression for real-valued power operations on arrays. | |
- Introduced optimization for integer exponents using repeated squaring. | |
- Non-integer exponents default to Eigen's existing vectorized `pow` routine. | |
### Improvements: | |
- Significant speedup for operations with integer exponents, as demonstrated in benchmark tests. | |
- Compliance with IEEE standards regarding error handling is maintained. | |
### Impact: | |
This feature enhances the functionality of the Eigen library by providing an efficient and versatile way to perform power operations on arrays, making it more robust for users requiring advanced mathematical calculations." | |
1017 (https://gitlab.com/libeigen/eigen/-/merge_requests/1017),Add support for AVX512-FP16 for vectorizing half precision math,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This merge request takes advantage of the AVX512-FP16 instruction set to vectorize half floating-point operations. It implements Packet32h and replaces many packet operations for the pre-existing Packet16h and Packet8h. The pre-existing AVX implementations used typecasting to float and back, so this could improve performance significantly by avoiding these intermediate typecasts. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
I've ran all tests against this, and it passes all of them consistently except for packetmath_13, which checks half precision packet math. The specific tests that fail are the fused math operations, particularly pmsub. From my testing, this seems to just be because of lower precision of the AVX512-FP16 fused intrinsics compared to the reference calculations. I'm not sure what the best thing to do here is, it would be nice to get some feedback on how to handle this. | |
On my experimental setup, I got a consistent 8-9 times improvement in the performance of the bench_gemm benchmark changing from using AVX512F only to using AVX512-FP16, with OpenMP disabled. This performance gain was consistent for matrices varying from size 256x256 - 4096x4096. | |
Specifically, the speedup was about 8.4x for 256x256, 8.3x for 512x512, 8.8x for 1024x1024, 8.7x for 2048x2048, and 9.4x for 4096x4096.",Matthew Sterrett,2022-08-17T18:15:22.710Z,NA,NA,"## Title: | |
Add support for AVX512-FP16 for vectorizing half precision math | |
## Authors: | |
Matthew Sterrett | |
## Summary: | |
This merge request introduces support for the AVX512-FP16 instruction set, enhancing the performance of half-precision floating-point operations in the Eigen C++ library. It implements new packet types and replaces existing types, optimizing computation without relying on typecasting. | |
### Key Changes: | |
- Implementation of `Packet32h` for efficient half-precision operations. | |
- Replacement of several packet operations for `Packet16h` and `Packet8h`. | |
- Elimination of typecasting between half and float types in existing AVX implementations. | |
### Improvements: | |
- Significant performance enhancements observed, with up to an 8-9 times improvement in the `bench_gemm` benchmark. | |
- Consistent performance gains for various matrix sizes (256x256 to 4096x4096). | |
### Impact: | |
This merge request is expected to drastically reduce computational overhead in half-precision calculations, leading to enhanced efficiency in applications that utilize these operations, while maintaining accuracy within acceptable bounds despite minor precision differences in specific tests. Feedback is sought for handling fused math operation discrepancies." | |
1034 (https://gitlab.com/libeigen/eigen/-/merge_requests/1034),Use proper double word division algorithm for pow<double>. Gives 11-15% speedup.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Fix TODO in `accurate_log2<double>` used by `pow<double>`. This change replaces the division via an approximate reciprocal with Algorithm 15 from: | |
""Tight and rigourous error bounds for basic building blocks of double-word arithmetic"", | |
Joldes, Muller, & Popescu, 2017. https://hal.archives-ouvertes.fr/hal-01351529 | |
This speeds up `pow<double>` by 11-15%. Benchmark measurements: https://gitlab.com/libeigen/eigen/-/snippets/2390173 | |
Comparison against MPFR shows no change in accuracy. The algorithm still return faithfully rounded results when the result is normal. | |
Thanks to David Majnemer ([email protected]) for pointing me to the paper. | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-08-17T18:36:24.370Z,NA,NA,"## Title: | |
Use proper double word division algorithm for pow<double>. Gives 11-15% speedup. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request introduces a more efficient algorithm for computing `pow<double>` in the Eigen C++ library. It replaces an approximate reciprocal division in the `accurate_log2<double>` function with a more precise method from the referenced paper, resulting in improved performance. | |
### Key Changes: | |
- Implemented a new division algorithm for `pow<double>` based on Algorithm 15 from Joldes, Muller, & Popescu (2017). | |
- Replaced the existing approximate reciprocal division with this more accurate method. | |
### Improvements: | |
- Achieved an 11-15% speedup in the performance of `pow<double>`. | |
- Maintained accuracy comparable to MPFR, ensuring that results remain faithfully rounded for normal outputs. | |
### Impact: | |
The changes significantly enhance the performance of the `pow<double>` function without compromising on accuracy, potentially benefiting applications that rely heavily on mathematical computations within the Eigen library." | |
1037 (https://gitlab.com/libeigen/eigen/-/merge_requests/1037),Protect new pblend implementation with EIGEN_VECTORIZE_AVX2,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
!1011 broke the AVX build without AVX2. | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-08-22T18:28:04.190Z,NA,NA,"## Title: | |
Protect new pblend implementation with EIGEN_VECTORIZE_AVX2 | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request introduces a protection mechanism for the new `pblend` implementation by utilizing `EIGEN_VECTORIZE_AVX2`, addressing an issue that arose from merge request !1011 which broke the AVX build when AVX2 support was absent. | |
### Key Changes: | |
- Implemented a safeguard in the `pblend` code to ensure compatibility with environments lacking AVX2 support. | |
- Adjusted the conditional compilation to rely on `EIGEN_VECTORIZE_AVX2`. | |
### Improvements: | |
- Enhances the robustness of the Eigen library by preventing build failures related to the absence of AVX2 hardware capabilities. | |
### Impact: | |
- This change lowers the risk of compilation errors in projects that rely on Eigen when using hardware that does not support AVX2, thereby broadening compatibility and ensuring more stable builds." | |
1039 (https://gitlab.com/libeigen/eigen/-/merge_requests/1039),"Fix psign for unsigned integer types, such as bool.","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.--> | |
Fixes bug in !1026 where `psign<bool>` would try to return bool(-1).",Rasmus Munk Larsen,2022-08-22T20:19:36.046Z,NA,NA,"## Title: | |
Fix psign for unsigned integer types, such as bool. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request addresses a bug in the Eigen C++ library related to the `psign` function for unsigned integer types, specifically `bool`. The previous implementation erroneously attempted to return `bool(-1)`. | |
### Key Changes: | |
- The implementation of `psign` has been modified to correctly handle unsigned integer types, including `bool`, preventing it from trying to return invalid values. | |
### Improvements: | |
- The fix enhances the robustness of the `psign` function, ensuring it behaves correctly for all unsigned types, which improves type compatibility within the library. | |
### Impact: | |
- This change eliminates a potential source of bugs when using `psign` with boolean values, thereby increasing reliability and correctness in applications relying on the Eigen library for matrix and vector operations involving unsigned integers." | |
1036 (https://gitlab.com/libeigen/eigen/-/merge_requests/1036),Sparse Core: Replace malloc/free with conditional_aligned,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
The sparse classes currently use a mix of `std::malloc`, `std::realloc`, and `std::free` for memory management instead of the `aligned_malloc` family of functions defined in `Memory.h`. Other than consistency with the dense classes, this will enable users to track heap allocations with `#define EIGEN_RUNTIME_NO_MALLOC` and related mechanisms. Also, there may be some advantage to ensuring that the sparse matrices use aligned storage for vectorization. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
Benchmark measurement: | |
``` | |
double sse no align Total duration: 21531, per iteration: 2153.1 | |
double sse align Total duration: 21691, per iteration: 2169.1 | |
double avx no align Total duration: 21548, per iteration: 2154.8 | |
double avx align Total duration: 21448, per iteration: 2144.8 | |
double old sse Total duration: 95201, per iteration: 9520.1 | |
double old avx Total duration: 97138, per iteration: 9713.8 | |
``` | |
I did this test 10 times for `size = 1000`. | |
``` | |
template<typename Scalar, bool Align> | |
void testSparseAlign(Index size) | |
{ | |
SparseMatrix<Scalar, ColMajor, int, Align> A(size, size); | |
for (int i = 0; i < size; i++) | |
for (int j = 0; j <= i; j++) | |
A.coeffRef(size-i-1, j) = i*j; | |
std::cout << A.nonZeros() << ""\n""; | |
std::cout << A.data().allocatedSize() << ""\n""; | |
std::cout << A.sum() << ""\n""; | |
} | |
```",Charles Schlosser,2022-08-23T21:44:23.506Z,NA,NA,"## Title: | |
Sparse Core: Replace malloc/free with conditional_aligned | |
## Authors: | |
Charles Schlosser | |
## Summary: | |
This merge request updates the memory management strategy for the sparse classes in the Eigen C++ library by replacing the use of `std::malloc`, `std::realloc`, and `std::free` with the `aligned_malloc` functions from `Memory.h`. This change aligns the memory management in sparse classes with that of dense classes, enhancing consistency. | |
### Key Changes: | |
- Transition from standard memory allocation functions (`malloc/realloc/free`) to aligned memory allocation functions. | |
- Introduced mechanisms to track heap allocations with the preprocessor directive `#define EIGEN_RUNTIME_NO_MALLOC`. | |
### Improvements: | |
- Ensures that sparse matrices utilize aligned storage, which can potentially improve performance for vectorized operations. | |
- Provides enhanced tracking of memory allocation, aiding debugging and profiling. | |
### Impact: | |
This change aims to improve the performance consistency of sparse operations and aligns memory management across the library, contributing to better optimization potential in memory-intensive applications. Benchmarks demonstrated slight performance variations, indicating a need for careful performance evaluation following these changes." | |
1044 (https://gitlab.com/libeigen/eigen/-/merge_requests/1044),Add missing ptr in realloc call.,Introduced in !1036,Antonio Sánchez,2022-08-25T05:05:17.064Z,NA,NA,"## Title: | |
Add missing ptr in realloc call. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an issue related to memory allocation by adding a missing pointer in a realloc call. | |
### Key Changes: | |
- Included a missing pointer in the realloc function to ensure proper memory management. | |
### Improvements: | |
- Enhances the reliability of memory allocation handling within the Eigen library. | |
### Impact: | |
- Reduces the risk of memory-related errors, potentially improving overall stability and performance in scenarios where realloc is utilized." | |
1045 (https://gitlab.com/libeigen/eigen/-/merge_requests/1045),Fix GeneralizedEigenSolver::info() and Asserts,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
#2524 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
small fix for ``GeneralizedEigenSolver::info()`` :-) | |
basically, m_valuesOkay was used to check if the decomp was initialized, but was only set to true when m_realQZ was successful. So info() would raise an assert when the decomposition failed. IMO info() should always be accessible when the decomp is initialized (and I think that's how all the other decomps work). | |
This just replaces m_valuesOkay with m_isInitialized and replaces spots where m_valuesOkay was used with ``info() == Success``. It also changes some of the error messages to make them more accurate (I.e., they now say the decomposition failed, rather than always just saying ""uninitialized"")",Arthur,2022-08-25T22:05:05.397Z,NA,NA,"## Title: | |
Fix GeneralizedEigenSolver::info() and Asserts | |
## Authors: | |
Arthur | |
## Summary: | |
This merge request addresses an issue in the `GeneralizedEigenSolver::info()` method, improving the way initialization checks are performed and error messages are conveyed. | |
### Key Changes: | |
- Replaced the variable `m_valuesOkay` with `m_isInitialized` to check if the decomposition is initialized. | |
- Adjusted the implementation to ensure that `info()` can always be accessed when the decomposition is initialized, aligning it with the behavior of other decompositions. | |
- Updated error messages to accurately reflect the state of the decomposition, specifying when decomposition has failed instead of stating it is ""uninitialized."" | |
### Improvements: | |
- Enhanced the robustness of the `info()` method by aligning its logic with the initialization state of the decomposition. | |
- Improved clarity of error messages, making it easier for users to understand the cause of failures in the decomposition process. | |
### Impact: | |
These changes improve both usability and reliability of the `GeneralizedEigenSolver` class, allowing users to better diagnose issues with decomposition and ensuring that functionality is consistently accessible when initialized." | |
1042 (https://gitlab.com/libeigen/eigen/-/merge_requests/1042),Avoid undefined behavior in array_cwise test due to signed integer overflow,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-08-26T16:19:04.541Z,NA,NA,"## Title: | |
Avoid undefined behavior in array_cwise test due to signed integer overflow | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request addresses the issue of undefined behavior in the Eigen library's `array_cwise` test caused by signed integer overflow. | |
### Key Changes: | |
- Modified the `array_cwise` test to prevent occurrences of signed integer overflow. | |
### Improvements: | |
- Enhanced the stability and reliability of the test by ensuring that operations do not lead to undefined behavior. | |
### Impact: | |
- Improved the integrity of the test suite, thus ensuring that tests yield consistent and correct results, contributing to the overall robustness of the Eigen library." | |
1040 (https://gitlab.com/libeigen/eigen/-/merge_requests/1040),"Specialize psign<Packet8i> for AVX2, don't vectorize psign<bool>.","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.--> | |
Speedup for `psign<Packet8i>` with AVX2 enabled: | |
``` | |
name old cpu/op new cpu/op delta | |
BM_eigen_sign_int/1 2.73ns ± 0% 0.56ns ± 1% -79.45% (p=0.000 n=52+56) | |
BM_eigen_sign_int/8 6.81ns ± 1% 5.33ns ± 0% -21.75% (p=0.000 n=49+55) | |
BM_eigen_sign_int/64 16.1ns ± 1% 7.9ns ± 0% -50.95% (p=0.000 n=52+57) | |
BM_eigen_sign_int/512 58.0ns ± 0% 28.2ns ± 0% -51.40% (p=0.000 n=58+49) | |
BM_eigen_sign_int/4k 405ns ± 1% 198ns ± 1% -51.05% (p=0.000 n=46+60) | |
BM_eigen_sign_int/32k 3.83µs ± 1% 2.46µs ± 1% -35.76% (p=0.000 n=42+54) | |
BM_eigen_sign_int/256k 78.5µs ± 2% 78.5µs ± 1% ~ (p=0.369 n=59+51) | |
BM_eigen_sign_int/1M 315µs ± 2% 315µs ± 1% ~ (p=0.983 n=32+30) | |
```",Rasmus Munk Larsen,2022-08-26T17:02:38.413Z,NA,NA,"## Title: | |
Specialize psign<Packet8i> for AVX2, don't vectorize psign<bool>. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request enhances the performance of the `psign<Packet8i>` function by specializing it for AVX2 architecture. Additionally, it modifies the implementation to avoid vectorizing `psign<bool>`. | |
### Key Changes: | |
- Specialized the `psign<Packet8i>` function for improved performance with AVX2 support. | |
- Removed the vectorization of the `psign<bool>` function. | |
### Improvements: | |
- Significant speedup observed in benchmarks: | |
- Up to 79.45% faster for small input sizes like `BM_eigen_sign_int/1`. | |
- Consistently reduced execution time across varying input sizes, with improvements of up to 51% for larger input arrays. | |
### Impact: | |
- The specialized AVX2 implementation drastically improves the efficiency of operations involving `psign<Packet8i>`, particularly beneficial for performance-critical applications in numerical computing. | |
- The decision to not vectorize `psign<bool>` may simplify implementation while maintaining appropriate performance characteristics for boolean operations." | |
1046 (https://gitlab.com/libeigen/eigen/-/merge_requests/1046),re-enable pow for complex types,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-08-29T16:06:08.524Z,NA,NA,"## Title: | |
Re-enable pow for complex types | |
## Authors: | |
Charles Schlosser | |
## Summary: | |
This merge request re-enables the power function (`pow`) for complex types in the Eigen C++ library, facilitating more comprehensive mathematical computations involving complex numbers. | |
### Key Changes: | |
- The `pow` function has been restored for complex number types, expanding the functionality of the library. | |
### Improvements: | |
- Allows the use of the power function with complex numbers, providing users with more flexibility and enhancing mathematical operations within the library. | |
### Impact: | |
- Increased usability for users working with complex numbers, enabling them to perform eigenvalue calculations and other complex mathematical operations without additional workarounds." | |
1043 (https://gitlab.com/libeigen/eigen/-/merge_requests/1043),Vectorize pow for integer base / exponent types,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
#2522 | |
### What does this implement/fix? | |
Vectorized `pow` with handling for negative exponents. e.g. `ArrayXi x, y; y = x.pow(3);` | |
In general, integer divide by zero and signed integer overflow is undefined behavior. Under these conditions, output may vary from `std::pow` (or Eigen's `square` and `cube` for that matter) depending on implementation. For example: msvc and clang always (?) return `lowest()` for overflow and underflow for signed types -- while gcc returns `highest()` or `lowest()` depending on the arguments and data types. | |
Unsigned integers do not overflow per standard. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-08-29T19:23:55.104Z,NA,NA,"## Title: | |
Vectorize pow for integer base / exponent types | |
## Authors: | |
Charles Schlosser | |
## Summary: | |
This merge request introduces a vectorized implementation of the `pow` function specifically for integer base and exponent types in the Eigen C++ library. It addresses issues related to handling negative exponents and potential undefined behaviors in calculations. | |
### Key Changes: | |
- Implemented a vectorized `pow` function for integer types. | |
- Added handling for negative exponents when computing powers. | |
### Improvements: | |
- Enhanced performance through vectorization, allowing for more efficient computations of powers in bulk. | |
- Provided more predictable behavior for power calculations involving negative exponents across different compilers. | |
### Impact: | |
The changes improve the numerical robustness of the Eigen library when dealing with integer base and exponent types, ensuring consistent results and better performance in mathematical computations." | |
1038 (https://gitlab.com/libeigen/eigen/-/merge_requests/1038),"Vectorize acos, asin, and atan for float.","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This change vectorizes the `acos`, `asin`, and `atan` operators in Eigen. | |
### Additional information | |
**Accuracy:** Exhaustive testing for all float arguments in [-1:1] shows that this implementation | |
is accurate to 2.6 ulps for `pacos`, and 3.8 ulps for `pasin`. Maximum relative error for `patan` is 2 ulps. | |
**Speed**: See: libeigen/eigen$2393114 | |
Speedup for 4k element vector: | |
| | SSE | AVX | AVX512 | | |
| ---------- | ----- | ----- | ------ | | |
| `asin()` | 6x | 15.9x | 18.6x | | |
| `acos()` | 11.4x | 29.5x | 29.8x | | |
| `atan()` | 5.5x | 9.3x | 17x | | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-08-29T19:49:33.827Z,NA,NA,"## Title: | |
Vectorize acos, asin, and atan for float. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request introduces vectorized implementations of the `acos`, `asin`, and `atan` functions specifically for floating-point numbers in the Eigen library, enhancing both accuracy and performance. | |
### Key Changes: | |
- Vectorization of `acos`, `asin`, and `atan` operators for float types. | |
- Introduction of new performance benchmarks for vectorized operations. | |
### Improvements: | |
- Accurate computations with: | |
- `pacos` showing an accuracy of 2.6 ulps, | |
- `pasin` with 3.8 ulps, | |
- `patan` demonstrating a maximum relative error of 2 ulps. | |
- Significant speed improvements, notably: | |
- `asin()` achieves up to 18.6x speedup with AVX512, | |
- `acos()` reaches a maximum 29.8x speedup under the same conditions, | |
- `atan()` shows a peak speedup of 17x. | |
### Impact: | |
These enhancements result in more efficient mathematical computations in environments where performance is critical, thereby improving the overall utility of the Eigen library for scientific computing and other applications relying on trigonometric functions." | |
1048 (https://gitlab.com/libeigen/eigen/-/merge_requests/1048),Fix some test build errors in new unary pow.,"For `real^complex`, we need the return type to be the output of | |
`ScalarBinaryOpTraits`. For `complex<real>^int`, we need the exponent to be | |
promoted to real for use in` ScalarBinaryOpTraits`. | |
Also removed some `const` qualifiers on return types, since these are often overly restrictive.",Antonio Sánchez,2022-08-30T17:24:15.511Z,NA,NA,"## Title: | |
Fix some test build errors in new unary pow. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses and resolves test build errors related to the new unary power functionality in the Eigen library. Specifically, it focuses on the behavior of operations involving real and complex numbers. | |
### Key Changes: | |
- Adjusted return types for `real^complex` to reflect the output of `ScalarBinaryOpTraits`. | |
- Promoted the exponent to a real type for the `complex<real>^int` case. | |
- Removed unnecessary `const` qualifiers on return types to enhance flexibility. | |
### Improvements: | |
- Enhanced type handling for power operations among different numeric types. | |
- Streamlined return type definitions by eliminating overly restrictive qualifiers. | |
### Impact: | |
These changes improve the robustness of the unary power functionality, ensuring better compatibility and flexibility in type operations, which can lead to fewer build errors and improved performance in related test cases." | |
1049 (https://gitlab.com/libeigen/eigen/-/merge_requests/1049),2 typos fix in the 3rd table.,"### What does this implement/fix? | |
It seems to me that there are 2 typos in the 3rd table of the slicing tutorial page. One shall read matrix `A` and not vector `v`.",Gilles Aouizerate,2022-08-31T22:45:04.405Z,NA,NA,"## Title: | |
2 typos fix in the 3rd table. | |
## Authors: | |
Gilles Aouizerate | |
## Summary: | |
This merge request addresses two typos found in the third table of the slicing tutorial page of the Eigen C++ library documentation. | |
### Key Changes: | |
- Corrected instances where ""vector `v`"" should be replaced with ""matrix `A`"" in the third table. | |
### Improvements: | |
- Enhances the accuracy of the documentation, ensuring clarity for users. | |
### Impact: | |
- Improves user understanding of the slicing tutorial, reducing potential confusion regarding the distinction between matrix and vector notation." | |
1051 (https://gitlab.com/libeigen/eigen/-/merge_requests/1051),Fix mixingtypes tests.,"The new unary pow op no longer calls the binary op plugin, so some of | |
these tests were failing. Modified the test to account for this... | |
I think this is better than forcing the unary op to call a binary | |
plugin.",Antonio Sánchez,2022-09-02T15:30:13.933Z,NA,NA,"## Title: | |
Fix mixingtypes tests. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses failures in the mixingtypes tests caused by the new unary power operation (pow) no longer relying on the binary operation plugin. The tests have been modified to accommodate this change. | |
### Key Changes: | |
- Updated mixingtypes tests to reflect the behavior of the new unary pow operation. | |
### Improvements: | |
- The changes enhance the clarity and reliability of the tests by avoiding unnecessary calls to the binary plugin for unary operations. | |
### Impact: | |
- The modification ensures that the tests accurately assess the unary pow functionality, leading to greater robustness in the library's testing framework." | |
1052 (https://gitlab.com/libeigen/eigen/-/merge_requests/1052),Fix some cmake issues.,"We shouldn't be building benchmarks by default on peoples' systems. | |
Also fixed some test dependency issues if sparse libraries are detected. | |
Fixes #2529.",Antonio Sánchez,2022-09-02T16:43:15.460Z,NA,NA,"## Title: | |
Fix some cmake issues. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses and resolves some CMake-related issues in the Eigen C++ library. It specifically adjusts the default behavior for building benchmarks and improves the handling of test dependencies in the presence of sparse libraries. | |
### Key Changes: | |
- Disabled the default building of benchmarks on user systems. | |
- Fixed issues related to test dependencies when sparse libraries are detected. | |
### Improvements: | |
- Enhances user experience by preventing unnecessary benchmark builds. | |
- Ensures proper functionality of tests with sparse library configurations. | |
### Impact: | |
These changes contribute to a cleaner installation process and better compatibility in diverse environments, ultimately making it easier for users to set up the Eigen library without encountering common CMake issues." | |
1050 (https://gitlab.com/libeigen/eigen/-/merge_requests/1050),Add asserts for index-out-of-bounds in IndexedView.,Fixes #2530.,Antonio Sánchez,2022-09-02T17:28:03.803Z,NA,NA,"## Title: | |
Add asserts for index-out-of-bounds in IndexedView. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses issue #2530 by adding assertions to the IndexedView component of the Eigen C++ library to handle index-out-of-bounds situations. | |
### Key Changes: | |
- Implemented assertions to detect and prevent index-out-of-bounds errors in the IndexedView functionality. | |
### Improvements: | |
- Enhanced error checking in the IndexedView, which will lead to more robust behaviors when invalid indices are accessed. | |
### Impact: | |
- This change improves the overall reliability of the Eigen library by preventing potential runtime errors due to invalid indexing, making it safer for developers." | |
1053 (https://gitlab.com/libeigen/eigen/-/merge_requests/1053),fixed msvc compilation error in GeneralizedEigenSolver.h,"### Reference issue | |
#2532 | |
### What does this implement/fix? | |
Fixed a compilation error happening at least on MSVC due to a missing semi-colon.",Michael Palomas,2022-09-05T15:55:32.954Z,NA,NA,"## Title: | |
Fixed MSVC Compilation Error in GeneralizedEigenSolver.h | |
## Authors: | |
Michael Palomas | |
## Summary: | |
This merge request addresses a compilation error encountered in the Eigen C++ library's GeneralizedEigenSolver.h file when using Microsoft Visual Studio Compiler (MSVC). The issue was caused by a missing semi-colon. | |
### Key Changes: | |
- Added a missing semi-colon to resolve compilation issues. | |
### Improvements: | |
- Enhanced code stability by fixing the compilation error specific to MSVC environments. | |
### Impact: | |
- Prevents compilation failures on MSVC, which improves the usability of the Eigen library for developers using this compiler." | |
1054 (https://gitlab.com/libeigen/eigen/-/merge_requests/1054),fix typo in doc/TutorialSparse.dox,"### What does this implement/fix? | |
fix typo in doc/TutorialSparse.dox (2 columns were inverted)",Gilles Aouizerate,2022-09-06T17:58:00.432Z,NA,NA,"## Title: | |
Fix typo in doc/TutorialSparse.dox | |
## Authors: | |
Gilles Aouizerate | |
## Summary: | |
This merge request addresses a typographical error in the documentation of the Eigen C++ library, specifically in the file TutorialSparse.dox, where two columns were inverted. | |
### Key Changes: | |
- Corrected a typo in the documentation. | |
### Improvements: | |
- Enhances the clarity and accuracy of the TutorialSparse documentation. | |
### Impact: | |
- Increases the readability and correctness of the documentation for users, aiding in a better understanding of the sparse matrix functionalities in Eigen." | |
1055 (https://gitlab.com/libeigen/eigen/-/merge_requests/1055),Call check_that_malloc_is_allowed() in aligned_realloc(),"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
https://gitlab.com/libeigen/eigen/-/issues/2526 | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
This prevents reallocs when EIGEN_RUNTIME_NO_MALLOC is defined. The check is not needed with EIGEN_NO_MALLOC as no memory could be malloc'd in the first place. We only call the assert if result != ptr which means an allocation happened. | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Florian Richer,2022-09-06T18:18:57.788Z,NA,NA,"## Title: | |
Call check_that_malloc_is_allowed() in aligned_realloc() | |
## Authors: | |
Florian Richer | |
## Summary: | |
This merge request introduces a safeguard in the `aligned_realloc()` function to prevent memory reallocations when the `EIGEN_RUNTIME_NO_MALLOC` flag is defined. | |
### Key Changes: | |
- Implemented a check for `check_that_malloc_is_allowed()` within the `aligned_realloc()` function. | |
- The check ensures that reallocations do not occur if `EIGEN_RUNTIME_NO_MALLOC` is defined, preventing unnecessary memory allocation attempts. | |
### Improvements: | |
- Enhances the robustness of memory management by preventing reallocations in environments where dynamic memory allocation is disabled, thereby avoiding potential runtime errors and improving stability. | |
### Impact: | |
The changes will improve the reliability of the Eigen library in scenarios where memory allocation is restricted, ensuring that users experience fewer issues related to memory handling in constrained environments." | |
1056 (https://gitlab.com/libeigen/eigen/-/merge_requests/1056),Reduce compiler warnings for tests.,NA,Antonio Sánchez,2022-09-06T18:21:14.674Z,NA,NA,"## Title: | |
Reduce compiler warnings for tests. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request focuses on minimizing compiler warnings that arise during testing in the Eigen C++ library. | |
### Key Changes: | |
- Adjustments made in test code to address and reduce warnings. | |
### Improvements: | |
- Cleaner build output with fewer warnings, enhancing code quality. | |
### Impact: | |
- Improved maintainability of the codebase by ensuring cleaner test outputs, which can facilitate easier debugging and development." | |
1057 (https://gitlab.com/libeigen/eigen/-/merge_requests/1057),Adjust overflow threshold bound for pow tests.,"The original bound was too weak and did result in some overflows causing | |
our CI pipelines to fail (https://gitlab.com/libeigen/eigen_ci_cross_testing/-/pipelines/631839537). | |
In particular, it allowed `int32_t` `pow(256, 4)`, which overflowed to 0 | |
for integers, and to an inexact result for float. Here we adjust the | |
bounds.",Antonio Sánchez,2022-09-06T19:53:29.762Z,NA,NA,"## Title: | |
Adjust overflow threshold bound for pow tests. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses the overflow issues encountered in the CI pipelines due to weak bounds in the power tests for the Eigen C++ library. | |
### Key Changes: | |
- Adjusted the overflow threshold bounds for the `pow` function tests to prevent overflow scenarios. | |
### Improvements: | |
- Enhanced the reliability of CI pipelines by preventing integer overflows (e.g., `pow(256, 4)` resulting in 0) and ensuring more accurate results for floating-point calculations. | |
### Impact: | |
- The adjustments improve the stability and reliability of the testing process, reducing the likelihood of CI pipeline failures caused by overflow errors." | |
899 (https://gitlab.com/libeigen/eigen/-/merge_requests/899),"Add constexpr, test for C++14 constexpr.","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This is the next split-off from !881. It adds the C++14 parts of the patch which allows `constexpr` initialization of `Map`s from static `constexpr` memory and some of the basic operations (including, to my surprise, my memory is a sieve) addition and subtraction. | |
I included a testcase which is the testcase in !881 with the C++20 parts reduced to empty placeholders. | |
Edit: this is done. ~~I realize now that I forgot to follow up on @cantonios comment re destructors (""Whatever we do for NVCC, we probably also need to do for HIP as well.""), so I'll have to leave this as a draft until I can review what that implies. Edit: I think `#if defined(EIGEN_GPUCC)` should be the right thing.~~ | |
Compared to the version in !881 this re-instates (I think) all the inline keywords that I had removed when adding `constexpr`. | |
### Additional information | |
<!--Any additional information you think is important.-->",Tobias Schlüter,2022-09-07T04:02:46.824Z,NA,NA,"## Title: Add constexpr, test for C++14 constexpr. | |
## Authors: Tobias Schlüter | |
## Summary: | |
This merge request introduces C++14 `constexpr` support to the Eigen C++ library, enabling the initialization of `Map`s from static `constexpr` memory and allowing some basic operations such as addition and subtraction to be executed at compile time. A test case has been included to validate these changes. | |
### Key Changes: | |
- Added C++14 `constexpr` initialization for `Map`s. | |
- Supported basic operations (addition and subtraction) as `constexpr`. | |
- Included a new test case based on prior work, adjusted to focus on C++14 features. | |
### Improvements: | |
- Enhanced compile-time capabilities, improving the library's performance and usability for `constexpr` contexts. | |
- Reinstated inline keywords removed in previous versions, optimizing function calls. | |
### Impact: | |
These changes improve the functionality and flexibility of the Eigen library in C++14 environments, enabling more efficient compile-time computations and potentially reducing runtime overhead for certain tasks." | |
1058 (https://gitlab.com/libeigen/eigen/-/merge_requests/1058),Add missing comparison operators for GPU packets.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Adds missing comparison operators for GPUs. Without these, the recent vectorized version of psign (!1026) does not build with CUDA. | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-09-07T21:34:16.479Z,NA,NA,"## Title: | |
Add missing comparison operators for GPU packets. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request implements the addition of missing comparison operators specifically for GPU packets in the Eigen C++ library. This change addresses a critical build issue with the recent vectorized version of the `psign` function when used with CUDA. | |
### Key Changes: | |
- Introduced missing comparison operators needed for GPU operations. | |
- Ensured compatibility with the vectorized implementation of `psign`. | |
### Improvements: | |
- Enhances functionality of the Eigen library for GPU computations. | |
- Resolves build failures related to CUDA by providing necessary operator definitions. | |
### Impact: | |
This change enables successful integration and compilation of the vectorized `psign` functionality with CUDA, thereby improving the library's capability for GPU-accelerated linear algebra computations." | |
1061 (https://gitlab.com/libeigen/eigen/-/merge_requests/1061),Tweak bound for pow to account for floating-point types.,"Floating-point types can only represent up to 2^digits integers | |
exactly, so we need to adjust the corrected bound from !1057. | |
This fixes an `array_cwise_3` failure.",Antonio Sánchez,2022-09-08T17:40:45.985Z,NA,NA,"## Title: | |
Tweak bound for pow to account for floating-point types. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an issue related to the representation of integers in floating-point types in the Eigen C++ library. It adjusts the corrected bound from a previous merge request in order to fix a failure encountered in `array_cwise_3`. | |
### Key Changes: | |
- Updated the bound for the `pow` function to accommodate the limitations of floating-point types. | |
### Improvements: | |
- Enhanced the reliability of operations involving floating-point numbers to prevent potential errors. | |
### Impact: | |
- Fixes a specific failure in the `array_cwise_3`, thereby improving the stability and correctness of the Eigen library's functionalities dealing with floating-point arithmetic." | |
1060 (https://gitlab.com/libeigen/eigen/-/merge_requests/1060),Fix realloc for non-trivial types.,"Fix realloc for non-trivial types. | |
If a non-trivial type `RequiresInitialization`, then we unfortunately | |
can't simply rely on `realloc` to move the memory to a new buffer. | |
If the data type contains self-referencing pointers (as does | |
`AnnoyingScalar` in tests), then those pointers become invalid. | |
Instead, we need to allocate a new buffer and copy-construct (or | |
move-construct) existing elements. | |
This fixes some failing tests in `sparse_block`, which tickled the | |
bug.",Antonio Sánchez,2022-09-08T19:39:37.176Z,NA,NA,"## Title: | |
Fix realloc for non-trivial types. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses a bug related to reallocating memory for non-trivial types in the Eigen C++ library. It introduces a method to handle types that require specific initialization, ensuring that self-referencing pointers remain valid during memory reallocation. | |
### Key Changes: | |
- Implemented a new approach for reallocating memory that avoids relying on `realloc` for non-trivial types. | |
- Introduced logic to allocate a new buffer and copy-construct or move-construct existing elements instead. | |
### Improvements: | |
- Enhanced stability in memory handling for complex data types containing self-referencing pointers, such as `AnnoyingScalar`. | |
- Resolved failing tests in the `sparse_block` module that were affected by the previous implementation. | |
### Impact: | |
This change improves the robustness of memory management within the Eigen library, particularly for types that require initialization, thereby reducing potential runtime errors and improving the overall reliability of the library." | |
1047 (https://gitlab.com/libeigen/eigen/-/merge_requests/1047),Feature/skew symmetric matrix3,"### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
https://gitlab.com/libeigen/eigen/-/issues/1474 | |
### Additional information | |
This MR implements a skew symmetric matrix class for Vector3. | |
It is based on DiagonalMatrix and implements its exponential using | |
[Rodrigues' rotation formula](https://mathworld.wolfram.com/RodriguesRotationFormula.html).",Thomas Gloor,2022-09-08T20:44:41.220Z,NA,NA,"## Title: Feature/skew symmetric matrix3 | |
## Authors: Thomas Gloor | |
## Summary: | |
This merge request introduces a new class for skew symmetric matrices specifically for Vector3 in the Eigen C++ library. | |
### Key Changes: | |
- Addition of a skew symmetric matrix class for 3D vectors. | |
- Implementation of matrix exponential using Rodrigues' rotation formula. | |
### Improvements: | |
- Enhances the library's functionality for 3D vector transformations by providing a dedicated skew symmetric matrix representation. | |
### Impact: | |
- Enables more efficient and accurate computations involving rotations in 3D space, thus improving the usability of the Eigen library for applications in graphics and robotics." | |
1064 (https://gitlab.com/libeigen/eigen/-/merge_requests/1064),Fix g++-6 constexpr and c++20 constexpr build errors.,"Apparently g++-6 requires that all variables in a constexpr function | |
need to be initialized on construction. | |
Also added more `constexpr` labels that are required post c++20. | |
And clang has a bug for `consteval` that claims it's not a contexpr (https://stackoverflow.com/questions/63364918/clang-says-call-to-void-consteval-function-is-not-a-constant-expression). Re-wrote the `assert_constexpr` ""function"" using template parameter and macro. | |
Fixes #2536",Antonio Sánchez,2022-09-09T03:41:45.852Z,NA,NA,"## Title: | |
Fix g++-6 constexpr and c++20 constexpr build errors. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses build errors encountered with g++-6 regarding `constexpr` functions, ensuring compliance with C++20 standards. It also resolves issues related to clang's `consteval` implementation by modifying the `assert_constexpr` functionality. | |
### Key Changes: | |
- Ensured all variables in `constexpr` functions are initialized upon construction to meet g++-6 requirements. | |
- Added additional `constexpr` labels as required for C++20 compliance. | |
- Rewrote the `assert_constexpr` mechanism using template parameters and macros due to a bug in clang. | |
### Improvements: | |
- Enhanced compatibility with g++-6 and C++20. | |
- Improved robustness of `assert_constexpr` by addressing cross-compiler inconsistencies. | |
### Impact: | |
This fix will facilitate smoother builds across different compiler versions, thereby improving the library's portability and adherence to modern C++ standards." | |
1065 (https://gitlab.com/libeigen/eigen/-/merge_requests/1065),[ROCm] Fix for sparse matrix related breakage on ROCm.,"The following commit caused a compilation failure on ROCm: https://gitlab.com/rohitsan/eigen/-/commit/ec9c7163a3acd941163dc26aa1bea913a4a5c3a8 | |
This MR fixes this. | |
/cc @cantonios",Rohit Santhanam,2022-09-09T15:48:05.107Z,NA,NA,"## Title: | |
[ROCm] Fix for sparse matrix related breakage on ROCm. | |
## Authors: | |
Rohit Santhanam | |
## Summary: | |
This merge request addresses a compilation failure on the ROCm backend caused by a previous commit. | |
### Key Changes: | |
- Introduced fixes to resolve issues in the sparse matrix functionality that arose from a specific commit. | |
### Improvements: | |
- Enhanced compatibility of the Eigen library with ROCm for sparse matrix operations. | |
### Impact: | |
- Resolves compilation issues, ensuring that users can effectively utilize sparse matrices on the ROCm platform without encountering errors." | |
1063 (https://gitlab.com/libeigen/eigen/-/merge_requests/1063),Fix a couple of issues with unary pow():,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
1. Explicitly cast value returned by std::pow() to result_type. | |
1. Discard const when comparing types. | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-09-09T17:21:11.889Z,NA,NA,"## Title: | |
Fix a couple of issues with unary pow() | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request addresses two specific issues related to the unary `pow()` function in the Eigen C++ library, ensuring better type safety and correctness in type comparisons. | |
### Key Changes: | |
1. Explicitly casts the value returned by `std::pow()` to `result_type`. | |
2. Discards `const` when comparing types to enhance type compatibility. | |
### Improvements: | |
- The changes improve type safety, reducing the risk of type mismatch errors during computation. | |
- By explicitly managing types, the adjustments simplify future maintenance and modifications of the `pow()` function. | |
### Impact: | |
These updates enhance the reliability of the unary `pow()` operation in the Eigen library, potentially improving performance and correctness in mathematical computations across various modules that utilize this function." | |
1069 (https://gitlab.com/libeigen/eigen/-/merge_requests/1069),Remove bad skew_symmetric_matrix3 test.,"We can't compare an uninitialized matrix since this results in msan | |
errors, and it does actually pass the approx test with compiler | |
optimizations since the old memory address is re-used. | |
Fixes #2537.",Antonio Sánchez,2022-09-10T07:08:38.726Z,NA,NA,"## Title: | |
Remove bad skew_symmetric_matrix3 test. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an issue with the skew_symmetric_matrix3 test in the Eigen C++ library. The uninitialized matrix comparison was causing errors, and the test was passing under certain conditions, leading to incorrect results. | |
### Key Changes: | |
- Removed the faulty skew_symmetric_matrix3 test due to issues with uninitialized matrix comparisons. | |
### Improvements: | |
- Mitigated potential msan errors associated with the comparison of uninitialized matrices. | |
### Impact: | |
- Enhances the robustness of the testing framework by eliminating unreliable tests, ensuring future tests yield correct results." | |
1066 (https://gitlab.com/libeigen/eigen/-/merge_requests/1066),"Allow mixed types for pow(), as long as the exponent is exactly representable in the base type.","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
The current implementation of unary_pow break many existing codes because of the hard constraint that base type == exponent type for floating point types. Promoting the exponent from float to double is safe. This change is an attempt to write this in a general way, so we can add support for mixed `pow()` for more types on the future, such as `pow(double, bfloat16)`. | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-09-12T21:55:31.867Z,NA,NA,"## Title: | |
Allow mixed types for pow(), as long as the exponent is exactly representable in the base type. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request introduces modifications to the `pow()` function within the Eigen C++ library, allowing for mixed types as long as the exponent can be exactly represented in the base type. Previously, the implementation required both base and exponent types to be the same, which led to issues in existing code. | |
### Key Changes: | |
- Removed the constraint that base type must equal exponent type for floating point operations. | |
- Enabled safe promotion of the exponent from `float` to `double`. | |
- Established a foundation for future support of mixed type operations in `pow()`, such as `pow(double, bfloat16)`. | |
### Improvements: | |
- Increased flexibility in using the `pow()` function, allowing for a broader range of type combinations without breaking existing code. | |
- Enhanced usability by accommodating type promotions that are generally safe. | |
### Impact: | |
This change aims to resolve compatibility issues in existing codebases using the `pow()` function while setting the stage for future expansions in type support. It facilitates easier integration of different numeric types in mathematical computations." | |
1070 (https://gitlab.com/libeigen/eigen/-/merge_requests/1070),Fix test for pow with mixed integer types. We do not convert the exponent if it is an integer type.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-09-12T23:13:38.030Z,NA,NA,"## Title: | |
Fix test for pow with mixed integer types. We do not convert the exponent if it is an integer type. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request addresses an issue in the Eigen C++ library concerning the behavior of the `pow` function when using mixed integer types. It corrects the handling of exponents that are of integer type, ensuring that no unintended conversions occur during computation. | |
### Key Changes: | |
- Adjusted the test for the `pow` function to ensure proper handling of mixed integer types. | |
- Implemented logic to prevent conversion of integer-type exponents. | |
### Improvements: | |
- Enhanced accuracy and reliability of the `pow` function when dealing with mixed integer types. | |
### Impact: | |
- This change improves the functionality of the `pow` function and prevents potential errors related to type conversions, thus ensuring consistent results in mathematical calculations involving mixed integer types." | |
1073 (https://gitlab.com/libeigen/eigen/-/merge_requests/1073),Add AVX int32_t pdiv,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
Enables vectorized int32_t division by internally casting to double and truncating the result. Approximately 2x throughput compared to native integer division. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-09-16T17:06:30.044Z,NA,NA,"## Title: | |
Add AVX int32_t pdiv | |
## Authors: | |
Charles Schlosser | |
## Summary: | |
This merge request introduces a new implementation for vectorized division of `int32_t` using AVX instructions. This method enhances performance by internally casting the operands to double and truncating the result. | |
### Key Changes: | |
- Added support for vectorized `int32_t` division utilizing AVX. | |
- Division operation casts integers to double for computation, then truncates the results. | |
### Improvements: | |
- The new implementation achieves approximately double the throughput compared to the native integer division. | |
### Impact: | |
This enhancement significantly boosts the performance of integer division operations within the Eigen library, making it particularly beneficial for applications that rely on intensive numerical computations." | |
1074 (https://gitlab.com/libeigen/eigen/-/merge_requests/1074),"Revert ""Add constexpr, test for C++14 constexpr.""","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-09-16T21:14:30.500Z,NA,NA,"## Title: | |
Revert ""Add constexpr, test for C++14 constexpr."" | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request reverts a previous change that introduced `constexpr` support and corresponding tests for C++14. | |
### Key Changes: | |
- Removed the addition of `constexpr` features and tests that were introduced in a prior commit. | |
### Improvements: | |
- Reverting the changes may improve code stability if the `constexpr` implementation was causing issues or not functioning as expected. | |
### Impact: | |
- The reversion aims to restore functionality to a state prior to the changes, potentially addressing any regressions or compatibility problems introduced by the `constexpr` features." | |
1077 (https://gitlab.com/libeigen/eigen/-/merge_requests/1077),[ROCm] fixed gpuGetDevice unused message,"### Reference issue | |
https://gist.github.com/cheshire/f81fd036f07ea9267e3abb8c14c12aec#file-gistfile1-txt-L7-L8 | |
We are fixing unused-result warning for Tensorflow at all Linux builds, and this unused-result warning existes both in ROCm and cuda. | |
### What does this implement/fix? | |
Add the warning status check for `gpuGetDevice`, and it will print out ""Failed to get the GPU devices "". | |
This is identical to `gpuGetDeviceCount`.",Chao Chen,2022-09-20T21:38:21.447Z,NA,NA,"## Title: | |
[ROCm] fixed gpuGetDevice unused message | |
## Authors: | |
Chao Chen | |
## Summary: | |
This merge request addresses an unused-result warning in the ROCm integration of the Eigen C++ library, specifically related to the `gpuGetDevice` function. It ensures that warnings are properly reported when there is a failure to acquire GPU device information. | |
### Key Changes: | |
- Implemented a warning status check for the `gpuGetDevice` function. | |
- Added a message that prints ""Failed to get the GPU devices"" upon failure, aligning with the existing behavior of `gpuGetDeviceCount`. | |
### Improvements: | |
- Enhances error reporting for GPU device acquisition failures. | |
- Helps maintain cleaner build outputs by addressing unused-result warnings, particularly for TensorFlow on Linux platforms. | |
### Impact: | |
- Improved clarity in diagnostics for developers by providing explicit feedback when GPU devices cannot be retrieved, which could assist in debugging and ensure better compatibility with TensorFlow." | |
1076 (https://gitlab.com/libeigen/eigen/-/merge_requests/1076),"Add vectorized integer division for int32 with AVX512, AVX or SSE.","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This implements vectorized integer division for int32 on AVX512, AVX and SSE | |
Original author is @chuckyschluz . | |
This change adds | |
- Sets HasDiv=1 for Packet4i, Packet8i, and Packet16i | |
- raises SIGFPE upon division by zero | |
### Additional information | |
<!--Any additional information you think is important.--> | |
Benchmark data for SSE & AVX here: https://gitlab.com/libeigen/eigen/-/snippets/2412525",Rasmus Munk Larsen,2022-09-21T00:27:23.997Z,NA,NA,"## Title: | |
Add vectorized integer division for int32 with AVX512, AVX or SSE. | |
## Authors: | |
Rasmus Munk Larsen, Original author @chuckyschluz | |
## Summary: | |
This merge request introduces a vectorized implementation of integer division for int32 data types utilizing AVX512, AVX, and SSE instructions, enhancing the performance of division operations in the Eigen C++ library. | |
### Key Changes: | |
- Implemented vectorized integer division for int32. | |
- Set the `HasDiv=1` flag for `Packet4i`, `Packet8i`, and `Packet16i`. | |
- Implemented a mechanism to raise SIGFPE (Signal Floating Point Exception) when division by zero occurs. | |
### Improvements: | |
- Increased performance for integer division operations on supported architectures by leveraging vectorization. | |
- Enhanced error handling through the SIGFPE signal when dividing by zero, improving robustness. | |
### Impact: | |
This optimization is expected to significantly improve computational efficiency for applications utilizing integer division, particularly in high-performance computing scenarios. The introduction of proper error handling further strengthens the reliability of the library's functionality." | |
1018 (https://gitlab.com/libeigen/eigen/-/merge_requests/1018),Use 3px8/2px8/1px8/1x8 gebp_kernel on arm64-neon,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> -->#2518 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
I found that the gebp_kernel used by eigen neon is 3px4/2px4/1px4 by default. This is reasonable on x86(avx/fma) and arm32. However, arm64 neon has 32 registers, and a larger size gebp_kernel can be used to get better data reuse to improve performance. | |
Therefore, I implement gebp_kernel 3px8/2px8/1px8 on eigen (3px8 24 registers for acc, 3 for lhs, 1 for rhs). | |
### Additional information | |
<!--Any additional information you think is important.--> | |
**benchmark** | |
# clang | |
## dgemm | |
 | |
 | |
## sgemm | |
 | |
 | |
## hgemm | |
 | |
 | |
# gcc | |
## dgemm | |
 | |
 | |
## sgemm | |
 | |
 | |
## hgemm | |
 | |
 | |
**platform** : Ampere® Altra | |
Architecture: aarch64 | |
CPU op-mode(s): 32-bit, 64-bit | |
Byte Order: Little Endian | |
CPU(s): 128 | |
On-line CPU(s) list: 0-127 | |
Vendor ID: ARM | |
Model name: Neoverse-N1 | |
Model: 1 | |
Thread(s) per core: 1 | |
Core(s) per socket: 64 | |
Socket(s): 2 | |
Stepping: r3p1 | |
CPU max MHz: 3000.0000 | |
CPU min MHz: 1000.0000 | |
BogoMIPS: 50.00 | |
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs | |
Caches (sum of all): | |
L1d: 8 MiB (128 instances) | |
L1i: 8 MiB (128 instances) | |
L2: 128 MiB (128 instances)",Lianhuang Li,2022-09-21T16:36:41.318Z,NA,NA,"## Title: | |
Use 3px8/2px8/1px8/1x8 gebp_kernel on arm64-neon | |
## Authors: | |
Lianhuang Li | |
## Summary: | |
This merge request introduces an optimized gebp_kernel configuration specifically for arm64-neon architecture. The new configuration utilizes a 3px8/2px8/1px8 kernel, aiming to leverage the 32 registers available on arm64 to enhance performance through better data reuse. | |
### Key Changes: | |
- Transition from the default gebp_kernel of 3px4/2px4/1px4 to 3px8/2px8/1px8. | |
- The new implementation efficiently utilizes 24 registers for accumulation, 3 for the left-hand side (lhs), and 1 for the right-hand side (rhs) of matrix multiplications. | |
### Improvements: | |
- Enhanced performance on arm64-neon platforms by optimizing register usage. | |
- Significant improvements observed in benchmark tests for different matrix multiplication types (dgemm, sgemm, hgemm) utilizing both clang and gcc compilers. | |
### Impact: | |
- This change is expected to boost the computational efficiency of matrix operations on arm64 architectures, leading to faster execution times for applications relying on the Eigen library for linear algebra computations." | |
1078 (https://gitlab.com/libeigen/eigen/-/merge_requests/1078),Add a macro to set the nr trait in the GEBP kernel for NEON.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-09-22T23:56:34.752Z,NA,NA,"## Title: | |
Add a macro to set the nr trait in the GEBP kernel for NEON. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request introduces a macro designed to adjust the `nr` trait within the Generalized Eigenvalue Block Product (GEBP) kernel specifically for NEON architecture. This update aims to optimize computational efficiency for matrix operations using NEON. | |
### Key Changes: | |
- Added a new macro to configure the `nr` trait for the GEBP kernel tailored for NEON support. | |
### Improvements: | |
- Enhances the performance of matrix computations on platforms utilizing NEON, potentially leading to faster execution times in relevant applications. | |
### Impact: | |
- Users working with the Eigen library on NEON-enabled hardware will benefit from improved processing speeds, thus making matrix operations more efficient." | |
1080 (https://gitlab.com/libeigen/eigen/-/merge_requests/1080),Remove unused typedef.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-09-23T19:33:55.589Z,NA,NA,"## Title: | |
Remove unused typedef | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request focuses on cleaning up the Eigen C++ library by removing an unused typedef, thereby enhancing code clarity and maintainability. | |
### Key Changes: | |
- Removed a typedef that was not utilized within the codebase. | |
### Improvements: | |
- Simplified the code, making it more readable and easier to maintain. | |
### Impact: | |
- Contributes to improved code cleanliness and may reduce confusion for future contributors regarding unused components." | |
1079 (https://gitlab.com/libeigen/eigen/-/merge_requests/1079),Try to reduce compilation time/memory for GEBP kernel using EIGEN_IF_CONSTEXPR,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-09-23T20:09:43.550Z,NA,NA,"## Title: | |
Try to reduce compilation time/memory for GEBP kernel using EIGEN_IF_CONSTEXPR | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request introduces optimizations aimed at reducing both compilation time and memory usage for the Generalized Eigenvalue Problem (GEBP) kernel in the Eigen C++ library. By utilizing the `EIGEN_IF_CONSTEXPR` feature, the implementation allows for more efficient code generation based on compile-time evaluations. | |
### Key Changes: | |
- Implementation of `EIGEN_IF_CONSTEXPR` in the GEBP kernel. | |
- Adjustments made to enhance the handling of compile-time constants. | |
### Improvements: | |
- Reduced compilation times due to more efficient code paths. | |
- Decreased memory consumption when compiling the GEBP kernel. | |
### Impact: | |
The improvements are expected to enhance the development experience by making compile times shorter and resource usage more efficient, which can contribute to a more responsive workflow for developers working with large projects that utilize the Eigen library." | |
1083 (https://gitlab.com/libeigen/eigen/-/merge_requests/1083),Try to reduce size of GEBP kernel for non-ARM targets.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This is an experiment to try and prevent MSVC from running out of heap memory when building TensorFlow. | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-09-28T02:37:18.984Z,NA,NA,"## Title: | |
Try to reduce size of GEBP kernel for non-ARM targets. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request is aimed at reducing the memory footprint of the Generalized Eigenvalue Problem (GEBP) kernel for non-ARM architectures, specifically to mitigate issues with heap memory running out in MSVC while building TensorFlow. | |
### Key Changes: | |
- Modifications to the GEBP kernel code to optimize memory usage. | |
### Improvements: | |
- Enhanced performance in environments with limited heap memory, particularly for users of MSVC. | |
### Impact: | |
- This change is expected to facilitate successful builds of TensorFlow on systems where MSVC encounters heap memory limitations, improving the overall usability of the Eigen library in such contexts." | |
1082 (https://gitlab.com/libeigen/eigen/-/merge_requests/1082),Add a vectorized implementation of atan2 to Eigen.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This adds support for the array syntax `z = x.atan2(y)` and the corresponding global function `z = atan2(x,y)`. Since the common case is `atan2(x) = atan(y/x)` and `atan()` is already vectorized, this MR mostly adds global declarations and vectorized handling of special cases, as specified at https://en.cppreference.com/w/cpp/numeric/math/atan2. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
Speedup is about 12.4x for AVX512 on large arrays. See detailed benchmark numbers here: | |
https://gitlab.com/libeigen/eigen/-/snippets/2418433",Rasmus Munk Larsen,2022-09-28T20:46:50.727Z,NA,NA,"## Title: | |
Add a vectorized implementation of atan2 to Eigen. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request introduces a vectorized implementation of the `atan2` function in the Eigen C++ library. It allows users to compute the `atan2` of two arrays with an efficient array syntax, enhancing computational performance for large datasets. | |
### Key Changes: | |
- Added support for the array syntax `z = x.atan2(y)` and the global function `z = atan2(x,y)`. | |
- Implemented vectorized handling of special cases as per specifications from the C++ reference documentation. | |
### Improvements: | |
- Achieves approximately 12.4x speedup for AVX512 executions on large arrays. | |
- Leverages the existing vectorized `atan()` function, optimizing the computation process for more efficient processing. | |
### Impact: | |
This enhancement significantly improves the performance of calculations involving the `atan2` function in Eigen, making it more suitable for operations on large-scale arrays and providing a better experience for users who require high-performance mathematical computations." | |
1084 (https://gitlab.com/libeigen/eigen/-/merge_requests/1084),Vectorize atan() for double.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
See benchmarks: https://gitlab.com/libeigen/eigen/-/snippets/2420060 | |
### Additional information | |
<!--Any additional information you think is important.--> | |
Sampling the function for arguments in the interval [0:1] with multiplicative stepsize of 1+1e-7 shows a maximum relative error of 2 ULPs.",Rasmus Munk Larsen,2022-10-01T01:49:30.973Z,NA,NA,"## Title: | |
Vectorize atan() for double. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request introduces a vectorized implementation of the `atan()` function specifically for double precision floats in the Eigen C++ library. | |
### Key Changes: | |
- Implemented a vectorized version of the `atan()` function for double precision, optimizing its performance. | |
### Improvements: | |
- Achieves a maximum relative error of 2 ULPs when sampling the function in the interval [0:1] with a very fine step size. | |
- Enhances computational efficiency for applications utilizing `atan()` with double precision data. | |
### Impact: | |
The vectorization of the `atan()` function will likely lead to significant performance improvements in applications that rely on this mathematical operation, especially for large datasets or in computationally intensive scenarios." | |
1086 (https://gitlab.com/libeigen/eigen/-/merge_requests/1086),Only vectorize atan<double> for Altivec if VSX is available.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-10-03T22:06:59.594Z,NA,NA,"## Title: Only vectorize atan<double> for Altivec if VSX is available. | |
## Authors: Rasmus Munk Larsen | |
## Summary: | |
This merge request implements conditional vectorization of the `atan<double>` function for the Altivec architecture, specifically ensuring that the vectorization only takes place if VSX (Vector-Scalar Extension) support is available. | |
### Key Changes: | |
- Vectorization of `atan<double>` for Altivec is now only performed when VSX is available. | |
### Improvements: | |
- Enhanced performance by preventing unnecessary vectorization attempts on hardware that does not support VSX. | |
### Impact: | |
- This change improves the efficiency and compatibility of the Eigen library on systems utilizing Altivec, ultimately leading to better performance and reduced errors related to unsupported vectorization." | |
1085 (https://gitlab.com/libeigen/eigen/-/merge_requests/1085),Fix 4x4 inverse when compiling with -Ofast.,"Fast mode doesn't respect -0, causing sign flips in the inverse. | |
Fixes #2549",Antonio Sánchez,2022-10-04T16:05:49.760Z,NA,NA,"## Title: | |
Fix 4x4 inverse when compiling with -Ofast. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an issue with the 4x4 matrix inverse computation in the Eigen C++ library when using the optimization flag -Ofast. The fix resolves sign flip problems that occur under these conditions. | |
### Key Changes: | |
- Corrected the 4x4 inverse function to ensure it respects the correct sign when compiled with -Ofast. | |
### Improvements: | |
- Enhanced the reliability of the matrix inverse functionality in optimized builds, preventing erroneous results. | |
### Impact: | |
- Ensures consistent and accurate inverse computations for 4x4 matrices, particularly in performance-tuned applications where -Ofast is employed. This reinforces the integrity of mathematical operations within the library." | |
1088 (https://gitlab.com/libeigen/eigen/-/merge_requests/1088),Replace assert with eigen_assert.,This is for consistency and ability to disable entirely.,Antonio Sánchez,2022-10-04T17:11:23.683Z,NA,NA,"## Title: | |
Replace assert with eigen_assert. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request replaces instances of `assert` with `eigen_assert` in the Eigen C++ library to enhance consistency and provide the option to disable assertions entirely. | |
### Key Changes: | |
- All occurrences of `assert` have been substituted with `eigen_assert`. | |
### Improvements: | |
- Improved consistency across the codebase. | |
- Added the capability to disable assertions through the existing configuration options. | |
### Impact: | |
This change enhances flexibility for users by allowing them to control assertion behavior, potentially improving performance and debugging based on user needs." | |
1089 (https://gitlab.com/libeigen/eigen/-/merge_requests/1089),Unconditionally enable CXX11 math.,It should be supported on all compilers for C++14 and up.,Antonio Sánchez,2022-10-04T17:37:47.753Z,NA,NA,"## Title: | |
Unconditionally enable CXX11 math. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request enables CXX11 math features unconditionally across the Eigen library, ensuring compatibility with all compilers that support C++14 and later. | |
### Key Changes: | |
- Conditional checks for enabling CXX11 math have been removed. | |
- CXX11 math is now fully supported in all relevant compiler environments. | |
### Improvements: | |
- Simplified code management by eliminating compiler-specific conditions. | |
- Enhanced consistency in CXX11 math functionality across different environments. | |
### Impact: | |
- Broadens compatibility for C++14 and up, making it easier for users to leverage CXX11 math features without additional configurations. | |
- Improves the overall user experience and reduces potential errors related to compiler support." | |
1087 (https://gitlab.com/libeigen/eigen/-/merge_requests/1087),Simpler range reduction strategy for atan<float>().,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.--> | |
This change saves a division and some `pselect` logic, in exchange for a couple of extra FMAs. The relative error is still <= 2 ulps, while speedup is 20-40% on x86. libeigen/eigen$2421160 | |
Unfortunately, the same change is not viable for `double` without going to a very high polynomial degree, negating the benefit. | |
Also, this change refactors the inner polynomial approximations for `atan<float>()` and `atan<double>()` to separate functions for future use in a more efficient implementation of `atan2()`.",Rasmus Munk Larsen,2022-10-04T18:11:01.153Z,NA,NA,"## Title: | |
Simpler range reduction strategy for atan<float>(). | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request introduces a refined approach to the range reduction strategy for the `atan<float>()` function in the Eigen C++ library, leading to improved performance and reduced computational complexity. | |
### Key Changes: | |
- Implemented a simpler range reduction strategy that eliminates a division and some `pselect` logic. | |
- Introduced a few additional fused multiply-add (FMA) operations. | |
- Refactored the polynomial approximations for both `atan<float>()` and `atan<double>()` into separate functions, paving the way for a more efficient implementation of `atan2()`. | |
### Improvements: | |
- Achieved a speedup of 20-40% on x86 architectures while maintaining a relative error of less than or equal to 2 ulps for `atan<float>()`. | |
- Enhanced code organization and maintainability through the refactoring of polynomial approximations. | |
### Impact: | |
This change significantly boosts the performance of the `atan<float>()` function, making it more efficient for applications that rely on this mathematical computation. While the improvements for `double` types are not feasible without a trade-off in polynomial degree, the advancements for float types provide a clear advantage." | |
1091 (https://gitlab.com/libeigen/eigen/-/merge_requests/1091),[clang-format] Add a few macros to AttributeMacros,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
This improves the automated formatting with clang-format. I used this for !1090 since without it, some lines were formatted rather oddly.",Alexander Richardson,2022-10-10T16:44:47.781Z,NA,NA,"## Title: | |
[clang-format] Add a few macros to AttributeMacros | |
## Authors: | |
Alexander Richardson | |
## Summary: | |
This merge request introduces several macros to the AttributeMacros module in Eigen, aimed at enhancing the automated formatting capabilities when using clang-format. This change was particularly motivated by observed formatting issues in a previous merge request. | |
### Key Changes: | |
- Added new macros to the AttributeMacros module to improve compatibility with clang-format. | |
### Improvements: | |
- Enhances automated code formatting, resulting in cleaner and more consistent code style. | |
- Addresses formatting anomalies that were present in the prior implementation. | |
### Impact: | |
The introduction of these macros will likely lead to a more streamlined codebase, reducing the likelihood of formatting inconsistencies and improving the overall developer experience when working with the Eigen library." | |
1092 (https://gitlab.com/libeigen/eigen/-/merge_requests/1092),Remove references to M_PI_2 and M_PI_4.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Fixes #2553 | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-10-11T00:27:17.517Z,NA,NA,"## Title: | |
Remove references to M_PI_2 and M_PI_4. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request removes all references to the constants M_PI_2 and M_PI_4 from the Eigen C++ library codebase, addressing issue #2553. | |
### Key Changes: | |
- Eliminated occurrences of M_PI_2 and M_PI_4 within the library. | |
### Improvements: | |
- Streamlined the code by removing unnecessary dependencies on these constants. | |
### Impact: | |
- Increases code clarity and potentially improves portability across different platforms where these constants may not be defined." | |
1093 (https://gitlab.com/libeigen/eigen/-/merge_requests/1093),Handle NaN inputs to atan2.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-10-11T17:35:52.548Z,NA,NA,"## Title: | |
Handle NaN inputs to atan2. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request addresses the handling of NaN (Not a Number) inputs in the `atan2` function of the Eigen C++ library. It ensures that NaN values are processed correctly, enhancing the robustness of the library. | |
### Key Changes: | |
- Implementation of proper handling for NaN inputs in the `atan2` function. | |
### Improvements: | |
- Improved reliability and accuracy of calculations involving `atan2` when NaN values are present. | |
### Impact: | |
- Enhances the stability of mathematical computations within the Eigen library, reducing the likelihood of unexpected behavior or errors when using NaN values as inputs." | |
1094 (https://gitlab.com/libeigen/eigen/-/merge_requests/1094),Eigen/Sparse: fix warnings -Wunused-but-set-variable,"### What does this implement/fix? | |
This merge-request fixes the following warnings, reported by clang 16.0.0git (compiled from the `main` branch of LLVM/clang) in `Eigen/Sparse`: | |
``` | |
/usr/include/eigen3/Eigen/src/SparseLU/SparseLU_heap_relax_snode.h:78:9: warning: variable 'nsuper_et_post' set but not used [-Wunused-but-set-variable] | |
Index nsuper_et_post = 0; // Number of relaxed snodes in postordered etree | |
^ | |
/usr/include/eigen3/Eigen/src/SparseLU/SparseLU_heap_relax_snode.h:79:9: warning: variable 'nsuper_et' set but not used [-Wunused-but-set-variable] | |
Index nsuper_et = 0; // Number of relaxed snodes in the original etree | |
^ | |
``` | |
``` | |
/usr/include/eigen3/Eigen/src/SparseCore/TriangularSolver.h:273:13: warning: variable 'count' set but not used [-Wunused-but-set-variable] | |
Index count = 0; | |
^ | |
``` | |
### Additional information: Context | |
Those warnings can be seen in context in the [testsuite page](https://cgal.geometryfactory.com/CGAL/testsuite/) of [CGAL (The Computational Geometry Algorithms Library)](https://www.cgal.org/) and in particular here: https://cgal.geometryfactory.com/CGAL/testsuite/CGAL-5.6-I-86/Heat_method_3/TestReport_gimeno_Debian-testing-clang-main.gz. | |
### Additional information: Testsuite passed locally | |
I have compiled and run tests successfully on my machine (x86_64, with compiler `clang version 14.0.5 (Fedora 14.0.5-1.fc36)`). | |
``` | |
100% tests passed, 0 tests failed out of 1121 | |
Label Time Summary: | |
Official = 8562.54 sec*proc (778 tests) | |
Unsupported = 1867.33 sec*proc (233 tests) | |
smoketest = 176.24 sec*proc (109 tests) | |
Total Test time (real) = 1041.58 sec | |
```",Laurent Rineau,2022-10-11T17:37:05.033Z,NA,NA,"## Title: | |
Eigen/Sparse: fix warnings -Wunused-but-set-variable | |
## Authors: | |
Laurent Rineau | |
## Summary: | |
This merge request addresses warnings in the Eigen/Sparse module related to unused variables. These warnings were reported by clang 16.0.0git and occur in specific source files, indicating some variables are set but not utilized in the code. | |
### Key Changes: | |
- Removed or modified the following unused variables: | |
- `nsuper_et_post` in `SparseLU_heap_relax_snode.h` | |
- `nsuper_et` in `SparseLU_heap_relax_snode.h` | |
- `count` in `TriangularSolver.h` | |
### Improvements: | |
- Enhanced code cleanliness by eliminating warnings related to unused variables, which can help prevent potential confusion and maintain a cleaner codebase. | |
### Impact: | |
- The resolution of these warnings contributes to better code quality and readability in the Eigen library. Additionally, it ensures that the library compiles without unnecessary warnings, which can aid in maintenance and future development efforts. The testsuite passed successfully, indicating that the changes did not introduce any issues." | |
1075 (https://gitlab.com/libeigen/eigen/-/merge_requests/1075),Don't use generic sign function for sign(complex) unless it is vectorizable,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Don't use generic psign function for sign(complex) unless the type is vectorizable, which implies that it hasvdata member `v` that we use for access complex packets as a vector of real. | |
Bug reported in on powerpc without VSX support by Chip Kerchner. | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-10-12T16:03:30.585Z,NA,NA,"## Title: | |
Don't use generic sign function for sign(complex) unless it is vectorizable | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request changes the implementation of the sign function for complex numbers within the Eigen C++ library. It ensures that the generic `psign` function is utilized only when the complex type is vectorizable, helping to optimize performance. | |
### Key Changes: | |
- Modified the behavior of the `sign(complex)` function to prevent the use of the generic `psign` function unless the complex type can be vectorized. | |
- Introduced a condition to check for the presence of a `vdata` member `v`, which enables access to complex packets as a vector of real numbers. | |
### Improvements: | |
- The adjustment primarily targets performance optimization, particularly for architectures lacking vector support (e.g., PowerPC without VSX). | |
- Reduces potential inefficiencies that could arise from using a generic implementation in non-vectorizable scenarios. | |
### Impact: | |
- This change enhances the efficiency of the Eigen library when working with complex numbers, particularly in specific hardware contexts, leading to potentially better performance across diverse platforms. It also addresses a reported bug related to this functionality." | |
1095 (https://gitlab.com/libeigen/eigen/-/merge_requests/1095),"Refactor special values test for pow, and add a similar test for atan2","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-10-12T20:12:09.584Z,NA,NA,"## Title: | |
Refactor special values test for pow, and add a similar test for atan2 | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request involves a refactor of the existing tests for special values related to the `pow` function and introduces a new test for the `atan2` function in the Eigen C++ library. The changes aim to enhance the robustness and coverage of the library's mathematical functions. | |
### Key Changes: | |
- Refactored the special values test for the `pow` function to improve clarity and maintainability. | |
- Added a new test case for the `atan2` function to address special value scenarios. | |
### Improvements: | |
- Increased code readability and organization in the `pow` testing. | |
- Expanded test coverage for mathematical functions, ensuring better handling of edge cases. | |
### Impact: | |
These changes enhance the reliability of the Eigen library's mathematical operations, as they now account for special values in both `pow` and `atan2`. This can lead to fewer unexpected behaviors and errors when users perform mathematical computations using these functions." | |
1096 (https://gitlab.com/libeigen/eigen/-/merge_requests/1096),Fix bug atan2,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Using a packet with just a single bit set (in this case the sign bit) as the predicate for `pselect` does not work on some platforms. | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-10-12T23:49:33.472Z,NA,NA,"## Title: | |
Fix bug atan2 | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request addresses a bug related to the `atan2` function in the Eigen C++ library. It focuses on improving the behavior of the `pselect` function when using a packet with only the sign bit set, which is problematic on certain platforms. | |
### Key Changes: | |
- Adjustments made to how the `atan2` function utilizes `pselect` with a single bit set. | |
### Improvements: | |
- Ensured that the `atan2` function operates correctly across different platforms, particularly where the previous implementation failed due to a singular bit condition. | |
### Impact: | |
- Enhances the reliability and cross-platform compatibility of the Eigen library, thereby improving its overall robustness in mathematical computations involving the `atan2` function." | |
1099 (https://gitlab.com/libeigen/eigen/-/merge_requests/1099),Explicitly state that indices must be sorted.,Fixes #2558.,Antonio Sánchez,2022-10-19T18:15:29.847Z,NA,NA,"## Title: Explicitly state that indices must be sorted. | |
## Authors: Antonio Sánchez | |
## Summary: This merge request addresses issue #2558 by clarifying the requirement that indices must be sorted within the Eigen C++ library documentation and relevant code areas. | |
### Key Changes: | |
- Updated documentation to explicitly state that indices used must be sorted. | |
### Improvements: | |
- Enhanced clarity in the library's documentation, thereby reducing potential user errors related to unsorted indices. | |
### Impact: | |
- Users will have a better understanding of the index requirements, leading to fewer errors and improved usability of the library." | |
1101 (https://gitlab.com/libeigen/eigen/-/merge_requests/1101),Change handmade_aligned_malloc/realloc/free to store a 1 byte offset instead of absolute address,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
Fixes #2554 | |
### What does this implement/fix? | |
The `handmade_aligned_malloc` family of functions assumes that `malloc` will return an address that is aligned to at least `sizeof(void*)`, i.e. `alignof(max_align_t) % sizeof(void*) == 0`. If this is not true, there may be insufficient space to store `original` in the offset between `aligned` and `original`. | |
One solution is to store a 1-byte offset `aligned-original` at `aligned-1` and deduce `original` when needed for `realloc` and `free`. This should work for `alignment<256` and any `alignof(max_align_t)`. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-10-22T22:51:32.357Z,NA,NA,"## Title: | |
Change handmade_aligned_malloc/realloc/free to store a 1 byte offset instead of absolute address | |
## Authors: | |
Charles Schlosser | |
## Summary: | |
This merge request modifies the `handmade_aligned_malloc` functions to improve memory address alignment handling. The existing approach was at risk of failing when the allocated address wasn't adequately aligned for storing the original address needed for memory management. | |
### Key Changes: | |
- Transformation of the `handmade_aligned_malloc`, `realloc`, and `free` functions to utilize a 1-byte offset (`aligned-original`) instead of storing the absolute original address. | |
### Improvements: | |
- The new implementation addresses potential alignment issues when `malloc` does not yield an adequately aligned address, ensuring compatibility across different architectures and alignment requirements. | |
### Impact: | |
- This change enhances the reliability of memory management in the Eigen library, making it more robust against alignment failures that could lead to undefined behavior in the handling of memory blocks." | |
1105 (https://gitlab.com/libeigen/eigen/-/merge_requests/1105),Fix pragma check for disabling fastmath.,Fixes #2563.,Antonio Sánchez,2022-10-26T22:50:58.352Z,NA,NA,"## Title: | |
Fix pragma check for disabling fastmath. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses issue #2563 by fixing the pragma check responsible for disabling fastmath in the Eigen C++ library. | |
### Key Changes: | |
- Corrected the pragma directive to ensure it properly disables fastmath. | |
### Improvements: | |
- Enhances the reliability of fastmath functionality, ensuring that it can be effectively turned off as intended. | |
### Impact: | |
- Improves code robustness and compiler compatibility, paving the way for better numerical stability in computations that rely on the disabling of fastmath optimizations." | |
1102 (https://gitlab.com/libeigen/eigen/-/merge_requests/1102),Add assert for invalid outerIndexPtr array in SparseMapBase.,"The outer index array must have size equal to `outerSize + 1`, with the | |
last element being the size of the `valuePtr` array. | |
Fixes ##2561.",Antonio Sánchez,2022-10-26T22:51:34.067Z,NA,NA,"## Title: | |
Add assert for invalid outerIndexPtr array in SparseMapBase. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request introduces an assertion in the SparseMapBase class to ensure that the outer index array is correctly sized. Specifically, it enforces that the outer index array must have a size equal to `outerSize + 1`, with the last element indicating the size of the `valuePtr` array. | |
### Key Changes: | |
- Added an assertion to validate the size of the outer index array in SparseMapBase. | |
### Improvements: | |
- Enhanced input validation for the SparseMapBase class, ensuring that incorrect configurations are caught early in the execution. | |
### Impact: | |
- This change improves the robustness of the SparseMapBase implementation by preventing potential runtime errors related to misconfigured outer index arrays." | |
1106 (https://gitlab.com/libeigen/eigen/-/merge_requests/1106),Fix handmade_aligned_malloc offset computation.,"Turns out `-` takes precedence over `&`, which was causing a bunch of | |
compiler warnings. We have a oss-fuzz bug reported via chromium as | |
well, reporting writing to out-of-bounds memory, and I *think* this solves | |
that issue.",Antonio Sánchez,2022-10-27T17:33:47.768Z,NA,NA,"## Title: | |
Fix handmade_aligned_malloc offset computation. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an issue with the `handmade_aligned_malloc` function where the precedence of the `-` operator over the `&` operator led to compiler warnings and potential out-of-bounds memory writes, as reported by an OSS-Fuzz bug in Chromium. | |
### Key Changes: | |
- Modified the offset computation in `handmade_aligned_malloc` to ensure correct operator precedence. | |
### Improvements: | |
- Reduces compiler warnings associated with operator precedence. | |
- Potentially resolves out-of-bounds memory writing issues, enhancing overall memory safety. | |
### Impact: | |
This change improves the stability and safety of memory allocation in the Eigen C++ library, thereby reducing the risk of memory-related bugs." | |
1107 (https://gitlab.com/libeigen/eigen/-/merge_requests/1107),Disable patan for double on PPC.,"It's not defined, leading to build failures.",Antonio Sánchez,2022-10-27T17:56:09.225Z,NA,NA,"## Title: Disable patan for double on PPC | |
## Authors: Antonio Sánchez | |
## Summary: | |
This merge request addresses a build issue in the Eigen C++ library by disabling the `patan` function for double precision on PowerPC (PPC) architecture, as it is not defined. | |
### Key Changes: | |
- Disabled the `patan` function for double precision on PPC. | |
### Improvements: | |
- Rectified build failures associated with the undefined `patan` function on PPC. | |
### Impact: | |
- Enhances the compatibility of the Eigen library on PPC architecture by preventing build errors related to unsupported functions." | |
1100 (https://gitlab.com/libeigen/eigen/-/merge_requests/1100),Allow empty matrices to be resized.,"Previously we had the storage fixed to 0-by-0 if the compile-time size | |
of the storage was 0, but this conflicts with the compile-time | |
matrix size. This was preventing dynamic empty matrices from being | |
properly resized, and causing a misreporting of the matrices # of rows | |
and columns. | |
Fixes #2557.",Antonio Sánchez,2022-10-27T20:33:36.040Z,NA,NA,"## Title: | |
Allow empty matrices to be resized. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an issue with the resizing of dynamic empty matrices in the Eigen C++ library. Previously, empty matrices were restricted to a fixed 0-by-0 storage if their compile-time size was also 0, leading to conflicts and inaccuracies in the reported number of rows and columns. | |
### Key Changes: | |
- Enabled resizing of dynamic empty matrices. | |
- Resolved conflict between storage size and compile-time matrix size. | |
### Improvements: | |
- Enhanced flexibility for users working with empty matrices. | |
- Improved accuracy in the reporting of matrix dimensions. | |
### Impact: | |
This change allows for better handling of dynamic empty matrices, thereby improving usability and functionality within the Eigen library. It directly addresses issue #2557, leading to a more robust matrix management system." | |
1110 (https://gitlab.com/libeigen/eigen/-/merge_requests/1110),Remove unused parameter name.,NA,Antonio Sánchez,2022-11-01T23:13:50.521Z,NA,NA,"## Title: | |
Remove unused parameter name. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses the removal of an unused parameter name in the Eigen C++ library's codebase. | |
### Key Changes: | |
- The unnecessary parameter name has been removed from the relevant function. | |
### Improvements: | |
- Streamlining the codebase by eliminating unused elements, which enhances code readability and maintainability. | |
### Impact: | |
- This change helps in reducing code clutter, potentially improving performance and making future code modifications easier." | |
1109 (https://gitlab.com/libeigen/eigen/-/merge_requests/1109),Remove recently added sparse assert in SparseMapBase.,"Turns out we have existing use-cases where we use the map to populate | |
the sparse matrix, so the map may not be valid on construction here.",Antonio Sánchez,2022-11-03T17:29:06.295Z,NA,NA,"## Title: | |
Remove recently added sparse assert in SparseMapBase. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request removes an assert in the SparseMapBase class related to the validity of the sparse map upon construction. This change acknowledges that there are existing use cases where the map can be utilized to populate the sparse matrix, making the previous assert unnecessary. | |
### Key Changes: | |
- Removed an assertion that required the sparse map to be valid on construction. | |
### Improvements: | |
- Allows for greater flexibility in using SparseMapBase in real-world use cases where the map is populated after construction. | |
### Impact: | |
- Enhances usability for developers working with sparse matrices, preventing potential runtime errors that could arise from the removed assertion." | |
1097 (https://gitlab.com/libeigen/eigen/-/merge_requests/1097),Add signbit function,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
Checking the sign of a floating point value appears trivial, except when checking for -0. This is necessary for some frequently used math functions, such as `pow`. A consistent and efficient signbit will simplify implementation and improve performance of many floating point functions. This may also be a faster method of checking for negative values, as it uses shift operations instead of floating point comparisons. | |
AVX arithmetic shift: `_mm256_srai_epi32`, latency 1, CPI: 0.5 | |
AVX floating point compare: `_mm256_cmp_ps`, latency 4, CPI: 0.5 | |
`std::signbit` checks if the leading bit is set and returns a `bool`. These functions return a bitmask (all 1's for true, all 0's for false) of the same type so that it may be used for logical operations (but will evaluate to the same boolean value). | |
Also added an AVX2 packet op to perform arithmetic shift of `int64_t` packets -- `Packet4l`. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-11-04T00:31:21.802Z,NA,NA,"## Title: | |
Add signbit function | |
## Authors: | |
Charles Schlosser | |
## Summary: | |
This merge request introduces a new `signbit` function to the Eigen C++ library, aimed at efficiently determining the sign of floating point values, with special consideration for the representation of -0. The implementation focuses on enhancing performance in various mathematical operations. | |
### Key Changes: | |
- Added a `signbit` function that utilizes bit manipulation for sign checking. | |
- Implemented an AVX2 packet operation for performing arithmetic shifts on `int64_t` packets. | |
- Provides a bitmask output for the sign check, enhancing usability in logical operations. | |
### Improvements: | |
- Increased efficiency in checking the sign of floating point numbers, particularly with -0 representation. | |
- Potential performance boost in frequently used math functions like `pow` due to simplified implementation logic. | |
- Reduced latency through the use of arithmetic shifts compared to floating point comparisons. | |
### Impact: | |
The introduction of this function is expected to improve the performance of mathematical functions within the Eigen library, catering to applications that require precise sign checks of floating point values, thereby enhancing overall computational efficiency." | |
1111 (https://gitlab.com/libeigen/eigen/-/merge_requests/1111),fix neon,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-11-08T20:03:02.287Z,NA,NA,"## Title: | |
Fix Neon | |
## Authors: | |
Charles Schlosser | |
## Summary: | |
This merge request focuses on fixing issues related to Neon vectorization in the Eigen C++ library, enhancing performance and compatibility on ARM architectures. | |
### Key Changes: | |
- Addressed specific issues in the Neon implementation to ensure proper functioning and optimization. | |
### Improvements: | |
- Enhanced performance for operations utilizing Neon intrinsics. | |
- Increased compatibility with different ARM architectures. | |
### Impact: | |
The changes lead to improved performance for applications using the Eigen library on ARM platforms, enabling more efficient computation and broader usability in environments where ARM processors are prevalent." | |
1112 (https://gitlab.com/libeigen/eigen/-/merge_requests/1112),Fix typo in CholmodSupport,"Whadyaknow, it is pretty easy to edit in the gui. | |
Fixes #2566.",Antonio Sánchez,2022-11-08T23:49:56.815Z,NA,NA,"## Title: | |
Fix typo in CholmodSupport | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request focuses on correcting a typo in the CholmodSupport module of the Eigen C++ library. | |
### Key Changes: | |
- Fixed a typo in the CholmodSupport code. | |
### Improvements: | |
- Enhances code readability and accuracy. | |
### Impact: | |
- Contributes to overall code quality by eliminating errors, which may improve user experience and reduce confusion in the module." | |
1118 (https://gitlab.com/libeigen/eigen/-/merge_requests/1118),Fix ambiguity in PPC for vec_splats call.,"`uint64_t` is `unsigned long` in clang, but the IBM intrinsic is only defined for `unsigned long long`.",Antonio Sánchez,2022-11-14T18:58:16.974Z,NA,NA,"## Title: | |
Fix ambiguity in PPC for vec_splats call. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an ambiguity in the PowerPC (PPC) implementation concerning the `vec_splats` call, specifically with the treatment of the `uint64_t` type. | |
### Key Changes: | |
- Modified the implementation to ensure compatibility with IBM's intrinsic, which requires `unsigned long long` instead of `unsigned long`. | |
### Improvements: | |
- Clarified type usage in PPC code to prevent potential ambiguity during compilation with clang. | |
### Impact: | |
- Ensures correct functionality of the `vec_splats` function on PPC architectures, enhancing compatibility and reducing errors related to type definitions." | |
1119 (https://gitlab.com/libeigen/eigen/-/merge_requests/1119),Put brackets around unsigned type names.,NA,Antonio Sánchez,2022-11-15T17:32:32.188Z,NA,NA,"## Title: | |
Put brackets around unsigned type names. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request introduces changes to the Eigen C++ library by implementing bracket notation around unsigned type names. This adjustment aims to improve code clarity and maintain consistency in type declarations. | |
### Key Changes: | |
- Brackets have been added around all unsigned type names in the codebase. | |
### Improvements: | |
- Enhanced readability of type declarations. | |
- Consistent style applied across the library, making it easier for contributors to understand and modify code. | |
### Impact: | |
- The modification aims to improve code maintainability and collaboration among developers by fostering a more uniform coding style, which may reduce potential errors during development." | |
1116 (https://gitlab.com/libeigen/eigen/-/merge_requests/1116),Correct pnegate for floating-point zero.,"The original formulation of `0 - x` incorrectly handles +/-0 for | |
floating point numbers. We need to instead flip the sign bit | |
explicitly.",Antonio Sánchez,2022-11-15T18:07:24.310Z,NA,NA,"## Title: | |
Correct pnegate for floating-point zero. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an issue in the Eigen C++ library related to the handling of floating-point zero in the pnegate function. The previous implementation of negation, specifically `0 - x`, did not correctly account for the distinct representations of positive and negative zero. | |
### Key Changes: | |
- The implementation now directly flips the sign bit for floating-point zeros to ensure accurate negation. | |
### Improvements: | |
- Enhanced accuracy in the negation of floating-point numbers. | |
- Improved compliance with floating-point arithmetic standards. | |
### Impact: | |
- This change corrects a potential source of errors in numerical computations involving floating-point zero, leading to more reliable and predictable behavior in mathematical operations using the Eigen library." | |
1098 (https://gitlab.com/libeigen/eigen/-/merge_requests/1098),Cross product for vectors of size 2. Fixes #1037,"### Reference issue | |
Fixes #1037 | |
### What does this implement/fix? | |
Implements cross product for vectors of size 2. | |
The result is a scalar equal to the signed area of a parallelepiped spanned by the input vectors. | |
Or, to put it differently, the cross product of (v1, v2) and (w1, w2) is the 3rd coordinate of the cross product of (v1, v2, v3) and (w1, w2, w3). | |
### Additional information | |
This is my first time contributing to Eigen! I tried to do as best as I could, but I still have some doubts | |
- I think we can consider this MR fixes #1037 because the issue specifically requests a cross product of 2D vectors. Actually, that issue seems to be a little big wider, also proposing a more general implementation for arbitrary-size vectors returning skew-symmetric matrices, possibly named `wedge`. This is not implemented by this MR, but since the main focus of that issue seems to be 2D vectors I think it can be closed. A comment in the thread mentions `numpy`, allowing arbitrary combinations of 2D and 3D vectors, where a 2D vector is interpreted as a 3D vector with vanishing z-component. This behavior is not implemented either in this MR, the cross product is only defined between vectors of the same size | |
- A few words about `cross_product_return_type` | |
- As a shift in paradigm, I took its definition out of `MatrixBase`, turning it from a nested struct into an independent templated struct. I thought it more elegant. This change is not backward compatible, but I took this liberty because the struct was undocumented anyways, and also it was used nowhere in the code base. If requested, I can easily revert this change. | |
- I decided to document it as a type rather than hide it behind something like `PlainObject` as it was previously done. It makes it easier to explain the output. Also, I didn't like the idea of obfuscating the doc | |
- I am not sure about naming conventions, especially now that it's a public type. Should it be `CrossProductReturnType::Type` instead of `cross_product_return_type::type`? Maybe it could even be renamed to something completely different such as `CrossProductTraits::ReturnType`? | |
- In passing, I removed a seemingly-orphaned forward declaration of a class called `Cross` which looked like a relic, as it wasn't used anywhere in the code base and indeed I could not even find its full declaration. Again, if requested, I can put it back | |
- A simpler implementation (mentioned in #1037) would have consisted in giving the method for size-2 vectors a different name, like `cross2`. However, I decided to call it `cross` because I liked it better as an interface, which seems to be shared by other important resources such as `numpy`. Also, I actively enjoyed template metaprogramming!",Gabriele Buondonno,2022-11-15T22:39:43.056Z,NA,NA,"## Title: | |
Cross product for vectors of size 2. Fixes #1037 | |
## Authors: | |
Gabriele Buondonno | |
## Summary: | |
This merge request implements the cross product for 2-dimensional vectors in the Eigen C++ library, addressing issue #1037. The new functionality computes a scalar that represents the signed area of the parallelogram spanned by the input vectors. | |
### Key Changes: | |
- Introduced a cross product function for 2D vectors. | |
- Defined the result as a scalar reflecting the third coordinate of the corresponding 3D cross product. | |
- Refactored the `cross_product_return_type` from a nested struct within `MatrixBase` to an independent templated struct, enhancing its elegance and documentation. | |
- Removed an unused forward declaration of a class named `Cross`. | |
### Improvements: | |
- The new cross product function allows for clearer and more intuitive usage of 2D vectors in mathematical operations, aligning with functionality seen in other libraries like `numpy`. | |
- The refactor of `cross_product_return_type` improves the type’s clarity and usability as a public type, aiding documentation. | |
### Impact: | |
The addition of the cross product for 2D vectors enriches the Eigen library's capabilities, making it more versatile for users dealing with 2D geometrical calculations. The adjustments to type structuring also improve the codebase’s organization and documentation, though some changes may affect backward compatibility." | |
1113 (https://gitlab.com/libeigen/eigen/-/merge_requests/1113),Fix duplicate execution code for Power 8 Altivec in pstore_partial.,Fix duplicate execution code for Power 8 Altivec in pstore_partial.,Chip Kerchner,2022-11-16T13:41:44.026Z,NA,NA,"## Title: | |
Fix duplicate execution code for Power 8 Altivec in pstore_partial. | |
## Authors: | |
Chip Kerchner | |
## Summary: | |
This merge request addresses an issue related to redundant execution code specifically for Power 8 Altivec within the `pstore_partial` function. | |
### Key Changes: | |
- Resolved duplication in the execution code for Power 8 Altivec. | |
### Improvements: | |
- Streamlined code in the `pstore_partial` function, enhancing clarity and maintainability. | |
### Impact: | |
- Reduction of code redundancy may lead to improved performance and easier future updates within the Power 8 Altivec implementation." | |
1115 (https://gitlab.com/libeigen/eigen/-/merge_requests/1115),Fix AVX2 psignbit,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #2569 | |
Thank you Ogre Transporter for identifying this | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-11-16T13:43:11.755Z,NA,NA,"## Title: | |
Fix AVX2 psignbit | |
## Authors: | |
Charles Schlosser | |
## Summary: | |
This merge request addresses a bug in the AVX2 implementation of the `psignbit` function, fixing an issue identified in issue #2569. | |
### Key Changes: | |
- Corrected the implementation of the `psignbit` function to ensure accurate behavior under AVX2. | |
### Improvements: | |
- Enhanced the reliability and correctness of operations involving `psignbit` when utilizing AVX2 instructions, which may benefit performance and ensure expected results in applications relying on this functionality. | |
### Impact: | |
This fix resolves a critical issue that could lead to incorrect results when using the `psignbit` function, thus improving the overall robustness of the Eigen library's AVX2 capabilities. Users relying on this function can expect improved accuracy in their computations." | |
1117 (https://gitlab.com/libeigen/eigen/-/merge_requests/1117),Small cleanup of IDRS.h,"Removed a set, but unused variable in IDRS.h, cleaned up the odd line breaks in the comments on lines 94-100.",Chris,2022-11-16T13:51:23.918Z,NA,NA,"## Title: Small cleanup of IDRS.h | |
## Authors: Chris | |
## Summary: | |
This merge request focuses on minor improvements within the IDRS.h file of the Eigen C++ library, specifically addressing code cleanliness and readability. | |
### Key Changes: | |
- Removed an unused variable from IDRS.h. | |
- Cleaned up inconsistent line breaks in comments located on lines 94-100. | |
### Improvements: | |
- The removal of unused variables reduces potential confusion and clutter in the codebase. | |
- Improved readability of comments enhances maintainability and understanding for future contributors. | |
### Impact: | |
These changes contribute to a cleaner and more maintainable codebase, facilitating easier navigation and comprehension for developers working with the IDRS.h file." | |
1120 (https://gitlab.com/libeigen/eigen/-/merge_requests/1120),Fix bug in handmade_aligned_realloc,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
Addresses two bugs in `handmade_aligned_realloc`: | |
1) `memmove` the correct number of bytes from `new_size` (used to be `size`) to `min(old_size,new_size)` so as to not overrun the bounds of the allocated memory. If `new_size > old_size`, then the array was expanded, and we need to copy the entire original array (`old_size`). If `new_size < old_size`, then the array was shrunk, and we need to copy the entire new array (`new_size`). Thus, in general, we need the minimum of these two sizes. We explicitly avoid the case where `new_size == old_size`, as the behavior is possibly undefined (at best, this would result in a no-op anyway). | |
2) If `std::realloc` returns a new address, the reallocation is performed by ""allocating a new memory block of size `new_size` bytes, copying memory area with size equal the **lesser of the new and the old sizes**, and **freeing** the old block."" Therefore, there is no guarantee that the memory still exists at `ptr`/`old_original`, and should instead be copied from `original`. | |
https://en.cppreference.com/w/cpp/memory/c/realloc | |
Also added an additional condition that `alignment <= 128`. Although a byte can store a maximum offset of `255`, `128` is the largest such power of two. If this becomes insufficient, we can set aside two bytes for the offset with a little more effort. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-11-18T22:35:32.048Z,NA,NA,"## Title: | |
Fix bug in handmade_aligned_realloc | |
## Authors: | |
Charles Schlosser | |
## Summary: | |
This merge request addresses two critical bugs in the `handmade_aligned_realloc` function of the Eigen C++ library to ensure proper memory management during reallocation. | |
### Key Changes: | |
- Adjusted the `memmove` operation to copy the correct number of bytes, using `min(old_size, new_size)` to prevent memory overruns. | |
- Enhanced the handling of `std::realloc` by allocating a new memory block when a new address is returned, ensuring that memory is copied from `original` instead of potentially accessing freed memory. | |
- Added a condition to limit `alignment` to a maximum of `128`. | |
### Improvements: | |
- The changes prevent potential undefined behavior by ensuring the correct handling of memory sizes during reallocation. | |
- Improved safety in memory management by managing pointers more effectively, thereby reducing the risk of memory corruption. | |
### Impact: | |
These updates enhance the reliability and safety of memory allocation within the Eigen library, potentially preventing crashes or erratic behavior caused by improper memory handling." | |
1121 (https://gitlab.com/libeigen/eigen/-/merge_requests/1121),Add serialization for sparse matrix and sparse vector.,"This was required for another project in order to simplify reproduction | |
of a sparse solver issue.",Antonio Sánchez,2022-11-21T19:43:08.273Z,NA,NA,"## Title: | |
Add serialization for sparse matrix and sparse vector. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request introduces serialization capabilities for sparse matrices and sparse vectors in the Eigen C++ library. This functionality was developed to assist with another project, specifically to simplify the reproduction of issues related to a sparse solver. | |
### Key Changes: | |
- Implemented serialization methods for both sparse matrices and sparse vectors. | |
### Improvements: | |
- Enhances the usability of sparse matrix and vector data structures by allowing them to be easily saved and loaded. | |
- Facilitates debugging and testing of sparse solvers by enabling the reproduction of specific sparse matrix configurations. | |
### Impact: | |
This addition will significantly aid developers and researchers in reproducing and diagnosing issues within sparse solver algorithms, improving overall efficiency and reliability in handling sparse data structures." | |
1122 (https://gitlab.com/libeigen/eigen/-/merge_requests/1122),Fix a bunch of annoying compiler warnings in tests,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Should substantially reduce the number of warnings emitted during the MR smoketests, specifically visitor.cpp, adjoint.cpp, and array_cwise.cpp. This is mostly due to narrowing conversions -- nothing material. | |
Job buildsmoketests:x86-64:linux:clang-10:cxx11-on log: | |
Before: 2189 lines | |
After : 1193 lines | |
The only warnings are associated with the deprecation of pow, which is intentional. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-11-21T20:07:20.390Z,NA,NA,"## Title: | |
Fix a bunch of annoying compiler warnings in tests | |
## Authors: | |
Charles Schlosser | |
## Summary: | |
This merge request addresses numerous compiler warnings that emerged during the MR smoketests in the Eigen C++ library, particularly in the files visitor.cpp, adjoint.cpp, and array_cwise.cpp. The changes primarily focus on eliminating warnings related to narrowing conversions. | |
### Key Changes: | |
- Reduced the number of compiler warnings during smoketests from 2189 lines to 1193 lines. | |
- Focused improvements on specific test files: visitor.cpp, adjoint.cpp, and array_cwise.cpp. | |
### Improvements: | |
- Substantial reduction in compiler warnings improves code quality and maintainability. | |
- Warning cleanup enhances the clarity of the test outputs. | |
### Impact: | |
- The changes lead to a cleaner build process with significantly fewer warnings, facilitating easier debugging and development. | |
- The only remaining warnings are related to the intentional deprecation of the `pow` function, indicating that the modifications do not introduce new issues." | |
1125 (https://gitlab.com/libeigen/eigen/-/merge_requests/1125),Add synchronize method to all devices.,"This is to simply writing generic device code. Previously only the GPU | |
and Sycl tensor devices had a `synchronize` method, which is required | |
in testing to ensure all operations are performed. Added a dummy method | |
for threadpool and default devices, where are synchronous by default.",Antonio Sánchez,2022-11-29T19:35:03.227Z,NA,NA,"## Title: | |
Add synchronize method to all devices. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request introduces a `synchronize` method to all device types within the Eigen C++ library, enhancing the library's flexibility in handling device operations. | |
### Key Changes: | |
- Added a `synchronize` method to all devices, including previously unsupported types (like threadpool and default devices). | |
- Implemented a dummy `synchronize` method for threadpool and default devices, which are inherently synchronous. | |
### Improvements: | |
- Facilitates the writing of more generic device code. | |
- Enhances testing capabilities by ensuring all operations across devices are completed before proceeding. | |
### Impact: | |
This change ensures consistency in operation across all device types, improving the reliability of the library's device management and testing processes." | |
1124 (https://gitlab.com/libeigen/eigen/-/merge_requests/1124),Fix sparseLU solver when destination has a non-unit stride.,"The previous code had an implicit assumption that the destination is | |
directly accessible and has unit stride. This is not the case for | |
block expressions or views like `complex_matrix.real()`, which | |
has a non-unit stride. Converting the code to use a block expression | |
address this. | |
Fixes #2562.",Antonio Sánchez,2022-11-29T19:37:04.391Z,NA,NA,"## Title: | |
Fix sparseLU solver when destination has a non-unit stride. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an issue in the sparseLU solver related to the handling of destinations with non-unit stride. The previous implementation assumed that the destination was directly accessible and had a unit stride, which caused problems when working with block expressions or views, such as `complex_matrix.real()`. | |
### Key Changes: | |
- Modified the sparseLU solver to handle destinations with non-unit stride. | |
- Implemented the use of block expressions for improved compatibility. | |
### Improvements: | |
- Enhanced capability of the sparseLU solver for diverse types of matrix representations. | |
- Increased robustness of the library by eliminating previous assumptions about memory access. | |
### Impact: | |
This fix improves the usability of the sparseLU solver, allowing it to work correctly with different matrix formats and structures, thus broadening its application in numerical computations within the Eigen library." | |
1114 (https://gitlab.com/libeigen/eigen/-/merge_requests/1114),Changing BiCGSTAB parameters initialization so that it works with custom types,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Alexandre Hoffmann,2022-11-29T19:37:46.887Z,NA,NA,"## Title: | |
Changing BiCGSTAB parameters initialization so that it works with custom types | |
## Authors: | |
Alexandre Hoffmann | |
## Summary: | |
This merge request modifies the initialization of parameters in the BiCGSTAB algorithm to enhance compatibility with custom types. | |
### Key Changes: | |
- Updated the parameter initialization process for the BiCGSTAB algorithm. | |
### Improvements: | |
- Enhanced flexibility of the implementation to support custom data types, thereby broadening the usability of the BiCGSTAB solver. | |
### Impact: | |
This change enables users to utilize the BiCGSTAB algorithm with custom defined types, enhancing the library's versatility and application in diverse contexts." | |
1123 (https://gitlab.com/libeigen/eigen/-/merge_requests/1123),Fix reshape strides when input has non-zero inner stride.,"Fixes #2560. The outer stride needs to depend on the inner stride | |
if we have direct access.",Antonio Sánchez,2022-11-29T19:39:30.827Z,NA,NA,"## Title: | |
Fix reshape strides when input has non-zero inner stride. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an issue with the Eigen C++ library concerning the handling of reshaping operations when inputs possess non-zero inner strides. | |
### Key Changes: | |
- Corrected the calculation of outer stride to be contingent on the value of the inner stride in cases with direct access. | |
### Improvements: | |
- Ensures accurate handling of array reshaping, which maintains data integrity during operations that involve non-zero inner strides. | |
### Impact: | |
- Enhances the robustness and reliability of the reshaping functionality within the Eigen library, particularly for cases involving complex data layouts. This fix resolves issue #2560, ultimately improving user experience and functionality." | |
1127 (https://gitlab.com/libeigen/eigen/-/merge_requests/1127),Fix serialization for non-compressed matrices.,"The size of the data buffer was incorrect - it is not the number of | |
non-zeros, that is only true for compressed matrices. | |
It turned out that clang previously always converted the test matrices | |
to compressed mode because of the move-constructor, whereas this move | |
construction was optimized out by gcc. Added an explicit move constructor | |
to fix this.",Antonio Sánchez,2022-11-30T18:16:48.488Z,NA,NA,"## Title: | |
Fix serialization for non-compressed matrices. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses a bug in the serialization process for non-compressed matrices in the Eigen C++ library. The issue arose from an incorrect data buffer size calculation, which mistakenly relied on the number of non-zeros, a condition applicable only to compressed matrices. | |
### Key Changes: | |
- Corrected the data buffer size calculation for non-compressed matrices. | |
- Added an explicit move constructor to ensure consistent behavior across compilers (clang and gcc). | |
### Improvements: | |
- Enhanced the robustness of matrix serialization for non-compressed formats. | |
- Mitigated discrepancies between compiler behavior by explicitly managing the move constructor. | |
### Impact: | |
This fix is essential for accurate serialization of non-compressed matrices, ensuring that developers using Eigen can rely on proper matrix representation, regardless of the compiler in use." | |
1008 (https://gitlab.com/libeigen/eigen/-/merge_requests/1008),Add support for Power10 (AltiVec) MMA instructions for bfloat16.,"Hello everyone, long time no see! | |
I hope to find everyone safe and sound :smile_cat: | |
## What? | |
This merge request add MMA support for bfloat16 on Power 10 machines. As Power 10 has bfloat16 support this is way faster comparing to what we had before using only VSX instructions that falls back for float32 to any computation. | |
## How | |
**Briefly**, Power10 MMA instructions for 16 bits types has a Rank-2 operation `xvbf16ger2pp` that is able to do two rank-1 updates simultaneously using 2 rows/columns. | |
It takes a `4x2` against a `2x4` matrix block and do a rank 2 update. Below there's a scheme of how my MMA register needs to be and the result. One thing worth mentioning is, result is a `4x4` float32 matrix, there's a ""type upgrade"" on this operation. | |
<table> | |
<tr> | |
<td>A</td> | |
<td>B</td> | |
<td>C</td> | |
<td>D</td> | |
<td>E</td> | |
<td>F</td> | |
<td>G</td> | |
<td>H</td> | |
</tr> | |
</table> | |
<table> | |
<tr> | |
<td>I</td> | |
<td>J</td> | |
<td>K</td> | |
<td>L</td> | |
<td>M</td> | |
<td>N</td> | |
<td>O</td> | |
<td>P</td> | |
</tr> | |
</table> | |
<table> | |
<tr> | |
<td>A*I + B*J</td> | |
<td>A*K + B*L</td> | |
<td>A*M + B*N</td> | |
<td>A*O + B*P</td> | |
</tr> | |
<td>C*I + D*J</td> | |
<td>C*K + D*L</td> | |
<td>C*M + D*N</td> | |
<td>C*O + D*P</td> | |
<tr> | |
<td>E*I + F*J</td> | |
<td>E*K + F*L</td> | |
<td>E*M + F*N</td> | |
<td>E*O + F*P</td> | |
</tr> | |
<td>G*I + H*J</td> | |
<td>G*K + H*L</td> | |
<td>G*M + H*N</td> | |
<td>G*O + H*P</td> | |
</tr> | |
</table> | |
In short, what `gemmMMAbfloat16` is doing, it's loading `4x2` and `2x4` blocks from LHS/RHS, organizing them at the registers and running rank-2 update. As standard packing wasn't created with this situation in mind, it can be a little confusing how I'm acessing memory. If you think a detailed explanation is necessary I don't mind drawing something to make myself clear as possible :smile: | |
~~Out of curiosity, I did try to change packing to make code more friendly but I couldn't make my custom packing work for triangular so I went back.~~ | |
## Code | |
### Temporary float32 result | |
Talking further about the result being a float32 matrix to avoid converting back and forth from float32 <-> bfloat16 on GEMM I created a temporary float32 matrix to hold result. | |
```c++ | |
float** result = new float*[cols]; | |
for(int i = 0; i < cols; i++) result[i] = new float[rows]; | |
``` | |
I didn't see any code using `new` so if that's a problem I'm open to suggestions. :wink: | |
### Long and ugly switch statement | |
`pgerMMAbfloat16` is basically running rank-2 update instructions. There's a mask feature for this set of instructions that I'm able to ignore some parts of the result matrix. | |
This is useful when we are running that last section of a matrix that is unable to fit whole 4 elements and/or don't have two rows/columns. Using masks I'm able to ignore result for those non-existent values. | |
Now comes the ugly part, I don't know masks at compile time and so I can't write something like: | |
`__builtin_mma_pmxvbf16ger2pp(acc, reinterpret_cast<Packet16uc>(a.m_val), reinterpret_cast<Packet16uc>(b.m_val), maskX, maskY, 0b11);` | |
I don't have exact compiler error at this moment but it was something that mentions `literals`. I bet it's because, after compilation, these masked rank updates are different instructions (instead of a instruction with masks as arguments). | |
## Testing | |
For testing I've changed files below (not submitted): | |
* test/product_syrk.cpp | |
* test/product_large.cpp | |
* test/product_symm.cpp | |
* test/triangular.cpp | |
I had a problem creating those tests because `bfloat16` doesn't support int scaling (i.e. k * A) so there's some tests that had like `2*m1` that doesn't work for `bfloat16`. To work I've changed: | |
This: | |
`VERIFY_IS_APPROX(res, 2*(square + m1 * m2.transpose()));` | |
To: | |
`VERIFY_IS_APPROX(res, (square + m1 * m2.transpose()) + ( square + m1 * m2.transpose()) );` | |
There's also this code section that don't work for bfloat16 and I don't have any idea why: | |
```c++ | |
if(!MatrixType::IsRowMajor) | |
{ | |
typedef Matrix<Scalar,Dynamic,Dynamic> MatrixX; | |
MatrixX buffer(2*rows,2*rows); | |
Map<RowSquareMatrixType,0,Stride<Dynamic,2> > map1(buffer.data(),rows,rows,Stride<Dynamic,2>(2*rows,2)); | |
buffer.setZero(); | |
VERIFY_IS_APPROX(map1 = m1 * m2.transpose(), (m1 * m2.transpose()).eval()); | |
buffer.setZero(); | |
VERIFY_IS_APPROX(map1.noalias() = m1 * m2.transpose(), (m1 * m2.transpose()).eval()); | |
buffer.setZero(); | |
VERIFY_IS_APPROX(map1.noalias() += m1 * m2.transpose(), (m1 * m2.transpose()).eval()); | |
} | |
``` | |
If you people think it's important to update tests files to have my bfloat16 tests I don't mind doing it. Honestly I didn't give much tought about this matter but maybe I can specialize `product` function on `product.h` for a Matrix of bfloat16. Suggestions are also appreciated here :smile: | |
## Last considerations | |
As this is a lot of changes I can imagine this will go back and forth. Any suggestion/consideration will be much appreciated! | |
Thanks a lot for your time. | |
PS: | |
This is a collaboration between me, Chip Kerchner and Rafael Souza. (co-authors on commit message)",Pedro Caldeira,2022-11-30T23:33:37.637Z,NA,NA,"## Title: | |
Add support for Power10 (AltiVec) MMA instructions for bfloat16. | |
## Authors: | |
Pedro Caldeira, Chip Kerchner, Rafael Souza | |
## Summary: | |
This merge request introduces support for Power 10 machines to handle bfloat16 computations using enhanced MMA (Matrix Multiply Accumulate) instructions. The enhancements significantly improve performance over previous methods that utilized only VSX instructions for float32. | |
### Key Changes: | |
- Implementation of MMA support for bfloat16 on Power 10. | |
- Added `gemmMMAbfloat16` function to utilize the `xvbf16ger2pp` instruction for efficient rank-2 updates. | |
- Adapted matrix operations to leverage new instruction capabilities, transitioning from float32 operations to bfloat16 optimally. | |
### Improvements: | |
- The inclusion of temporary float32 matrices to facilitate operations without redundant conversions between bfloat16 and float32. | |
- Enhanced handling for cases where matrix dimensions do not fit perfectly by utilizing masking features in the MMA instructions. | |
### Impact: | |
- The performance of bfloat16 computations on Power 10 architectures is significantly enhanced due to the direct use of dedicated MMA instructions, leading to potentially faster execution times and reduced computational overhead for matrix operations." | |
1104 (https://gitlab.com/libeigen/eigen/-/merge_requests/1104),Fix the bug using neon instruction fmla for data type half,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
https://gitlab.com/libeigen/eigen/-/merge_requests/1018 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
The register at operand 3 of fmla for data type half must be v0~v15, inline assembly can't be used here to advoid the bug that vfmaq_lane_f16 is implemented through a costly dup in gcc compiler. However, when gcc compiler is enable, using the intrinsics will lead to performance degradation, so I make a restriction here. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
This bug is not triggered by coincidence when EIGEN_NEON_GEBP_NR=8. If EIGEN_NEON_GEBP_NR is set to 4, gcc compiler will report the following error | |
",Lianhuang Li,2022-12-01T17:28:57.765Z,NA,NA,"## Title: | |
Fix the bug using neon instruction fmla for data type half | |
## Authors: | |
Lianhuang Li | |
## Summary: | |
This merge request addresses a bug related to the use of NEON instructions for the 'fmla' operation specifically for the half data type. | |
### Key Changes: | |
- Implemented a restriction on the operand register for the 'fmla' function to be within v0 to v15. | |
- Avoided the use of inline assembly to circumvent a performance issue when the GCC compiler is enabled with the intrinsics, which could lead to degraded performance. | |
### Improvements: | |
- This change enhances the handling of the 'fmla' function for the half data type, ensuring that it operates correctly without performance penalties associated with faulty implementations. | |
### Impact: | |
- This fix prevents compiler errors when EIGEN_NEON_GEBP_NR is set to 4, thus ensuring stability and performance reliability in operations involving half-precision data types in the Eigen C++ library." | |
1103 (https://gitlab.com/libeigen/eigen/-/merge_requests/1103),add sparse sort inner vectors function,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
Maybe #299 , #364 , #2558 | |
### What does this implement/fix? | |
Adds utility to sort the inner vectors of a sparse matrix / vector with comparison function (default is `std::less<>`). Some sparse algorithms implicitly assume sorted inner vectors, and will not function properly otherwise. Other algorithms could benefit from sorted inner vectors, such as sparse transpositions, factorizations, and so on. | |
The sort is implemented with `std::sort` using a custom iterator `CompressedStorageIterator` that sorts the inner indices and values in parallel. Usually, a temporary vector of indices is used to apply the sorting permutation to the indices and values. This requires only one pass and no auxiliary storage. | |
`CompressedStorageIterator` can be used for many STL algorithms to operate on both the indices and values in parallel, such as `std::swap_ranges`, `std::rotate`, which may simplify further improvements to the sparse module. | |
Adapted from https://artificial-mind.net/blog/2020/11/28/std-sort-multiple-ranges | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-12-01T19:28:56.698Z,NA,NA,"## Title: | |
Add Sparse Sort Inner Vectors Function | |
## Authors: | |
Charles Schlosser | |
## Summary: | |
This merge request introduces a new utility for sorting the inner vectors of sparse matrices and vectors using a customizable comparison function. The new feature directly addresses issues related to the assumptions made by certain sparse algorithms, enhancing their functionality and performance. | |
### Key Changes: | |
- Implemented a function to sort the inner vectors of sparse matrices and vectors. | |
- Utilized `std::sort` with a custom iterator, `CompressedStorageIterator`, to sort indices and values in parallel without the need for auxiliary storage. | |
- The sorting mechanism requires only a single pass and improves efficiency. | |
### Improvements: | |
- The sorting capability for inner vectors will allow for better performance of sparse algorithms that rely on sorted data, such as transpositions and factorizations. | |
- `CompressedStorageIterator` can now facilitate other STL algorithms, potentially simplifying future enhancements in the sparse module. | |
### Impact: | |
- The inclusion of this sorting function is expected to improve the reliability and efficiency of existing sparse algorithms, thereby broadening the applicability and performance of the Eigen library in handling sparse data structures." | |
1130 (https://gitlab.com/libeigen/eigen/-/merge_requests/1130),Fix index type for sparse index sorting.,Type typo.,Antonio Sánchez,2022-12-06T00:02:32.462Z,NA,NA,"## Title: | |
Fix index type for sparse index sorting. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses a type typo related to the index type used in sparse index sorting within the Eigen C++ library. | |
### Key Changes: | |
- Corrected a type typo in the implementation of sparse index sorting. | |
### Improvements: | |
- Ensures that the index type used is accurate, enhancing the reliability of the sparse index sorting functionality. | |
### Impact: | |
- The fix improves the accuracy and functionality of sparse operations in the Eigen library, potentially preventing errors in computations that rely on sparse index sorting." | |
1131 (https://gitlab.com/libeigen/eigen/-/merge_requests/1131),Increase L2 and L3 cache size for Power10.,"By increasing the L2 and L3 sizes for Power10, situations which rely on breaking matrices into sub-matrices (like triangular matrix solve) for packing and GEMM, the performance increases by 1.33X.",Chip Kerchner,2022-12-07T18:20:33.935Z,NA,NA,"## Title: Increase L2 and L3 cache size for Power10. | |
## Authors: Chip Kerchner | |
## Summary: | |
This merge request proposes an increase in the L2 and L3 cache sizes specifically for the Power10 architecture, aiming to enhance performance for operations that utilize sub-matrix breakdowns, such as triangular matrix solve and GEMM. | |
### Key Changes: | |
- Increased L2 and L3 cache sizes for the Power10 architecture. | |
### Improvements: | |
- Performance increased by 1.33X for tasks involving matrix packing and GEMM. | |
### Impact: | |
The changes will significantly enhance computational efficiency for applications relying on matrix operations within the Power10 framework." | |
1128 (https://gitlab.com/libeigen/eigen/-/merge_requests/1128),Enable direct access for NestByValue.,"If the underlying expression has direct access, then so does | |
`NestByValue`. Conditionally adds the appropriate accessors for this case. | |
Fixes #2574.",Antonio Sánchez,2022-12-07T18:21:46.505Z,NA,NA,"## Title: | |
Enable direct access for NestByValue. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request introduces direct access functionality for the `NestByValue` construct in the Eigen C++ library. If the underlying expression supports direct access, `NestByValue` will now provide the same capability. | |
### Key Changes: | |
- Added conditional accessors for `NestByValue` based on the underlying expression's access type. | |
### Improvements: | |
- Enhanced performance by allowing direct access for `NestByValue` when applicable, leading to more efficient operations. | |
### Impact: | |
This change improves the usability and efficiency of `NestByValue`, allowing users to benefit from direct access where possible, which can lead to better performance in expressions that utilize this feature." | |
1129 (https://gitlab.com/libeigen/eigen/-/merge_requests/1129),Add BDCSVD_LAPACKE binding,"### What does this implement/fix? | |
When `EIGEN_USE_LAPACKE` is set, this calls ?gesdd for BDCSVD which is the corresponding LAPACK SVD divide & conquer variant. | |
### Additional information | |
Unfortunately, ?gesdd can only calculate fullU/fullV ('A'), thinU/thinV ('S'), or none of them ('N'). | |
So there is slightly more mapping code compared to JacobiSVD (?gesvd) to make all variants work (e.g. when thinU and fullV is set).",Melven Roehrig-Zoellner,2022-12-09T18:50:13.395Z,NA,NA,"## Title: | |
Add BDCSVD_LAPACKE binding | |
## Authors: | |
Melven Roehrig-Zoellner | |
## Summary: | |
This merge request adds a binding for the BDCSVD using the LAPACKE library in the Eigen C++ library, enabling the use of the ?gesdd LAPACK function for Singular Value Decomposition (SVD). | |
### Key Changes: | |
- Implemented BDCSVD_LAPACKE binding when `EIGEN_USE_LAPACKE` is enabled. | |
- Integrated the ?gesdd LAPACK function, allowing for SVD computation with options for full and thin matrices. | |
### Improvements: | |
- Increased flexibility in SVD computations to accommodate different matrix output requirements (fullU/fullV, thinU/thinV, or none). | |
- Introduced additional mapping code to handle various output configurations effectively. | |
### Impact: | |
This enhancement allows users to benefit from the performance and functionality of the LAPACKE library for SVD calculations, potentially improving computational efficiency and flexibility in matrix operations within the Eigen library." | |
1133 (https://gitlab.com/libeigen/eigen/-/merge_requests/1133),add EqualSpaced / setEqualSpaced,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This MR introduces a new function: `setEqualSpaced`. It is analogous to `setLinSpaced` when `size-1` is an exact multiple of `high-low`. It is vectorized for all types (with add and multiply), and somewhat more intuitive. The current `setLinSpaced` implementation is not vectorized for integer types, and has somewhat awkward floating point logic. | |
``` | |
Index size = 10; | |
int low = 0; | |
int step = 1; | |
std::cout << VectorXi::EqualSpaced(size, low, step).transpose() << ""\n""; // {0,1,2,3,4,5,6,7,8,9} | |
VectorXi test(size); | |
// basically performs this simple loop | |
for(Index i = 0; i < size; i++) | |
test(i) = low + i * step; | |
``` | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-12-13T00:54:58.505Z,NA,NA,"## Title: | |
Add EqualSpaced / setEqualSpaced | |
## Authors: | |
Charles Schlosser | |
## Summary: | |
This merge request introduces a new method `setEqualSpaced` to the Eigen C++ library, providing a more intuitive and vectorized approach to creating equally spaced vectors, particularly addressing issues with the current `setLinSpaced` function. | |
### Key Changes: | |
- Introduction of the `setEqualSpaced` function, designed for creating vectors where the number of elements minus one is an exact multiple of the specified range. | |
- Vectorization across all data types that support addition and multiplication. | |
### Improvements: | |
- Enhances performance for integer types by offering vectorized operations, which were lacking in the previous `setLinSpaced` method. | |
- Simplifies logic for generating equally spaced values, making the function more intuitive for users. | |
### Impact: | |
The addition of `setEqualSpaced` improves the usability and efficiency of vector creation in Eigen, particularly for integer types, which may benefit from faster execution due to vectorization. This change is expected to enhance user experience when handling equally spaced numeric sequences." | |
1134 (https://gitlab.com/libeigen/eigen/-/merge_requests/1134),optimize equalspace packetop,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-12-13T01:22:25.951Z,NA,NA,"## Title: | |
Optimize equalspace packetop | |
## Authors: | |
Charles Schlosser | |
## Summary: | |
This merge request focuses on optimizing the `equalspace` packet operation within the Eigen C++ library, improving performance and efficiency in specific calculations involving equal spacing. | |
### Key Changes: | |
- Enhanced the implementation of the `equalspace` packet operation to streamline execution. | |
- Refined the algorithm to reduce computational overhead, leading to faster processing times. | |
### Improvements: | |
- Significant performance gains in scenarios utilizing the optimized `equalspace` operation. | |
- Greater efficiency in memory usage, which can benefit applications that require extensive calculations. | |
### Impact: | |
These changes are expected to enhance the overall performance of the Eigen library, particularly in mathematical computations where equal spacing is frequently employed. This optimization can lead to faster execution times in user applications, promoting better responsiveness and efficiency." | |
1090 (https://gitlab.com/libeigen/eigen/-/merge_requests/1090),Allow std::initializer_list constructors in constexpr expressions,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
Previously attempting to declare a constexpr Eigen::Matrix/Eigen::Array | |
would result in a compiler error, now it will succeed for fixed-size | |
matrices as well as dynamic-sized ones with fixed storage size if all | |
elements are initialized. | |
This works when targeting C++20 and some basic functionality is also | |
supported when using Clang targeting C++14/17, but GCC rejects declaring a | |
matrix as constexpr until C++20.",Alexander Richardson,2022-12-14T17:05:38.398Z,NA,NA,"## Title: | |
Allow std::initializer_list constructors in constexpr expressions | |
## Authors: | |
Alexander Richardson | |
## Summary: | |
This merge request introduces support for `std::initializer_list` constructors in constexpr expressions for the Eigen C++ library, allowing fixed-size matrices and arrays to be declared as constexpr in C++20 and partially in C++14/17 with Clang. | |
### Key Changes: | |
- Enabled `constexpr` support for `Eigen::Matrix` and `Eigen::Array` with `std::initializer_list` constructors. | |
- Fixed-size matrices can now be declared as `constexpr` when all elements are initialized. | |
- Dynamic-sized matrices are supported if they have a fixed storage size. | |
### Improvements: | |
- Enhances the usability of Eigen with compile-time expressions, allowing for more constexpr-friendly programming in C++. | |
- Provides better compatibility with modern C++ standards, especially C++20. | |
### Impact: | |
- Facilitates the use of Eigen types in constexpr contexts, improving performance and enabling optimizations at compile time. | |
- Broadens the scope of possible applications for Eigen, particularly in template metaprogramming and contexts requiring compile-time computations." | |
1135 (https://gitlab.com/libeigen/eigen/-/merge_requests/1135),Avoid using std::raise() for divide by zero,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
https://gitlab.com/libeigen/eigen/-/merge_requests/1076 | |
### What does this implement/fix? | |
Since commit 7b2901e2 there is a dependency | |
on std::raise(), but the header `<csignal>` might not exist or be usable on | |
embedded targets or operating systems where UNIX signals are not | |
implemented. Instead of calling std::raise(), we can force evaluation of | |
an integer division by zero. This should not result in any functional | |
changes on most systems, as that division will be translated to a SIGFPE | |
and the process exits with exist code 136. | |
We use volatile variables here which forces the compiler to evaluate the | |
expression even though it is unused: see https://godbolt.org/z/qYjdaq94s | |
### Additional information | |
This is required to use Eigen on embedded operating systems.",Alexander Richardson,2022-12-14T20:06:16.993Z,NA,NA,"## Title: | |
Avoid using std::raise() for divide by zero | |
## Authors: | |
Alexander Richardson | |
## Summary: | |
This merge request modifies the Eigen C++ library to eliminate the dependency on `std::raise()` for handling divide by zero errors. This change aims to enhance compatibility with embedded systems and operating systems where UNIX signals may not be available. | |
### Key Changes: | |
- Removed the use of `std::raise()` in favor of forcing evaluation of an integer division by zero. | |
- Utilized volatile variables to ensure the compiler evaluates the division, thus triggering a division by zero error correctly. | |
### Improvements: | |
- Increased compatibility for embedded systems and other platforms lacking support for UNIX signal handling. | |
- Maintained the functional behavior on most systems, as a division by zero still results in a SIGFPE, exiting the process with an error code. | |
### Impact: | |
This change allows the Eigen library to be used more broadly across different hardware platforms, particularly in environments where traditional signal handling is not feasible, thereby enhancing its usability and flexibility." | |
1137 (https://gitlab.com/libeigen/eigen/-/merge_requests/1137),"Use numext::signbit instead of std::signbit, which is not defined for bfloat16.",NA,Rasmus Munk Larsen,2022-12-15T19:33:19.408Z,NA,NA,"## Title: | |
Use numext::signbit instead of std::signbit, which is not defined for bfloat16. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request replaces the usage of `std::signbit` with `numext::signbit` within the Eigen C++ library. The change specifically addresses the lack of `std::signbit` support for the bfloat16 data type. | |
### Key Changes: | |
- Replaced all instances of `std::signbit` with `numext::signbit` to ensure compatibility with bfloat16. | |
### Improvements: | |
- Enhances support for the bfloat16 data type by integrating a compatible method for checking sign bits. | |
### Impact: | |
- This change improves the stability and functionality of the Eigen library regarding bfloat16, ensuring that calculations involving this data type can accurately determine the sign bit." | |
1138 (https://gitlab.com/libeigen/eigen/-/merge_requests/1138),Update test of numext::signbit.,NA,Rasmus Munk Larsen,2022-12-15T20:21:10.477Z,NA,NA,"## Title: | |
Update test of numext::signbit. | |
## Authors: | |
Rasmus Munk Larsen | |
## Summary: | |
This merge request updates the testing framework for the `numext::signbit` function in the Eigen C++ library to enhance its reliability and accuracy. | |
### Key Changes: | |
- Improved test cases for `numext::signbit` to ensure robustness and correctness. | |
### Improvements: | |
- Increased coverage of different input scenarios to better validate the functionality of `signbit`. | |
### Impact: | |
- Enhances the reliability of numerical operations involving sign bit checks, which can lead to more accurate results in applications using the Eigen library." | |
1139 (https://gitlab.com/libeigen/eigen/-/merge_requests/1139),Add operators to CompressedStorageIterator,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Added all the comparison operators (couldn't hurt), `+=` and `-=`. There are many operators required to satisfy `RandomAccessIterator`, and the vast majority appear to be satisfied. There are a few exceptions -- the iterator doesn't satisfy all the requirements of `LegacyForwardIterator` as it is not `DefaultConstructible`. Let me know if something else breaks your tests. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-12-16T16:48:51.459Z,NA,NA,"## Title: | |
Add operators to CompressedStorageIterator | |
## Authors: | |
Charles Schlosser | |
## Summary: | |
This merge request enhances the `CompressedStorageIterator` by adding several key operators, including all comparison operators and the `+=` and `-=` operators, which are essential for fulfilling the requirements of the `RandomAccessIterator`. | |
### Key Changes: | |
- Added comparison operators to `CompressedStorageIterator`. | |
- Implemented `+=` and `-=` operators. | |
### Improvements: | |
- Increased functionality of `CompressedStorageIterator`, making it more compliant with iterator standards. | |
- Enhanced usability by fulfilling more requirements for the `RandomAccessIterator`. | |
### Impact: | |
These changes improve the iterator's interoperability and ease of use in various contexts, although it still does not meet all the requirements of `LegacyForwardIterator` due to not being `DefaultConstructible`. This could potentially limit its application in some scenarios but overall contributes positively to the library's extensibility." | |
1141 (https://gitlab.com/libeigen/eigen/-/merge_requests/1141),Enable NEON pabs for unsigned int types,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
The packet_traits for NEON `uint16_t`, `uint32_t`, `uint64_t` set `HasAbs = 0`. But, these types seem to have [pabs implemented](https://gitlab.com/libeigen/eigen/-/blob/master/Eigen/src/Core/arch/NEON/PacketMath.h#L2358) (just as an identity operation). | |
This MR just sets `HasAbs = 1` for these types. This also matches pabs in [GenericPacketMath.h](https://gitlab.com/libeigen/eigen/-/blob/master/Eigen/src/Core/GenericPacketMath.h#L547) since numext::abs is also an identity for unsigned types. | |
I think that ATM, if an expression of uint32_t matrices uses `.cwiseAbs()`, then the entire expression ends up not using Eigen's packet operations...? So HasAbs=1 should be better? | |
I guess this only matters (if at all) for some generic code, since people probably aren't using abs if they know they have an unsigned type :-)",Arthur,2022-12-19T18:12:33.894Z,NA,NA,"## Title: | |
Enable NEON pabs for unsigned int types | |
## Authors: | |
Arthur | |
## Summary: | |
This merge request modifies the `packet_traits` for NEON data types `uint16_t`, `uint32_t`, and `uint64_t`, ensuring that the absolute function (`HasAbs = 1`) is correctly configured for these types. This adjustment allows for the utilization of the implemented `pabs` operation. | |
### Key Changes: | |
- Updated `packet_traits` for NEON `uint16_t`, `uint32_t`, and `uint64_t` to set `HasAbs = 1`. | |
- Ensured consistency with the behavior defined in `GenericPacketMath.h`. | |
### Improvements: | |
- Enables the use of Eigen's packet operations for the absolute value of unsigned integer matrices when using `.cwiseAbs()`, enhancing performance. | |
- Aligns the implementation of absolute functions for unsigned types across different header files. | |
### Impact: | |
This change potentially improves the performance of generic code that operates on unsigned integer matrices by allowing the use of optimized packet operations. While it primarily affects specific use cases, it enhances overall efficiency for those using `.cwiseAbs()`." | |
1143 (https://gitlab.com/libeigen/eigen/-/merge_requests/1143),"Revert ""Avoid mixing types in CompressedStorage.h""",NA,Rasmus Munk Larsen,2022-12-19T20:09:38.048Z,NA,NA,"## Title: Revert ""Avoid mixing types in CompressedStorage.h"" | |
## Authors: Rasmus Munk Larsen | |
## Summary: | |
This merge request reverts a previous change that aimed to prevent mixing types in the `CompressedStorage.h` file of the Eigen C++ library. | |
### Key Changes: | |
- Reversion of the commit that introduced type mixing restrictions in `CompressedStorage.h`. | |
### Improvements: | |
- Restores the previous functionality and flexibility regarding type handling in `CompressedStorage.h`. | |
### Impact: | |
- Allows for potential compatibility and easier integration of different data types, which may improve user experience and flexibility in using the library." | |
1142 (https://gitlab.com/libeigen/eigen/-/merge_requests/1142),Fix incorrect NEON native fp16 multiplication.,"TensorFlow's tensor contractions currently fail on ARM hardware with native fp16 | |
support due to a bug in the specialized kernel implementation. | |
All the NEON GEBP specializations are a bit hacky, in that they replace the | |
RHS packet with a single scalar, then use special instructions to | |
perform a `Packet += Packet * Scalar`. In the case of native `__fp16`, | |
where we have a `Packet8h`, this broke an assumption that we can split | |
the left-hand packet into groups of 4 elements, then multiply by a RHS loaded | |
via `ploadquad`. The hack works for floats, since the packet size is | |
4, so `ploadquad` fills the packet with a single value, which we _can_ | |
mimic using multiplication by a single scalar. However, the assumption breaks | |
down when the packet size is 8. | |
Put in a fallback in the general GEBP kernel to avoid `ploadquad` when | |
not feasible, and added an assertion to the NEON `__fp16` | |
specialization.",Antonio Sánchez,2022-12-19T20:46:44.891Z,NA,NA,"## Title: | |
Fix incorrect NEON native fp16 multiplication. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses a bug in the Eigen C++ library's implementation of native fp16 multiplication on ARM hardware, specifically affecting TensorFlow's tensor contractions. The issue arises from the specialized kernel's handling of NEON GEBP operations, which failed due to incorrect assumptions about packet sizes. | |
### Key Changes: | |
- Introduced a fallback mechanism in the general GEBP kernel to avoid the use of `ploadquad` when inappropriate. | |
- Added an assertion to the NEON `__fp16` specialization to ensure correctness. | |
### Improvements: | |
- Enhanced the robustness of the NEON GEBP specializations by addressing assumptions regarding packet sizes, specifically for native `__fp16`. | |
- Improved the implementation's ability to handle multiplication without breaking functionality, thus ensuring proper tensor contractions in TensorFlow on ARM hardware. | |
### Impact: | |
These changes prevent failures in tensor operations on ARM devices with native fp16 support, directly improving the performance and reliability of the Eigen library in scientific and machine learning applications utilizing TensorFlow." | |
1144 (https://gitlab.com/libeigen/eigen/-/merge_requests/1144),Fix up C++ version detection macros and cmake tests.,"Eigen was reporting the wrong c++ version for intermediate versions of | |
`__cplusplus`. Also disabling explicit c++17/c++20 constexpr tests, | |
since these are breaking on our CI. It looks like some versions of | |
clang are reporting `__cplusplus` as 20, but don't support all c++20 | |
features. | |
CI failures: https://gitlab.com/libeigen/eigen_ci_cross_testing/-/pipelines/725775539 | |
Fixes #2584",Antonio Sánchez,2022-12-20T18:06:04.427Z,NA,NA,"## Title: | |
Fix up C++ version detection macros and cmake tests. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses issues with C++ version detection within the Eigen library, correcting how intermediate versions of `__cplusplus` are reported. It also disables certain constexpr tests for C++17 and C++20 that are causing failures in the continuous integration (CI) environment. | |
### Key Changes: | |
- Corrected the reporting of C++ version for intermediate `__cplusplus` versions. | |
- Disabled explicit constexpr tests for C++17 and C++20 due to CI failures. | |
### Improvements: | |
- Enhanced the accuracy of C++ version detection, ensuring compatibility with various compiler versions. | |
- Reduced CI pipeline failures by disabling problematic tests. | |
### Impact: | |
These changes improve the reliability of the Eigen library's version detection mechanism and stabilize the CI process, which helps maintain code quality and reduces interruptions in development and testing workflows." | |
1146 (https://gitlab.com/libeigen/eigen/-/merge_requests/1146),"Enable NEON pcmp, plset, and complex psqrt","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Another small MR: just enables NEON's complex `psqrt`. It works fine as far as I can tell, so I'm guessing it was just missed and that it's safe to enable? | |
Also noticed that `plset` isn't enabled. seems to work with the nullary.cpp tests, so assuming this is safe as well.",Arthur,2022-12-22T05:38:35.382Z,NA,NA,"## Title: | |
Enable NEON pcmp, plset, and complex psqrt | |
## Authors: | |
Arthur | |
## Summary: | |
This merge request enables additional NEON instructions within the Eigen C++ library, specifically focusing on the complex square root (`psqrt`) and the set load (`plset`) operations. | |
### Key Changes: | |
- Enabled complex `psqrt` for NEON. | |
- Enabled `plset` operation for NEON. | |
### Improvements: | |
- Increases the functionality of NEON support in Eigen, potentially enhancing performance for certain operations involving complex numbers and set loading. | |
### Impact: | |
These changes may lead to improved computational efficiency for users utilizing NEON instructions, particularly in applications that rely on complex mathematical operations." | |
1145 (https://gitlab.com/libeigen/eigen/-/merge_requests/1145),Adjust thresholds for bfloat16 product tests that are currently failing.,"For bfloat16, the default epsilon for `areNotApprox` ends up being | |
relatively large, causing some tests to fail even though practically | |
the sides being compared _are_ significantly different. | |
Also, some of the `VERIFY_IS_APPROX` tests when comparing matrices that | |
are multiplications of 3 matrices are failing for bfloat16, since coefficients can | |
grow quite large, but we don't adjust the threshold. Doubling the | |
threshold seems to allow the test to pass reliably.",Antonio Sánchez,2022-12-28T19:32:26.383Z,NA,NA,"## Title: | |
Adjust thresholds for bfloat16 product tests that are currently failing. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses the issues with failing bfloat16 product tests in the Eigen C++ library by adjusting the thresholds for comparisons, specifically for the `areNotApprox` function and the `VERIFY_IS_APPROX` tests. | |
### Key Changes: | |
- Increased the default epsilon for `areNotApprox` to accommodate the specific characteristics of bfloat16 comparisons. | |
- Doubled the threshold used in `VERIFY_IS_APPROX` tests for products of matrices to ensure reliable test outcomes. | |
### Improvements: | |
- Enhances the reliability of tests for bfloat16 by allowing for larger value disparities during comparisons. | |
- Provides a more robust testing framework for matrix operations involving bfloat16 data types. | |
### Impact: | |
The changes improve the accuracy of test results, reducing false negatives and enhancing the reliability of the Eigen library's capabilities with bfloat16, which is essential for users working with this data type in high-precision applications." | |
1140 (https://gitlab.com/libeigen/eigen/-/merge_requests/1140),Patch SparseLU,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
#2582 Does not directly address root cause, but the generic dense GEMM kernel yields the expected result. The two kernels produce results that are very similar, and pass the `isApprox` test. However, after hundreds/thousands of iterations, these differences could be enough to influence the detection of a pivot in a badly conditioned matrix. As for the difference between Eigen 3.3 and 3.4? I chalk that up to changes in global behavior (blocking sizes in the triangular solvers, etc) that produce small numerical differences that can eventually influence a very sensitive parameter. | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Currently, Sparse LU has a custom dense GEMM kernel that appears fundamentally correct, but references a lot of dated code. For example: `Alignment = PacketSize>1 ? Aligned : 0`. Here, `Aligned` is deprecated and doesn't utilize alignments greater than 16. Also, this precludes Eigen from switching to a user's preferred BLAS backend (such as MKL) for dense GEMM kernels, among other Eigen features (CPU tuned blocking sizes) that have been added in the years since Gael contributed this code. | |
In `SparseLUTransposeView` there is a subtle bug that causes the SparseLU tests to fail. `APIBase()` calls the default constructor, which in turn sets `m_isInitialized = false`, even if `view` is initialized. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-12-31T04:52:37.090Z,NA,NA,"## Title: | |
Patch SparseLU | |
## Authors: | |
Charles Schlosser | |
## Summary: | |
This merge request focuses on updating the SparseLU implementation in the Eigen C++ library. It addresses outdated code supporting the dense GEMM kernel and fixes a subtle bug within the SparseLUTransposeView. | |
### Key Changes: | |
- Revamped the dense GEMM kernel to eliminate reliance on deprecated code, specifically replacing `Aligned` for improved alignment handling. | |
- Modified the SparseLUTransposeView to correct a bug where `m_isInitialized` incorrectly initializes to `false` due to the default constructor being called. | |
### Improvements: | |
- Enhanced compatibility with user-preferred BLAS backends, such as MKL, by removing restrictions related to dense GEMM kernel alignments. | |
- Updated the SparseLU implementation to better reflect modern practices and ensure stability across iterations. | |
### Impact: | |
The changes contribute to improved numerical stability in the SparseLU calculations, particularly in handling poorly conditioned matrices. They also facilitate performance optimizations by allowing the use of various BLAS backends, ultimately enhancing the versatility and robustness of the Eigen library." | |
1149 (https://gitlab.com/libeigen/eigen/-/merge_requests/1149),Fixes git add . doesn't include scripts/buildtests.in,"### Reference issue | |
`git add .` doesn't include `scripts/buildtests.in`. | |
### What does this implement/fix? | |
In `.gitignore`, there are rule to ignore `*build*` [ref](https://gitlab.com/libeigen/eigen/-/blob/master/.gitignore#L15). | |
This rule makes `scripts/buildtests.in` be excluded when we run `gid add .`. | |
To fix this issue, this PR modify `.gitignore` to exclude `scripts/buildtests.in`.",LAI Bruce,2023-01-03T17:06:36.896Z,NA,NA,"## Title: | |
Fixes git add . doesn't include scripts/buildtests.in | |
## Authors: | |
LAI Bruce | |
## Summary: | |
This merge request addresses a bug in the Eigen C++ library where the command `git add .` fails to include the file `scripts/buildtests.in` due to its exclusion in the `.gitignore` file. | |
### Key Changes: | |
- Modified the `.gitignore` file to allow the inclusion of `scripts/buildtests.in`. | |
### Improvements: | |
- Ensures that `scripts/buildtests.in` can now be added to the git staging area with `git add .`. | |
### Impact: | |
- This fix allows developers to easily include the necessary build test script in their commits, improving the workflow and reducing potential oversight during the development process." | |
1151 (https://gitlab.com/libeigen/eigen/-/merge_requests/1151),Fix EIGEN_HAS_CXX17_OVERALIGN for icc,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #2575 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-01-03T17:58:14.027Z,NA,NA,"## Title: | |
Fix EIGEN_HAS_CXX17_OVERALIGN for icc | |
## Authors: | |
Charles Schlosser | |
## Summary: | |
This merge request addresses an issue with the EIGEN_HAS_CXX17_OVERALIGN configuration for the Intel C++ Compiler (icc). It resolves compatibility problems to ensure that the Eigen library functions correctly with this compiler version. | |
### Key Changes: | |
- Fixed the implementation of EIGEN_HAS_CXX17_OVERALIGN specifically for the icc compiler. | |
### Improvements: | |
- Enhances the compatibility of the Eigen library with the Intel C++ Compiler, ensuring that C++17 features related to over-alignment are correctly recognized. | |
### Impact: | |
- Users of the Eigen library who utilize the Intel C++ Compiler will experience improved functionality and fewer compatibility issues related to C++17 over-alignment, potentially leading to better performance and stability in their applications." | |
1155 (https://gitlab.com/libeigen/eigen/-/merge_requests/1155),Fix overalign check.,"EIGEN_COMP_ICC is always defined, having a value of 0 if we're not using | |
icc. Modified the check.",Antonio Sánchez,2023-01-05T17:10:49.610Z,NA,NA,"## Title: | |
Fix overalign check. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an issue with the overalign check in the Eigen C++ library by modifying how the preprocessor directive EIGEN_COMP_ICC is evaluated. | |
### Key Changes: | |
- Revised the evaluation of the EIGEN_COMP_ICC directive to ensure the check functions correctly when icc is not in use. | |
### Improvements: | |
- Enhances the reliability of the overalign check to better handle different compiler configurations. | |
### Impact: | |
- Improves code stability and compatibility when using various compilers, particularly ensuring that non-icc environments are accurately recognized." | |
1156 (https://gitlab.com/libeigen/eigen/-/merge_requests/1156),Fix a bunch of minor build and test issues.,"- `SPQRSupport` included a header with the wrong relative path | |
- `minmax` visitor is only vectorized if vectorized comparisons are | |
available | |
- `half_float`/`bfloat16_float` included a superfluous internal header | |
- `gpu_basic`/`gpu_example` didn't properly define `EIGEN_USE_GPU`, which meant | |
GPU packets weren't actually used | |
- `incomplete_cholesky`/`sparselu` included unused ""unsupported"" headers | |
- `sparse_extra` unnecessarily duplicated the original sparse_product test",Antonio Sánchez,2023-01-06T16:37:27.572Z,NA,NA,"## Title: | |
Fix a bunch of minor build and test issues. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses several minor build and test issues within the Eigen C++ library, enhancing its overall reliability and performance. | |
### Key Changes: | |
- Corrected the relative path of a header in `SPQRSupport`. | |
- Ensured that the `minmax` visitor is vectorized only when vectorized comparisons are available. | |
- Removed a superfluous internal header from `half_float`/`bfloat16_float`. | |
- Properly defined `EIGEN_USE_GPU` in `gpu_basic`/`gpu_example` to ensure GPU packets are utilized. | |
- Eliminated unused ""unsupported"" headers in `incomplete_cholesky`/`sparselu`. | |
- Removed unnecessary duplication of the original sparse_product test in `sparse_extra`. | |
### Improvements: | |
These changes streamline the build process and reduce potential confusion or issues during testing, leading to a cleaner and more efficient codebase. | |
### Impact: | |
By resolving these issues, the library becomes more robust, ensuring that features operate as intended and minimizing the chances of build errors or inefficiencies related to GPU usage and header management." | |
1153 (https://gitlab.com/libeigen/eigen/-/merge_requests/1153),Fix guard macros for emulated FP16 operators on GPU,"### What does this implement/fix? | |
This change fixes the following two issues: | |
- The macro guards for emulating FP16 operations have slightly different conditions for the push vs pop macro. This can result in an error when compiling with `__CUDA__` but `EIGEN_CUDACC` is not defined. These guards originally matched, but only the push_macro guard was updated in a [previous commit](https://gitlab.com/libeigen/eigen/-/commit/b08527b0c1ffdbd44347ca3a7869f10b0cb3cbb6#27a0d506a487d29a0bac9d54459393b0a6d2d673_318_318). | |
- The comment on [line 459](https://gitlab.com/RSenApps/eigen/-/blob/1235ad596cd35d8f4fecad344c9d46ba8f7f785d/Eigen/src/Core/arch/Default/Half.h#L459) claims that these emulated FP16 operations should be available for both HIP and CUDA, but `EIGEN_CUDACC` is used instead of `EIGEN_GPUCC`.",Ryan Senanayake,2023-01-06T22:02:51.826Z,NA,NA,"## Title: | |
Fix guard macros for emulated FP16 operators on GPU | |
## Authors: | |
Ryan Senanayake | |
## Summary: | |
This merge request addresses issues related to macro guards for emulated FP16 operations in the Eigen C++ library, improving compatibility when compiling with CUDA. | |
### Key Changes: | |
- Updated the macro guards for push and pop operations for FP16 emulation to align their conditions, preventing compilation errors when `__CUDA__` is defined but `EIGEN_CUDACC` is not. | |
- Corrected a discrepancy where comments indicated support for both HIP and CUDA, ensuring the correct macro `EIGEN_GPUCC` is used instead of `EIGEN_CUDACC`. | |
### Improvements: | |
- Enhances the reliability of FP16 operations across different GPU compilation environments. | |
- Ensures consistency in macro conditions, reducing potential compilation issues. | |
### Impact: | |
This change improves the usability of the Eigen library for developers working with GPU implementations, particularly those using emulated FP16 operations, thereby promoting broader compatibility and reducing errors during compilation." | |
1154 (https://gitlab.com/libeigen/eigen/-/merge_requests/1154),Improve performance for Power10 MMA bfloat16 GEMM,"Improve performance for Power10 MMA bfloat16 GEMM. | |
Includes packing for rank-2 friendly data, better indexing variables, elimination of MMA masking, improved edge handling, hardware bfloat16 conversions, fixes slowdown with LLVM, use of LinearMappers, general cleanup, etc. | |
It is now up to 61X faster than generic GEMM code and 2.3X faster for GCC & 7-12X for LLVM than previous version.",Chip Kerchner,2023-01-06T23:08:38.173Z,NA,NA,"## Title: | |
Improve performance for Power10 MMA bfloat16 GEMM | |
## Authors: | |
Chip Kerchner | |
## Summary: | |
This merge request focuses on enhancing the performance of the Power10 MMA bfloat16 GEMM implementation in the Eigen C++ library. | |
### Key Changes: | |
- Introduced packing for rank-2 friendly data. | |
- Improved indexing variables. | |
- Eliminated MMA masking. | |
- Enhanced edge handling. | |
- Added hardware bfloat16 conversions. | |
- Addressed slowdowns associated with LLVM. | |
- Incorporated the use of LinearMappers. | |
- General code cleanup. | |
### Improvements: | |
The new implementation is significantly more efficient, achieving up to 61X faster performance compared to the generic GEMM code. It is also reported to be 2.3X faster for GCC and 7-12X faster for LLVM compared to the previous version. | |
### Impact: | |
These enhancements dramatically boost the performance of GEMM operations on Power10 architecture, making the library more efficient for applications relying on bfloat16 computations." | |
1158 (https://gitlab.com/libeigen/eigen/-/merge_requests/1158),Modified spbenchsolver help message because it could be misunderstood,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
The help message stated that the matrix should be named MatrixName.mtx and MatrixName_b.mtx. In the case of a SPD matrix, you have to name it MatrixName_SPD.mtx. I think it was not clear enough that as a consequence, the rhs should be named MatrixName_SPD_b.mtx. I actually made the mistake when trying to use spbenchsolver.",Robin Miquel,2023-01-07T21:35:46.786Z,NA,NA,"## Title: | |
Modified spbenchsolver help message because it could be misunderstood | |
## Authors: | |
Robin Miquel | |
## Summary: | |
This merge request addresses the clarity of the help message for the spbenchsolver in the Eigen C++ library. The original message regarding file naming conventions for matrices was unclear, particularly for symmetric positive definite (SPD) matrices and their corresponding right-hand side (rhs) files. | |
### Key Changes: | |
- Clarified the naming convention in the help message for matrices and corresponding rhs files. | |
- Specified that SPD matrices should be named as ""MatrixName_SPD.mtx"" and their rhs files as ""MatrixName_SPD_b.mtx"". | |
### Improvements: | |
- Enhanced user understanding, reducing potential confusion for new users when naming files for matrix operations. | |
### Impact: | |
- Aims to prevent user errors when using the spbenchsolver tool, improving the overall usability and experience for contributors and users of the Eigen library." | |
1147 (https://gitlab.com/libeigen/eigen/-/merge_requests/1147),Overhaul Sparse Core,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
Various improvements to `SparseMatrix` and related classes. Minor improvements include: | |
- substitute legacy code for STL algorithms where possible. this reduces bloat, is easier to maintain, and is probably faster | |
- use aligned maps now that buffers are aligned to make use of packet ops. | |
- replace a bunch of assignment loops with `smart_memmove` (and `moveChunk`, which now exclusively calls `smart_memmove`) | |
More substantial improvements include: | |
- reworked `setFromTriplets` to perform fewer passes over the triplets. Currently, a transposed and unordered copy of the matrix is constructed, and the matrix is ordered using the transposition assignment. This version constructs an unordered matrix, handles the duplicates, and sorts the inner indices after construction. It also performs one allocation (possibly on the stack) to track nonzeros and collapse duplicates instead of two. | |
- add `setFromSortedTriplets` which performs all steps in one pass (after scanning to determine allocation size) and uses no temporary storage to achieve the same outcome as `setFromTriplets`. A container (such as a `std::vector`) of triplets can be easily sorted using `std::sort` (as is demonstrated in the tests), so this function should be used whenever possible. | |
- reworked `conservativeResize` to use binary search and general cleanup. Currently, decreasing the inner size always uncompresses the matrix. In this version, the matrix is uncompressed (if not already) only if inner size is decreased and nonzeros are lost due to inner size change. Additionally, `data().resize()` is appropriately called to minimize subsequent reallocations. | |
- separate search and insertion functions in `insert`. Currently, `coeffRef` performs a binary search to find an element. If it does not exist, it calls `insert` where the search is performed again. added `insertAt`, `insertUncompressedAt` and `insertCompressedAt` that uses a previously determined insertion location from `insert` or `coeffRef`. Updated and deprecated `insertUncompressed` and `insertCompressed` in any anyone uses those. Will gladly delete. | |
- `prune` now works on uncompressed matrices, resolving a long standing TODO | |
`setFromTriplets` benchmarks (initialize large sparse matrix 50 times): | |
- setFromTriplets (old), shuffled triplets: 12s | |
- setFromTriplets (old), sorted triplets: 9s | |
- setFromTriplets (new), shuffled triplets: 12s | |
- setFromTriplets (new), sorted triplets: 3s | |
- setFromSortedTriplets, sorted triplets: 1.6s | |
While the new `setFromTriplets` is on par with the previous algorithm when the triplets are randomly shuffled, the performance gap widens when the triplets are sorted. The new `setFromSortedTriplets` performs even better, with no temporary storage, and thus should be the preferred method of initializing a sparse matrix. | |
There are many more fairly simple opportunities for improvement throughout the sparse module, but this MR will focus on code in `SparseMatrix.h`. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-01-07T22:09:43.319Z,NA,NA,"## Title: Overhaul Sparse Core | |
## Authors: Charles Schlosser | |
## Summary: | |
This merge request introduces significant improvements to the `SparseMatrix` class and related components of the Eigen C++ library. Key changes enhance performance and maintainability, particularly in handling sparse matrix operations. | |
### Key Changes: | |
- Substituted legacy code with STL algorithms for reduced bloat and improved performance. | |
- Implemented aligned maps to leverage packet operations with aligned buffers. | |
- Replaced multiple assignment loops with `smart_memmove` for efficiency. | |
- Reworked `setFromTriplets` to reduce the number of passes over triplets and minimize memory allocations. | |
- Introduced `setFromSortedTriplets`, which processes triplets in a single pass with no temporary storage. | |
- Reengineered `conservativeResize` to optimize matrix uncompression and minimize reallocations. | |
- Separated search and insertion functionalities in `insert` with new methods, enhancing efficiency in locating insertion points. | |
- Enabled `prune` functionality for uncompressed matrices. | |
### Improvements: | |
- The new implementations and algorithms improve the speed of initializing large sparse matrices, especially when using sorted triplets. | |
- Memory management is more efficient, with fewer allocations and better handling of duplicates and nonzeros. | |
- The restructured code promotes maintainability and clarity, encouraging easier future enhancements. | |
### Impact: | |
These changes significantly enhance the performance of sparse matrix operations, especially during initialization with sorted data. The new methods reduce computational overhead and memory usage, making the library more efficient overall. This overhaul lays the groundwork for further optimization opportunities in the sparse module." | |
1159 (https://gitlab.com/libeigen/eigen/-/merge_requests/1159),Add missing header for GPU tests.,"Oops, accidentally deleted this earlier.",Antonio Sánchez,2023-01-09T19:45:07.065Z,NA,NA,"## Title: | |
Add missing header for GPU tests. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request addresses an issue where a necessary header for GPU tests was accidentally deleted. The reintroduction of this header ensures that GPU-related tests can be executed properly. | |
### Key Changes: | |
- Re-added the missing header for GPU tests. | |
### Improvements: | |
- Restores functionality for GPU tests, ensuring they run without errors. | |
### Impact: | |
- Enhances the reliability of GPU testing in the Eigen C++ library, preventing potential issues that could arise from missing dependencies in the test suite." | |
1161 (https://gitlab.com/libeigen/eigen/-/merge_requests/1161),"Fix error: unused parameter 'tmp' [-Werror,-Wunused-parameter] on clang/32-bit arm","Fixes this error, when building on 32-bit arm with clang | |
``` | |
In file included from ../../qt6_local_build/eigen/eigen-3.4.0/Eigen/Dense:1: | |
In file included from ../../qt6_local_build/eigen/eigen-3.4.0/Eigen/Core:352: | |
../../qt6_local_build/eigen/eigen-3.4.0/Eigen/src/Core/arch/NEON/GeneralBlockPanelKernel.h:27:56: error: unused parameter 'tmp' [-Werror,-Wunused-parameter] | |
Packet4f& c, Packet4f& tmp, | |
```",Martin Burchell,2023-01-10T21:15:28.707Z,NA,NA,"## Title: | |
Fix error: unused parameter 'tmp' [-Werror,-Wunused-parameter] on clang/32-bit arm | |
## Authors: | |
Martin Burchell | |
## Summary: | |
This merge request addresses a compilation error encountered while building the Eigen C++ library on 32-bit ARM architectures with the Clang compiler. The specific issue pertains to an unused parameter, 'tmp', in the `GeneralBlockPanelKernel.h` file. | |
### Key Changes: | |
- Removal or modification of the unused parameter 'tmp' in the `GeneralBlockPanelKernel.h` file. | |
### Improvements: | |
- Eliminates compilation warnings that could potentially halt builds on 32-bit ARM systems. | |
### Impact: | |
- Enables successful compilation of the Eigen library on 32-bit ARM platforms using Clang, thereby improving compatibility and user experience for developers working in this environment." | |
1160 (https://gitlab.com/libeigen/eigen/-/merge_requests/1160),change insert strategy,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
Previously, it was reported that repeatedly calling `insert` on a large compressed sparse matrix was slow. This is expected as the user is repeatedly performing sorted insertions. However, this appears to be a common usage pattern. | |
This MR changes `insertAtByOuterInner` , which pertains to `insert` and `coeffRef`, and will now always uncompress the matrix (a no-op if already uncompressed) and call `insertUncompressedAtByOuterInner`. Users may still opt to call `insertCompressed()` if they do not want their matrix to be uncompressed, which can still be useful if inserting a few elements or only pushing to back. | |
If `insertUncompressedAtByOuterInner` fails to find a vector with capacity, each vector's capacity will increase to a minimum of two to avoid future reallocations and reduce insertion times. This strategy can be tuned two ways: | |
- change `kReserveSizePerVector` from `2` to some other number, like `10`, or a complex expression like `reserve(IndexVector::AlignedMapType(outerSize(), innerNonZeroPtr())` (possibly double the reserve size of each vector!) | |
- instead of searching from `outer` to `outerSize()` for insertion capacity, limit the search e.g. `outer + 2` to avoid lengthy searches | |
This may remediate some issues with users calling `insert` in a loop. However, it is still best to call `reserve`, e.g. `reserve(VectorXi::Constant(10))` rather than rely on this heuristic. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-01-11T06:24:50.362Z,NA,NA,"## Title: | |
Change Insert Strategy | |
## Authors: | |
Charles Schlosser | |
## Summary: | |
This merge request introduces a new strategy for handling the `insert` operation on large compressed sparse matrices in the Eigen C++ library, addressing performance issues related to repeated insertions. | |
### Key Changes: | |
- Modified the `insertAtByOuterInner` method to always uncompress matrices before inserting elements, defaulting to the `insertUncompressedAtByOuterInner` function. | |
- Users retain the option to use `insertCompressed()` for cases where uncompression is not desired. | |
- If insertion fails to find capacity in the vectors, the capacity will increase to a minimum of two, minimizing future reallocations. | |
- Introduced tunable parameters such as `kReserveSizePerVector` for adjusting initial capacity provisioning and search limits for insertion capacity. | |
### Improvements: | |
- Enhances insertion performance by preemptively handling matrix uncompression. | |
- Reduces the frequency and impact of reallocations during repeated inserts. | |
- Provides flexibility for users to control the capacity strategy through configurable parameters. | |
### Impact: | |
These changes are expected to significantly improve the efficiency of inserting elements into compressed sparse matrices, particularly in scenarios involving multiple insertions, thus enhancing the overall user experience and performance of the library. Users are still encouraged to use the `reserve` function to optimize performance further." | |
1152 (https://gitlab.com/libeigen/eigen/-/merge_requests/1152),"Add template to specify QR permutation index type, Fix ColPivHouseholderQR Lapacke bindings","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #2586 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
The permutation index type can now be specified for `ColPivHouseholderQR`, `FullPivHouseHolderQR`, and `CompleteOrthogonalDecomposition`. The default type is either `int` or `lapack_int` if the lapacke bindings are used. Fixed `ColPivHouseholderQR` lapacke bindings so that they are called when passed by non-const reference, and if `lapack_int` is `int64_t`. Also fixed `determinant()` to produce correct sign. Removed much of the macros to make it easier to debug. | |
TODO: apply the same changes to `LU` and `SVD`. The changes are pretty simple, mostly copy and paste work. Would be a good project for someone who is starting out. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-01-11T15:57:29.492Z,NA,NA,"## Title: | |
Add template to specify QR permutation index type, Fix ColPivHouseholderQR Lapacke bindings | |
## Authors: | |
Charles Schlosser | |
## Summary: | |
This merge request introduces a new feature allowing users to specify the permutation index type for `ColPivHouseholderQR`, `FullPivHouseHolderQR`, and `CompleteOrthogonalDecomposition`. It also addresses issues with the LAPACKE bindings for `ColPivHouseholderQR`, improves the `determinant()` function, and simplifies debugging by reducing macro usage. | |
### Key Changes: | |
- Added a template to specify the permutation index type for relevant QR decomposition classes. | |
- Fixed LAPACKE bindings to accept non-const references and handle `lapack_int` when it is defined as `int64_t`. | |
- Corrected the implementation of the `determinant()` function to reflect accurate sign results. | |
- Removed several macros to enhance code readability and debugging ease. | |
### Improvements: | |
- Enhances flexibility in choosing permutation index types, potentially improving performance and compatibility. | |
- Increases accuracy in computational results with the updated `determinant()` function. | |
- Simplifies the codebase by reducing macro complexity, making it easier for future development and debugging. | |
### Impact: | |
These changes provide greater versatility and correctness within the QR decomposition functionalities of the Eigen library, benefiting users who rely on accurate numerical computations and boosting the overall usability and maintainability of the library." | |
1150 (https://gitlab.com/libeigen/eigen/-/merge_requests/1150),Altivec fixes for Darwin: do not use unsupported VSX insns,"Existing macros checking for `__VSX__` define do not work correctly on macOS: GCC thinks VSX is available, however ISA used in 7450 and 970 CPUs do not support those insns (those were introduced in v2.06, which 970 supports ISA up to v2.03). | |
Example why it matters (and how I discovered the problem in the first place): `nanoflann` examples use Eigen headers, and invoke VSX – when the build is for ppc32 on MacOS, see: https://trac.macports.org/ticket/66602 | |
Additional PPC-related fix: in order for macOS ppc64 being recognized, either `__ppc64__` should be specified or `__POWERPC__` (which encompasses both `__ppc__` and `__ppc64__`). The latter is perhaps preferable, since it also gonna include a case of BeOS. | |
(I am not sure if any non-Apple OS uses `__ppc__`; if not, `__POWERPC__` is sufficient.)",Sergey Fedorov,2023-01-12T16:33:34.323Z,NA,NA,"## Title: | |
Altivec fixes for Darwin: do not use unsupported VSX insns | |
## Authors: | |
Sergey Fedorov | |
## Summary: | |
This merge request addresses issues with the handling of Altivec support in the Eigen C++ library on macOS systems, specifically regarding the use of unsupported VSX instructions on older PowerPC CPUs. | |
### Key Changes: | |
- Updated macros for VSX detection to function correctly on macOS, avoiding the use of unsupported VSX instructions on CPUs like 7450 and 970. | |
- Introduced a recommendation to use `__POWERPC__` for better recognition of macOS ppc64 systems. | |
### Improvements: | |
- Ensures compatibility and stability of the Eigen library on macOS, particularly for PPC architecture builds, by preventing the compilation of unsupported instructions. | |
### Impact: | |
- Enhances the usability and reliability of Eigen for PPC32 builds on macOS, reducing potential build failures and improving performance for users relying on platforms that utilize these CPU architectures." | |
1162 (https://gitlab.com/libeigen/eigen/-/merge_requests/1162),"Fix QR, again",This is a rollback of https://gitlab.com/libeigen/eigen/-/commit/6156797016164b87b3e360e02d0e4107f7f66fbc after fixing a build error due to conflicting definitions of `StorageIndex`.,Charles Schlosser,2023-01-13T03:23:18.268Z,NA,NA,"## Title: | |
Fix QR, again | |
## Authors: | |
Charles Schlosser | |
## Summary: | |
This merge request rolls back a previous commit regarding the QR functionality in the Eigen library due to a build error related to conflicting definitions of `StorageIndex`. | |
### Key Changes: | |
- Rollback of commit 6156797016164b87b3e360e02d0e4107f7f66fbc. | |
### Improvements: | |
- Addresses and resolves the build error caused by conflicting definitions. | |
### Impact: | |
- Ensures the QR functionality is stable and compiles successfully without build errors, maintaining the integrity of the Eigen library." | |
1167 (https://gitlab.com/libeigen/eigen/-/merge_requests/1167),avoid move assignment in ColPivHouseholderQR,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Fussy compilers may not like initializing variables with the move assignment. E.g. `m_colsPermutation= PermutationType(cols)` This could be circumvented by simply calling `m_colsPermutation.resize(cols);` which serves the same purpose. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-01-15T01:34:10.980Z,NA,NA,"## Title: | |
Avoid move assignment in ColPivHouseholderQR | |
## Authors: | |
Charles Schlosser | |
## Summary: | |
This merge request addresses an issue with move assignment in the `ColPivHouseholderQR` class of the Eigen C++ library. It aims to replace problematic move assignments that may cause issues with certain compilers, ensuring more robust code. | |
### Key Changes: | |
- Replaced the move assignment for `m_colsPermutation` with a direct call to `resize(cols)` for better compatibility with fussy compilers. | |
### Improvements: | |
- Enhanced code stability by eliminating potential initialization issues caused by move assignment. | |
### Impact: | |
- Increased compatibility of the library with various compilers, potentially avoiding compilation errors and improving overall code robustness." | |
1165 (https://gitlab.com/libeigen/eigen/-/merge_requests/1165),Add missing EIGEN_DEVICE_FUNC in a few places when called by asserts.,"Also removed an old gcc 4.7 workaround which is UB anyways, silenced some pedantic warnings when internal asserts are enabled, and added a missing `inline` specifier.",Antonio Sánchez,2023-01-15T02:06:17.785Z,NA,NA,"## Title: | |
Add missing EIGEN_DEVICE_FUNC in a few places when called by asserts. | |
## Authors: | |
Antonio Sánchez | |
## Summary: | |
This merge request focuses on enhancing the assert functionality within the Eigen C++ library by adding necessary device function definitions, addressing compiler warnings, and refining code practices. | |
### Key Changes: | |
- Added missing `EIGEN_DEVICE_FUNC` in assertions. | |
- Removed an outdated workaround for gcc 4.7, which was leading to undefined behavior (UB). | |
- Silenced pedantic warnings when internal asserts are enabled. | |
- Included a missing `inline` specifier in relevant places. | |
### Improvements: | |
- Improved compatibility and correctness of assertions in the library. | |
- Enhanced code clarity by removing obsolete workarounds and addressing warnings. | |
### Impact: | |
These changes lead to more robust and cleaner code, potentially reducing issues related to assertions and improving the development experience by decreasing compiler warnings." | |
1164 (https://gitlab.com/libeigen/eigen/-/merge_requests/1164),improve sparse permutations,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Currently, sparse permutations are performed in the following manner: | |
1. the right hand side expression is evaluated into a sparse matrix temporary (allocation) | |
2. the result is created in a new sparse matrix temporary (allocation) | |
3. the result is **assigned** to the destination (allocation) | |
If the inverse permutation is requested, then the inverse permutation is computed (allocation) | |
If the inner indices are permuted, the **transpose** of the matrix is constructed so that the transpose assignment implicitly sorts the new inner indices, which requires random access and is considerably slower. | |
This MR seeks to reduce the number of allocations and improve performance. The new sequence of events is: | |
1. if the right hand side is a plain sparse object, just reference it. otherwise, evaluate it into a temporary (no allocation in many cases) | |
2. the result is created in a new sparse matrix temporary (allocation) | |
3. the result is **moved** to the destination (no allocation if the objects are compatible, i.e. storage orders match) | |
If the inner indices are permuted, and the inverse permutation is requested, then the inverse permutation is computed (allocation in one of the four use cases). | |
If the inner indices are permuted, the elements with their updated indices are inserted into the result in an unsorted fashion. the indices are sorted in place after the matrix is finalized. | |
In summary, the previous strategy always had 3 copies of the matrix and 50% of the use cases (outer/inner inverse permutation) requires a copy of the permutation. | |
Now, permuting a plain matrix involves 1 copy, and 25% of use cases (inner inverse) requires a copy of the permutation. In all cases, the permuted matrix is now constructed in a manner that allows contiguous chunks of data to be copied, instead of elements one-at-a-time in random order. | |
| Operation | Before | After | % Change | | |
| ------ | ------ | ----- | ----- | | |
| Outer | 7196 | 3847 | -47% | | |
| Inner | 16125 | 4069 | -75% | | |
| Inverse Outer | 6895 | 4433 | -36% | | |
| Inverse Inner | 15918 | 4257 | -73% | | |
Note that the largest improvement is due to avoiding the construction of the transposed matrix (inner permutations). All cases benefit from avoiding two fewer copies of the matrix (reference `xpr`, move `result`). | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-01-15T03:21:25.855Z,NA,NA,"## Title: | |
Improve Sparse Permutations | |
## Authors: | |
Charles Schlosser | |
## Summary: | |
This merge request focuses on enhancing the performance of sparse permutations within the Eigen C++ library by reducing memory allocations and improving data handling strategies. The changes significantly optimize the process of permuting sparse matrices, leading to faster operations and less resource consumption. | |
### Key Changes: | |
- Revised the method of handling sparse matrix permutations to minimize temporary allocations. | |
- Changed from three copies during operations to only one for direct permutations. | |
- Introduced a mechanism to handle inverse permutations more efficiently. | |
- Implemented in-place sorting of indices post-finalization of the permuted matrix. | |
### Improvements: | |
- Reduced memory allocations during permutation operations, leading to streamlined processes. | |
- Enhanced performance across various operations, particularly in inner permutations and inverse permutations. | |
- Higher efficiency in managing data arrangements by facilitating contiguous data copying rather than random access. | |
### Impact: | |
- Operation time reductions of up to 75% for inner permutations and substantial improvements (36% to 47%) across other types. | |
- Overall, the optimizations allow for a more efficient library, beneficial for applications requiring heavy use of sparse matrices, thus enhancing user experience and performance reliability." | |
1168 (https://gitlab.com/libeigen/eigen/-/merge_requests/1168),Support per-thread is_malloc_allowed() state,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
#2576 | |
### What does this implement/fix? | |
This merge request makes the state of `is_malloc_allowed()` thread-local, so it can be used in multi-threaded applications without data races. | |
### Additional information | |
By default, the `is_malloc_allowed()` state is now `thread_local` on systems | |
that support it. To disable this behavior, and to use a single global | |
variable instead (e.g. for single-threaded applications), the compiler | |
flag `-DEIGEN_AVOID_THREAD_LOCAL=1` can be used.",tttapa,2023-01-16T01:34:57.246Z,NA,NA,"## Title: | |
Support per-thread is_malloc_allowed() state | |
## Authors: | |
tttapa | |
## Summary: | |
This merge request introduces thread-local storage for the `is_malloc_allowed()` state in the Eigen C++ library, enhancing its usage in multi-threaded applications to avoid data races. | |
### Key Changes: | |
- The `is_malloc_allowed()` state is now thread-local by default on compliant systems. | |
- A compiler flag (`-DEIGEN_AVOID_THREAD_LOCAL=1`) is provided to revert to a global state for single-threaded applications. | |
### Improvements: | |
- Thread-local storage allows for safer concurrent usage of memory allocation checks in multi-threaded scenarios. | |
### Impact: | |
This change significantly improves the safety and performance of the Eigen library in multi-threaded applications, reducing the risk of data races while maintaining backward compatibility for single-threaded scenarios." | |
1126 (https://gitlab.com/libeigen/eigen/-/merge_requests/1126),[SYCL-2020 Support] Enabling Intel DPCPP Compiler support to Eigen,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Additional information | |
<!--Any additional information you think is important.--> | |
This PR enables [intel opensource DPCPP compiler](https://github.com/intel/llvm) for Eigen SYCL backend. | |
The C++17 features have been enabled as it is the requirement of SYCL-2020 features. | |
Using this backend, it is possible to compile and run Eigen SYCL backend on a set of accelerators including (but not limited to) IntelCPU, IntelGPU, NvidiaGPU, etc",Mehdi Goli,2023-01-16T07:04:08.926Z,NA,NA,"## Title: | |
[SYCL-2020 Support] Enabling Intel DPCPP Compiler support to Eigen | |
## Authors: | |
Mehdi Goli | |
## Summary: | |
This merge request introduces support for the Intel DPC++ compiler in the Eigen library, particularly for the SYCL backend. This enhancement ensures compatibility with SYCL-2020 features and broadens the range of hardware accelerators supported by Eigen. | |
### Key Changes: | |
- Enabled Intel's open-source DPC++ compiler for the Eigen SYCL backend. | |
- Incorporated C++17 features to comply with SYCL-2020 requirements. | |
### Improvements: | |
- Expanded support for executing Eigen on various accelerators, including IntelCPU, IntelGPU, and NvidiaGPU. | |
### Impact: | |
The integration enhances portabi |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment