Last active
March 19, 2025 13:57
-
-
Save SteveBronder/92d2183f97695c9fa0e7cfefdfeed51f to your computer and use it in GitHub Desktop.
We can't make this file beautiful and searchable because it's too large.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
MR ID/Link,Title/Subject,Description/Summary,Author,Merge Date,Category/Labels,Impacted Areas/Components | |
515 (https://gitlab.com/libeigen/eigen/-/merge_requests/515),Add random matrix generation via SVD,"Add random matrix generation via singular value decomposition as proposed by C. C. Paige and M. A. Saunders. | |
Fixes #2250. | |
### Reference issue | |
See #2250 for details. | |
### What does this implement/fix? | |
Adds a generator for random matrices with prescribed singular values. This allows for fine-tuning the difficulty of test problems with respect to the l2-norm. Two strategies for generating the singular values and unit tests for verification are included.",Kolja Brix,2021-08-23T16:00:05.986Z,NA,NA | |
544 (https://gitlab.com/libeigen/eigen/-/merge_requests/544),Add support for Eigen::Block types to GDB pretty printer,"### Reference issue | |
#1539 | |
### What does this implement/fix? | |
Add support for Eigen::Block types to GDB pretty printer. | |
### Additional information | |
* Thanks a lot to Allan Leal who provided the patch, see also #1539. | |
* See also !543.",Kolja Brix,2021-08-23T16:11:49.626Z,NA,NA | |
606 (https://gitlab.com/libeigen/eigen/-/merge_requests/606),removed sparse dynamic matrix,"It was deprecated already. | |
It is an API break in unsupported.",Jens Wehner,2021-08-24T15:53:34.943Z,NA,NA | |
607 (https://gitlab.com/libeigen/eigen/-/merge_requests/607),Add flowchart to unsupported sparse iterative solvers,"Add a flowchart to help people choose the best solver for their problem. | |
It is is a dot graph and it is not really pretty but it gets the job done. | |
I am a dot beginner so feedback and assistance is very welcome.",Jens Wehner,2021-08-24T17:12:05.554Z,NA,NA | |
608 (https://gitlab.com/libeigen/eigen/-/merge_requests/608),Remove c++11-off CI jobs.,This is step 1 in transitioning beyond c++03.,Antonio Sánchez,2021-08-24T17:59:45.188Z,NA,NA | |
609 (https://gitlab.com/libeigen/eigen/-/merge_requests/609),optimize predux if architecture is aarch64,"### What does this implement/fix? | |
This PR is going to optimize predux, predux_min and predux_max. | |
When NEON is in aarch64, we can use `v(add|min|max)v` intrinsic to do reduction, because use a lot of `vp(add|min|max)` will slow the performance.",Han-Kuan Chen,2021-08-25T19:18:55.109Z,NA,NA | |
489 (https://gitlab.com/libeigen/eigen/-/merge_requests/489),AVX512 and AVX2 support for Packet16i and Packet8i added,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
#2244 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Adds support for Packet16i and Packet8i for AVX512 and AVX2 backends respectively. | |
### Additional information | |
<!--Any additional information you think is important.-->",Jakub Lichman,2021-08-25T19:38:24.142Z,NA,NA | |
610 (https://gitlab.com/libeigen/eigen/-/merge_requests/610),Bump CMake files to at least c++11.,"Removed all configurations that explicitly test or set the c++ standard | |
flags. The only place the standard is now configured is at the top of | |
the main `CMakeLists.txt` file, which can easily be updated (e.g. if | |
we decide to move to c++14+). This can also be set via command-line using | |
``` | |
> cmake -DCMAKE_CXX_STANDARD 14 | |
``` | |
Kept the `EIGEN_TEST_CXX11` flag for now - that still controls whether to | |
build/run the `cxx11_*` tests. We will likely end up renaming these | |
tests and removing the `CXX11` subfolder.",Antonio Sánchez,2021-08-25T20:24:09.725Z,NA,NA | |
605 (https://gitlab.com/libeigen/eigen/-/merge_requests/605),SparseExtra: updated RandomSetter,"As the master branch has C++11 now | |
- updated the `RandomSetter` to use unordered_map",Jens Wehner,2021-08-25T20:47:41.570Z,NA,NA | |
543 (https://gitlab.com/libeigen/eigen/-/merge_requests/543),Fix PEP8 and formatting issues in GDB pretty printer.,"### What does this implement/fix? | |
Several PEP8 and formatting issues were found in the GDB pretty printer, which uses the Python interface of GDB to visualize the values of Eigen data structures. These get fixed in this MR. | |
### Additional information | |
Details on which PEP8 errors or warnings were found are listed in the individual commit messages.",Kolja Brix,2021-08-26T15:22:28.836Z,NA,NA | |
611 (https://gitlab.com/libeigen/eigen/-/merge_requests/611),included unordered_map header,fixes https://gitlab.com/libeigen/eigen/-/issues/2311,Jens Wehner,2021-08-27T16:53:32.855Z,NA,NA | |
613 (https://gitlab.com/libeigen/eigen/-/merge_requests/613),Fix fix<N> when variable templates are not supported.,"There were some typos that checked `EIGEN_HAS_CXX14` that should have | |
checked `EIGEN_HAS_CXX14_VARIABLE_TEMPLATES`, causing a mismatch | |
in some of the `Eigen::fix<N>` assumptions. | |
Also fixed the `symbolic_index` test when | |
`EIGEN_HAS_CXX14_VARIABLE_TEMPLATES` is 0. | |
Fixes #2308",Antonio Sánchez,2021-08-30T16:06:51.591Z,NA,NA | |
612 (https://gitlab.com/libeigen/eigen/-/merge_requests/612),Add EIGEN_TENSOR_PLUGIN support per issue #2052.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #2052 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Adds support for using EIGEN_TENSOR_PLUGIN, EIGEN_TENSORBASE_PLUGIN, and EIGEN_READONLY_TENSORBASE_PLUGIN to expand the functionality of those classes. | |
### Additional information | |
<!--Any additional information you think is important.-->",Turing Eret,2021-08-30T19:36:56.697Z,NA,NA | |
614 (https://gitlab.com/libeigen/eigen/-/merge_requests/614),Lapack flags,Allow old Fortran code for LAPACK tests to compile despite argument mismatch errors (REAL passed to COMPLEX workspace argument) with GNU Fortran 10.,Rasmus Munk Larsen,2021-08-30T20:24:00.026Z,NA,NA | |
615 (https://gitlab.com/libeigen/eigen/-/merge_requests/615),win: include intrin header for Windows on ARM,"It is necessary to include the intrin header for BitScanReverse and BitScanReverse64. | |
Fixes #2314",Ádám Kallai,2021-08-31T15:14:03.743Z,NA,NA | |
616 (https://gitlab.com/libeigen/eigen/-/merge_requests/616),Disable cuda Eigen::half vectorization on host.,"All cuda `__half` functions are device-only in CUDA 9, including | |
conversions. Host-side conversions were added in CUDA 10. | |
The existing code doesn't build prior to 10.0. | |
All arithmetic functions are always device-only, so there's | |
therefore no reason to use vectorization on the host at all. | |
Modified the code to disable vectorization for `__half` on host, | |
which required also updating the `TensorReductionGpu` implementation | |
which previously made assumptions about available packets.",Antonio Sánchez,2021-08-31T19:29:19.377Z,NA,NA | |
621 (https://gitlab.com/libeigen/eigen/-/merge_requests/621),GCC 4.8 arm EIGEN_OPTIMIZATION_BARRIER fix (#2315).,"GCC 4.8 doesn't seem to like the `g` register constraint, failing to | |
compile with ""error: 'asm' operand requires impossible reload"". | |
Tested `r` instead, and that seems to work, even with latest compilers. | |
Also fixed some minor macro issues to eliminate warnings on armv7. | |
Fixes #2315.",Antonio Sánchez,2021-08-31T20:37:12.522Z,NA,NA | |
619 (https://gitlab.com/libeigen/eigen/-/merge_requests/619),fixed unsupported linear solvers documentation,Fixed the header of the unsupported sparse iterative solvers and deleted a commented out include which no longer exists.,Jens Wehner,2021-08-31T23:15:13.631Z,NA,NA | |
629 (https://gitlab.com/libeigen/eigen/-/merge_requests/629),Fix EIGEN_OPTIMIZATION_BARRIER for arm-clang.,"Clang doesn't like !621, needs the ""g"" constraint back. | |
The ""g"" constraint also works for GCC >= 5. | |
This fixes our gitlab CI.",Antonio Sánchez,2021-09-01T16:39:50.615Z,NA,NA | |
628 (https://gitlab.com/libeigen/eigen/-/merge_requests/628),Rename 'vec_all_nan' of cxx11_tensor_expr test because this symbol is used by altivec.h,"Rename 'vec_all_nan' of cxx11_tensor_expr test because this symbol is used by altivec.h | |
This patch fixes the build failures of ppc64le tests.",Maxiwell S. Garcia,2021-09-01T17:00:06.170Z,NA,NA | |
485 (https://gitlab.com/libeigen/eigen/-/merge_requests/485),cmake: remove deprecated package config variables,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
This MR removes deprecated `EIGEN3_*` variables and related CMake files that were provided for compatibility purposes with previously provided find module. Their use, however, was discouraged and therefore declared deprecated almost five years ago with the introduction of the relocatable package config in 5c516e4e0a1290b9a233c8f3c379fd6bde5ef9c2. | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.--> | |
Removing deprecated CMake variables exposed by the package config can help avoid subtle errors that can be caused by intermixing `FindEigen3.cmake` and the `Eigen3Config.cmake` package config. The documentation does not mention the deprecated CMake variables anyway. The changes also effectively eliminate the need for workarounds required to solve #1386. | |
We should also likely remove `FindEigen3.cmake` (and maybe `FindEigen2.cmake`?) which mostly duplicate `Eigen3Config.cmake` while preventing best CMake practices by caching all the paths and increasing the maintenance burden at no additional value.",Sergiu Deitsch,2021-09-01T17:22:28.249Z,NA,NA | |
630 (https://gitlab.com/libeigen/eigen/-/merge_requests/630),Fix AVX integer packet issues.,"Most are instances of AVX2 functions not protected by | |
`EIGEN_VECTORIZE_AVX2`. There was also a missing semi-colon | |
for AVX512.",Antonio Sánchez,2021-09-01T21:32:51.342Z,NA,NA | |
622 (https://gitlab.com/libeigen/eigen/-/merge_requests/622),(GPU Testing Part 1) Rename Tuple -> Pair.,"This is to make way for a new `Tuple` class that mimics `std::tuple`, | |
but can be reliably used on device and with aligned Eigen types. | |
The existing Tuple has very few references, and is actually an | |
analogue of `std::pair`. | |
This is part 1 of a set of changes to simplify creating generic GPU tests.",Antonio Sánchez,2021-09-02T02:36:06.654Z,NA,NA | |
618 (https://gitlab.com/libeigen/eigen/-/merge_requests/618),Missing EIGEN_DEVICE_FUNCs to get `gpu_basic` passing with CUDA 9.,"CUDA 9 seems to require labelling defaulted constructors as | |
`EIGEN_DEVICE_FUNC`, despite giving warnings that such labels are | |
ignored. Without these labels, the `gpu_basic` test fails to | |
compile, with errors about calling `__host__` functions from | |
`__host__ __device__` functions. | |
With this and !616, the `gpu_basic` test now passes for CUDA 9.1.",Antonio Sánchez,2021-09-02T03:21:08.665Z,NA,NA | |
632 (https://gitlab.com/libeigen/eigen/-/merge_requests/632),cmake: remove unused interface definitions,"### What does this implement/fix? | |
This MR removes `EIGEN_DEFINITIONS` from interface definitions as `EIGEN_DEFINITIONS` is not defined anywhere.",Sergiu Deitsch,2021-09-02T16:07:39.760Z,NA,NA | |
633 (https://gitlab.com/libeigen/eigen/-/merge_requests/633),cmake: use ARCH_INDEPENDENT versioning if available,"### What does this implement/fix? | |
CMake 3.14 added the `ARCH_INDEPENDENT` option to `write_basic_package_version_file` from the `CMakePackageConfigHelpers` module greatly simplifying versioning of architecture independent package configs as used by Eigen. | |
This MR adds an alternative code path which makes use of the option. The legacy code path can be removed once Eigen increases the minimum required CMake version to at least 3.14.",Sergiu Deitsch,2021-09-02T16:09:12.711Z,NA,NA | |
617 (https://gitlab.com/libeigen/eigen/-/merge_requests/617),Matrixmarket extension,"### What does this implement/fix? | |
Currently the matrixmarket reader/writer can only parse sparse matrices and dense dynamic vectors, this extends the reader/writer to read write any kind of dense matrix. Currently self-adjoint and triangular reading is not supported. | |
Also added the documentation for these features.",Jens Wehner,2021-09-02T17:23:33.992Z,NA,NA | |
634 (https://gitlab.com/libeigen/eigen/-/merge_requests/634),cmake: populate package registry by default,"### What does this implement/fix? | |
With CMake 3.15 and above the `export` command does not populate the package registry by default unless the `CMAKE_EXPORT_PACKAGE_REGISTRY` variable is set. For backwards compatibility, prefer the old behavior (even though the old one occasionally causes some [confusion](https://gitlab.com/libeigen/eigen/-/issues/1386#note_254719335) and is considered deprecated.)",Sergiu Deitsch,2021-09-02T17:52:54.570Z,NA,NA | |
635 (https://gitlab.com/libeigen/eigen/-/merge_requests/635),Fix tridiagonalization_inplace_selector.,"The `Options` of the new `hCoeffs` vector do not necessarily match | |
those of the `MatrixType`, leading to build errors if they differ. Having the | |
`CoeffVectorType` be a template parameter relieves this restriction.",Antonio Sánchez,2021-09-02T19:45:01.334Z,NA,NA | |
636 (https://gitlab.com/libeigen/eigen/-/merge_requests/636),Remove stray DynamicSparseMatrix references.,DynamicSparseMatrix has been removed. These shouldn't be here anymore.,Antonio Sánchez,2021-09-02T20:03:43.632Z,NA,NA | |
637 (https://gitlab.com/libeigen/eigen/-/merge_requests/637),Remove more DynamicSparseMatrix references.,Missed some in unsupported/. Also fixed some typos clang-tidy was complaining about.,Antonio Sánchez,2021-09-02T22:53:22.916Z,NA,NA | |
638 (https://gitlab.com/libeigen/eigen/-/merge_requests/638),Add missing packet types in pset1 call.,"Oops, introduced this when ""fixing"" integer packets.",Antonio Sánchez,2021-09-02T23:39:03.436Z,NA,NA | |
639 (https://gitlab.com/libeigen/eigen/-/merge_requests/639),Fix AVX2 PacketMath.h.,"There were a couple typos ps -> epi32, and an unaligned load issue.",Antonio Sánchez,2021-09-03T20:03:50.162Z,NA,NA | |
482 (https://gitlab.com/libeigen/eigen/-/merge_requests/482),Add LLDB Pretty Printer,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
<!-- ### Reference issue --> | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
An LLDB synthetic child provider was implemented, with which the LLDB debugger can show the items in a structured view. | |
This script supports fixed or dynamic storage dense matrix, and compressed or uncompressed sparse matrix. Both row-major or column-major matrices are supported. | |
#### Usage | |
Import the script to LLDB using the following command | |
``` | |
command script import eigenlldb.py | |
``` | |
#### Effects | |
##### Before | |
Previous screenshots during debugging in CLion on Windows: | |
 | |
 | |
##### After | |
The effects in CLion on Windows: | |
 | |
 | |
The effects in command line on Ubuntu 20.04 with LLDB 10: | |
``` | |
(lldb) frame variable vector3 | |
(Eigen::Vector3d) vector3 = ([0,0] = 1, [1,0] = 1, [2,0] = 1) | |
(lldb) frame variable matrix3 | |
(Eigen::Matrix3d) matrix3 = ([0,0] = 1, [1,0] = 1, [2,0] = 1, [0,1] = 1, [1,1] = 1, [2,1] = 1, [0,2] = 1, [1,2] = 1, [2,2] = 1) | |
(lldb) frame variable sparse_mat | |
(Eigen::SparseMatrix<double, 0, int>) sparse_mat = ([0,0] = 1, [0,1] = 2, [1,1] = 3) | |
``` | |
<!--Please explain your changes.--> | |
<!-- ### Additional information --> | |
<!--Any additional information you think is important.-->","Huang, Zhaoquan",2021-09-07T17:28:25.975Z,NA,NA | |
624 (https://gitlab.com/libeigen/eigen/-/merge_requests/624),(GPU Testing Part 3) Add a simple serialization mechanism.,"The `Serializer<T>` class implements a binary serialization that | |
can write to (`serialize`) and read from (`deserialize`) a byte | |
buffer. Also added convenience routines for serializing | |
a list of arguments. | |
This will mainly be for testing, specifically to transfer data to | |
and from the GPU. | |
This is part 3 of a set of changes to simplify creating generic GPU tests.",Antonio Sánchez,2021-09-08T20:05:19.942Z,NA,NA | |
623 (https://gitlab.com/libeigen/eigen/-/merge_requests/623),(GPU Testing Part 2) Device-compatible Tuple implementation.,"An analogue of `std::tuple` that works on device. | |
Context: I've tried `std::tuple` in various versions of NVCC and clang, | |
and although code seems to compile, it often fails to run - generating | |
""illegal memory access"" errors, or ""illegal instruction"" errors. | |
This replacement does work on device. | |
This is part 2 of a set of changes to simplify creating generic GPU tests.",Antonio Sánchez,2021-09-08T22:34:02.622Z,NA,NA | |
641 (https://gitlab.com/libeigen/eigen/-/merge_requests/641),Remove unnecessary std::tuple reference.,"Doesn't seem to be used anyways, since we never actually include `<tuple>` anywhere.",Antonio Sánchez,2021-09-09T16:06:12.783Z,NA,NA | |
631 (https://gitlab.com/libeigen/eigen/-/merge_requests/631),Issue an error in case of direct inclusion of internal headers.,This change was mostly autogenerated.,Rasmus Munk Larsen,2021-09-10T19:12:27.443Z,NA,NA | |
625 (https://gitlab.com/libeigen/eigen/-/merge_requests/625),(GPU Testing Part 4) New GPU test utilities and example.,"This introduces three new functions: | |
``` | |
// returns kernel(args...) running on the CPU. | |
Eigen::run_on_cpu(Kernel kernel, Args&&... args); | |
// returns kernel(args...) running on the GPU. | |
Eigen::run_on_gpu(Kernel kernel, Args&&... args); | |
Eigen::run_on_gpu_with_hint(size_t buffer_capacity_hint, Kernel kernel, Args&&... args); | |
// returns kernel(args...) running on the GPU if using | |
// a GPU compiler, or CPU otherwise. | |
Eigen::run(Kernel kernel, Args&&... args); | |
Eigen::run_with_hint(size_t buffer_capacity_hint, Kernel kernel, Args&&... args); | |
``` | |
Running on the GPU is accomplished by: | |
- Serializing the kernel inputs on the CPU | |
- Transferring the inputs to the GPU | |
- Passing the kernel and serialized inputs to a GPU kernel | |
- Deserializing the inputs on the GPU | |
- Running `kernel(inputs...)` on the GPU | |
- Serializing all output parameters and the return value | |
- Transferring the serialized outputs back to the CPU | |
- Deserializing the outputs and return value on the CPU | |
- Returning the deserialized return value | |
All inputs must be serializable (currently POD types, `Eigen::Matrix` | |
and `Eigen::Array`). The kernel must also be POD (though usually | |
contains no actual data). | |
Tested on CUDA 9.1, 10.2, 11.3, with g++-6, g++-8, g++-10 respectively. | |
This MR depends on !622, !623, !624.",Antonio Sánchez,2021-09-10T22:33:06.246Z,NA,NA | |
643 (https://gitlab.com/libeigen/eigen/-/merge_requests/643),Minor fix for compilation error on HIP.,Minor fix to enable successful compilation on HIP.,Rohit Santhanam,2021-09-12T17:57:38.971Z,NA,NA | |
645 (https://gitlab.com/libeigen/eigen/-/merge_requests/645),Default eigen_packet_wrapper constructor.,"This makes it trivial, allowing use of `memcpy`. | |
Fixes #2326.",Antonio Sánchez,2021-09-14T19:47:50.618Z,NA,NA | |
648 (https://gitlab.com/libeigen/eigen/-/merge_requests/648),Fix typos in copyright dates,"### What does this implement/fix? | |
This just fixes some typos in copyright dates noticed when doing some extraction of copyright statements from the project. | |
It's on the 3.4 branch, but I can rebase to master if you prefer.",Rylie Pavlik,2021-09-15T20:46:24.854Z,NA,NA | |
647 (https://gitlab.com/libeigen/eigen/-/merge_requests/647),Clean up EIGEN_STATIC_ASSERT to only use standard c++11 static_assert.,"* Move static assertions out of constructors. | |
* Remove mechanism to turn static assertions into runtime checks and update affected tests. | |
* Break the large static assert in PlainObjectBase.h into individual checks to improve error messages.",Rasmus Munk Larsen,2021-09-16T20:43:54.846Z,NA,NA | |
651 (https://gitlab.com/libeigen/eigen/-/merge_requests/651),Remove -fabi-version=6 flag from AVX512 builds.,Fixes #2328,Rasmus Munk Larsen,2021-09-16T23:44:36.157Z,NA,NA | |
646 (https://gitlab.com/libeigen/eigen/-/merge_requests/646),Add buildtests_gpu and check_gpu to simplify GPU testing.,"This is in preparation of adding GPU tests to the CI, allowing | |
us to limit building/testing of GPU-specific tests for a given | |
GPU-capable runner. | |
GPU tests are tagged with the label ""gpu"". The new targets | |
``` | |
make buildtests_gpu | |
make check_gpu | |
``` | |
allow building and running only the gpu tests.",Antonio Sánchez,2021-09-17T01:06:15.209Z,NA,NA | |
653 (https://gitlab.com/libeigen/eigen/-/merge_requests/653),Disable specific subtests that fail on HIP due to non-functional device side malloc/free (on HIP).,"Disable specific subtests that use dynamic data structures since device side malloc/free is not currently available on HIP. | |
This functionality is forthcoming and once it is publicly available, these subtests will be reenabled. | |
/cc @cantonios",Rohit Santhanam,2021-09-17T16:37:35.636Z,NA,NA | |
649 (https://gitlab.com/libeigen/eigen/-/merge_requests/649),"Move Eigen::all,last,lastp1 back to Eigen::placeholders::.","These names are so common, IMO they should not exist directly in the | |
`Eigen::` namespace. This prevents us from using the `last` or `all` | |
names for any parameters or local variables, otherwise the compiler spews | |
warnings about shadowing or hiding the global values. Many external | |
projects (and our own examples) also heavily use | |
``` | |
using namespace Eigen; | |
``` | |
which means these conflict with external libraries as well, e.g. | |
`std::fill(first,last,value)`. | |
It seems originally these were placed in a separate namespace | |
`Eigen::placeholders`, which has since been deprecated. I propose | |
to un-deprecate this, and restore the original locations. | |
These symbols are also imported into `Eigen::indexing`, which | |
additionally imports `fix` and `seq`. An alternative is to remove the | |
`placeholders` namespace and stick with `indexing`. | |
NOTE: this is an API-breaking change. | |
Fixes #2321.",Antonio Sánchez,2021-09-17T17:44:08.696Z,API change,NA | |
652 (https://gitlab.com/libeigen/eigen/-/merge_requests/652),"Added a macro to pass arguments to ctest, e.g. to run tests in parallel.","Example: To build and run 32 tests in parallel, build Eigen with: | |
``` | |
cmake -DEIGEN_CTEST_ARGS=-j32 ../eigen | |
make -j32 check | |
```",Rasmus Munk Larsen,2021-09-17T18:33:13.440Z,NA,NA | |
654 (https://gitlab.com/libeigen/eigen/-/merge_requests/654),Silence string overflow warning for GCC in initializer_list_construction test.,"This looks to be a GCC bug. It doesn't seem to reproduce in a smaller example, | |
making it hard to isolate. | |
Warning: | |
``` | |
In file included from ../Eigen/Core:296, | |
from ../Eigen/QR:11, | |
from ../test/main.h:340, | |
from ../test/initializer_list_construction.cpp:10: | |
In constructor ‘Eigen::PlainObjectBase<Derived>::PlainObjectBase(const std::initializer_list<std::initializer_list<typename Eigen::internal::traits<T>::Scalar> >&) [with Derived = Eigen::Matrix<unsigned char, 5, 4, 0, 5, 4>]’, | |
inlined from ‘Eigen::Matrix<Scalar_, Rows_, Cols_, Options_, MaxRows_, MaxCols_>::Matrix(const std::initializer_list<std::initializer_list<typename Eigen::internal::traits<Eigen::Matrix<Scalar_, Rows_, Cols_, Options_, MaxRows_, MaxCols_> >::Scalar> >&) [with Scalar_ = unsigned char; int Rows_ = 5; int Cols_ = 4; int Options_ = 0; int MaxRows_ = 5; int MaxCols_ = 4]’ at ../Eigen/src/Core/Matrix.h:319:118, | |
inlined from ‘void initializerListMatrixConstruction() [with Scalar = unsigned char]’ at ../test/initializer_list_construction.cpp:176:26: | |
../Eigen/src/Core/PlainObjectBase.h:583:44: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=] | |
583 | coeffRef(row_index, col_index) = e; | |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~ | |
In file included from ../Eigen/Core:289, | |
from ../Eigen/QR:11, | |
from ../test/main.h:340, | |
from ../test/initializer_list_construction.cpp:10: | |
../Eigen/src/Core/DenseStorage.h: In function ‘void initializerListMatrixConstruction() [with Scalar = unsigned char]’: | |
../Eigen/src/Core/DenseStorage.h:48:5: note: at offset 30 into destination object ‘Eigen::internal::plain_array<unsigned char, 20, 0, 0>::array’ of size 20 | |
48 | T array[Size]; | |
| ^~~~~ | |
``` | |
I've verified we never actually access offset `30` in the destination.",Antonio Sánchez,2021-09-17T18:57:10.262Z,NA,NA | |
656 (https://gitlab.com/libeigen/eigen/-/merge_requests/656),Fix strict aliasing bug causing product_small failure.,"Packet loading is skipped due to aliasing violation, leading to nullopt matrix | |
multiplication. | |
Fixes #2327.",Antonio Sánchez,2021-09-17T21:24:32.775Z,NA,NA | |
655 (https://gitlab.com/libeigen/eigen/-/merge_requests/655),Run CI tests in parallel on all cores.,NA,Rasmus Munk Larsen,2021-09-17T22:35:23.466Z,NA,NA | |
657 (https://gitlab.com/libeigen/eigen/-/merge_requests/657),Fix implicit conversion warnings in tuple_test.,Fixes #2329.,Antonio Sánchez,2021-09-18T02:58:52.873Z,NA,NA | |
572 (https://gitlab.com/libeigen/eigen/-/merge_requests/572),[AutodiffScalar] Remove const when returning by value,"clang-tidy: Return type 'const T' is 'const'-qualified at the top level, | |
which may reduce code readability without improving const correctness | |
The types are somewhat long, but the affected return types are of the form: | |
``` | |
const T my_func() { /**/ } | |
``` | |
Change to: | |
``` | |
T my_func() { /**/ } | |
```",Alexander Karatarakis,2021-09-18T21:38:57.382Z,NA,NA | |
659 (https://gitlab.com/libeigen/eigen/-/merge_requests/659),Fix alias violation in BFloat16,"### What does this implement/fix? | |
Using a reinterpret_cast to access the bits of a float value is undefined behavior. With GCC 10 on PPC platforms we have seen actual failures (wrong values) due to that which are fixed by (the equivalent) of this change. See https://github.com/easybuilders/easybuild-easyconfigs/pull/14025 | |
An easy testcase for that with TF 2.2.3 is: | |
``` | |
import numpy as np | |
from tensorflow.python import _pywrap_bfloat16 | |
bfloat16 = _pywrap_bfloat16.TF_bfloat16_type() | |
print(np.arange(-10.5, 7.8, 0.5, dtype=bfloat16)) | |
``` | |
Which prints `[bfloat16(-10.5) bfloat16(-10) bfloat16(-20) bfloat16(-30) bfloat16(-40)...` | |
printf-debugging into the TF bfloat16 shows that during conversion from bfloat16->float the step value gets calculated wrong. | |
### Additional information | |
Not only is the proposed solution correct, it is even (potentially) faster. See the generated ASM: https://godbolt.org/z/4dT4a9d1b and https://github.com/tensorflow/tensorflow/commit/6b853c8f2020a446d7c04e75deff7866a35a7658#diff-17ca5d26579d2089aa9c41eacf8570b066e5c83dc957dc9bf1647a266de990f1 (see commit message)",Alexander Grund,2021-09-20T14:25:12.004Z,NA,NA | |
660 (https://gitlab.com/libeigen/eigen/-/merge_requests/660),fix various typos,NA,sciencewhiz,2021-09-22T16:15:06.955Z,NA,NA | |
661 (https://gitlab.com/libeigen/eigen/-/merge_requests/661),Fix some typos found,"### What does this implement/fix? | |
This MR fixes some typos in English text that I found using a spell checker. | |
@cantonios, @sciencewhiz: Could you please review my changes? | |
### Reference issue | |
See also (discussion in) !660. | |
### Additional information | |
There are also some occurrences of ""assignement"" which should be changed to ""assignment"", e.g. in lines 52, 56, 58, 87, 140, and 144 of `unsupported/test/cxx11_tensor_builtins_sycl.cpp`.",Kolja Brix,2021-09-23T15:22:01.068Z,NA,NA | |
663 (https://gitlab.com/libeigen/eigen/-/merge_requests/663),Disable more CUDA warnings.,"For cuda 9.2 and 11.4, they changed the numbers again. | |
Fixes #2331.",Antonio Sánchez,2021-09-25T04:51:45.289Z,NA,NA | |
662 (https://gitlab.com/libeigen/eigen/-/merge_requests/662),Reorganize test main file,"### What does this implement/fix? | |
Reorganize test main file `main.h` as discussed with @rmlarsen1 in !515. | |
* Move random matrix generators to separate file `random_matrix_helper.h` | |
* Protect forward declarations with EIGEN_COMP_ICC. | |
### Reference issue | |
See also !515.",Kolja Brix,2021-09-27T18:30:48.634Z,NA,NA | |
664 (https://gitlab.com/libeigen/eigen/-/merge_requests/664),Disable testing of complex compound assignment operators for MSVC.,"MSVC does not support specializing compound assignments for | |
`std::complex`, since it already specializes them (contrary to the | |
standard). | |
Trying to use one of these on device will currently lead to a | |
duplicate definition error. This is still probably preferable | |
to no error though. If we remove the definitions for MSVC, then | |
it will compile, but the kernel will fail silently. | |
The only proper solution would be to define our own custom `Complex` | |
type.",Antonio Sánchez,2021-09-28T15:39:59.670Z,NA,NA | |
671 (https://gitlab.com/libeigen/eigen/-/merge_requests/671),Fix gpu special function tests.,"Some checks used incorrect values, partly from copy-paste errors, | |
partly from the change in behaviour introduced in !398. | |
Modified results to match scipy, simplified tests by updating | |
`VERIFY_IS_CWISE_APPROX` to work for scalars.",Antonio Sánchez,2021-10-02T04:33:13.347Z,NA,NA | |
669 (https://gitlab.com/libeigen/eigen/-/merge_requests/669),Reduce tensor_contract_gpu test.,"The original test times out after 60 minutes on Windows, even when | |
setting flags to optimize for speed. Reducing the number of | |
contractions performed from 3600->27 for subtests 8,9 allow the | |
two to run in just over a minute each.",Antonio Sánchez,2021-10-02T04:51:14.721Z,NA,NA | |
667 (https://gitlab.com/libeigen/eigen/-/merge_requests/667),Speed up tensor reduction,"Speed up tensor reduction by strip mining & unrolling loops in `InnerMostDimReducer` and `InnerMostDimPreserved`. | |
This change also cleans up a few redundant pieces of code, where deferring to an existing specialization was possible. | |
Below are measurements of full-, row-, and column- sum reductions of square 2D float tensors with sizes ranging from 3 x 3 to 10k x 10k. These were measured single-threaded on a Skylake core, and compiled with clang approximately at head. | |
AVX2: | |
``` | |
name old cpu/op new cpu/op delta | |
BM_fullReduction_1T/3 [using 1 threads] 13.4ns ± 8% 15.1ns ± 4% +12.18% (p=0.000 n=49+60) | |
BM_fullReduction_1T/4 [using 1 threads] 13.0ns ± 4% 15.4ns ±18% +18.06% (p=0.000 n=48+60) | |
BM_fullReduction_1T/7 [using 1 threads] 15.0ns ±13% 16.4ns ± 3% +9.29% (p=0.000 n=48+47) | |
BM_fullReduction_1T/8 [using 1 threads] 15.8ns ±19% 16.7ns ±13% +5.84% (p=0.000 n=60+60) | |
BM_fullReduction_1T/10 [using 1 threads] 18.7ns ±11% 18.5ns ± 9% ~ (p=0.292 n=48+60) | |
BM_fullReduction_1T/15 [using 1 threads] 31.1ns ±12% 22.4ns ±17% -27.80% (p=0.000 n=52+57) | |
BM_fullReduction_1T/16 [using 1 threads] 34.3ns ±10% 23.0ns ±13% -32.75% (p=0.000 n=50+58) | |
BM_fullReduction_1T/31 [using 1 threads] 125ns ± 5% 48ns ± 9% -61.81% (p=0.000 n=60+60) | |
BM_fullReduction_1T/32 [using 1 threads] 134ns ± 6% 50ns ± 8% -62.70% (p=0.000 n=60+57) | |
BM_fullReduction_1T/64 [using 1 threads] 535ns ± 4% 160ns ± 5% -70.06% (p=0.000 n=60+60) | |
BM_fullReduction_1T/128 [using 1 threads] 2.15µs ± 4% 0.69µs ± 8% -67.65% (p=0.000 n=60+60) | |
BM_fullReduction_1T/256 [using 1 threads] 8.55µs ± 4% 2.77µs ± 5% -67.65% (p=0.000 n=60+55) | |
BM_fullReduction_1T/512 [using 1 threads] 34.5µs ± 3% 11.6µs ± 6% -66.52% (p=0.000 n=50+60) | |
BM_fullReduction_1T/1k [using 1 threads] 155µs ± 4% 158µs ± 4% +1.73% (p=0.000 n=60+60) | |
BM_fullReduction_1T/2k [using 1 threads] 682µs ±20% 684µs ±17% ~ (p=0.475 n=40+45) | |
BM_fullReduction_1T/4k [using 1 threads] 6.34ms ±12% 5.71ms ±11% -9.98% (p=0.000 n=39+35) | |
BM_fullReduction_1T/10k [using 1 threads] 37.4ms ± 7% 37.4ms ±32% ~ (p=0.481 n=10+10) | |
name old cpu/op new cpu/op delta | |
BM_rowReduction_1T/3 [using 1 threads] 29.0ns ± 7% 30.5ns ± 4% +5.10% (p=0.000 n=54+50) | |
BM_rowReduction_1T/4 [using 1 threads] 33.5ns ± 3% 38.5ns ± 4% +15.07% (p=0.000 n=50+50) | |
BM_rowReduction_1T/7 [using 1 threads] 54.6ns ± 4% 60.8ns ± 8% +11.40% (p=0.000 n=59+60) | |
BM_rowReduction_1T/8 [using 1 threads] 55.1ns ± 8% 52.1ns ± 9% -5.40% (p=0.000 n=60+60) | |
BM_rowReduction_1T/10 [using 1 threads] 75.8ns ± 7% 72.2ns ± 7% -4.66% (p=0.000 n=60+60) | |
BM_rowReduction_1T/15 [using 1 threads] 114ns ± 5% 123ns ± 6% +7.98% (p=0.000 n=60+60) | |
BM_rowReduction_1T/16 [using 1 threads] 102ns ± 5% 95ns ± 7% -6.74% (p=0.000 n=60+60) | |
BM_rowReduction_1T/31 [using 1 threads] 250ns ± 5% 264ns ± 4% +5.56% (p=0.000 n=55+55) | |
BM_rowReduction_1T/32 [using 1 threads] 232ns ± 4% 203ns ± 9% -12.47% (p=0.000 n=55+60) | |
BM_rowReduction_1T/64 [using 1 threads] 651ns ± 4% 482ns ± 6% -25.95% (p=0.000 n=60+60) | |
BM_rowReduction_1T/128 [using 1 threads] 1.90µs ± 3% 1.30µs ± 7% -31.67% (p=0.000 n=60+60) | |
BM_rowReduction_1T/256 [using 1 threads] 7.03µs ± 5% 3.69µs ± 5% -47.44% (p=0.000 n=60+49) | |
BM_rowReduction_1T/512 [using 1 threads] 28.6µs ± 4% 13.3µs ± 6% -53.36% (p=0.000 n=54+60) | |
BM_rowReduction_1T/1k [using 1 threads] 158µs ± 9% 157µs ± 4% ~ (p=0.948 n=60+60) | |
BM_rowReduction_1T/2k [using 1 threads] 733µs ±37% 657µs ±13% -10.36% (p=0.000 n=45+40) | |
BM_rowReduction_1T/4k [using 1 threads] 6.65ms ±11% 6.19ms ± 9% -6.89% (p=0.032 n=30+38) | |
BM_rowReduction_1T/10k [using 1 threads] 41.4ms ±11% 37.8ms ± 1% ~ (p=0.080 n=12+10) | |
name old cpu/op new cpu/op delta | |
BM_colReduction_1T/3 [using 1 threads] 21.8ns ± 5% 22.4ns ± 4% +2.34% (p=0.000 n=58+55) | |
BM_colReduction_1T/4 [using 1 threads] 20.8ns ± 6% 27.7ns ± 6% +33.27% (p=0.000 n=60+55) | |
BM_colReduction_1T/7 [using 1 threads] 32.0ns ± 4% 43.9ns ± 6% +37.53% (p=0.000 n=48+60) | |
BM_colReduction_1T/8 [using 1 threads] 28.7ns ±11% 24.8ns ± 3% -13.81% (p=0.000 n=53+55) | |
BM_colReduction_1T/10 [using 1 threads] 39.9ns ± 7% 37.8ns ± 4% -5.12% (p=0.000 n=53+50) | |
BM_colReduction_1T/15 [using 1 threads] 65.0ns ±10% 77.2ns ± 6% +18.79% (p=0.000 n=58+57) | |
BM_colReduction_1T/16 [using 1 threads] 56.5ns ± 7% 43.0ns ±21% -23.92% (p=0.000 n=48+60) | |
BM_colReduction_1T/31 [using 1 threads] 203ns ± 5% 210ns ± 6% +3.46% (p=0.000 n=60+59) | |
BM_colReduction_1T/32 [using 1 threads] 170ns ± 8% 95ns ± 7% -44.18% (p=0.000 n=60+60) | |
BM_colReduction_1T/64 [using 1 threads] 677ns ± 7% 261ns ± 4% -61.43% (p=0.000 n=60+55) | |
BM_colReduction_1T/128 [using 1 threads] 3.14µs ± 4% 1.40µs ± 5% -55.45% (p=0.000 n=50+60) | |
BM_colReduction_1T/256 [using 1 threads] 14.8µs ± 4% 5.4µs ± 6% -63.24% (p=0.000 n=60+60) | |
BM_colReduction_1T/512 [using 1 threads] 65.2µs ± 5% 25.2µs ± 5% -61.31% (p=0.000 n=60+55) | |
BM_colReduction_1T/1k [using 1 threads] 754µs ± 6% 393µs ± 5% -47.92% (p=0.000 n=60+45) | |
BM_colReduction_1T/2k [using 1 threads] 3.24ms ±18% 1.66ms ±17% -48.61% (p=0.000 n=35+42) | |
BM_colReduction_1T/4k [using 1 threads] 70.3ms ± 3% 34.5ms ± 3% -50.93% (p=0.000 n=44+25) | |
BM_colReduction_1T/10k [using 1 threads] 69.5ms ± 0% 69.6ms ± 2% ~ (p=0.605 n=10+15) | |
``` | |
SSE4.3: | |
``` | |
name old cpu/op new cpu/op delta | |
BM_fullReduction_1T/3 [using 1 threads] 13.5ns ± 6% 13.1ns ± 4% -2.72% (p=0.000 n=59+60) | |
BM_fullReduction_1T/4 [using 1 threads] 13.2ns ± 8% 12.8ns ± 4% -2.60% (p=0.000 n=60+60) | |
BM_fullReduction_1T/7 [using 1 threads] 14.7ns ± 4% 14.5ns ± 5% -1.16% (p=0.014 n=48+60) | |
BM_fullReduction_1T/8 [using 1 threads] 14.8ns ± 4% 14.6ns ± 4% -1.59% (p=0.001 n=48+60) | |
BM_fullReduction_1T/10 [using 1 threads] 17.8ns ± 4% 16.5ns ± 5% -7.15% (p=0.000 n=48+60) | |
BM_fullReduction_1T/15 [using 1 threads] 29.9ns ± 7% 24.7ns ± 3% -17.59% (p=0.000 n=54+55) | |
BM_fullReduction_1T/16 [using 1 threads] 33.1ns ± 7% 27.1ns ± 4% -18.35% (p=0.000 n=47+54) | |
BM_fullReduction_1T/31 [using 1 threads] 123ns ± 4% 70ns ± 7% -43.38% (p=0.000 n=60+57) | |
BM_fullReduction_1T/32 [using 1 threads] 131ns ± 4% 78ns ± 7% -40.77% (p=0.000 n=60+60) | |
BM_fullReduction_1T/64 [using 1 threads] 534ns ± 4% 281ns ± 4% -47.40% (p=0.000 n=60+55) | |
BM_fullReduction_1T/128 [using 1 threads] 2.13µs ± 4% 1.23µs ± 4% -42.17% (p=0.000 n=60+60) | |
BM_fullReduction_1T/256 [using 1 threads] 8.54µs ± 4% 4.95µs ± 5% -42.10% (p=0.000 n=60+60) | |
BM_fullReduction_1T/512 [using 1 threads] 34.5µs ± 4% 20.2µs ± 4% -41.43% (p=0.000 n=50+60) | |
BM_fullReduction_1T/1k [using 1 threads] 158µs ± 6% 154µs ± 5% -2.46% (p=0.000 n=60+60) | |
BM_fullReduction_1T/2k [using 1 threads] 687µs ±25% 668µs ±23% ~ (p=0.093 n=47+46) | |
BM_fullReduction_1T/4k [using 1 threads] 5.86ms ± 6% 5.82ms ±10% ~ (p=0.736 n=28+35) | |
BM_fullReduction_1T/10k [using 1 threads] 36.0ms ± 3% 35.5ms ± 3% ~ (p=0.095 n=10+9) | |
name old cpu/op new cpu/op delta | |
BM_rowReduction_1T/3 [using 1 threads] 28.8ns ± 4% 27.8ns ± 4% -3.64% (p=0.000 n=53+54) | |
BM_rowReduction_1T/4 [using 1 threads] 33.6ns ± 4% 33.7ns ± 6% ~ (p=0.465 n=50+50) | |
BM_rowReduction_1T/7 [using 1 threads] 54.4ns ± 4% 52.9ns ± 4% -2.81% (p=0.000 n=60+60) | |
BM_rowReduction_1T/8 [using 1 threads] 53.8ns ± 4% 51.6ns ± 4% -4.05% (p=0.000 n=60+60) | |
BM_rowReduction_1T/10 [using 1 threads] 74.4ns ± 4% 71.2ns ± 4% -4.39% (p=0.000 n=60+58) | |
BM_rowReduction_1T/15 [using 1 threads] 113ns ± 4% 109ns ± 4% -3.49% (p=0.000 n=60+60) | |
BM_rowReduction_1T/16 [using 1 threads] 101ns ± 6% 97ns ± 6% -3.91% (p=0.000 n=60+60) | |
BM_rowReduction_1T/31 [using 1 threads] 250ns ± 4% 271ns ± 4% +8.24% (p=0.000 n=55+55) | |
BM_rowReduction_1T/32 [using 1 threads] 232ns ± 3% 222ns ± 4% -4.31% (p=0.000 n=55+59) | |
BM_rowReduction_1T/64 [using 1 threads] 654ns ± 4% 501ns ± 5% -23.43% (p=0.000 n=60+60) | |
BM_rowReduction_1T/128 [using 1 threads] 1.90µs ± 4% 1.62µs ± 5% -14.84% (p=0.000 n=60+59) | |
BM_rowReduction_1T/256 [using 1 threads] 7.07µs ± 4% 5.51µs ± 4% -21.99% (p=0.000 n=60+59) | |
BM_rowReduction_1T/512 [using 1 threads] 28.7µs ± 6% 21.1µs ± 4% -26.28% (p=0.000 n=55+60) | |
BM_rowReduction_1T/1k [using 1 threads] 156µs ±10% 153µs ± 4% -2.07% (p=0.007 n=60+60) | |
BM_rowReduction_1T/2k [using 1 threads] 705µs ±26% 678µs ±33% -3.86% (p=0.035 n=41+39) | |
BM_rowReduction_1T/4k [using 1 threads] 7.04ms ±10% 6.31ms ± 8% -10.45% (p=0.000 n=41+36) | |
BM_rowReduction_1T/10k [using 1 threads] 42.6ms ± 6% 38.8ms ± 4% -8.82% (p=0.000 n=12+9) | |
name old cpu/op new cpu/op delta | |
BM_colReduction_1T/3 [using 1 threads] 22.0ns ± 7% 22.1ns ± 7% ~ (p=0.614 n=54+46) | |
BM_colReduction_1T/4 [using 1 threads] 20.6ns ± 5% 20.6ns ± 5% ~ (p=0.771 n=60+48) | |
BM_colReduction_1T/7 [using 1 threads] 31.6ns ± 4% 31.6ns ± 3% ~ (p=0.935 n=50+40) | |
BM_colReduction_1T/8 [using 1 threads] 27.8ns ± 9% 27.5ns ± 4% ~ (p=0.113 n=45+44) | |
BM_colReduction_1T/10 [using 1 threads] 39.0ns ± 4% 38.6ns ± 5% -0.86% (p=0.048 n=50+40) | |
BM_colReduction_1T/15 [using 1 threads] 63.9ns ± 4% 63.1ns ± 4% -1.20% (p=0.005 n=60+48) | |
BM_colReduction_1T/16 [using 1 threads] 56.5ns ± 8% 47.2ns ± 9% -16.50% (p=0.000 n=59+49) | |
BM_colReduction_1T/31 [using 1 threads] 200ns ± 5% 145ns ± 8% -27.33% (p=0.000 n=60+60) | |
BM_colReduction_1T/32 [using 1 threads] 170ns ± 5% 100ns ± 6% -40.78% (p=0.000 n=60+55) | |
BM_colReduction_1T/64 [using 1 threads] 673ns ± 4% 291ns ± 5% -56.83% (p=0.000 n=60+55) | |
BM_colReduction_1T/128 [using 1 threads] 3.14µs ± 4% 2.43µs ± 6% -22.70% (p=0.000 n=50+55) | |
BM_colReduction_1T/256 [using 1 threads] 14.7µs ± 4% 9.6µs ± 5% -35.06% (p=0.000 n=60+60) | |
BM_colReduction_1T/512 [using 1 threads] 65.4µs ± 4% 44.2µs ± 5% -32.42% (p=0.000 n=59+59) | |
BM_colReduction_1T/1k [using 1 threads] 761µs ± 8% 756µs ± 8% ~ (p=0.274 n=60+60) | |
BM_colReduction_1T/2k [using 1 threads] 3.22ms ±13% 3.27ms ±23% ~ (p=0.629 n=37+37) | |
BM_colReduction_1T/4k [using 1 threads] 70.9ms ±10% 69.8ms ± 8% -1.47% (p=0.028 n=40+40) | |
BM_colReduction_1T/10k [using 1 threads] 69.7ms ± 3% 79.6ms ± 2% +14.22% (p=0.000 n=13+14) | |
```",Rasmus Munk Larsen,2021-10-02T14:58:24.275Z,NA,NA | |
668 (https://gitlab.com/libeigen/eigen/-/merge_requests/668),Fix Windows CMake compiler/OS detection.,"Replaced deprecated `DetermineVSServicePack` macro with recommended | |
`CMAKE_CXX_COMPILER_VERSION`. | |
Deleted custom `OSVersion` detection. The windows-specific code is | |
highly outdated, and on other systems simply returns `CMAKE_SYSTEM`. | |
We will get values like `windows-10.0.17763`, but this is preferable | |
to `unknownwin`, and saves us needing to maintain a separate cmake file.",Antonio Sánchez,2021-10-02T16:47:06.748Z,NA,NA | |
673 (https://gitlab.com/libeigen/eigen/-/merge_requests/673),Vectorize Visitor.h.,"This change adds a vectorized codepath in `Visitor.h`, which speeds up `coeffMax(&row, &col)` etc. by about 5x on machines with AVX2. | |
Benchmark of `coeffMax(&row, &col)` on a random square matrix of `float`: | |
``` | |
name old cpu/op new cpu/op delta | |
BM_EigenCoeffMax/16 317ns ± 0% 73ns ± 0% -77.16% (p=0.000 n=44+47) | |
BM_EigenCoeffMax/64 5.30µs ± 0% 0.92µs ± 5% -82.56% (p=0.000 n=42+60) | |
BM_EigenCoeffMax/128 21.3µs ± 0% 3.6µs ± 1% -83.21% (p=0.000 n=45+48) | |
BM_EigenCoeffMax/512 341µs ± 0% 56µs ± 0% -83.65% (p=0.000 n=38+60) | |
BM_EigenCoeffMax/1k 1.42ms ± 0% 0.24ms ± 1% -83.31% (p=0.000 n=36+33) | |
``` | |
This also speeds up various matrix decompositions that perform pivot search using `coeffMax`; | |
``` | |
name old cpu/op new cpu/op delta | |
BM_EigenPartialPivLU/16 1.99µs ± 1% 1.96µs ± 1% -1.32% (p=0.000 n=60+59) | |
BM_EigenPartialPivLU/64 23.2µs ± 1% 21.7µs ± 2% -6.63% (p=0.000 n=56+58) | |
BM_EigenPartialPivLU/128 116µs ± 2% 108µs ± 2% -6.56% (p=0.000 n=60+60) | |
BM_EigenPartialPivLU/512 3.53ms ± 1% 3.40ms ± 2% -3.83% (p=0.000 n=38+38) | |
BM_EigenPartialPivLU/1k 17.0ms ± 1% 16.4ms ± 1% -3.98% (p=0.000 n=29+27) | |
BM_EigenFullPivLU/16 3.17µs ± 1% 2.76µs ± 1% -12.99% (p=0.000 n=49+50) | |
BM_EigenFullPivLU/64 79.2µs ± 2% 53.3µs ± 3% -32.75% (p=0.000 n=58+56) | |
BM_EigenFullPivLU/128 560µs ± 2% 361µs ± 3% -35.61% (p=0.000 n=60+50) | |
BM_EigenFullPivLU/512 26.7ms ± 3% 16.5ms ± 2% -38.26% (p=0.000 n=47+47) | |
BM_EigenFullPivLU/1k 234ms ± 3% 165ms ± 4% -29.52% (p=0.000 n=15+21) | |
BM_EigencolPivQR/16 4.61µs ± 3% 4.61µs ± 4% ~ (p=0.881 n=58+59) | |
BM_EigencolPivQR/64 51.7µs ± 2% 51.0µs ± 2% -1.44% (p=0.000 n=58+57) | |
BM_EigencolPivQR/128 277µs ± 3% 272µs ± 3% -1.97% (p=0.000 n=55+54) | |
BM_EigencolPivQR/512 9.05ms ± 3% 9.00ms ± 2% ~ (p=0.197 n=45+44) | |
BM_EigencolPivQR/1k 127ms ± 4% 127ms ± 5% ~ (p=0.421 n=27+26) | |
BM_EigenfullPivQR/16 5.45µs ± 3% 5.02µs ± 4% -7.78% (p=0.000 n=59+60) | |
BM_EigenfullPivQR/64 108µs ± 3% 76µs ± 4% -29.07% (p=0.000 n=59+59) | |
BM_EigenfullPivQR/128 682µs ± 3% 452µs ± 2% -33.78% (p=0.000 n=59+57) | |
BM_EigenfullPivQR/512 33.1ms ± 4% 20.0ms ± 3% -39.57% (p=0.000 n=44+40) | |
BM_EigenfullPivQR/1k 323ms ± 1% 225ms ± 3% -30.20% (p=0.000 n=8+15) | |
``` | |
Closes #2345",Rasmus Munk Larsen,2021-10-20T16:58:02.155Z,NA,NA | |
678 (https://gitlab.com/libeigen/eigen/-/merge_requests/678),"Move CUDA/Complex.h to GPU/Complex.h, remove TensorReductionCuda.h","Move CUDA/Complex.h to GPU/Complex.h, remove TensorReductionCuda.h | |
The `Complex.h` file applies equally to HIP/CUDA, so placing under the | |
generic `GPU` folder. | |
The `TensorReductionCuda.h` has already been deprecated, now removing | |
for the next Eigen version.",Antonio Sánchez,2021-10-20T19:17:53.230Z,NA,NA | |
665 (https://gitlab.com/libeigen/eigen/-/merge_requests/665),Fix tuple compilation for VS2017.,"VS2017 doesn't like deducing alias types, leading to a bunch of compile | |
errors for functions involving the `tuple` alias. Replacing with | |
`TupleImpl` seems to solve this, allowing the test to compile/pass.",Antonio Sánchez,2021-10-20T19:37:50.457Z,NA,NA | |
666 (https://gitlab.com/libeigen/eigen/-/merge_requests/666),Fix MSVC+NVCC EIGEN_INHERIT_ASSIGNMENT_EQUAL_OPERATOR compilation.,"Looks like we need to update the | |
`EIGEN_INHERIT_ASSIGNMENT_EQUAL_OPERATOR` for newer versions of MSVC as | |
well when compiling with NVCC. Fixes build issues for VS 2017.",Antonio Sánchez,2021-10-20T19:53:31.809Z,NA,NA | |
676 (https://gitlab.com/libeigen/eigen/-/merge_requests/676),Improve accuracy of full tensor reduction for half and bfloat16,"We use a tree summation algorithm for full tensor reduction. The relative error in summing n (positive) elements this way is bounded by `~2*eps*(log(n/B) + B)`, where `B` is the size of the leaves in the tree, where we sum sequentially in the interest of speed. For less accurate types (i.e. types with larger eps), we reduce B to keep the relative error significantly below 1.",Rasmus Munk Larsen,2021-10-20T20:11:32.498Z,NA,NA | |
679 (https://gitlab.com/libeigen/eigen/-/merge_requests/679),Disable Tree reduction for GPU.,"For moderately sized inputs, running the Tree reduction quickly | |
overflows the GPU thread stack space, leading to memory errors. | |
This was happening in the `cxx11_tensor_complex_gpu` test, for example. | |
Disabling tree reduction on GPU fixes this.",Antonio Sánchez,2021-10-20T21:34:18.676Z,NA,NA | |
677 (https://gitlab.com/libeigen/eigen/-/merge_requests/677),Use reinterpret_cast on GPU for bit_cast.,"This seems to be the recommended approach for doing type punning in | |
CUDA. See for example | |
- https://stackoverflow.com/questions/47037104/cuda-type-punning-memcpy-vs-ub-union | |
- https://developer.nvidia.com/blog/faster-parallel-reductions-kepler/ | |
(the latter puns a double to an `int2`). The issue is that for CUDA, the `memcpy` is not elided, and ends up | |
being an expensive operation. We already have similar `reintepret_cast`s across | |
the Eigen codebase for GPU (as does TensorFlow).",Antonio Sánchez,2021-10-20T21:50:30.064Z,NA,NA | |
686 (https://gitlab.com/libeigen/eigen/-/merge_requests/686),Revert bit_cast to use memcpy for CUDA.,"To elide the memcpy, we need to first load the `src` value into | |
registers by making a local copy. This avoids the need to resort | |
to potential UB by using `reinterpret_cast`. | |
This change doesn't seem to affect CPU (at least not with gcc/clang). | |
With optimizations on, the copy is also elided.",Antonio Sánchez,2021-10-21T19:18:17.611Z,NA,NA | |
687 (https://gitlab.com/libeigen/eigen/-/merge_requests/687),Add nan-propagation options to matrix and array plugins.,"The ability to control nan-propagation in elementwise min/max and min/max reduction was added in Eigen 3.4, but we missed adding them to the corresponding array and matrix plugins. Should we consider backporting this change to 3.4 (would have to make it c++03 compliant)?",Rasmus Munk Larsen,2021-10-21T20:06:54.199Z,NA,NA | |
691 (https://gitlab.com/libeigen/eigen/-/merge_requests/691),Fix -Wbitwise-instead-of-logical clang warning,Fixes #2353.,Nico,2021-10-22T05:50:17.301Z,NA,NA | |
693 (https://gitlab.com/libeigen/eigen/-/merge_requests/693),Included note on inner stride for compile-time vectors. Fixes #2355,"### Reference issue | |
#2355 | |
### What does this implement/fix? | |
Added note in the documentation on the `Stride` class for compile-time vectors, which always use the inner stride. | |
### Additional information | |
<!--Any additional information you think is important.-->",Lennart Steffen,2021-10-22T15:14:14.956Z,NA,NA | |
692 (https://gitlab.com/libeigen/eigen/-/merge_requests/692),Extend EIGEN_QT_SUPPORT to Qt6,"When building with Qt6, excludes the functions `inline Transform(const QMatrix& other);`, `inline Transform& operator=(const QMatrix& other);` and `inline QMatrix toQMatrix(void) const;` from `Transform.h`. | |
Fixes #2350",benardp,2021-10-23T23:43:07.383Z,NA,NA | |
688 (https://gitlab.com/libeigen/eigen/-/merge_requests/688),Add nan-propagation options to matrix and array plugins.,NA,Rasmus Munk Larsen,2021-10-25T19:11:00.256Z,NA,NA | |
696 (https://gitlab.com/libeigen/eigen/-/merge_requests/696),Remove const from visitor return type.,"This seems to interfere with `pload`/`ploadu`, since `pload<const Packet**>` are not defined. | |
This should unbreak the arm/ppc builds.",Antonio Sánchez,2021-10-25T19:26:04.904Z,NA,NA | |
689 (https://gitlab.com/libeigen/eigen/-/merge_requests/689),Fix broadcasting oob error.,"For vectorized 1-dimensional inputs that do not take the special | |
blocking path (e.g. `std::complex<...>`), there was an | |
index-out-of-bounds error causing the broadcast size to be | |
computed incorrectly. Here we fix this, and make other minor | |
cleanup changes. | |
Fixes #2351.",Antonio Sánchez,2021-10-25T19:48:13.063Z,NA,NA | |
698 (https://gitlab.com/libeigen/eigen/-/merge_requests/698),Ensure comma initializer reuses fixed dimensions,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
This relates to #2346. | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
Ensures that the block in `CommaInitializer` is fix-sized if the input is by replacing | |
``` | |
m_xpr.block(0, 0, other.rows(), other.cols()) = other; | |
``` | |
with | |
``` | |
m_xpr.template block<OtherDerived::RowsAtCompileTime, OtherDerived::ColsAtCompileTime>(0, 0, other.rows(), other.cols()) = other; | |
``` | |
<!--Please explain your changes.--> | |
### Additional information | |
No additional information :)",Stresspresso,2021-10-25T20:10:15.448Z,NA,NA | |
695 (https://gitlab.com/libeigen/eigen/-/merge_requests/695),test: fix boostmutiprec test to compile with older Boost versions,"test: fix boostmutiprec test to compile with older Boost versions | |
Eigen boostmultiprec test redefines a symbol that is already defined inside Boot Math [1]. Boost has fixed it recently [2], but this patch avoids errors if Boost version was less than 1.77. | |
https://github.com/boostorg/math/blob/boost-1.76.0/include/boost/math/policies/policy.hpp#L18 | |
https://github.com/boostorg/math/commit/68307123029676ba5cb316f8dd1d1c98d1fc7b23#diff-c7a8e5911c2e6be4138e1a966d762200f147792ac16ad96fdcc724313d11f839",Maxiwell S. Garcia,2021-10-25T20:48:18.344Z,NA,NA | |
681 (https://gitlab.com/libeigen/eigen/-/merge_requests/681),Avoid integer overflows in EigenMetaKernel indexing,"- The current implementation computes `size + total_threads`, which can overflow and cause `CUDA_ERROR_ILLEGAL_ADDRESS` when size is close to the maximum representable value. | |
- The `num_blocks` calculation can also overflow due to the implementation of `divup()`. | |
- This patch prevents these overflows and allows the kernel to work correctly for the full representable range of tensor sizes. | |
- Also adds relevant tests. | |
cc @nluehr",Ben Barsdell,2021-10-26T00:20:35.402Z,NA,NA | |
701 (https://gitlab.com/libeigen/eigen/-/merge_requests/701),ZVector: Move alignas qualifier to come first,"We currently have plenty of type definitions with the alignment | |
qualifier coming after the type. The compiler warns about ignoring | |
them: | |
int EIGEN_ALIGN16 ai[4]; | |
Turn this into: | |
EIGEN_ALIGN16 int ai[4];",Andreas Krebbel,2021-10-26T16:54:18.888Z,NA,NA | |
700 (https://gitlab.com/libeigen/eigen/-/merge_requests/700),Vectorize fp16 tanh and logistic functions on Neon,Adds vectorization to the current implementation of the tanh and logistic functions when they run on Neon.,Alex Druinsky,2021-10-27T16:24:58.672Z,NA,NA | |
694 (https://gitlab.com/libeigen/eigen/-/merge_requests/694),Fix ZVector build.,"Cross-compiled via `s390x-linux-gnu-g++`, run via qemu. This allows the packetmath tests to pass.",Antonio Sánchez,2021-10-27T16:50:52.175Z,NA,NA | |
534 (https://gitlab.com/libeigen/eigen/-/merge_requests/534),Preliminary HIP bfloat16 GPU support.,"The purpose of this MR is to deliver bfloat16 data type support for AMD GPUs and the HIP software stack. | |
The changes captured herein are a work in progress in the sense that basic functionality is provided. | |
Performance optimizations and similar changes will be forthcoming. | |
The existing bfloat16 header has been enhanced with support for HIP. | |
Also, a GPU specific bfloat16 unit test suite has been introduced.",Rohit Santhanam,2021-10-27T18:36:46.828Z,NA,NA | |
697 (https://gitlab.com/libeigen/eigen/-/merge_requests/697),optimize cmake scripts for subproject use,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
Fixes #2347 | |
### What does this implement/fix? | |
With that, the support for subprojects is improved. E.g. no tests will be builded by default. | |
### Additional information | |
The use of CMAKE_CXX_FLAGS for tests should be replaced by a test_interface target",Fabian Keßler,2021-10-28T15:04:48.377Z,NA,NA | |
703 (https://gitlab.com/libeigen/eigen/-/merge_requests/703),"Fix min/max nan-propagation for scalar ""other"".","Copied input type from `EIGEN_MAKE_CWISE_BINARY_OP`. | |
Fixes #2362.",Antonio Sánchez,2021-10-28T16:47:50.777Z,NA,NA | |
702 (https://gitlab.com/libeigen/eigen/-/merge_requests/702),Add AVX vector path to float2half/half2float,"Add AVX vector path to float2half/half2float | |
Makes e. g. matrix multiplication 3x faster: | |
name old cpu/op new cpu/op delta | |
BM_convers 181ms ± 1% 62ms ± 9% -65.82% (p=0.016 n=4+5) | |
Direct translation of the scalar code from half_to_float and float_to_half_rtne (Eigen/src/Core/arch/Default/Half.h). | |
Tested on all possible input values (not adding those, since they take a long time, especially in debug build).",Ilya Tokar,2021-10-28T21:04:41.543Z,NA,NA | |
680 (https://gitlab.com/libeigen/eigen/-/merge_requests/680),Invert rows and depth in non-vectorized portion of packing (PowerPC).,"Invert rows and depth in non-vectorized portion of packing for RHS (PowerPC). | |
This shows up as bad results in the following: | |
``` | |
export EIGEN_SEED=1629216664 | |
test/product_syrk_3 | |
test/product_mmtr_3 | |
``` | |
The previous packing did NOT allow us to know the correct end of a row in some cases and it would pickup incorrect values from the wrong locations. | |
In the process of fixing this, I simplified the code and added performance improvements (extra rows are now 5X faster and overall 10% gains).",Chip Kerchner,2021-10-28T21:59:41.561Z,NA,NA | |
705 (https://gitlab.com/libeigen/eigen/-/merge_requests/705),Fix TensorReduction warnings and error bound for sum accuracy test.,"The sum accuracy test currently uses the default test precision for | |
the given scalar type. However, scalars are generated via a normal | |
distribution, and given a large enough count and strong enough random | |
generator, the expected sum is zero. This causes the test to | |
periodically fail. | |
Here we estimate an upper-bound for the error as `N * prec` for | |
summing N values, with each having an approximate epsilon of `prec`. | |
In practice, this is much larger than it probably needs to be, since errors | |
are generally both positive and negative. | |
Also fixed a few warnings generated by MSVC when compiling the | |
reduction test.",Antonio Sánchez,2021-11-01T17:03:50.849Z,NA,NA | |
704 (https://gitlab.com/libeigen/eigen/-/merge_requests/704),"Remove bad ""take"" impl that causes g++-11 crash.","For some reason, having `take<n, numeric_list<T>>` for `n > 0` causes | |
g++-11 to ICE with | |
``` | |
sorry, unimplemented: unexpected AST of kind nontype_argument_pack | |
``` | |
It does work with other versions of gcc, and with clang. | |
I filed a GCC bug | |
[here](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102999). | |
Technically we should never actually run into this case, since you | |
can't take n > 0 elements from an empty list. Commenting it out | |
allows our Eigen tests to pass",Antonio Sánchez,2021-11-01T17:20:39.950Z,NA,NA | |
707 (https://gitlab.com/libeigen/eigen/-/merge_requests/707),"Fix total deflation issue in BDCSVD, when & only when M is already diagonal.","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
- #1980 | |
- #2174 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
1. Add more unit tests to BDCSVD, to capture the total deflation issue before. | |
2. Fix total deflation when it's not supposed to be triggered.",Xinle Liu,2021-11-02T16:53:55.746Z,NA,NA | |
709 (https://gitlab.com/libeigen/eigen/-/merge_requests/709),"Fix BDCSVD's total deflation in branch 3.4, similar to that of master in MR 707.","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
* #1980 | |
* #2174 | |
* Cherry pick for !707 into branch `3.4` | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Fix BDCSVD's total deflation logic, to be triggered when and only when M is diagonal.",Xinle Liu,2021-11-03T18:16:04.555Z,NA,NA | |
714 (https://gitlab.com/libeigen/eigen/-/merge_requests/714),nestbyvalue test: fix uninitialized matrix,"- Doing computation with uninitialized (zero-ed ? but thanks Linux) matrix, or | |
worse NaN on other non-linux systems. | |
- This commit fixes it by initializing to Random(). | |
Note: can we have this cherry-picked into 3.4 too, please ?",Minh Quan Ho,2021-11-04T16:19:21.437Z,NA,NA | |
712 (https://gitlab.com/libeigen/eigen/-/merge_requests/712),Documentation of Quaternion constructor from MatrixBase (fixes #2368),"Added documentation to clarify that the Quaternion constructor from MatrixBase assumes the matrix is in the order qx, qy, qz, qw. | |
Fixes #2368.",Gilad Barach,2021-11-04T16:38:08.734Z,NA,NA | |
711 (https://gitlab.com/libeigen/eigen/-/merge_requests/711),Bug Fix: correct the bug that won't define EIGEN_HAS_FP16_C,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This patch has fixed a bug of Eigen: | |
If we use a compiler which is not clang, `EIGEN_COMP_CLANG` is defined as 0, then `(!defined(EIGEN_COMP_CLANG) || EIGEN_COMP_CLANG>=380))` is always false. `EIGEN_HAS_FP16_C` will be never defined. | |
### Additional information | |
<!--Any additional information you think is important.-->",Gengxin Xie,2021-11-04T22:29:42.448Z,NA,NA | |
715 (https://gitlab.com/libeigen/eigen/-/merge_requests/715),Fix failing test for tensor reduction.,Compare summation results against forward error bound.,Rasmus Munk Larsen,2021-11-05T01:25:46.968Z,NA,NA | |
713 (https://gitlab.com/libeigen/eigen/-/merge_requests/713),Avoid integer overflow in EigenMetaKernel indexing (v2),"This is a re-submission of https://gitlab.com/libeigen/eigen/-/merge_requests/681, which was reverted due to build issues on Windows. | |
This version has two changes compared to the previous version: | |
- It doesn't use inline PTX, so there shouldn't be any build issues on Windows. | |
- It only uses saturated addition in each loop iteration when overflow is possible (i.e., when the size is within total_threads of the max representable index). When overflow is not possible, regular addition is used. | |
Summary of changes: | |
- The current implementation computes `size + total_threads`, which can | |
overflow and cause CUDA_ERROR_ILLEGAL_ADDRESS when size is close to | |
the maximum representable value. | |
- The num_blocks calculation can also overflow due to the implementation | |
of divup(). | |
- This patch prevents these overflows and allows the kernel to work | |
correctly for the full representable range of tensor sizes. | |
- Also adds relevant tests. | |
cc @nluehr",Ben Barsdell,2021-11-05T18:28:54.257Z,NA,NA | |
121 (https://gitlab.com/libeigen/eigen/-/merge_requests/121),Added a make format command,"Using `make format` formats the whole source code according to a .clang-format file, which specifies the exact layout. | |
This allows to later check if new Merge Request fulfill these guidelines. | |
The copyright on the FindCLANG_FORMAT can still be changed. : Done",Jens Wehner,2021-11-10T17:35:21.520Z,infrastructure::build system,NA | |
720 (https://gitlab.com/libeigen/eigen/-/merge_requests/720),fix a typo,what the title says,Erik Schultheis,2021-11-15T03:42:11.454Z,NA,NA | |
716 (https://gitlab.com/libeigen/eigen/-/merge_requests/716),Convert diag pragmas to nv_diag take 2.,This is a re-submission of MR https://gitlab.com/libeigen/eigen/-/merge_requests/670.,Nathan Luehr,2021-11-15T19:01:01.372Z,NA,NA | |
717 (https://gitlab.com/libeigen/eigen/-/merge_requests/717),moved pruning code to SparseVector.h,"I'm planning to start working on the storage code for sparse matrices. There are a few feature requests for this, see below. | |
This MR does not yet address any of those, but instead moves the prune function, which is only used (and useful) for sparse vectors from the storage implementation to the sparse vector implementation. As this is not related to how the data is stored, this shouldn't (imo) be part of the CompressedStorage class anyway. | |
So this can be seen as a first cleanup step. | |
### Reference issue | |
https://gitlab.com/libeigen/eigen/-/issues/2371 | |
https://gitlab.com/libeigen/eigen/-/issues/2238 | |
https://gitlab.com/libeigen/eigen/-/issues/2207 | |
https://gitlab.com/libeigen/eigen/-/issues/1729",Erik Schultheis,2021-11-15T22:16:03.142Z,NA,NA | |
718 (https://gitlab.com/libeigen/eigen/-/merge_requests/718),use consistent `StorageIndex`,"`SparseMatrix::Map` and `SparseMatrix::TransposedSparseMatrix` are defined in `SparseMatrix`, but they always use the default `StorageIndex`. | |
Now they use the same `StorageIndex` as the `SparseMatrix` object.",Erik Schultheis,2021-11-15T22:35:41.734Z,NA,NA | |
327 (https://gitlab.com/libeigen/eigen/-/merge_requests/327),Reimplemented the Tensor stream output.,"The implementation structure is based on the IO for Eigen::Matrix. | |
Predefined formats in struct Eigen::TensorIOFormat are defined for numpy-like output, native output, plain output and Legacy output for backwards compatibilty. | |
Documentation and tests are added.",cpp977,2021-11-16T17:36:59.423Z,MR to reopen,NA | |
723 (https://gitlab.com/libeigen/eigen/-/merge_requests/723),Fix tensor broadcast off-by-one error.,Caught by JAX unit tests. Triggered if broadcast size is smaller than packet size.,Antonio Sánchez,2021-11-16T17:53:58.713Z,NA,NA | |
722 (https://gitlab.com/libeigen/eigen/-/merge_requests/722),Update Umeyama.h,"Update Umeyama.h: `src_var` is only used when `with_scaling == true`. Therefore, the actual computation can be avoided when `with_scaling == false`.",Pablo Speciale,2021-11-16T18:14:11.908Z,NA,NA | |
724 (https://gitlab.com/libeigen/eigen/-/merge_requests/724),Make the new TensorIO implementation work with TensorMap with const elements.,Fix issue with `TensorMap<Tensor<const T...>>` in the new TensorIO implementation from !327.,Rasmus Munk Larsen,2021-11-18T17:45:30.916Z,NA,NA | |
719 (https://gitlab.com/libeigen/eigen/-/merge_requests/719),Fixed Sparse-Sparse Product in case of mixed StorageIndex types,"The Sparse-Sparse product implementation converts its arguments to different storage order, but always uses the storage index of the result matrix. This can cause problems if the input indices are not representable as such. | |
The two cases are | |
1) Result is small, but the inputs are large, e.g. (8 x 512) x (512 x 8) -> (8 x 8). This is the case that was broken. | |
2) Result is large, but inputs are small, e.g. (127 x 8) x ( 8 x 127) -> (127 x 127). That worked, but I've added a test-case nontheless. | |
Note that case 1) can still lead to problems, because the `StorageIndex` imposes 2 limits: | |
a) The maximum coordinate any nonzero can have | |
b) The maximum number of nonzeros. | |
While a) is clearly not violated in case 1), we might create two many nonzeros ans the product would still fail. In this case the result is truly not representable with the given types. This MR only fixes the case when the operation itself is valid.",Erik Schultheis,2021-11-18T18:33:33.309Z,NA,NA | |
728 (https://gitlab.com/libeigen/eigen/-/merge_requests/728),Fix errors for Windows build.,NA,Antonio Sánchez,2021-11-19T04:41:33.660Z,NA,NA | |
726 (https://gitlab.com/libeigen/eigen/-/merge_requests/726),Add basic iterator support for Eigen::array to ease transition to std::array,"In particular, for code built with EIGEN_AVOID_STL_ARRAY, the inconsistency in syntax makes it cumbersome to remove this compilation option.",Rasmus Munk Larsen,2021-11-19T05:31:16.026Z,NA,NA | |
725 (https://gitlab.com/libeigen/eigen/-/merge_requests/725),don't use deprecated MappedSparseMatrix,"This MR removes Eigen-internal references to the deprecated MappedSparseMatrix type. | |
Q: should the MappedSparseMatrix type be removed entirely, now that the C++14 jump is happening?",Erik Schultheis,2021-11-19T15:58:05.090Z,NA,NA | |
727 (https://gitlab.com/libeigen/eigen/-/merge_requests/727),Make numeric_limits members constexpr as per the newer C++ standards.,Author: [email protected].,Rasmus Munk Larsen,2021-11-19T16:14:06.379Z,NA,NA | |
729 (https://gitlab.com/libeigen/eigen/-/merge_requests/729),Implement Eigen::array<...>::reverse_iterator if std::reverse_iterator exists.,"This is needed by the new TensorIO implementation, and it is handy to have available. TODO: Implement for CUDA.",Rasmus Munk Larsen,2021-11-20T00:22:46.929Z,NA,NA | |
733 (https://gitlab.com/libeigen/eigen/-/merge_requests/733),Fix warnings about shadowing definitions.,NA,Rasmus Munk Larsen,2021-11-23T22:52:25.977Z,NA,NA | |
732 (https://gitlab.com/libeigen/eigen/-/merge_requests/732),remove EIGEN_HAS_CXX11,This MR removes the EIGEN_HAS_CXX11 macro (see #2372) and all corresponding ifs. It also removes the test cases that explicitly set the C++ version to less than 11. I've also removed two conditional compilations for GCC versions less than 4.,Erik Schultheis,2021-11-24T20:08:50.121Z,NA,NA | |
737 (https://gitlab.com/libeigen/eigen/-/merge_requests/737),split up large Lapacke LLT macro,"### Reference issue | |
Came across this when working on !731 | |
### What does this implement/fix? | |
Currently, the binding of LLT to Lapacke is done using a large macro. This factors out a large part of the functionality of the macro and implement them explicitly. This results in an increase in line count, though if you take away the number of new comment or blank lines I think the change is marginal. | |
### Additional information | |
On my system, with both `liblapacke` and `liblapacke64` some tests fail. They also fail without the changes here, though.",Erik Schultheis,2021-11-25T16:11:25.952Z,NA,NA | |
740 (https://gitlab.com/libeigen/eigen/-/merge_requests/740),Remove DenseBase::nonZeros() which just calls DenseBase::size(),NA,David Tellenbach,2021-11-27T14:31:02.742Z,NA,NA | |
741 (https://gitlab.com/libeigen/eigen/-/merge_requests/741),Fix for HIP compilation failure in DenseBase.,"Commit 96e537d6fd1a7187feb853c1bdbdea69ee7b99ec introduced EIGEN_DEVICE_FUNC modifiers to some DenseBase functions. | |
The corresponding functions in DenseBase.h were missing the modifiers and this caused a compilation failure in HIP. | |
/cc @cantonios",Rohit Santhanam,2021-11-28T15:59:31.010Z,NA,NA | |
735 (https://gitlab.com/libeigen/eigen/-/merge_requests/735),removed EIGEN_HAS_CXX11_* and redundant EIGEN_COMP_CXXVER checks,"This MR removes conditional compilation for C++11 features `CONTAINERS`, `RVALUE_REFERENCES`, `NOEXCEPT`, `ATOMIC` and `OVERRIDE`. Essentially the feature macros which had a simple condition when they are enabled. I have merged the minimum compiler versions from these checks into the `this compiler is too old` error path. | |
I have also removed explicit checks for `EIGEN_COMP_CXXVER` and `EIGEN_CAX_CPP_VER` (which seem to have rather inconsistent spelling) that are always true (false) when we have at least C++11. | |
Q: What is the `#define EIGEN_INCLUDE_TYPE_TRAITS` doing in the noexcept part? It seems that it is possible to have `EIGEN_HAS_TYPE_TRAITS` 0 but still get `EIGEN_INCLUDE_TYPE_TRAITS`? | |
#2372",Erik Schultheis,2021-11-29T19:18:58.362Z,NA,NA | |
742 (https://gitlab.com/libeigen/eigen/-/merge_requests/742),Updated CMake,"This MR updates the minimum CMake version required to 3.10, which is supported by both Ubuntu 18 (3.10 supported till April 23) and and Debian Buster (3.13 EOL August 22, LTS?). It is not included in Debian stretch, which is EOL but still receives LTS until June 22. | |
I have also removed the option to disable C++11 tests from the CMake file, and cleaned up the corresponing names in the CI. | |
I have also included a second commit which changes the minimum version of gcc to GCC 5. This might conflict or be redundant with !739, so if you like I can also submit a PR without the second change.",Erik Schultheis,2021-11-29T20:24:21.383Z,NA,NA | |
658 (https://gitlab.com/libeigen/eigen/-/merge_requests/658),Update SVD Module to allow specifying computation options with a template parameter. Resolves #2051,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
This is related to issue #2051 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This refactoring is an API improvement that changes the ``QRPreconditioner`` template parameter in JacobiSVD to a generic ``Options`` template parameter. For consistency, it also adds a similar optional ``Options`` parameter to BDCSVD. | |
This allows users to request thin unitaries for fixed-size matrices, which currently does not work. This change seems particularly beneficial when using ``.solve(y)``. This does not make any algorithmic changes, **but it is backwards incompatible change to the existing API**. | |
[Example of Ceres having to workaround this assertion](https://github.com/ceres-solver/ceres-solver/blob/ec4f2995bbde911d6861fb5c9bb7353ad796e02b/internal/ceres/invert_psd_matrix.h#L71) | |
Here is an example of how the new API gets used: | |
~~~~ | |
#include <Eigen/Dense> | |
#include <iostream> | |
using namespace Eigen; | |
using FixedSizeMatrix = Matrix<double, 4, 5>; | |
int main() { | |
FixedSizeMatrix m = FixedSizeMatrix::Random(); | |
// Oh no! The following line has assertion because of the fixed size matrix type. | |
// JacobiSVD<FixedSizeMatrix> svd1(m, ComputeThinU | ComputeThinV); | |
JacobiSVD<FixedSizeMatrix, ColPivHouseholderQRPreconditioner | ComputeThinU | ComputeThinV> svd2(m); | |
std::cout << svd2.singularValues().transpose() << '\n'; | |
return 0; | |
} | |
~~~~ | |
### Additional information | |
<!--Any additional information you think is important.--> | |
For testing I mostly wanted to try all combinations of options and make sure that sizes and storage orders are set properly. It does run computation checks, but since it's not an algorithmic change I thought it would be redundant to redo everything. | |
I'm also a little unsure about the BDCSVD tests. It seems like the existing tests occasionally fail on the master branch if they're run a lot. Is this actually the case? They still all pass most of the time, but I think it's just the additional computation checks are making failures a little bit more common. | |
All feedback to improve this is greatly appreciated!",Arthur,2021-11-29T20:50:47.484Z,"API change, MR to reopen",NA | |
734 (https://gitlab.com/libeigen/eigen/-/merge_requests/734),Select AVX2 even if the data size is not a multiple of 8,"This is a second version of https://gitlab.com/libeigen/eigen/-/merge_requests/46 . That PR contained very useful comments that now seem to be lost. | |
In any case, it was first merged, and then reverted by @rmlarsen1 because some tests were failing. I fixed the tests and added some more tests on top, but it was never merged back. | |
Moreover, the branch was reverted wrongly, in the sense that the revert still left some changes in: | |
``` | |
% git diff 52a2fbbb008a47c5e3fb8ac1c65c2feecb0c511c..5ca10480b0756e40b0723d90adeba8506291fc7c Eigen/src/Core/util/XprHelper.h | |
diff --git a/Eigen/src/Core/util/XprHelper.h b/Eigen/src/Core/util/XprHelper.h | |
index fd2db56a4..26aa609fe 100644 | |
--- a/Eigen/src/Core/util/XprHelper.h | |
+++ b/Eigen/src/Core/util/XprHelper.h | |
@@ -195,7 +195,7 @@ template<typename T> struct unpacket_traits | |
}; | |
template<int Size, typename PacketType, | |
- bool Stop = Size==Dynamic || (Size%unpacket_traits<PacketType>::size)==0 || is_same<PacketType,typename unpacket_traits<PacketType>::half>::value> | |
+ bool Stop = Size==Dynamic || Size >= unpacket_traits<PacketType>::size || is_same<PacketType,typename unpacket_traits<PacketType>::half>::value> | |
struct find_best_packet_helper; | |
template< int Size, typename PacketType> | |
``` | |
So the first fix of my MR (commit 5ca10480b0756e40b0723d90adeba8506291fc7c) actually mistakenly remained in master, but without the further improvements later. | |
It's quite unfortunate that the very useful discussion that happened in https://gitlab.com/libeigen/eigen/-/merge_requests/46 is now gone. Is there a way to recover it?",Francesco Mazzoli,2021-11-29T21:13:26.011Z,NA,NA | |
730 (https://gitlab.com/libeigen/eigen/-/merge_requests/730),bugfix: issue #2375,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
Fixes #2375 | |
### What does this implement/fix? | |
Fixed indexed views for indices of non Eigen, non built-in types (e.g. `std::array`) | |
### Additional information | |
The underlying problem was with the way strides were being computed in the `traits` class for indexed views. When the increment value for the supplied index was undefined (as with e.g. `std::array`), the stride value was computed as the product of the undefined increment value (i.e. `0xfffffe`) and the expression stride. Since nothing in the test suite depends on this stride value, this went unnoticed (perhaps this value is altogether irrelevant for indexed views?). However, for sufficiently large (i.e. 129 or more) values of the expression stride, the aforementioned product resulted in a signed integer overflow, which gcc detected at compile time. The fix consisted of explicitly checking for and undefined increment value and setting the stride to `Dynamic` when it was detected.",Jakub Gałecki,2021-11-29T22:26:16.173Z,NA,NA | |
736 (https://gitlab.com/libeigen/eigen/-/merge_requests/736),SFINAE'ing away non-const overloads if selfAdjoint/triangular view is not referring to an lvalue,"### Reference issue | |
None, I noticed this while working on !731 | |
### What does this implement/fix? | |
Currently, the non-const transpose methods in selfadjoint and triangular views static_assert that the view represents an lvalue. With this change, this overload is automatically disabled for non-lvalues and the const version is considered. | |
### Additional information | |
Currently clang-tidy produces some ""return type is const-qualified at the top level which may reduce readability without improving const correctness"" warnings. The const-qualification there is in fact necessary, because otherwise the wrong `.transpose` overload would be selected. I have only removed that overload from consideration, but left the return types const for now. | |
There is a second method which has the same static_assert (coeffRef). In this case, though, I have not changed anything, because there is no valid alternative overload, and I think the static assert provides a more informative error message than just getting the compiler error about a missing method, for which you manually have to decipher the SFINAE that disabled it.",Erik Schultheis,2021-11-29T22:51:28.571Z,NA,NA | |
745 (https://gitlab.com/libeigen/eigen/-/merge_requests/745),Fix for HIP compilation breakage in selfAdjoint and triangular view classes.,"HIP related compilation fix for selfAdjoint and triangular view class changes. | |
/cc @cantonios",Rohit Santhanam,2021-11-30T14:01:00.453Z,NA,NA | |
746 (https://gitlab.com/libeigen/eigen/-/merge_requests/746),fixed cholesky with 0 sized matrix (cf. #785),"Lapacke considers 0-sized matrices in LLT to be an error, whereas the Eigen test suite expects them to be a success. | |
This change turn lapacke-based LLT into a no-op that returns success if the input has zero size, thus making the corresponding Eigen test cases pass.",Erik Schultheis,2021-11-30T17:17:42.252Z,NA,NA | |
749 (https://gitlab.com/libeigen/eigen/-/merge_requests/749),"Revert ""Update SVD Module to allow specifying computation options with a...","This change broke a lot of third party libraries, and without an associated version change, this change is too disruptive. We will revert until we come up with a better solution.",Rasmus Munk Larsen,2021-11-30T18:45:55.438Z,NA,NA | |
744 (https://gitlab.com/libeigen/eigen/-/merge_requests/744),Require recent GCC and MSCV and removed `EIGEN_HAS_CXX14` and some other feature test macros,"This MR removes the `EIGEN_HAS_VARIADIC_TEMPLATES` and `EIGEN_HAS_STATIC_ARRAY_TEMPLATE`, | |
`EIGEN_HAS_ALIGNAS` (which seemed to be unused) as well as `EIGEN_HAS_CXX14` and the corresponding version checks. | |
It also removed the checks for GCC older than 5.1 and MSCV older than 1900. | |
What is the new minimum version for ICC? I think 1600 would make all the current checks pass. | |
Also, there are some checks that checked `EIGEN_MAX_CPP` for 14 and then used standard feature-test macros directly. So far I've only removed the MAX_CPP part of the check. | |
I've updated the list in #2372",Erik Schultheis,2021-12-01T00:48:35.494Z,NA,NA | |
739 (https://gitlab.com/libeigen/eigen/-/merge_requests/739),Disable GCC-4.8 tests.,This is to unblock moving the minimum requirement to c++!4.,Antonio Sánchez,2021-12-01T02:12:52.637Z,NA,NA | |
752 (https://gitlab.com/libeigen/eigen/-/merge_requests/752),Deprecate macro EIGEN_GPU_TEST_C99_MATH as it's only used in one file and always true.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> Minor fix of deprecated macro, dependent on cxx11 macro before.",Xinle Liu,2021-12-01T14:48:57.304Z,NA,NA | |
748 (https://gitlab.com/libeigen/eigen/-/merge_requests/748),Improved lapacke binding code for HouseholderQR and PartialPivLU,"This MR replaces the binding macros with C++ code for HouseholderQR and PartialPivLU and factors out common binding code into a new file. | |
For the remaining Lapacke bindings, I'm not so sure what the best way to handle them is, since they do explicitly specialize functions of an existing template class instead of the class itself. Maybe refactor the class so that the computation is separated from the interface. | |
Currently, `ColPivHouseholderQR` allocates working arrays `m_colsTranspositions`, `m_temp`, `m_colNormsUpdated`, `m_colNormsDirect` which are unused in the Lapacke code path. Encapsulating this in a separate subobject that handles the computations would mean we could get rid of this wasted space in the Lapacke binding, and potentially instead allocate a buffer that is used by the lapacke functions as the work space.",Erik Schultheis,2021-12-02T00:10:58.956Z,NA,NA | |
755 (https://gitlab.com/libeigen/eigen/-/merge_requests/755),fixed leftover else branch,"This fixes (removes) the leftover else branch that was part of an `#ifdef` that got deleted with the recent changes. I guess the smoketests don't even try to parse the unsupported module. Maybe it would be good to have some tests there that at least try to include the unsupported parts, i.e. they are unsupported, but we still might want to guarantee that at least including their headers into an empty file works.",Erik Schultheis,2021-12-02T18:13:20.537Z,NA,NA | |
447 (https://gitlab.com/libeigen/eigen/-/merge_requests/447),Bicgstabl,"Adds the BiCGSTAB(L) algorithm for solving linear systems. | |
Although often IDR(s) is more efficient, for strongly non-symmetric system with large imaginary eigenvalues, BiCGSTAB(L) typically converges faster. | |
This implementation sometimes does not pass the test and fails with a norm of 1e-15. Advice on how to improve the numerical stability are welcome. To some extend it seems to be a limit to this specific algorithm.",Jens Wehner,2021-12-02T22:48:23.453Z,MR to reopen,NA | |
757 (https://gitlab.com/libeigen/eigen/-/merge_requests/757),Idrs refactoring,Basically reformats IDRS code and replaces calls to norm() by calls to StableNorm(),Jens Wehner,2021-12-02T23:32:08.066Z,NA,NA | |
756 (https://gitlab.com/libeigen/eigen/-/merge_requests/756),Only include <atomic> if needed.,"### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This will allow (e.g. embedded) toolchains without atomic support to continue compiling Eigen Core by setting EIGEN_DONT_PARALLELIZE.",Rasmus Munk Larsen,2021-12-02T23:55:25.835Z,NA,NA | |
759 (https://gitlab.com/libeigen/eigen/-/merge_requests/759),fix typo `StableNorm` -> `stableNorm`,"the new code in `IDRS.h` misspells `stableNorm`. | |
This seems to have not been caught by the smoketests, maybe they should be extended?",Erik Schultheis,2021-12-04T14:52:10.315Z,NA,NA | |
762 (https://gitlab.com/libeigen/eigen/-/merge_requests/762),fixed snippets,see !760,Erik Schultheis,2021-12-05T17:31:12.703Z,NA,NA | |
761 (https://gitlab.com/libeigen/eigen/-/merge_requests/761),Some further cleanup,"This removes further compiler versions checks for obsolete versions, and the `EIGEN_HAS_CXX14_VARIABLES_TEMPLATES`, `EIGEN_HAS_TYPE_TRAITS`, `EIGEN_HAS_SFINAE` flags.",Erik Schultheis,2021-12-06T18:01:15.774Z,NA,NA | |
577 (https://gitlab.com/libeigen/eigen/-/merge_requests/577),Idrsstabl,"This implements IDR(s)STAB(l) | |
The IDR(s)STAB(l) is a combination of IDR(s) and BiCGSTAB(l). It is a short-recurrences Krylov method for sparse square problems. | |
It can outperform both IDR(s) and BiCGSTAB(l). IDR(s)STAB(l) generally closely follows the optimal GMRES convergence in | |
terms of the number of Matrix-Vector products. However, without the increasing cost per iteration of GMRES. IDR(s)STAB(l) | |
is suitable for both indefinite systems and systems with complex eigenvalues.",Jens Wehner,2021-12-06T20:00:01.016Z,MR to reopen,NA | |
765 (https://gitlab.com/libeigen/eigen/-/merge_requests/765),disambiguate overloads for empty index list,"Clang complains about an ambiguous overload for creating compile time indices when the index list is empty. | |
(e.g. https://gitlab.com/libeigen/eigen/-/jobs/1856367113#L8459) | |
``` | |
../unsupported/test/../../unsupported/Eigen/CXX11/src/Tensor/TensorMeta.h:284:12: error: call to 'customIndices2Array' is ambiguous | |
return customIndices2Array(idx, typename gen_numeric_list<Index, NumIndices>::type{}); | |
``` | |
In principle, the generic version should be enough I think, but the git blame said that the second overload was introduced to prevent some warnings, so I've kept the two overloads. | |
Instead, the first overload now explicitly mentions at least one single entry, thus not being viable for the empty case and preventing the ambiguous call.",Erik Schultheis,2021-12-07T19:40:10.444Z,NA,NA | |
760 (https://gitlab.com/libeigen/eigen/-/merge_requests/760),get rid of `using namespace Eigen` in sample code,"Even if we cannot get rid of the bad examples elsewhere, a first step of showing that `using namespace Eigen` is not good practice is to not do it in the examples that eigen comes with.",Erik Schultheis,2021-12-07T19:57:39.184Z,NA,NA | |
767 (https://gitlab.com/libeigen/eigen/-/merge_requests/767),Make sure exp(-Inf) is zero for vectorized expressions.,"### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
This fixes #2385 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Before this fix exp() would return a non-zero value for -Inf arguments if the expression was vectorized (i.e. the array being at least as long as the packet size of the corresponding scalar type). | |
### Additional information | |
<!--Any additional information you think is important.--> | |
For AVX2 this change gives a small speedup for float and is neutral for double. | |
AVX2 on Skylake: | |
``` | |
name old cpu/op new cpu/op delta | |
BM_eigen_exp_double/1 3.54ns ± 0% 3.54ns ± 0% -0.09% (p=0.005 n=50+49) | |
BM_eigen_exp_double/8 58.8ns ± 1% 59.0ns ± 3% ~ (p=0.385 n=43+56) | |
BM_eigen_exp_double/64 201ns ± 4% 200ns ± 4% ~ (p=0.299 n=59+60) | |
BM_eigen_exp_double/512 1.29µs ± 2% 1.28µs ± 3% -0.73% (p=0.001 n=59+59) | |
BM_eigen_exp_double/4k 9.92µs ± 2% 9.90µs ± 3% ~ (p=0.435 n=59+59) | |
BM_eigen_exp_double/32k 78.8µs ± 2% 78.9µs ± 3% ~ (p=0.584 n=58+59) | |
BM_eigen_exp_double/256k 634µs ± 2% 628µs ± 3% -0.96% (p=0.000 n=59+58) | |
BM_eigen_exp_double/1M 2.54ms ± 2% 2.51ms ± 2% -1.24% (p=0.000 n=34+33) | |
BM_eigen_exp_float/1 3.27ns ± 0% 3.27ns ± 0% -0.10% (p=0.000 n=50+47) | |
BM_eigen_exp_float/8 30.3ns ± 5% 29.6ns ± 0% -2.34% (p=0.001 n=54+50) | |
BM_eigen_exp_float/64 81.3ns ± 2% 79.6ns ± 2% -2.11% (p=0.000 n=58+58) | |
BM_eigen_exp_float/512 471ns ± 4% 455ns ± 3% -3.40% (p=0.000 n=60+58) | |
BM_eigen_exp_float/4k 3.58µs ± 3% 3.45µs ± 3% -3.53% (p=0.000 n=50+49) | |
BM_eigen_exp_float/32k 28.5µs ± 3% 27.5µs ± 3% -3.52% (p=0.000 n=54+52) | |
BM_eigen_exp_float/256k 227µs ± 4% 220µs ± 3% -3.27% (p=0.000 n=49+49) | |
BM_eigen_exp_float/1M 908µs ± 4% 884µs ± 2% -2.65% (p=0.000 n=42+43) | |
``` | |
For SSE, the change nets a 4-6% speedup: | |
``` | |
name old cpu/op new cpu/op delta | |
BM_eigen_exp_double/1 1.90ns ± 0% 1.90ns ± 1% ~ (p=0.567 n=48+60) | |
BM_eigen_exp_double/8 48.2ns ± 0% 45.9ns ± 0% -4.76% (p=0.000 n=49+51) | |
BM_eigen_exp_double/64 348ns ± 2% 328ns ± 2% -5.94% (p=0.000 n=50+49) | |
BM_eigen_exp_double/512 2.74µs ± 0% 2.56µs ± 0% -6.61% (p=0.000 n=44+53) | |
BM_eigen_exp_double/4k 21.9µs ± 0% 20.5µs ± 0% -6.41% (p=0.000 n=58+50) | |
BM_eigen_exp_double/32k 175µs ± 0% 163µs ± 0% -6.52% (p=0.000 n=52+50) | |
BM_eigen_exp_double/256k 1.40ms ± 0% 1.31ms ± 0% -6.45% (p=0.000 n=54+51) | |
BM_eigen_exp_double/1M 5.59ms ± 0% 5.23ms ± 0% -6.41% (p=0.000 n=43+43) | |
BM_eigen_exp_float/1 1.87ns ± 2% 1.89ns ± 0% +1.06% (p=0.000 n=60+53) | |
BM_eigen_exp_float/8 22.5ns ± 0% 25.3ns ± 0% +12.65% (p=0.000 n=54+48) | |
BM_eigen_exp_float/64 149ns ± 0% 142ns ± 0% -4.84% (p=0.000 n=59+50) | |
BM_eigen_exp_float/512 1.17µs ± 0% 1.11µs ± 0% -5.07% (p=0.000 n=54+52) | |
BM_eigen_exp_float/4k 9.36µs ± 0% 8.87µs ± 0% -5.21% (p=0.000 n=52+55) | |
BM_eigen_exp_float/32k 74.9µs ± 0% 70.9µs ± 0% -5.41% (p=0.000 n=54+53) | |
BM_eigen_exp_float/256k 599µs ± 0% 569µs ± 0% -5.11% (p=0.000 n=58+53) | |
BM_eigen_exp_float/1M 2.39ms ± 0% 2.27ms ± 0% -4.92% (p=0.000 n=33+32) | |
```",Rasmus Munk Larsen,2021-12-08T17:57:24.619Z,NA,NA | |
758 (https://gitlab.com/libeigen/eigen/-/merge_requests/758),Build unit tests for HIP using C++14.,"Build GPU unit tests for HIP using C++14. | |
/cc @cantonios",Rohit Santhanam,2021-12-09T08:04:20.163Z,NA,NA | |
763 (https://gitlab.com/libeigen/eigen/-/merge_requests/763),removed helper cmake macro and don't use deprecated COMPILE_FLAGS anymore.,"This is a first attempt at starting to clean up the cmake scripts. | |
It replaces manual manipulation of (deprecated) `COMPILE_FLAGS` with `target_compile_options` and | |
`target_compile_definitions` instead. | |
I don't have a SYCL/HIP version available here, so I haven't tested that part. | |
Q: Should we (in a separate MR) define an interface target for the common test flags and then just `target_link_libraries` with those instead of adding all these options inside `ei_add_test_internal`?",Erik Schultheis,2021-12-09T23:09:56.963Z,NA,NA | |
768 (https://gitlab.com/libeigen/eigen/-/merge_requests/768),removed Find*.cmake scripts for which these are available in cmake itself,"As per #2387, this removes find scripts for packages which have these supplied by cmake. | |
For BLAS, Lapack, GLEW, and GSL cmake provides find scripts: | |
* https://cmake.org/cmake/help/v3.10/module/FindBLAS.html | |
* https://cmake.org/cmake/help/v3.10/module/FindLAPACK.html | |
* https://cmake.org/cmake/help/v3.10/module/FindGLEW.html | |
* https://cmake.org/cmake/help/v3.10/module/FindGSL.html | |
For GSL and GLEW, the cmake versions provide imported targets, (for BLAS and LAPACK only after 3.18), so this change should enable further downstream improvements of the cmake scripts that I haven't looked at yet.",Erik Schultheis,2021-12-10T02:02:35.546Z,NA,NA | |
770 (https://gitlab.com/libeigen/eigen/-/merge_requests/770),fixed customIndices2Array forgetting first index,"In !765 i made sure the overload for `customIndices2Array` was picking up the first element separately to disambiguate the empty case. But I forgot to put this first element into the newly created array. Since this only affects the tensor module, running the smoke tests did not reveal the problem. | |
I'm currently running `./check.sh "".*tensor.*""` to check that this fix really helps.",Erik Schultheis,2021-12-10T16:42:00.358Z,NA,NA | |
769 (https://gitlab.com/libeigen/eigen/-/merge_requests/769),Fix,"Fixes | |
``` | |
Eigen/src/CholmodSupport/CholmodSupport.h:13, | |
from Eigen/SPQRSupport:31, | |
from test/spqr_support.cpp:11: | |
Eigen/src/CholmodSupport/./InternalHeaderCheck.h:2:2: error: #error ""Please include Eigen/CholmodSupport instead of including headers inside the src directory directly."" | |
```",Erik Schultheis,2021-12-10T16:59:49.166Z,NA,NA | |
753 (https://gitlab.com/libeigen/eigen/-/merge_requests/753),turn some macros intro constexpr functions,"Now that we have C++14, some actual cleanup. | |
This turns some ""computational"" macros into actual constexpr functions. | |
The added benefit is that there is a bit more checking involved, e.g. you cannot pass floats anymore. | |
Should the type here be `int` or rather `Eigen::Index`? | |
Also, both the old macro and the new code are susceptible to problems stemming from narrowing conversions.",Erik Schultheis,2021-12-10T19:27:02.761Z,NA,NA | |
774 (https://gitlab.com/libeigen/eigen/-/merge_requests/774),Fixes for enabling HIP unit tests. Includes a fix to make this work with the latest cmake.,"Fixes for enabling unit tests on HIP. | |
/cc @cantonios",Rohit Santhanam,2021-12-12T21:03:31.341Z,NA,NA | |
776 (https://gitlab.com/libeigen/eigen/-/merge_requests/776),space separated EIGEN_TEST_CUSTOM_CXX_FLAGS,"Convert spaces in `EIGEN_TEST_CUSTOM_CXX_FLAGS` into `;` to make this a CMake List. | |
This is using the `MODE` version of `separate_arguments` because this respects escapes and quotes. | |
Interestingly, at this point I think `EIGEN_TEST_CUSTOM_LINKER_FLAGS` does not need this processing. I think this is because we are using still using the legacy way of specifying `target_link_dependencies` without `PUBLIC`/`PRIVATE` specifier.",Erik Schultheis,2021-12-13T15:27:34.232Z,NA,NA | |
773 (https://gitlab.com/libeigen/eigen/-/merge_requests/773),Small speed-up in row-major sparse dense product,"This MR changes the implementation of sparse_time_dense_product_impl for `RowMajor` matrices to use two accumulation variables instead of one. This breaks up the dependency chain of addition the values, and opens up options for the CPU to employ instruction-level parallelism. | |
I have run the following benchmark on my system (AMD Threadripper 1950, single threaded) | |
``` | |
#define ANKERL_NANOBENCH_IMPLEMENT | |
#include <nanobench.h> | |
#include <Eigen/Sparse> | |
#include <random> | |
void benchmark_sparse_dense_vector(ankerl::nanobench::Bench& bench) { | |
const Eigen::Index SIZE = 100'000; | |
const Eigen::Index NNZ = 10'000'000; | |
Eigen::SparseMatrix<float, Eigen::RowMajor> matrix(SIZE, SIZE); | |
std::default_random_engine rng; | |
std::uniform_int_distribution<Eigen::Index> dist(0, SIZE - 1); | |
std::vector<Eigen::Triplet<float>> triplets; | |
triplets.reserve(NNZ); | |
auto start = std::chrono::steady_clock::now(); | |
for(int i = 0; i < NNZ; ++i) { | |
triplets.emplace_back(dist(rng), dist(rng), 1.0); | |
} | |
matrix.setFromTriplets(begin(triplets), end(triplets)); | |
Eigen::VectorXf rhs = Eigen::VectorXf::Random(SIZE); | |
Eigen::VectorXf dst(SIZE); | |
bench.run(""Sparse * Dense -> Dense"", [&] {dst = matrix * rhs; }); | |
} | |
int main() { | |
ankerl::nanobench::Bench bench{}; | |
bench.warmup(10).epochs(100); | |
bench.minEpochTime(std::chrono::nanoseconds{1'000'000}); | |
bench.minEpochIterations(10); | |
benchmark_sparse_dense_vector(bench); | |
} | |
``` | |
with the following results | |
# SIZE = 1M | |
## New | |
| ns/op | op/s | err% | ins/op | bra/op | miss% | total | benchmark | |
|--------------------:|--------------------:|--------:|----------------:|---------------:|--------:|----------:|:---------- | |
| 15,099,473.67 | 66.23 | 0.4% | 89,126,601.18 | 14,625,689.82 | 7.2% | 16.55 | `Sparse * Dense -> Dense` | |
## Old | |
| ns/op | op/s | err% | ins/op | bra/op | miss% | total | benchmark | |
|--------------------:|--------------------:|--------:|----------------:|---------------:|--------:|----------:|:---------- | |
| 16,052,703.27 | 62.29 | 1.4% | 90,625,231.30 | 14,125,126.00 | 7.3% | 18.99 | `Sparse * Dense -> Dense` | |
# SIZE = 100k | |
## New | |
| ns/op | op/s | err% | ins/op | bra/op | miss% | total | benchmark | |
|--------------------:|--------------------:|--------:|----------------:|---------------:|--------:|----------:|:---------- | |
| 8,723,373.15 | 114.63 | 1.1% | 62,853,098.64 | 10,452,488.20 | 1.0% | 9.73 | `Sparse * Dense -> Dense` | |
## Old | |
| ns/op | op/s | err% | ins/op | bra/op | miss% | total | benchmark | |
|--------------------:|--------------------:|--------:|----------------:|---------------:|--------:|----------:|:---------- | |
| 9,254,136.85 | 108.06 | 0.4% | 71,994,036.60 | 10,402,745.36 | 1.0% | 10.45 | `Sparse * Dense -> Dense` | |
On the other hand, in my real application (about 60% time spent sparse matrix times dense vector), I don't see any change. I suspect that is because I'm running many of these in parallel, and the computation is severely memory bound anyway.",Erik Schultheis,2021-12-15T18:46:26.435Z,NA,NA | |
782 (https://gitlab.com/libeigen/eigen/-/merge_requests/782),Fix a bug introduced in !751.,A few uses of the EIGEN_IMPLIES macro had side-effects that were conditionally short-circuited in the old but not the new implementation.,Rasmus Munk Larsen,2021-12-15T22:00:40.737Z,NA,NA | |
783 (https://gitlab.com/libeigen/eigen/-/merge_requests/783),Simplify logical_xor(),"### What does this implement/fix? | |
For `a` and `b` of type `bool` we can simplify `(a || b) && !(a && b)` to `a != b`. | |
Maybe we could even consider using `!=` right away in the code for clarity instead of invoking the function `logical_xor()`. @rmlarsen1 @ngc92 What do you think? | |
### Reference issue | |
See also my comment on commit c20e908e.",Kolja Brix,2021-12-16T20:20:47.898Z,NA,NA | |
785 (https://gitlab.com/libeigen/eigen/-/merge_requests/785),fixed clang warnings about alignment change and floating point precision,"This fixes two warnings that come up in the CI with Clang. | |
The first is related to a conversion of a pointer to a `std::complex` value on the stack to a pointer to `__m64`. We can just align the `res` variable with the same alignment as required by `__mm64`, things should be safe then. | |
The other fix is just dropping a `.f` suffix for a double constant. | |
The remaining Clang warnings are about | |
> Eigen/src/Core/Reverse.h:194:51: warning: implicit conversion loses integer precision: 'Eigen::Index' (aka 'long') to 'int' [-Wshorten-64-to-32] | |
Which is because `Eigen::fix<>` has `int` as its template parameter, even though it fulfills the function of an `Eigen::Index` I think. But addressing that seems to be a more complicated change, so I've left that as is for now.",Erik Schultheis,2021-12-18T17:18:16.892Z,NA,NA | |
786 (https://gitlab.com/libeigen/eigen/-/merge_requests/786),Small cleanup of GDB pretty printer code,"### What does this implement/fix? | |
Small cleanup of pretty printer code for the GNU Debugger (GDB): | |
* Rename variable `type` to avoid conflict with Python function `type()`. | |
* Remove import of module that is no longer needed. | |
* Use `+=` for better readability. | |
* Improve formatting.",Kolja Brix,2021-12-18T17:34:38.717Z,NA,NA | |
788 (https://gitlab.com/libeigen/eigen/-/merge_requests/788),Small fixes,"This MR fixes a bunch of smaller issues, making the following changes: | |
* Template parameters in the documentation are documented with `\tparam` instead of `\param` | |
* superfluous semicolon warnings [note enabled by default] fixed | |
* Fixed the type of literals used to initialize float variables",Erik Schultheis,2021-12-21T16:46:10.725Z,NA,NA | |
789 (https://gitlab.com/libeigen/eigen/-/merge_requests/789),Include immintrin.h if F16C is available and vectorization is disabled,"If EIGEN_DONT_VECTORIZE is defined, immintrin.h is not included even if | |
F16C is available. Trying to use F16C intrinsics thus fails. | |
This fixes issue #2395.",David Tellenbach,2021-12-25T19:51:48.138Z,cherry-pick-to-stable::done,NA | |
790 (https://gitlab.com/libeigen/eigen/-/merge_requests/790),Add missing internal namespace,The vectorization logic tests miss some namespace internal qualifiers.,David Tellenbach,2021-12-27T23:50:32.626Z,NA,NA | |
779 (https://gitlab.com/libeigen/eigen/-/merge_requests/779),Improve exp<float>(): Don't flush denormal results + 4% speedup.,"1. Speed up `exp(x)` by reducing the polynomial approximant from degree 7 to degree 6. With exactly representable coefficients computed by the Sollya tool, this still gives a maximum relative error of 1 ulp, i.e. faithfully rounded, for arguments where exp(x) is a normalized float. This change results in a speedup of about 4% for AVX2. | |
2. Extend the range where `exp(x)` returns a non-zero result to from ~[-88;88] to ~[-104;88] i.e. return denormalized values for large negative arguments instead of zero. Compared to `exp<double>(x)` the denormalized results gradually decrease in accuracy down to 0.033 relative error for arguments around `x = -104` where `exp(x)` is `~std::numeric<float>::denorm_min()`. This is expected and acceptable. | |
Benchmark numbers for AVX2. | |
``` | |
name old cpu/op new cpu/op delta | |
BM_eigen_exp_float/1 3.27ns ± 0% 3.27ns ± 0% ~ (p=0.218 n=46+48) | |
BM_eigen_exp_float/8 29.6ns ± 0% 30.1ns ± 6% +1.56% (p=0.000 n=41+54) | |
BM_eigen_exp_float/64 80.4ns ± 5% 79.7ns ± 5% -0.85% (p=0.007 n=47+60) | |
BM_eigen_exp_float/512 460ns ± 2% 441ns ± 2% -4.31% (p=0.000 n=60+57) | |
BM_eigen_exp_float/4k 3.48µs ± 2% 3.35µs ± 2% -3.52% (p=0.000 n=49+49) | |
BM_eigen_exp_float/32k 27.6µs ± 3% 26.6µs ± 3% -3.75% (p=0.000 n=54+54) | |
BM_eigen_exp_float/256k 221µs ± 2% 212µs ± 2% -3.81% (p=0.000 n=48+56) | |
BM_eigen_exp_float/1M 887µs ± 3% 848µs ± 2% -4.33% (p=0.000 n=39+54) | |
name old time/op new time/op delta | |
BM_eigen_exp_float/1 3.27ns ± 0% 3.27ns ± 0% ~ (p=0.475 n=49+48) | |
BM_eigen_exp_float/8 29.6ns ± 0% 30.1ns ± 6% +1.54% (p=0.000 n=41+54) | |
BM_eigen_exp_float/64 80.4ns ± 5% 79.7ns ± 5% -0.89% (p=0.006 n=48+60) | |
BM_eigen_exp_float/512 460ns ± 2% 441ns ± 2% -4.31% (p=0.000 n=60+57) | |
BM_eigen_exp_float/4k 3.48µs ± 2% 3.35µs ± 2% -3.52% (p=0.000 n=49+49) | |
BM_eigen_exp_float/32k 27.6µs ± 3% 26.6µs ± 3% -3.73% (p=0.000 n=54+54) | |
BM_eigen_exp_float/256k 221µs ± 2% 212µs ± 2% -3.83% (p=0.000 n=48+56) | |
BM_eigen_exp_float/1M 887µs ± 3% 848µs ± 2% -4.33% (p=0.000 n=39+54) | |
name old INSTRUCTIONS/op new INSTRUCTIONS/op delta | |
BM_eigen_exp_float/1 41.0 ± 0% 41.0 ± 0% ~ (all samples are equal) | |
BM_eigen_exp_float/8 308 ± 0% 308 ± 0% ~ (all samples are equal) | |
BM_eigen_exp_float/64 660 ± 0% 632 ± 0% -4.24% (p=0.000 n=60+60) | |
BM_eigen_exp_float/512 3.29k ± 0% 3.04k ± 0% -7.65% (p=0.000 n=53+55) | |
BM_eigen_exp_float/4k 24.3k ± 0% 22.3k ± 0% -8.39% (p=0.000 n=45+45) | |
BM_eigen_exp_float/32k 193k ± 0% 176k ± 0% -8.50% (p=0.000 n=49+48) | |
BM_eigen_exp_float/256k 1.54M ± 0% 1.41M ± 0% -8.51% (p=0.000 n=44+54) | |
BM_eigen_exp_float/1M 6.16M ± 0% 5.64M ± 0% -8.51% (p=0.000 n=37+52) | |
name old CYCLES/op new CYCLES/op delta | |
BM_eigen_exp_float/1 12.0 ± 0% 12.0 ± 0% ~ (p=0.830 n=49+49) | |
BM_eigen_exp_float/8 109 ± 0% 111 ± 6% +1.52% (p=0.000 n=40+54) | |
BM_eigen_exp_float/64 270 ± 2% 269 ± 5% ~ (p=0.051 n=47+60) | |
BM_eigen_exp_float/512 1.55k ± 2% 1.49k ± 2% -4.10% (p=0.000 n=57+60) | |
BM_eigen_exp_float/4k 11.7k ± 1% 11.3k ± 1% -3.78% (p=0.000 n=50+40) | |
BM_eigen_exp_float/32k 93.0k ± 1% 89.5k ± 1% -3.76% (p=0.000 n=52+48) | |
BM_eigen_exp_float/256k 744k ± 1% 715k ± 1% -3.93% (p=0.000 n=49+46) | |
BM_eigen_exp_float/1M 2.99M ± 1% 2.87M ± 1% -4.02% (p=0.000 n=40+58) | |
```",Rasmus Munk Larsen,2021-12-28T15:00:19.706Z,NA,NA | |
793 (https://gitlab.com/libeigen/eigen/-/merge_requests/793),Remove unused EIGEN_HAS_STATIC_ARRAY_TEMPLATE,NA,David Tellenbach,2021-12-30T15:26:57.137Z,NA,NA | |
794 (https://gitlab.com/libeigen/eigen/-/merge_requests/794),bugfix: *ALTIVEC_H -> *ZVECTOR_H,"Hello, I noticed that some some header guards were repeated between the [`AltiVec`](https://gitlab.com/shivaghose/eigen/-/tree/e4d8299a417203980af37f4e544226884b8cc031/Eigen/src/Core/arch/AltiVec) package and the [`ZVector`](https://gitlab.com/shivaghose/eigen/-/tree/e4d8299a417203980af37f4e544226884b8cc031/Eigen/src/Core/arch/ZVector) packages. This could cause a problem if (for whatever reason) someone attempts to include headers for both architectures. | |
I'm not sure how to write a test for this, but I'm open to suggestions. | |
<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
I couldn't find any existing issues (happy to file an issue if the bug tracking makes your lives easier). | |
### What does this implement/fix? | |
This MR replaces the duplicated header guards found in [`ZVector/Complex.h`](https://gitlab.com/libeigen/eigen/-/blob/22a347b9d2ee8321543e3b15673e1dd1d5456d4e/Eigen/src/Core/arch/ZVector/Complex.h#L11-12) and [`MathFunctions.h`](https://gitlab.com/libeigen/eigen/-/blob/22a347b9d2ee8321543e3b15673e1dd1d5456d4e/Eigen/src/Core/arch/ZVector/MathFunctions.h#L16-17) with unique header guards. | |
### Additional information | |
Ideally we would use something like `pragma once` but I understand that not all compilers and operating systems could easily support it.",Shiva Ghose,2021-12-31T08:43:25.244Z,NA,NA | |
797 (https://gitlab.com/libeigen/eigen/-/merge_requests/797),Add bounds checking to Eigen serializer,"### What does this implement/fix? | |
#2405",Lingzhu Xiang,2022-01-04T18:42:08.354Z,NA,NA | |
771 (https://gitlab.com/libeigen/eigen/-/merge_requests/771),"ensure that eigen::internal::size is not found by ADL, rename to ssize and...","This is an attempt at solving #2391 | |
The naive idea would be to use `std::size` when available (C++17), and otherwise define `Eigen::internal::size`. | |
However, these is a subtle difference between the std:: version and the Eigen version, in that std version is generally expected to return an unsigned value. C++20 defines `ssize` which is specifically for signed size values. | |
Technically, the `size` function in Eigen is compatible with the std version for Eigen types, as that is defined to return whatever the underlying `object.size()` returns. But for std types and arrays the return types would differ. | |
Therefore, I have changed the function name to `ssize` and put an implementation compatible with the standard (I hope, it is based on https://en.cppreference.com/w/cpp/iterator/size). Since this is an internal function, the name change should be OK I think. | |
There is still a slight subtlety, in that the std ssize function returns `std::common_type_t<std::ptrdiff_t, std::make_signed_t<decltype(c.size())>>`, which means that if the user redefined `Eigen::Index` to be smaller than `std::ptrdiff_t`, the old version would return a value of type `Eigen::Index` but the new one `std::ptrdiff_t`.",Erik Schultheis,2022-01-05T00:46:10.359Z,NA,NA | |
800 (https://gitlab.com/libeigen/eigen/-/merge_requests/800),Some serialization API changes were made in commit...,"Serialization API changes were made which broke the GPU unit tests for HIP. | |
This commit fixes things. | |
/cc @cantonios",Rohit Santhanam,2022-01-05T16:18:46.073Z,NA,NA | |
792 (https://gitlab.com/libeigen/eigen/-/merge_requests/792),Allow specifying inner & outer stride for CWiseUnaryView - fixes #2398,"### Reference issue | |
#2398 | |
### What does this implement/fix? | |
This MR adds the ability to manually specify the inner and/or outer stride for `CWiseUnaryView`, to avoid issues caused by the incorrect automatic derivation of strides. The strides are set in an identical manner to `Map`, via: | |
```cpp | |
CwiseUnaryView<view_op, VectorType, Stride<OuterStride,InnerStride>> vec_view(vec); | |
``` | |
### Additional information | |
Tests are added to check that strides are set correctly",Andrew Johnson,2022-01-05T19:24:47.225Z,NA,NA | |
799 (https://gitlab.com/libeigen/eigen/-/merge_requests/799),Improve plog: 20% speedup for float + handle denormals,"This replaces !784 | |
1. For `float`, replace the degree 10 polynomial approximation of `log(1+x)` on `[sqrt(0.5)-1;sqrt(2)-1]` | |
by a (3,3) rational approximation. This speeds up the function by ~20% for AVX2. | |
The max relative error increases slightly from 2 ulp to 2.2 ulp for arguments > 1e-15. | |
For tiny arguments the error in both the old and new implementation rises to 64 ulp | |
as x approaches `std::numeric_limits<float>::denorm_min()`. This is likely related to | |
the range reduction and remains to be investigated. | |
2. Change argument clamping such that `log(x)` does not incorrecctly saturate at `~-88` for | |
denormalized `float` arguments, but continues down to `~-104` for positive denormal arguments. | |
A similar fix is done for `double`. | |
3. Re-enable a test for computing `log(denorm_min)`. | |
Thanks to my colleague James Lottes for suggesting this change and deriving the (3,3) approximant. | |
Benchmark numbers for AVX2: | |
``` | |
name old cpu/op new cpu/op delta | |
BM_eigen_log_float/1 3.55ns ± 0% 3.27ns ± 0% -7.78% (p=0.000 n=48+49) | |
BM_eigen_log_float/8 34.4ns ± 5% 32.7ns ± 0% -4.97% (p=0.000 n=50+38) | |
BM_eigen_log_float/64 107ns ± 5% 86ns ± 3% -19.69% (p=0.000 n=60+60) | |
BM_eigen_log_float/512 640ns ± 5% 502ns ± 5% -21.56% (p=0.000 n=60+60) | |
BM_eigen_log_float/4k 4.94µs ± 5% 3.84µs ± 3% -22.22% (p=0.000 n=60+51) | |
BM_eigen_log_float/32k 39.1µs ± 4% 30.5µs ± 3% -22.07% (p=0.000 n=46+50) | |
BM_eigen_log_float/256k 313µs ± 4% 244µs ± 4% -21.93% (p=0.000 n=45+50) | |
BM_eigen_log_float/1M 1.26ms ± 4% 0.97ms ± 2% -23.06% (p=0.000 n=39+30) | |
name old time/op new time/op delta | |
BM_eigen_log_float/1 3.55ns ± 0% 3.27ns ± 0% -7.79% (p=0.000 n=41+49) | |
BM_eigen_log_float/8 34.4ns ± 5% 32.7ns ± 0% -4.98% (p=0.000 n=50+38) | |
BM_eigen_log_float/64 107ns ± 5% 86ns ± 3% -19.68% (p=0.000 n=60+60) | |
BM_eigen_log_float/512 640ns ± 5% 502ns ± 5% -21.56% (p=0.000 n=60+60) | |
BM_eigen_log_float/4k 4.93µs ± 5% 3.84µs ± 3% -22.19% (p=0.000 n=60+52) | |
BM_eigen_log_float/32k 39.1µs ± 4% 30.5µs ± 3% -22.06% (p=0.000 n=46+50) | |
BM_eigen_log_float/256k 313µs ± 4% 244µs ± 4% -21.94% (p=0.000 n=45+50) | |
BM_eigen_log_float/1M 1.26ms ± 4% 0.97ms ± 2% -23.07% (p=0.000 n=39+30) | |
name old INSTRUCTIONS/op new INSTRUCTIONS/op delta | |
BM_eigen_log_float/1 41.0 ± 0% 41.0 ± 0% ~ (all samples are equal) | |
BM_eigen_log_float/8 328 ± 0% 329 ± 0% +0.30% (p=0.000 n=48+48) | |
BM_eigen_log_float/64 778 ± 0% 684 ± 0% -12.08% (p=0.000 n=56+60) | |
BM_eigen_log_float/512 4.03k ± 0% 3.26k ± 0% -19.03% (p=0.000 n=53+56) | |
BM_eigen_log_float/4k 30.0k ± 0% 23.9k ± 0% -20.47% (p=0.000 n=56+46) | |
BM_eigen_log_float/32k 238k ± 0% 189k ± 0% -20.66% (p=0.000 n=37+44) | |
BM_eigen_log_float/256k 1.90M ± 0% 1.51M ± 0% -20.69% (p=0.000 n=38+45) | |
BM_eigen_log_float/1M 7.60M ± 0% 6.03M ± 0% -20.69% (p=0.000 n=36+35) | |
name old CYCLES/op new CYCLES/op delta | |
BM_eigen_log_float/1 13.1 ± 0% 12.1 ± 0% -7.81% (p=0.000 n=40+50) | |
BM_eigen_log_float/8 127 ± 5% 121 ± 0% -4.98% (p=0.000 n=50+37) | |
BM_eigen_log_float/64 362 ± 2% 293 ± 0% -18.99% (p=0.000 n=56+60) | |
BM_eigen_log_float/512 2.17k ± 2% 1.71k ± 1% -21.00% (p=0.000 n=60+60) | |
BM_eigen_log_float/4k 16.7k ± 2% 13.1k ± 1% -21.65% (p=0.000 n=59+52) | |
BM_eigen_log_float/32k 133k ± 3% 104k ± 1% -21.58% (p=0.000 n=46+45) | |
BM_eigen_log_float/256k 1.06M ± 2% 0.83M ± 1% -21.41% (p=0.000 n=45+50) | |
BM_eigen_log_float/1M 4.26M ± 3% 3.33M ± 1% -21.77% (p=0.000 n=39+38) | |
```",Rasmus Munk Larsen,2022-01-05T23:40:32.471Z,NA,NA | |
802 (https://gitlab.com/libeigen/eigen/-/merge_requests/802),Fixes #i2411,"Fixes #2411 | |
### What does this implement/fix? | |
Commit c20e908ebc42f7174b1d85ce82583a66d185520c introduced a truncation from unsigned int to bool | |
this fixes it. Also, the truncation itself might be a bug (0110 => 0, but it should be 1).",Fabian Keßler,2022-01-06T20:02:38.500Z,NA,NA | |
801 (https://gitlab.com/libeigen/eigen/-/merge_requests/801),Some fixes/cleanups for numeric_limits & fix for related bug in psqrt,"From [email protected]: | |
Some fixes/cleanups for numeric_limits | |
BFloat16: | |
- Set the highest payload bit instead of the lowest for signaling_NaN to match Half | |
- Set has_denorm to denorm_present as this type supports denormals. Otherwise, we should set denorm_min to min as per the standard. | |
Half: | |
- epsilon defined incorrectly | |
- tinyness_before should be identical to the other C++ floating point types | |
- is_bounded defined as false instead of true; is_bounded == false would be true for types with arbitrary precision | |
- traps should be set to what float uses, it is likely false | |
- is_iec559 should be set to true for both types (long double has this true and it has a much weirder encoding) | |
From [email protected]: | |
Add a workaround to the AVX implementation of `psqrt` since `_mm256_rsqrt_ps` appears to flush negative denormal values to zero. | |
Closes #2409",Rasmus Munk Larsen,2022-01-07T01:10:18.838Z,NA,NA | |
791 (https://gitlab.com/libeigen/eigen/-/merge_requests/791),"Add support for Cray, Fujitsu, and Intel ICX compilers","1. This MR adds support for the Cray (CPE), Fujitsu (FCC), and Intel ICX compilers | |
The following preprocessor macros are added: | |
- `EIGEN_COMP_CPE` and `EIGEN_COMP_CLANGCPE` version number of the CRAY compiler if Eigen is compiled with the Cray C++ compiler, `0` otherwise | |
- `EIGEN_COMP_FCC` and `EIGEN_COMP_CLANGFCC` version number of the FCC compiler if Eigen is compiled with the Fujitsu C++ compiler, `0` otherwise | |
- `EIGEN_COMP_CLANGICC` version number of the ICX compiler if Eigen is compiled with the Intel oneAPI C++ compiler, `0` otherwise | |
All three compilers (Cray, Fujitsu, Intel) offer a traditional and a Clang-based frontend. This is distinguished by the `CLANG` fix. | |
2. This MR extends the detection of the IBM XL compiler to V13.1 and V16.1 which use other predefined macros",Matthias Möller,2022-01-07T18:46:16.971Z,NA,NA | |
796 (https://gitlab.com/libeigen/eigen/-/merge_requests/796),Make fixed-size Matrix and Array trivially copyable after C++20,"Making them trivially copyable allows using std::memcpy() without undefined behaviors. | |
Only Matrix and Array with trivially copyable DenseStorage are marked as trivially copyable with an additional type trait. | |
As described in http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0848r3.html it requires extremely verbose SFINAE to make the special member functions of fixed-size Matrix and Array trivial, unless C++20 concepts are available to simplify the selection of trivial special member functions given template parameters. Therefore only make this feature available to compilers that support C++20 P0848R3. | |
Fix #1855. | |
I have run the official and unsupported tests locally with passing results using the same commands in ci/test.gitlab-ci.yml and `cmake -DCMAKE_CXX_STANDARD=20`, but it's not clear to me how the CI script itself should be updated. | |
Please also let me know if any documentation is needed.",Lingzhu Xiang,2022-01-07T19:04:36.308Z,NA,NA | |
803 (https://gitlab.com/libeigen/eigen/-/merge_requests/803),Fix Gcc8.5 warning about missing base class initialisation (#2404),"### Reference issue | |
Gcc8.5 warning about missing base class initialisation (#2404) | |
### What does this implement/fix? | |
This MR initialises the base class explicitly.",Matthias Möller,2022-01-07T19:34:28.784Z,NA,NA | |
805 (https://gitlab.com/libeigen/eigen/-/merge_requests/805),Make sure the scalar and vectorized paths for array.exp() return consistent values.,This fixes #2413.,Rasmus Munk Larsen,2022-01-07T23:31:36.354Z,NA,NA | |
780 (https://gitlab.com/libeigen/eigen/-/merge_requests/780),Fix accuracy of logistic sigmoid,"Fix accuracy of the specialized float32 implementation of the logistic (sigmoid) function `S(x) = exp(x)/(1+exp(x))` in Eigen. The reason to have this specialization in the first place is that logistic sigmoid is a frequently used function in machine learning and statistics, so even a modestly (~30%) faster version can have an impact in applications. | |
The old implementation was very inaccurate around x=-9 where we would switch from a fast rational approximant to returning `exp(x)`, which has the right asymptotic behavior for large negative x. This approach would have errors >8000 ulps around x=-9 and also be slow for SIMD packets containing elements less than -9, since we would compute both the rational approximant and `exp(x)`. | |
The new algorithm uses a hybrid range reduction method: First the standard range reduction used in `pexp(x)` is applied, and secondly we use the identity `exp(r) = exp(r/2)^2`, to avoid generating denormalized intermediate values for large negative arguments. This enables us to use a fast version of `pldexp` that does not properly handle denormals, but is significantly faster. | |
The final result is an implementation that has maximum relative error of 4.5 ulps for normalized results. In addition, the old algorithm would return zero for `x < ~-88`, while the new algorithm extends the range to `x~=-104` where `S(x) ~= std::numeric_limits<float>::denorm_min()`. Relative accuracy degrades gradually down to 0.033 for arguments around `x=-104` where `S(x) ~= std::numeric_limits<float>::denorm_min()`. This is expected and acceptable. | |
The new algorithm is about 30% faster than computing `e = pexp(x); s = e / (1 + e)`, while the old inaccurate algorithm was about 40% faster when no arguments `x < -9` are present and about 2x slower when they are. | |
This change also extends the range of the non-specialized version to properly compute dernormalized values of S(x) for large negative x, e.g. the range `~[-104;-88]` for float, below which S(x) is too small to be represented even in a denormalized float. This is accomplished by evaluating `S(x)` as `exp(x) / (1 + exp(x))` instead of `1 / (1 + exp(-x))`. | |
Thanks to my Google colleagues James Lottes and Sameer Agarwal for the discussions and experimentation that led to this.",Rasmus Munk Larsen,2022-01-08T00:15:15.017Z,NA,NA | |
806 (https://gitlab.com/libeigen/eigen/-/merge_requests/806),Fix IterativeSolverBase referring to itself as ConjugateGradient,"### What does this implement/fix? | |
Two assertion messages in `IterativeSolverBase` refer to it as `ConjugateGradient`. | |
This is incorrect, as the derived class could be several different solvers, not just CG. | |
This MR changes those assertion messages to say `IterativeSolverBase` instead. | |
This is consistent with another identical assertion further down in the file | |
that is already saying `""IterativeSolverBase is not initialized.""` | |
Based on this git history, it looks like IterativeSolverBase was made by extracting | |
code out of `ConjugateGradient` and that these messages were copy-pasted at that | |
time without being modified. A simple copy-paste error.",Essex Edwards,2022-01-08T08:25:16.514Z,NA,NA | |
795 (https://gitlab.com/libeigen/eigen/-/merge_requests/795),Reduce usage of reserved names,"In Eigen there are quite some usages of reserved names. For example, we use leading underscores in identifiers, which [is reserved for implementation](https://timsong-cpp.github.io/cppwp/n3337/reserved.names#global.names-1.2), see also discussion in #2205 and #361. | |
### Reference issue | |
#2205 | |
See also #361 and !575. | |
### What does this implement/fix? | |
This MR fixes several usages of reserved names. | |
Note: The MR is still work in progress. | |
### Additional information | |
* As in !575 / commit 4ba872bd I am moving the underscore to the end of the name. Please let me know if you prefer another replacement rule. | |
* I would be very happy if somebody could review my MR.",Kolja Brix,2022-01-10T20:53:29.599Z,NA,NA | |
808 (https://gitlab.com/libeigen/eigen/-/merge_requests/808),Explicit type casting,"### What does this implement/fix? | |
The function signature of `pmadd` assumes identical types of all three arguments, i.e. `Eigen/src/Core/GenericPacketMath.h` lines 958ff | |
```cpp | |
/** \internal \returns a * b + c (coeff-wise) */ | |
template<typename Packet> EIGEN_DEVICE_FUNC inline Packet | |
pmadd(const Packet& a, | |
const Packet& b, | |
const Packet& c) | |
{ return padd(pmul(a, b),c); } | |
``` | |
In `Eigen/src/LU/Determinant.h` the function `pmadd` is used with mathematical expressions, i.e. | |
```cpp | |
return internal::pmadd((Scalar)(-m(0,3)),d3_0, (Scalar)(m(1,3)*d3_1)) + | |
internal::pmadd((Scalar)(-m(2,3)),d3_2, (Scalar)(m(3,3)*d3_3)); | |
``` | |
and | |
```cpp | |
return internal::pmadd(m(i0,2), d0, internal::pmadd((Scalar)(-m(i1,2)), d1, (Scalar)(m(i2,2)*d2))); | |
``` | |
which must be explicitly casted to `Scalar`. Otherwise, custom scalar types whose overload of `operator*` and `operator-` return expression templates rather than the original types will lead to compiler errors. | |
### Additional information | |
This bug has ben observed while differentiating Eigen using the AD library CoDiPack.",Matthias Möller,2022-01-10T22:06:44.478Z,NA,NA | |
810 (https://gitlab.com/libeigen/eigen/-/merge_requests/810),Fix two corner cases in the new implementation of logistic sigmoid.,"1. Truncate at the first point where the interpolant is exactly 1, | |
such that 1 is returned for all arguments greater than or equal | |
to it. | |
2. Make sure that Sigmoid(+Inf) = 1 in the generic implementation.",Rasmus Munk Larsen,2022-01-12T00:41:30.681Z,NA,NA | |
809 (https://gitlab.com/libeigen/eigen/-/merge_requests/809),fix broken asserts,"The conditions in these asserts contained the variable name as a string, so there was no actual checking.",Erik Schultheis,2022-01-12T19:46:45.680Z,NA,NA | |
811 (https://gitlab.com/libeigen/eigen/-/merge_requests/811),fix compilation issue with gcc < 10 and -std=c++2a,"Fixes #2415 | |
This MR requires GCC <= 9.4 to use the old code for signed size as `std::ssize` is not available.",Joerg Buchwald,2022-01-13T01:43:05.391Z,NA,NA | |
764 (https://gitlab.com/libeigen/eigen/-/merge_requests/764),Add MMA and performance improvements for VSX in GEMV for PowerPC.,"Add MMA and performance improvements for VSX in GEMV for PowerPC. | |
Changes include improved complex operations, full packet operations, and addition of VSX and MMA acceleration. | |
Up to 2.5X faster for VSX and 4X for MMA.",Chip Kerchner,2022-01-13T13:23:19.153Z,NA,NA | |
812 (https://gitlab.com/libeigen/eigen/-/merge_requests/812),fix implicit conversion warning in vectorwise_reverse_inplace,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
This fixes warning 2 in Issue #2400; the implicit conversion from ``Eigen::Index`` to ``int`` in ``vectorwise_reverse_inplace_impl`` | |
I also noticed this warning when building the BDCSVD tests. | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
There's a new warning when building the tests about an implicit conversion from ``Index`` to ``int`` in Reverse, when it uses ``fix<N>(n)``. | |
This update just explicitly casts ``half`` from an Index to an int to remove the conversion warning. Since as far as I can see, ``fix<N>(n)`` only works with ``int``, this casting should be equivalent to how it was before the recent changes. | |
Before the recent cleanup, the pre-c++14 version of ``fix`` took the runtime value generically, and then manually cast it down to an int. So I think there was generally no implicit conversion.",Arthur,2022-01-13T20:30:55.569Z,NA,NA | |
814 (https://gitlab.com/libeigen/eigen/-/merge_requests/814),update comment referencing removed macro EIGEN_SIZE_MIN_PREFER_DYNAMIC,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
super minor :sweat_smile: | |
Just stumbled on this comment referencing ``EIGEN_SIZE_MIN_PREFER_DYNAMIC``, which no longer exists. | |
Updated it to reference the new constexpr function instead.",Arthur,2022-01-14T19:29:48.485Z,NA,NA | |
813 (https://gitlab.com/libeigen/eigen/-/merge_requests/813),Minor correction/clarification to LSCG solver documentation,"This applies some minor corrections/clarifications to the docs of LeastSquaresConjugateGradient (LSCG). | |
LSCG solves the least squares problem ""min |Ax-b|"" or, equivalently, the normal equations ""A'Ax=A'b"". The documentation seems a little confused about this. In one place, it is described as solving ""min |A'Ax-b|"", which appears to be a mashup of the two formulas and is not correct. In another place, it is described as solving Ax=b, which is misleading without mentioning least-squares, as it (probably) will not be able to find an x satisfying that equation. I would guess both of these issues are simple copy-paste/editing errors.",Essex Edwards,2022-01-14T19:48:55.119Z,NA,NA | |
815 (https://gitlab.com/libeigen/eigen/-/merge_requests/815),Fix implicit conversion warning in GEBP kernel's packing,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Warning 1 in #2400 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
very small change to remove a shortening conversion warning in GeneralBlockPanelKernel.h. Not too important to fix, but I've seen this warning pop up a number of times in unrelated compilation errors and it's a little annoying to scroll past. | |
(Since it doesn't really show up in the diff,) ``gemm_pack_lhs`` in the gebp kernel includes some code like this | |
~~~~ | |
int pack = Pack1; | |
int psize = PacketSize; | |
... | |
Index left = rows - i; // rows remaining to pack | |
... | |
psize = pack = left & ~1; // triggers -Wshorten-64-to-32 | |
~~~~ | |
This MR just changes ``pack``, ``psize``, (and a loop variable) to ``Index`` to remove the warning. | |
### Additional Notes | |
maybe it's preferable to just do ``int left = internal::convert_index<int>(rows - i)``? | |
I think ``left = rows - i`` is the few remaining rows that didn't fit with the current value of ``pack``. So ``left`` should definitely be small enough to fit in an int. (and just printing the value of ``left`` in the product_large tests confirms it's always smaller than ``Pack1``)",Arthur,2022-01-18T12:55:05.320Z,NA,NA | |
816 (https://gitlab.com/libeigen/eigen/-/merge_requests/816),Port EIGEN_OPTIMIZATION_BARRIER to soft float arm,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Currently Eigen does not build if the -march CFLAG is set to ""armv6j+nofp"". This is happening because a ""w"" inline asm constraint is used. | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This change detects no-floating-point-hw targets using the __ARM_FP macro, and avoids the ""w"" constraint accordingly. | |
### Additional information | |
<!--Any additional information you think is important.-->",David Gao,2022-01-20T00:44:17.678Z,NA,NA | |
819 (https://gitlab.com/libeigen/eigen/-/merge_requests/819),Improve clang warning suppressions by checking if warning is supported,NA,Sean McBride,2022-01-21T00:27:43.962Z,NA,NA | |
818 (https://gitlab.com/libeigen/eigen/-/merge_requests/818),Silence some MSVC warnings,"### What does this implement/fix? | |
Silence two warnings in construct_elements_of_array() | |
- C4701: potentially uninitialized local variable 'i' used (in catch handler) | |
- C4702: unreachable code (return NULL) | |
I have been running these mods locally for a while on vs2017, 2019 and now 2022. Has not caused me issues. | |
However, I have not run tests or compiled on other platforms.",Stephen Pierce,2022-01-21T00:47:15.290Z,NA,NA | |
772 (https://gitlab.com/libeigen/eigen/-/merge_requests/772),Cleanup,"some more cleanup, removing EIGEN_HAS_CONSTEXPR, EIGEN_HAS_INDEX_LIST, EIGEN_HAS_STD_RESULT_OF and Eigens own implementation of index/integer_sequence. | |
These are the macros that were currently on the list here #2372, though there are some more where I'm not sure yet whether they should be removed.",Erik Schultheis,2022-01-21T01:48:59.990Z,NA,NA | |
817 (https://gitlab.com/libeigen/eigen/-/merge_requests/817),Add support for packets of int64 on x86,Currently only AVX is supported.,Ilya Tokar,2022-01-21T19:55:25.184Z,NA,NA | |
821 (https://gitlab.com/libeigen/eigen/-/merge_requests/821),Prevent heap allocation in diagonal product,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
fixes #2408 - product with a diagonal matrix had a heap allocation. | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Just sets the ``NestByRefBit`` in the ``DiagonalMatrix`` traits to prevent the heap allocation. | |
Since this bit wasn't set, the ``Product`` class was not using a reference type! Instead, it was copying the diagonal matrix.",Arthur,2022-01-21T21:36:01.524Z,NA,NA | |
820 (https://gitlab.com/libeigen/eigen/-/merge_requests/820),"Add reciprocal packet op and fast specializations for float with SSE, AVX, and AVX512.","Add reciprocal packet op and fast specializations for float with | |
SSE and AVX, which have builtin instructions for approximate reciprocal. | |
The approximation is refined by one step of Newton-Raphson iteration. | |
The result is accurate to 2 ulps for SSE/AVX and within 1 ulp for AVX512, | |
where the `_mm512_rcp14_ps` instruction provides a better starting guess. | |
TODO: Add specializations for more ISAs with fast approximate reciprocal instructions. | |
Benchmark numbers measured on Intel Xeon Gold 6154 (Skylake): | |
``` | |
AVX512 (packet size 16) | |
name old cpu/op new cpu/op delta | |
BM_eigen_inverse_float/1 2.72ns ± 0% 0.61ns ± 2% -77.39% (p=0.000 n=53+59) | |
BM_eigen_inverse_float/8 5.73ns ± 1% 6.25ns ± 2% +9.10% (p=0.000 n=58+60) | |
BM_eigen_inverse_float/64 31.2ns ± 2% 13.4ns ± 2% -56.96% (p=0.000 n=51+60) | |
BM_eigen_inverse_float/512 115ns ± 2% 43ns ± 3% -62.38% (p=0.000 n=59+57) | |
BM_eigen_inverse_float/4k 781ns ± 2% 290ns ± 2% -62.88% (p=0.000 n=60+53) | |
BM_eigen_inverse_float/32k 6.12µs ± 2% 2.94µs ± 3% -51.99% (p=0.000 n=59+48) | |
BM_eigen_inverse_float/256k 80.1µs ± 2% 81.2µs ± 2% +1.28% (p=0.000 n=60+56) | |
BM_eigen_inverse_float/1M 321µs ± 2% 324µs ± 1% +0.91% (p=0.000 n=33+29) | |
AVX (packet size 8): | |
name old cpu/op new cpu/op delta | |
BM_eigen_inverse_float/1 2.72ns ± 0% 3.28ns ± 1% +20.53% (p=0.000 n=54+45) | |
BM_eigen_inverse_float/8 5.72ns ± 0% 6.65ns ± 0% +16.21% (p=0.000 n=56+56) | |
BM_eigen_inverse_float/64 19.0ns ± 0% 12.6ns ± 2% -33.75% (p=0.000 n=58+48) | |
BM_eigen_inverse_float/512 95.1ns ± 0% 50.7ns ± 4% -46.65% (p=0.000 n=52+55) | |
BM_eigen_inverse_float/4k 704ns ± 0% 368ns ± 2% -47.65% (p=0.000 n=56+50) | |
BM_eigen_inverse_float/32k 5.57µs ± 0% 3.47µs ± 3% -37.75% (p=0.000 n=57+50) | |
BM_eigen_inverse_float/256k 78.2µs ± 1% 80.7µs ± 2% +3.29% (p=0.000 n=59+58) | |
BM_eigen_inverse_float/1M 313µs ± 1% 323µs ± 1% +3.42% (p=0.000 n=33+33) | |
SSE (packet size 4): | |
name old cpu/op new cpu/op delta | |
BM_eigen_inverse_float/1 0.83ns ± 1% 0.83ns ± 0% -0.10% (p=0.006 n=56+50) | |
BM_eigen_inverse_float/8 3.26ns ± 0% 2.45ns ± 0% -24.68% (p=0.000 n=48+50) | |
BM_eigen_inverse_float/64 14.7ns ± 0% 11.1ns ± 1% -24.48% (p=0.000 n=53+54) | |
BM_eigen_inverse_float/512 106ns ± 0% 86ns ± 0% -18.47% (p=0.000 n=55+55) | |
BM_eigen_inverse_float/4k 835ns ± 0% 640ns ± 0% -23.44% (p=0.000 n=55+54) | |
BM_eigen_inverse_float/32k 6.67µs ± 0% 5.81µs ± 0% -12.83% (p=0.000 n=51+56) | |
BM_eigen_inverse_float/256k 78.4µs ± 2% 79.0µs ± 1% +0.71% (p=0.000 n=55+54) | |
BM_eigen_inverse_float/1M 313µs ± 1% 316µs ± 1% +0.88% (p=0.000 n=29+30) | |
``` | |
Thanks to @sandwichmaker for reviewing a preliminary version of this at Google.",Rasmus Munk Larsen,2022-01-21T23:49:19.363Z,NA,NA | |
822 (https://gitlab.com/libeigen/eigen/-/merge_requests/822),make casts explicit and fixed the type,"I think there is a bug in the implementation of the rand test. | |
The variable is called short offset, and assigned a maximum value of 16'000, yet is stored in a singed char. | |
I admittedly don't quite understand how this is used exactly, so it would be good if someone with more knowledge about this test could take a look. | |
I've also changed the maximum number that can be assigned to short offset, because `24345 + 16000 > 2^15` would result in an overflow in line 97.",Erik Schultheis,2022-01-24T18:19:22.495Z,NA,NA | |
824 (https://gitlab.com/libeigen/eigen/-/merge_requests/824),"Remove inline assembly for FMA (AVX) and add remaining extensions as packet ops: pmsub, pnmadd, and pnmsub.","Adding the additional variation can save explicit negations in various low-level implementations. In a followup to this change, they will be used to make `preciprocal` IEEE compliant with minimal overhead. | |
This change also removes the old workaround for register spilling in `Eigen/src/Core/arch/AVX/PacketMath.h`, which appears very counterproductive on modern compiler/CPU combos. For example, compiling a matrix multiplication benchmark with clang 11 without the workaround yields the following speedups on a Skylake core (in addition to the improved readability). | |
| flags | speedup | | |
| ------ | ------ | | |
| -march=skylake | 25% (!) | | |
| -mavx -mfma | 12% (!) | | |
| -mavx | unchanged | | |
Closes #2231",Rasmus Munk Larsen,2022-01-26T04:25:41.636Z,NA,NA | |
825 (https://gitlab.com/libeigen/eigen/-/merge_requests/825),reduce float warnings (comparisons and implicit conversions),"This MR consists of three changes to reduce the number of warnings that would be generated due to floating point concerns: | |
1) It adds new utility functions for performing exact floating point comparisons to 0 and 1. Why the extra function? Because for one, these are the most common types of exact floating point comparisons, and the ones where we usually clearly want exact comparison to happen (e.g. do I need to add this scalar/do I need to multiply with this scalar/can I ignore this term). Furthermore, having this as an extra function means we can hide generating the correctly typed 0 and 1 in their code, and enforce that the correct type is used. | |
2) Replace `==` with `equal_strict` in regular code, and `VERIFY(... == ...)` with `VERIFY_IS_EQUAL` in the tests. | |
3) wrote out some implicit conversions into explicit casts. This is only in the test suite. | |
Q: What should be the name for the comparison to 0/1 numbers? Right now I'm using `is_zero_strict` in analogy to `equal_strict`, but a more descriptive name (maybe) could also be `is_exactly_zero` (this is more how I would read this out aloud in an if, for example).",Erik Schultheis,2022-01-26T18:16:19.578Z,NA,NA | |
827 (https://gitlab.com/libeigen/eigen/-/merge_requests/827),Make preciprocal IEEE compliant w.r.t. 1/0 and 1/inf.,"The new implementation takes advantage of the precondition that the starting approximation is 0 for infinite arguments and vice versa. Since one term in the Newton-Raphson step is the product of the argument and the approximation, we can detect zeros and infinities by checking if this term is NaN. This is faster than explicitly testing whether the argument is inf or 0. | |
Here are benchmark results comparing this implementation with `pdiv` (i.e. the state before commit ea2c02060cc329cc83f6a77e8247b195b5defcd9.) The change also appears to speed up the scalar path significantly (not sure why, but I'll take it). | |
``` | |
SSE+FMA (-mfma) | |
name old cpu/op new cpu/op delta | |
BM_eigen_inverse_float/1 2.72ns ± 0% 1.09ns ± 0% -59.73% (p=0.000 n=53+58) | |
BM_eigen_inverse_float/8 5.71ns ± 1% 5.70ns ± 0% -0.20% (p=0.000 n=49+54) | |
BM_eigen_inverse_float/64 19.0ns ± 0% 10.4ns ± 3% -45.25% (p=0.000 n=52+60) | |
BM_eigen_inverse_float/512 95.0ns ± 0% 63.8ns ± 3% -32.85% (p=0.000 n=55+59) | |
BM_eigen_inverse_float/4k 703ns ± 0% 508ns ± 3% -27.80% (p=0.000 n=55+60) | |
BM_eigen_inverse_float/32k 5.57µs ± 0% 4.89µs ± 3% -12.30% (p=0.000 n=56+60) | |
BM_eigen_inverse_float/256k 78.3µs ± 1% 80.4µs ± 2% +2.62% (p=0.000 n=60+60) | |
BM_eigen_inverse_float/1M 313µs ± 2% 322µs ± 3% +2.82% (p=0.000 n=35+35) | |
AVX+FMA (-march=skylake) | |
name old cpu/op new cpu/op delta | |
BM_eigen_inverse_float/1 2.72ns ± 0% 1.10ns ± 0% -59.68% (p=0.000 n=47+60) | |
BM_eigen_inverse_float/8 5.72ns ± 0% 5.70ns ± 0% -0.39% (p=0.000 n=52+55) | |
BM_eigen_inverse_float/64 19.0ns ± 0% 11.0ns ± 2% -41.86% (p=0.000 n=52+60) | |
BM_eigen_inverse_float/512 95.1ns ± 0% 64.2ns ± 3% -32.44% (p=0.000 n=54+60) | |
BM_eigen_inverse_float/4k 704ns ± 0% 510ns ± 3% -27.53% (p=0.000 n=54+60) | |
BM_eigen_inverse_float/32k 5.57µs ± 0% 4.89µs ± 3% -12.32% (p=0.000 n=56+60) | |
BM_eigen_inverse_float/256k 78.4µs ± 1% 80.6µs ± 1% +2.88% (p=0.000 n=60+53) | |
BM_eigen_inverse_float/1M 314µs ± 1% 322µs ± 1% +2.79% (p=0.000 n=33+29) | |
AVX512+FMA (-march=skylake-avx512) | |
name old cpu/op new cpu/op delta | |
BM_eigen_inverse_float/1 2.72ns ± 0% 0.62ns ± 2% -77.23% (p=0.000 n=50+60) | |
BM_eigen_inverse_float/8 5.73ns ± 1% 6.28ns ± 2% +9.65% (p=0.000 n=55+59) | |
BM_eigen_inverse_float/64 31.4ns ± 2% 13.5ns ± 2% -57.17% (p=0.000 n=50+60) | |
BM_eigen_inverse_float/512 115ns ± 2% 46ns ± 2% -59.80% (p=0.000 n=60+59) | |
BM_eigen_inverse_float/4k 787ns ± 2% 324ns ± 3% -58.87% (p=0.000 n=60+49) | |
BM_eigen_inverse_float/32k 6.17µs ± 2% 3.20µs ± 3% -48.16% (p=0.000 n=60+50) | |
BM_eigen_inverse_float/256k 80.5µs ± 2% 80.6µs ± 1% ~ (p=0.262 n=57+60) | |
BM_eigen_inverse_float/1M 322µs ± 1% 322µs ± 1% ~ (p=0.429 n=30+29) | |
```",Rasmus Munk Larsen,2022-01-26T20:38:07.115Z,NA,NA | |
828 (https://gitlab.com/libeigen/eigen/-/merge_requests/828),Fix number of block columns to NOT overflow the cache (PowerPC) abnormally in GEMV,Fix number of block columns to NOT overflow the cache (PowerPC) abnormally in GEMV. Timings were unusual when number of columns was between 2500-3200-ish.,Chip Kerchner,2022-01-27T20:35:53.586Z,NA,NA | |
830 (https://gitlab.com/libeigen/eigen/-/merge_requests/830),removed some documentation referencing c++98 behaviour,This MR removes some comments/documentation that reference C++98/03 behaviour.,Erik Schultheis,2022-01-30T20:22:51.206Z,NA,NA | |
826 (https://gitlab.com/libeigen/eigen/-/merge_requests/826),Update SVD Module with Options template parameter,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Take 2 of !658. This MR is a less devastating API break :sweat_smile: | |
Updated following discussion in !750. | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This makes several API breaking changes to the SVD module. However, it stays compatible with the most common use of the old API. | |
1. Adds the ``Options`` template parameter to the SVD module: ``JacobiSVD<MatrixType, ComputeThinU | ComputeThinV> svd(m);`` | |
- This ""improved"" API allows computing thin unitaries of fixed-size matrices, which is not possible at the moment. | |
2. ""Deprecates"" the constructor taking computation options: ``SvdType svd(m, ComputeThinU);`` to stay compatible with the old version. | |
3. Disallows using both the ``Options`` template parameter and deprecated constructor at the same time. | |
- This is so it can use a static assert to fail early, rather than failing at runtime or silently preferring one setting over the other. | |
- I.e., doing ``JacobiSVD<MatrixType, Options> svd(m, options);`` raises a static assert. | |
4. Removes the overload of ``compute`` that could change the computation options: ``svd.compute(m2, newOptions);`` | |
The plugin methods are essentially in the same situation as the constructor. Can use either | |
- ``m.template jacobiSvd<Options>();`` | |
- ``m.jacobiSvd(options);`` | |
### Additional information | |
<!--Any additional information you think is important.--> | |
Worked on this very sporadically, but the majority of this should be essentially the same as the original MR.",Arthur,2022-02-02T00:15:44.550Z,NA,NA | |
835 (https://gitlab.com/libeigen/eigen/-/merge_requests/835),Fix ODR violations.,"We can't use unnamed namespaces (or any internal linkage) in headers, since this will | |
lead to ODR violations and undefined behavior. | |
Fixes #2392",Antonio Sánchez,2022-02-04T19:01:08.324Z,NA,NA | |
832 (https://gitlab.com/libeigen/eigen/-/merge_requests/832),"Fix AVX512 math function consistency, enable for ICC.",Fixes #2419.,Antonio Sánchez,2022-02-04T19:35:19.174Z,NA,NA | |
833 (https://gitlab.com/libeigen/eigen/-/merge_requests/833),Fix 32-bit arm int issue.,"For (some?) 32-bit arm platforms, `int32_t` is actually of type `long int`, | |
not `int`. We have a couple places where we assume extracting a bit | |
pattern from a `float` is type `int`, when we should use `int32_t` | |
instead. The discrepancy in types causes some packet functions to | |
fail. | |
I'm not sure how prevalent this issue is, but these changes at least fix | |
it for #2412. | |
Fixes #2412.",Antonio Sánchez,2022-02-04T21:59:34.504Z,NA,NA | |
840 (https://gitlab.com/libeigen/eigen/-/merge_requests/840),Correct use of EIGEN_CUDACC to respect EIGEN_NO_CUDA.,"Previously the mix of EIGEN_CUDACC and __CUDACC__ led to discrepancies | |
when `EIGEN_NO_CUDA` was defined. | |
Fixes #2290",Antonio Sánchez,2022-02-04T22:24:32.003Z,NA,NA | |
838 (https://gitlab.com/libeigen/eigen/-/merge_requests/838),Define EIGEN_HAS_AVX512_MATH in PacketMath.,"Previously was used before it was defined, so defaulted to 0. This | |
fixes the order.",Antonio Sánchez,2022-02-04T22:25:52.813Z,NA,NA | |
836 (https://gitlab.com/libeigen/eigen/-/merge_requests/836),Restrict GCC<6.3 maxpd workaround to only gcc.,"Previously this was applied to any gnuc-compatible compiler (e.g. clang). | |
Fixes #2332.",Antonio Sánchez,2022-02-04T22:47:35.454Z,NA,NA | |
841 (https://gitlab.com/libeigen/eigen/-/merge_requests/841),"Add generic fast psqrt and prsqrt impls and make them correct for 0, +Inf, NaN, and negative arguments.","1. Consolidate fast psqrt and prsqrt into generic implementations and avoid duplicating this code for SSE,AVX, and AVVX512. TODO: Use these generic implementations for more architectures. | |
1. Make both fast psqrt and prsqrt correct for 0, Inf, NaN and negative arguments. These functions are now fully standard compliant, except that they treat positive subnormal input arguments as zeros. | |
The performance regressions associated with these changes are less than 5% measured for SSE+FMA, AVX, and AVX512 on Skylake.",Rasmus Munk Larsen,2022-02-05T00:20:14.109Z,NA,NA | |
844 (https://gitlab.com/libeigen/eigen/-/merge_requests/844),Update MPL2 with https.,Fixes #2433.,Antonio Sánchez,2022-02-07T17:51:16.891Z,NA,NA | |
843 (https://gitlab.com/libeigen/eigen/-/merge_requests/843),Fix collision with resolve.h.,"`resolve.h` pollutes the global namespace with `_res`, so this clashes | |
with our local parameter name. | |
Rename locals `_*` to `*_` to address. | |
Fixes #2435",Antonio Sánchez,2022-02-07T18:17:43.329Z,NA,NA | |
842 (https://gitlab.com/libeigen/eigen/-/merge_requests/842),Typo in COD's doc: matrixR() -> matrixT(),"### What does this implement/fix? | |
Small typo in method documentation of matrixT() | |
### Additional information | |
I think this is just a typo, or am I utterly confused? There's no matrixR() method on CompleteOrthogonalDecomposition, or should the user access m_cpqr.matrixR()?",Björn Dahlgren,2022-02-07T18:49:16.388Z,NA,NA | |
845 (https://gitlab.com/libeigen/eigen/-/merge_requests/845),Provide a definition for numeric_limits static data members,"Authored by [email protected] | |
Provide a definition for numeric_limits static data members | |
Eigen provides specializations which have static data member initializers. However, const non-inline static data members which are ODR used must have a definition at namespace scope. | |
We cannot use C++17 inline variables to solve this so we must come up with another, wordier, solution. | |
Move the implementation into a class template. | |
Provide a definition for the data members. Because they are templated, we are safe from ODR violations. | |
Make the std::numeric_limits specializations inherit from the helper class template. | |
Note that the class template’s template parameter is not meaningful, we only need it because we want to be able to have the static data member emitted into a COMDAT linker section.",Rasmus Munk Larsen,2022-02-08T20:34:53.804Z,NA,NA | |
846 (https://gitlab.com/libeigen/eigen/-/merge_requests/846),Return alphas() and betas() by const reference,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
#2436 | |
### What does this implement/fix? | |
Returns a reference instead of copying vectors, to match the docs and reduce allocations.",Matt Keeter,2022-02-08T23:16:10.921Z,NA,NA | |
847 (https://gitlab.com/libeigen/eigen/-/merge_requests/847),"Cleanup compiler warnings, etc from recent changes in GEMM & GEMV for PowerPC","Cleanup compiler warnings, etc from recent changes in GEMM & GEMV for PowerPC.",Chip Kerchner,2022-02-09T18:47:09.227Z,NA,NA | |
849 (https://gitlab.com/libeigen/eigen/-/merge_requests/849),Complete doc with MatrixXNt and MatrixNXt,"The doc in TutorialMatrixClass.dox was missing two matrix patterns `MatrixXNt` and `MatrixNXt`. | |
Also missing some missing namespaces to TutorialLinAlgSVDSolve.cpp, that block doc compilation. | |
Partially fix https://gitlab.com/libeigen/eigen/-/issues/2437",Florian Maurin,2022-02-11T21:55:54.927Z,NA,NA | |
852 (https://gitlab.com/libeigen/eigen/-/merge_requests/852),Add convenience method `constexpr std::size_t size() const` to `Eigen::IndexList`,NA,Rasmus Munk Larsen,2022-02-12T04:23:03.915Z,NA,NA | |
853 (https://gitlab.com/libeigen/eigen/-/merge_requests/853),Fix ODR failures in TensorRandom.,NA,Antonio Sánchez,2022-02-12T15:15:11.523Z,NA,NA | |
855 (https://gitlab.com/libeigen/eigen/-/merge_requests/855),remove unused macros,"with the generic implementation of `prsqrt`, these macros are not used anywhere anymore.",Erik Schultheis,2022-02-14T10:34:26.315Z,NA,NA | |
859 (https://gitlab.com/libeigen/eigen/-/merge_requests/859),Fix MSVC+NVCC 9.2 pragma error.,"For NVCC 9.2, we require MSVC 14.16 or earlier. Unfortunately, this does not seem to properly support `_Pragma`, despite claims from the [documentation](https://docs.microsoft.com/en-us/cpp/preprocessor/pragma-directives-and-the-pragma-keyword?view=msvc-140) that it does (or perhaps it's the interaction between MSVC and NVCC?). This does work with the microsoft-specific extension `__pragma` though.",Antonio Sánchez,2022-02-15T19:14:55.239Z,NA,NA | |
858 (https://gitlab.com/libeigen/eigen/-/merge_requests/858),Fix sqrt/rsqrt for NEON.,"Newly added tests must check `HasSqrt`/`HasRsqrt` via `CHECK_CWISE1_IF` | |
to avoid compile errors. | |
Replaced NEON hand-written versions with new generic version that | |
correctly handles 0, `inf`.",Antonio Sánchez,2022-02-15T21:31:52.203Z,NA,NA | |
850 (https://gitlab.com/libeigen/eigen/-/merge_requests/850),Add descriptions to Matrix typedefs.,"Otherwise, they don't show up in doxygen documentation. Fixes #2437.",Antonio Sánchez,2022-02-15T21:53:28.275Z,NA,NA | |
857 (https://gitlab.com/libeigen/eigen/-/merge_requests/857),"Re-add `svd::compute(Matrix, options)` method to avoid breaking external projects.","Re-add `svd::compute(Matrix, options)` method to avoid breaking external projects. | |
There are too many other projects (open-source, and at Google) that rely | |
on the existing mechanism. For the open-source projects, we at least need a | |
version number increase to check for which API to use. We also likely | |
need some transition time to allow projects to adapt. | |
Adding the method back in seemed trivial. We may want to reconsider whether | |
we truly want to deprecate this behavior. If we do, we can probably | |
remove it after the next major release. | |
/cc @rmlarsen1 @arthurfeeney",Antonio Sánchez,2022-02-16T00:54:03.048Z,NA,NA | |
861 (https://gitlab.com/libeigen/eigen/-/merge_requests/861),"Make FixedInt constexpr, fix ODR of fix<N>","Related to #2392, the use of a static variable in a header file | |
leads to potential ODR violations. Removing `static` should resolve | |
this (without `inline`) since it is a variable template. Also made | |
this `constexpr`, which required making the `FixedInt` class | |
`constexpr`-compatible.",Antonio Sánchez,2022-02-16T17:47:52.616Z,NA,NA | |
862 (https://gitlab.com/libeigen/eigen/-/merge_requests/862),Use fixed-sized U/V for fixed-sized inputs.,"The change in !826 was breaking (it made U/V dynamic by default). | |
Since we don't allow thin U/V for fixed-sized matrices and default | |
options anyways, there is no need for dynamic sizes anyways. | |
This restores the original U/V sizes.",Antonio Sánchez,2022-02-16T18:31:48.531Z,NA,NA | |
866 (https://gitlab.com/libeigen/eigen/-/merge_requests/866),Fix for crash bug in SPQRSupport: Initialize pointers to nullptr to avoid free() calls of invalid pointers.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
No issue filed, should I add one? | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
If SuiteSparseQR fails, `m_isInitialized` is set to false, but some pointers stay uninitialized. Then in the destructor, they are still free'd, causing a crash. | |
Initializing the pointers to nullptr fixes the issue, as both `std::free` and `cholmod_l_free_*` are no-op for null pointers. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
N/A",Martin Heistermann,2022-02-18T16:13:29.299Z,NA,NA | |
869 (https://gitlab.com/libeigen/eigen/-/merge_requests/869),[SYCL] Fix CMake for SYCL support,"### What does this implement/fix? | |
These fixes are needed to compile and run Eigen tests with SYCL enabled. This only needs a simple CMake command `cmake -Bbuild -DEIGEN_TEST_SYCL=ON`. The SYCL tests can be run with `./check.sh sycl` from the build folder. | |
* There is no need to force a C++ version when compiling Eigen with SYCL | |
anymore. Forcing C++11 would disable some Eigen features. | |
* Remove a CMake workaround that is not needed anymore since the | |
compiler definitions are now defined in the `COMPILE_DEFINITIONS` property instead of | |
`COMPILE_FLAGS`. | |
* Sigmoid tests are temporarily disabled. SYCL needs to support more | |
functions to handle the edge cases of Sigmoid.",Romain Biessy,2022-02-22T16:53:28.242Z,NA,NA | |
870 (https://gitlab.com/libeigen/eigen/-/merge_requests/870),Fix test macro conflicts with STL headers in C++20,"Fix test build with GCC 9, 10, 11 in C++20.",Lingzhu Xiang,2022-02-23T05:14:33.171Z,NA,NA | |
865 (https://gitlab.com/libeigen/eigen/-/merge_requests/865),Add assert for edge case if Thin U Requested at runtime,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Discussion in !862 | |
(Ignore the stuff about changing workspace sizes. That's not necessary since this case doesn't work anyway.) | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
There's an old issue when requesting thin unitaries at runtime: | |
``` | |
#include <Eigen/Dense> | |
int main() { | |
using namespace Eigen; | |
using MT = Matrix<double, 5, Dynamic>; | |
MT m = MT::Random(5, 4); // fixed rows > dynamic cols | |
// fails to resize stuff because U is 5x5 at compile-time, but needs to be 5x4. | |
BDCSVD<MT> svd1(m, ComputeThinU | ComputeThinV); | |
} | |
``` | |
I don't think this can be ""fixed,"" so this adds an assert for this case and updates tests to look for the assert. | |
This shouldn't change any behavior, basically just a fix for the tests and to improve the failure message.",Arthur,2022-02-23T05:35:20.399Z,NA,NA | |
863 (https://gitlab.com/libeigen/eigen/-/merge_requests/863),Modify test expression to avoid numerical differences (#2402).,"It looks like when comparing slice to block evaluation and aggressive | |
optimizations (`-O3`), we can sometimes get slightly different numerical | |
results due to operation fusing (e.g. fma). | |
Here we simply modify the reference expression to avoid this. | |
Fixes #2402.",Antonio Sánchez,2022-02-23T16:37:04.053Z,NA,NA | |
868 (https://gitlab.com/libeigen/eigen/-/merge_requests/868),Changes to fast SQRT/RSQRT,"1. x86 processors from Skylake and Zen2 onwards have significantly higher throughput square root units. Therefore, as determined by our benchmarking, it is counter-productive to use Newton-Raphson iteration for SQRT if only SSE or AVX is available and proper handling of corner cases is required. Therefore, this change removes the corresponding specializations of internal::psqrt. Newton-Raphson is still a win for AVX512 for SQRT and for SSE/AVX/AVX512 for RSQRT. | |
2. Add a function for testing packet math functions on IEEE special values {+,-} x {denorm_min, min, 0, inf, NaN}, and fix the generic SQRT/RSQRT implementations to pass this test. If EIGEN_FAST_MATH is 1 we relax the test in subnormal inputs by allowing the function to return the same as the reference with the inputs flushed to zero with the same sign.",Rasmus Munk Larsen,2022-02-23T17:32:23.229Z,NA,NA | |
873 (https://gitlab.com/libeigen/eigen/-/merge_requests/873),Disable deprecated warnings in SVD tests.,"Our tests currently purposely check assertions and consistent behavior | |
of legacy SVD methods that accept runtime options. Now that we have | |
marked them deprecated, GCC/clang spew out many | |
`-Wdeprecated-declaration` warnings. Here we disable them for our tests | |
to remove the crud from the build logs.",Antonio Sánchez,2022-02-23T18:32:01.085Z,NA,NA | |
874 (https://gitlab.com/libeigen/eigen/-/merge_requests/874),Fix gcc-5 packetmath_12 bug.,"There seems to be a gcc-5 bug that's causing `data1` within | |
`packetmath_minus_zero_add()` to be filled with | |
``` | |
(-0, 0), (-0, NaN) | |
``` | |
when optimizations `-O2` or higher are on. This is before any packet | |
operations are called. Printing the values causes the bug to disappear. I've | |
double-checked we're not running into any aliasing bugs (the cast from | |
`std::complex<double>*` to `double*` is legal, and using c++ casts | |
doesn't fix the issue). Compiling with `-fno-strict-aliasing` does *not* | |
solve the issue, so seems to be related to something else. The test works | |
with gcc-6 and later, and all other compilers/versions. | |
Initializing the memory to zeroes causes the bug to disappear.",Antonio Sánchez,2022-02-23T21:56:26.227Z,NA,NA | |
875 (https://gitlab.com/libeigen/eigen/-/merge_requests/875),Fix packetmath compilation error.,"Unfortunately we can't pass a pointer to `psqrt<Packet>` as a functor, since such a | |
function might not exist (e.g. if `HasSqrt` is `false`). The only way to pass an overloaded function | |
as a functor is to wrap it in a struct. Here we create a simple wrapper | |
around `psqrt`, `prsqrt`.",Antonio Sánchez,2022-02-23T23:27:08.997Z,NA,NA | |
877 (https://gitlab.com/libeigen/eigen/-/merge_requests/877),Disable deprecated warnings for SVD tests on MSVC.,"We are currently purposely testing the deprecated behavior, and the | |
warnings are cluttering up the build log.",Antonio Sánchez,2022-02-24T21:20:50.841Z,NA,NA | |
878 (https://gitlab.com/libeigen/eigen/-/merge_requests/878),Fix frexp packetmath tests for MSVC.,"Silly MSVC, `frexp` sets the exponent to 1 for non-finite inputs when the docs say the output | |
is ""unspecified"". Our tests assume it remains zero, so we need to reset it for our reference value.",Antonio Sánchez,2022-02-24T22:16:38.443Z,NA,NA | |
876 (https://gitlab.com/libeigen/eigen/-/merge_requests/876),Fix mixingtypes for g++-11.,"The `_mm512_broadcast_f64x2` instruction is no faster than the | |
`_mm512_broadcast_f32x4` instruction, and is warned about in [intel's | |
documentation](https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=_mm512_broadcast_f64x2&ig_expand=554). There's also a funny interaction within our GEBP kernels with g++-11 that | |
causes the real component to be duplicated to the imaginary component | |
when optimizations are turned on (-O1 or higher). Removing this | |
instruction fixes the issue. | |
This fixes both `mixingtypes_6` and `mixingtypes_7` tests for AVX512.",Antonio Sánchez,2022-02-25T19:28:11.129Z,NA,NA | |
880 (https://gitlab.com/libeigen/eigen/-/merge_requests/880),Fix SVD for MSVC.,"There's an odd bug for MSVC where the `Options` template parameter seems | |
to be forgotten (is always treated as zero) unless we store it within | |
the class as an `enum` or `static constexpr` member. This was causing | |
the `ShouldComputeThinU`, ... members to all always evaluate to `false`, | |
breaking all the fixed-sized SVDs. This was happening with all versions | |
of msvc tested. Storing it and using the stored version fixes this. | |
Also fixed some warnings on MSVC, and temporarily disabled the | |
`EIGEN_DEPRECATED` attribute on one constructor until I get a chance to | |
fix another library that uses it and has `-Werror=deprecated-declarations` | |
enabled by default.",Antonio Sánchez,2022-02-28T19:53:15.957Z,NA,NA | |
879 (https://gitlab.com/libeigen/eigen/-/merge_requests/879),Fix any/all reduction in the case of row-major layout,Fix any/all reduction inefficiency in the case of row-major layout.,Yury Gitman,2022-03-01T05:27:51.110Z,NA,NA | |
882 (https://gitlab.com/libeigen/eigen/-/merge_requests/882),Fix SVD for MSVC+CUDA.,"CUDA gets confused about the definition of `Index`, and can't match the | |
out-of-line definitions of a few functions with their declarations. | |
If we just import the `SVDBase::Index` definition, then gcc/clang get confused | |
about the out-of-line definition. To fix all, we import | |
`SVDBase::Index`, and modify all definitions to use the internal `Index` | |
type. | |
Also addressed annoying warnings about not returning anything at the end | |
of a non-void function.",Antonio Sánchez,2022-03-01T21:35:23.386Z,NA,NA | |
883 (https://gitlab.com/libeigen/eigen/-/merge_requests/883),Adjust tolerance of matrix_power test for MSVC.,Was failing pretty consistently for MSVC 19.16.,Antonio Sánchez,2022-03-01T23:34:00.195Z,NA,NA | |
872 (https://gitlab.com/libeigen/eigen/-/merge_requests/872),Modified sqrt/rsqrt for denormal handling.,"This updates the new generic sqrt/rsqrt implementation after !868 | |
to account for the following: | |
- Better handling of `std::numeric_limits<T>::denorm_min()` (the | |
original incorrectly returns `NaN` for AVX512) | |
- Better handling of denormals in general (will often give correct | |
answers rather than flushing to 0/`inf`) | |
- Faster `sqrt` and `rsqrt` for AVX512 (but slightly slower rsqrt for | |
SSE, AVX had no change) | |
Google benchmark numbers (only significant changes shown): | |
``` | |
Comparing ./sqrt_old_sse4.2 to ./sqrt_new_sse4.2 | |
Benchmark Time CPU Time Old Time New CPU Old CPU New | |
---------------------------------------------------------------------------------------------------------------------- | |
BM_Rsqrt<float>/8/1 +0.1165 +0.1165 5 5 5 5 | |
BM_Rsqrt<float>/64/1 +0.1355 +0.1355 25 28 25 28 | |
BM_Rsqrt<float>/512/1 +0.1340 +0.1340 195 221 195 221 | |
BM_Rsqrt<float>/2048/1 +0.0715 +0.0714 1016 1089 1016 1089 | |
Comparing ./sqrt_old_avx512dq to ./sqrt_new_avx512dq | |
Benchmark Time CPU Time Old Time New CPU Old CPU New | |
---------------------------------------------------------------------------------------------------------------------- | |
BM_Sqrt<float>/8/1 -0.0226 -0.0226 9 8 9 8 | |
BM_Sqrt<float>/64/1 -0.3050 -0.3050 14 9 14 9 | |
BM_Sqrt<float>/512/1 -0.3282 -0.3282 104 70 104 70 | |
BM_Sqrt<float>/2048/1 -0.2790 -0.2790 469 338 469 338 | |
BM_Sqrt<double>/8/1 -0.1990 -0.1990 5 4 5 4 | |
BM_Sqrt<double>/64/1 -0.2366 -0.2366 34 26 34 26 | |
BM_Sqrt<double>/512/1 -0.2236 -0.2236 313 243 313 243 | |
BM_Sqrt<double>/2048/1 -0.2237 -0.2237 1287 999 1287 999 | |
BM_Rsqrt<float>/8/1 +0.0166 +0.0165 5 5 5 5 | |
BM_Rsqrt<float>/64/1 -0.0715 -0.0715 11 10 11 10 | |
BM_Rsqrt<float>/512/1 -0.1097 -0.1097 82 73 82 73 | |
BM_Rsqrt<float>/2048/1 -0.1323 -0.1323 387 335 387 335 | |
BM_Rsqrt<double>/8/1 -0.0874 -0.0874 5 5 5 5 | |
BM_Rsqrt<double>/64/1 -0.1198 -0.1198 31 27 31 27 | |
BM_Rsqrt<double>/512/1 -0.1499 -0.1499 287 244 287 244 | |
BM_Rsqrt<double>/2048/1 -0.1728 -0.1727 1181 977 1181 977 | |
OVERALL_GEOMEAN -0.1616 -0.1616 0 0 0 0 | |
```",Antonio Sánchez,2022-03-02T17:20:49.074Z,NA,NA | |
884 (https://gitlab.com/libeigen/eigen/-/merge_requests/884),Remove poor non-convergence checks in NonLinearOptimization.,"Both the `levenberg_marquardt` and `NonLinearOptimization` tests are | |
essentially the same. For some reason, they purposely check for a | |
specific status that indicates non-convergence, even though there are | |
many closely related statuses based on tolerances. They also check | |
for exact (or near exact) number of iterations. This makes no sense | |
in general when dealing with multiple architectures, nor when | |
considering things like FMA and optimization levels (e.g. `-O3`) can | |
slightly change numerical results. | |
Removed a bunch of poor checks to allow these tests to pass. | |
Note: the NonLinearOptimization subpackage seems to be an ""updated"" | |
version of LevenbergMarquardt. They have a bunch of colliding symbols | |
and the test is nearly identical. We may consider removing one (or | |
both) of these.",Antonio Sánchez,2022-03-02T19:31:20.798Z,NA,NA | |
886 (https://gitlab.com/libeigen/eigen/-/merge_requests/886),Skip denormal test if `Cond` is false.,"Minor bug, the test should be skipped if the packet operation doesn't | |
exist.",Antonio Sánchez,2022-03-03T04:32:13.780Z,NA,NA | |
885 (https://gitlab.com/libeigen/eigen/-/merge_requests/885),Fix enum conversion warnings in BooleanRedux.,NA,Antonio Sánchez,2022-03-03T05:04:27.500Z,NA,NA | |
888 (https://gitlab.com/libeigen/eigen/-/merge_requests/888),Speed lscg by using .noalias,"Hi, Eigen developers | |
This MR aims to speed up `least_square_conjugate_gradient()` by adding `.noalias()`.",Zhuo Zhang,2022-03-03T18:42:19.213Z,NA,NA | |
851 (https://gitlab.com/libeigen/eigen/-/merge_requests/851),Fix JacobiSVD_LAPACKE bindings,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
JacobiSVD_LAPACKE.h doesn't reflect current SVD module for runtime options. :upside_down: This just corrects that. | |
Tested these changes with apple Accelerate.",Arthur,2022-03-03T19:24:08.236Z,NA,NA | |
887 (https://gitlab.com/libeigen/eigen/-/merge_requests/887),Update vectorization_logic tests for all platforms.,"A few of the tests depend on packet (and half packet) availability, | |
implementation of specific packet functions (e.g. add, div), and | |
unrolling limits and costs, which can vary by platform. Here we add | |
the ability to ignore the traversal or unrolling when | |
it is not informative. | |
Also, with some half-packets, if the size of the packet is smaller | |
than 16 bytes (e.g. `Packet2f`), then matrices of certain sizes | |
no longer vectorize with `EIGEN_UNALIGNED_VECTORIZE=0`, since they | |
are assumed unaligned. Adjusted matrix sizes in such cases to | |
get the tests to pass. | |
This *should* help tests pass on all platforms. Tested with | |
SSE, AVX, AVX512, NEON, AltiVec.",Antonio Sánchez,2022-03-03T19:54:16.335Z,NA,NA | |
864 (https://gitlab.com/libeigen/eigen/-/merge_requests/864),Removed EIGEN_UNUSED decorations from many functions that are in fact used,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Sean McBride,2022-03-03T20:19:33.917Z,NA,NA | |
890 (https://gitlab.com/libeigen/eigen/-/merge_requests/890),Remove duplicate IsRowMajor declaration.,"It already exists in DenseBase, and this shadow copy is generating warnings.",Antonio Sánchez,2022-03-04T21:22:03.523Z,NA,NA | |
891 (https://gitlab.com/libeigen/eigen/-/merge_requests/891),Split and reduce SVD test sizes.,"The original tests consume A LOT of memory - causing MSVC to | |
constantly run out of heap space, gcc to sometimes crash without | |
warning, and our CI machines to start using swap space, drastically | |
slowing down compile times. | |
Here we split up and re-number a bunch of the tests. Also | |
reduced some fixed-size matrix sizes - the new stack allocation for U/V | |
gets pretty big, which also seems to be drastically slowing down | |
compile times.",Antonio Sánchez,2022-03-05T00:15:28.613Z,NA,NA | |
893 (https://gitlab.com/libeigen/eigen/-/merge_requests/893),Adds new CMake Options for controlling build components.,"Adds build options `EIGEN_BUILD_BLAS`, `EIGEN_BUILD_LAPACK`, and `EIGEN_BUILD_CMAKE_PACKAGE`. Resurrected from !512.",Antonio Sánchez,2022-03-05T05:49:46.376Z,NA,NA | |
894 (https://gitlab.com/libeigen/eigen/-/merge_requests/894),"Fix broken tensor executor test, allow tensor packets of size 1.","The cxx11_tensor_executor test assumed vectorization was always possible for the given types - though they may not be depending on the platform. Added a check via `packet_traits<T>`. | |
Also modified tensor ops to actually allow `PacketSize == 1`, | |
such as for `Packet1cd`. The README previously said complex | |
was known to be broken, but we've since added some fixes for that.",Antonio Sánchez,2022-03-07T20:30:39.287Z,NA,NA | |
856 (https://gitlab.com/libeigen/eigen/-/merge_requests/856),Add support for Apple's Accelerate sparse matrix solvers,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
This merge request implements support for the sparse matrix solvers found in Apple's Accelerate framework. Wrappers around the following solvers are provided: | |
* AccelerateLLT: Cholesky (LL^T) factorization | |
* AccelerateLDLT: Default LDL^T factorization (Currently an alias to AccelerateLDLTTPP) | |
* AccelerateLDLTUnpivoted: Cholesky-like LDL^T with only 1x1 pivots and no pivoting | |
* AccelerateLDLTSBK: LDL^T with Supernode Bunch-Kaufman and static pivoting | |
* AccelerateLDLTTPP: LDL^T with full threshold partial pivoting | |
* AccelerateQR: QR factorization | |
* AccelerateCholeskyAtA: QR factorization without storing Q (equivalent to A^TA = R^T R) | |
### Performance | |
Notes: | |
* Default solver settings are used in each test. | |
* Eigen QR did not finish in a reasonable amount of time. | |
* Tests run on an Apple M1 Devkit (4e4p cores) | |
Solving system of size: 1946848x1946848 | |
| Solver | Analysis (ms) | Factorization (ms) | Solve (ms) | Total (ms) | | |
|-------------------------|----------|---------------|-------|-------| | |
| AccelerateLDLT | 722 | 9861 | 169 | 10753 | | |
| AccelerateLLT | 744 | 774 | 242 | 1760 | | |
| Eigen LDLT | 1376 | 45685 | 343 | 47405 | | |
| Eigen LLT | 1366 | 45716 | 346 | 47429 | | |
| Cholmod Simplicial LDLT | 12134 | 10171 | 79 | 22385 | | |
| Cholmod Supernodal LLT | 13138 | 864 | 110 | 14114 | | |
Solving system of size: 57504x57504: | |
| Solver | Analysis (ms) | Factorization (ms) | Solve (ms) | Total (ms) | | |
|--------------|----------|---------------|-------|-------| | |
| AccelerateQR | 221 | 2939 | 217 | 3378 | | |
| SPQR | 69 | | 961 | 4371 | | |
| Eigen QR | 64 | > 60000 | | | | |
 | |
### Additional information | |
In order for Accelerate to work, you need to provide which triangle you wish to use (for LDLT and LLT) and the type of matrix that you are factorizing. Eg symmetric, ordinary or triangular. The way that I'm determining this currently is by the exploiting the `UpLo` template argument. | |
The question I have is what would be a sensible `UpLo` default? If you try and run Accelerate's LDLT solver and `m_sparseKind` is not `SparseSymmetric`, it will assert and crash the program. In the way that I'm currently handling it, you would therefore need to do: `AccelerateLDLT<SparseMatrix<double>, Symmetric | Upper> LDLT;` This doesn't exactly follow what the other solvers do however. We could always assume that whatever you're passing into the solver is positive definite, which is what the PARDISO wrapper seems to do. This would work for LDLT and LLT, but not necessarily for QR. | |
QR is a bit tricky as your matrix may not be upper or lower triangular, and may also not be symmetric - ie it's ""ordinary"" as Apple calls it. However UpLoType does not have an appropriate entry for this case, so you must pass 0 in order to set `m_sparseKind` to `SparseOrdinary`. Eg: `AccelerateQR<SparseMatrix<double>, 0> QR;` You can see an example of this in the test program.",John Mather,2022-03-08T00:09:19.119Z,NA,NA | |
897 (https://gitlab.com/libeigen/eigen/-/merge_requests/897),Remove copy_bool workaround for gcc 4.3,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This is a split-off from MR !881 . | |
The testsuite uses a workaround for gcc 4.3, a noughties-era compiler which implements something called C++0x as its highest version of C++. In other words, this compiler can no longer be used to compile Eigen. Torn between turning this workaround into a `constexpr` and outright removing it, I opted for the latter. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
I ran the first few hundred testsuite checks on my PC and found no regressions.",Tobias Schlüter,2022-03-08T17:43:12.165Z,NA,NA | |
895 (https://gitlab.com/libeigen/eigen/-/merge_requests/895),make SparseSolverBase and IterativeSolverBase move constructable,"Hello everyone, | |
Thanks for this powerful library. | |
I witnessed an issue in my code where I wanted to move iterative solvers around. | |
Since they derive from SparseSolverBase they are not copyable. | |
This is fine, but I'm interested in the reason why this is the case. | |
Nevertheless, I propose the change to include move constructors, if C++11 is supported. | |
### What does this implement/fix? | |
This fixes the issue that IterativeSolvers cannot be moved. I.e., it solves the issue outlined in https://godbolt.org/z/rM8M76bqW",Alex_M,2022-03-08T19:41:30.587Z,NA,NA | |
889 (https://gitlab.com/libeigen/eigen/-/merge_requests/889),"Add construct_at, destroy_at wrappers. Use throughout.","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This is the first independent part of my previous merge request 881 https://gitlab.com/libeigen/eigen/-/merge_requests/881 | |
This implements wrappers for C++17 and C++20's `std::destroy_at` and `std::construct_at` and uses them throughout instead of placement new and explicit destructor calls. Since this is independent of the previous merge request, I chose to take a less targeted approach and change all occurences. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
Except for a trap in JacobiSVD (where the order of arguments is reversed between subsequent invocations, so simple copy and paste would be wrong) and in IterativeLinearSolvers/IterativeSolverBase.h (where type names are used inconsistently before this patch so the change may look wrong), there are no surprises here.",Tobias Schlüter,2022-03-08T20:43:23.217Z,NA,NA | |
898 (https://gitlab.com/libeigen/eigen/-/merge_requests/898),Fix edge-case in zeta for large inputs.,"Reported by MLIR folks, large inputs are triggering NaNs in TF/Eigen. | |
NaNs are being triggered in the tail sum correction term due to an overflow | |
in the `a` parameter. Returning zero for such large inputs (e.g. 2000, 2000) | |
is consistent with scipy.",Antonio Sánchez,2022-03-08T21:21:20.766Z,NA,NA | |
896 (https://gitlab.com/libeigen/eigen/-/merge_requests/896),Remove ComputeCpp-specific code from SYCL Vptr,"### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
The virtual pointer code used to rely on ComputeCpp-specific details, but can now be implemented using the reinterpret functionality of the SYCL buffer class. | |
### Additional information | |
<!--Any additional information you think is important.-->",Duncan McBain,2022-03-08T22:44:18.751Z,NA,NA | |
901 (https://gitlab.com/libeigen/eigen/-/merge_requests/901),Fix construct_at compilation breakage on ROCm.,"Enable the recently added construct_at and destroy_at functions to build on HIP. | |
/cc @cantonios",Rohit Santhanam,2022-03-09T17:48:18.620Z,NA,NA | |
902 (https://gitlab.com/libeigen/eigen/-/merge_requests/902),Temporarily disable aarch64 CI.,Disable Arm CI as WoA machines are temporarily down.,Everton Constantino,2022-03-10T14:46:41.657Z,NA,NA | |
900 (https://gitlab.com/libeigen/eigen/-/merge_requests/900),Fix swap test for size 1 inputs.,"The swap of a matrix and its first row actually passes in this case, | |
causing the assertion test to fail sporadically.",Antonio Sánchez,2022-03-10T15:05:59.488Z,NA,NA | |
903 (https://gitlab.com/libeigen/eigen/-/merge_requests/903),"Convert bit calculation to constexpr, avoid casts.","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This is a simple patch, which changes `enum`s to `static constexpr int` in evaluation of floating point bit sizes. The advantage of this is that a few casts can be removed. `static constexpr int` is quite a mouthful, so I chose to not repeat it for every single variable. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
I was looking more globally at replacing `enum`s in traits with `constexpr`, mainly to not have to cast to int all the time when comparing the dimensions of matrices but that is a larger project of which I could make this little spin-off. | |
I ran most of the testsuite, in particular the `packetmath` tests with no errors.",Tobias Schlüter,2022-03-14T18:59:35.958Z,NA,NA | |
907 (https://gitlab.com/libeigen/eigen/-/merge_requests/907),Fix up PowerPC MMA flags so it builds by default.,"Introduces the build option | |
``` | |
EIGEN_ALTIVEC_ENABLE_MMA_DYNAMIC_DISPATCH | |
``` | |
If set and MMA builtins are available, will allow the dynamic dispatch | |
path. If building for power10 and mma (`-mmma -cpu=power10`), switches | |
to always use MMA. | |
~~Removed the `EIGEN_ALTIVEC_DISABLE_MMA` option.~~ edit: put it back in. | |
Fixes #2457, and partly fixes #2324 in that the LTO issue should now | |
be avoided by default.",Antonio Sánchez,2022-03-15T20:22:23.690Z,NA,NA | |
910 (https://gitlab.com/libeigen/eigen/-/merge_requests/910),"Revert ""Fix up PowerPC MMA flags so it builds by default.""",Premature merge.,Rasmus Munk Larsen,2022-03-15T20:51:04.306Z,NA,NA | |
909 (https://gitlab.com/libeigen/eigen/-/merge_requests/909),Remove workarounds for bad GCC-4 warnings,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
No reference, just stumbled on these | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Minor changes. This removes a few smelly workarounds that were done to avoid bad warnings in old versions of GCC. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
I ran tests with gcc 5.5 and didn't see any of the warnings mentioned. | |
Also tried making these changes from the 3.4 branch, against gcc 4.9.",Arthur,2022-03-16T00:08:17.649Z,NA,NA | |
829 (https://gitlab.com/libeigen/eigen/-/merge_requests/829),Replace Eigen type metaprogramming with corresponding std types and make use of alias templates,"This MR removes a bunch of metaprogramming facilities in eigen in favour of their std versions. It also replaces | |
`typename type_trait<X>::type` with `type_trait_t<X>` for these types. (Overall, this reduces the number of such constructs, `typename .*::type`, from about 2000 to 1300). | |
I've run the test suite on CPU, but not on other devices. Given the large amount of small changes that look extremely similar, I think it would be good to also run extensive tests on CUDA and SYCL before merging.",Erik Schultheis,2022-03-16T16:43:40.744Z,NA,NA | |
914 (https://gitlab.com/libeigen/eigen/-/merge_requests/914),Disable schur non-convergence test.,"It seems that about half the time, the schur decomposition passes when | |
the maximum number of iterations is set to 1, so checking that it | |
doesn't converge leads to flaky results. Disabling the non-convergence | |
check. | |
Fixes #2458",Antonio Sánchez,2022-03-16T17:33:54.017Z,NA,NA | |
834 (https://gitlab.com/libeigen/eigen/-/merge_requests/834),AVX512 Optimizations for Triangular Solve,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This MR includes optimized AVX512 kernels to improve fp32/fp64 Triangular Solve performance. These kernels are ""nocopy"" (i.e matrices are not packed, `GEBP` not used) and are meant to get better performance for smaller problem sizes (only inner strides of 1 are supported). The existing generic implementation of this functionality was pulled into separate functions `trsmKernelL/trsmKernelR` to allow dropping in the optimized versions when needed in `TriangularSolverMatrix.h`. | |
Changes: | |
- `Eigen/src/Core/products/TriangularSolverMatrix.h`: Replaced the previous inner triangular solve loop with a wrapper to the original implementation (`trsmKernelL`, `trsmKernelR`). | |
- `Eigen/src/Core/arch/AVX512/trsmKernel_impl.hpp`: The optimized kernels are implemented here. Template specializations to fp32/fp64 `trsmKernelL` and `trsmKernelR` are here as well. | |
- The solve kernel, `trisolve`, solves `AX=B` where `A` is `MxM` triangular, and `B` is `MxN`. `A` and `B` can be row/col-major and `A` can be upper/lower triangular. Combinations of these layouts can be used to handle the cases where `A` is on the right. | |
- `gemm_MNK__` is used to update panels of `B` and computes `C -= A*B`. This can be reused for Matrix Multiply optimizations (smaller sizes for certain transpose cases) | |
- Both these kernels use various unrolls which are generated recursively using templates. | |
- For small/medium sizes the solve kernel (**built with clang**) is generally faster and is used directly for the entire problem. TODO: improve heuristics for determining when to use kernels directly (current cutoffs determined from quick benchmarking). | |
- **Note**: we have noticed increases in compile time as a result of these changes. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
Here are some performance results of fp32/fp64 triangular solve with the optimized kernels. The charts are for the RUN (right, upper, non-transposed) and LLN (left, lower, non-transposed) trsm cases. The metric is flops/cycle, measured on `Intel(R) Xeon(R) Gold 6336Y` (peak is 64 flops/cycle for fp32, 32 flops/cycle for fp64). Compilers used were `g++`/`clang++` with versions `8.4.1` and `11.0.0` respectively. | |
For the RUN case, the data in the matrices are organized in the most optimal way (both A/B are row-major) so this provides the best performance of the 8 cases. For the LLN case, we do intermediate transposes so performance here is not as great. For large problem sizes triangular solve performance is entirely dependent on `GEBP` performance. Currently GNU compilers are generating sub-optimal code for the gemm micro kernel. We are seeing some register spilling not present in clang (this is mentioned in comments in the code). This only impacts performance for smaller sizes, for larger sizes performance using either compilers were similar. | |
 | |
 | |
 | |
 | |
+@aaraujom",b-shi,2022-03-16T18:04:51.497Z,NA,NA | |
913 (https://gitlab.com/libeigen/eigen/-/merge_requests/913),Fix up PowerPC MMA flags so it builds by default.,"Introduces the build option | |
``` | |
EIGEN_ALTIVEC_ENABLE_MMA_DYNAMIC_DISPATCH | |
``` | |
If set and MMA builtins are available, will allow the dynamic dispatch path. Otherwise, if building for power10 and mma (`-mmma -cpu=power10`), switches to always use MMA. | |
Fixes #2457, and partly fixes #2324 in that the LTO issue should now be avoided by default.",Antonio Sánchez,2022-03-16T19:16:29.262Z,NA,NA | |
915 (https://gitlab.com/libeigen/eigen/-/merge_requests/915),Fix missing pound,NA,Antonio Sánchez,2022-03-16T19:26:49.477Z,NA,NA | |
917 (https://gitlab.com/libeigen/eigen/-/merge_requests/917),Work around g++-10 docker issue for geo_orthomethods_4.,"There seems to be a weird compiler bug in the g++-10 version of the | |
`ubuntu:20.04` docker image that is optimizing out one of our test vectors | |
in `geo_orthomethods_4` and causing a test failure. The weirder part is that on a machine | |
actually running ubuntu 20.04, with the exact same version of g++-10, | |
the test passes as-is. Reported version on both is | |
``` | |
g++-10 (Ubuntu 10.3.0-1ubuntu1~20.04) 10.3.0 | |
``` | |
To reproduce the original failure in docker, | |
``` | |
docker run -it ubuntu:20.04 | |
export DEBIAN_FRONTEND=noninteractive | |
apt update | |
apt-get install -y --no-install-recommends software-properties-common git | |
add-apt-repository -y ppa:ubuntu-toolchain-r/test | |
apt update | |
apt install g++-10 | |
git clone https://gitlab.com/libeigen/eigen.git | |
g++-10 -Ieigen -DEIGEN_TEST_PART_4=1 -O3 -mfma eigen/test/geo_orthomethods.cpp -o geo_orthomethods_4 | |
./geo_orthomethods_4 | |
``` | |
By storing the casted vector into `v2`, we seem to work around the issue.",Antonio Sánchez,2022-03-16T21:46:05.208Z,NA,NA | |
919 (https://gitlab.com/libeigen/eigen/-/merge_requests/919),Completed a missing parenthesis in tutorial.,Added a missing parenthesis in tutorial code.,Øystein Sørensen,2022-03-17T14:52:08.842Z,NA,NA | |
911 (https://gitlab.com/libeigen/eigen/-/merge_requests/911),Fix RowMajorBit <-> RowMajor mixup.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This fixes a bit of code that assumes that `RowMajorBit == RowMajor` and `ColMajor == 0`. Both are true, but both shouldn't be relied on. | |
### Additional information | |
<!--Any additional information you think is important.-->",Tobias Schlüter,2022-03-17T15:28:13.578Z,NA,NA | |
922 (https://gitlab.com/libeigen/eigen/-/merge_requests/922),Work around MSVC compiler bug dropping `const`.,"MSVC seems to drop the `const` from the underlying `Const**ReturnType` | |
when trying to match the out-of-line definition of `transpose()` and | |
`diagonal()` to the declaration. When using `is_same` and `is_const` | |
to inspect the types the `const` *is* actually there... it's just | |
ignored when trying to find the corresponding definition. | |
Adding an extra `const` seems to fix this. | |
Fixes #2464",Antonio Sánchez,2022-03-17T20:50:26.983Z,NA,NA | |
916 (https://gitlab.com/libeigen/eigen/-/merge_requests/916),Change EIGEN_ALTIVEC_ENABLE_MMA_DYNAMIC_DISPATCH and EIGEN_ALTIVEC_DISABLE_MMA flags to be like TensorFlow's...,"Change EIGEN_ALTIVEC_ENABLE_MMA_DYNAMIC_DISPATCH and EIGEN_ALTIVEC_DISABLE_MMA flags to be like a TensorFlow's flag (either 0 or 1 - rather than undefined or defined). This will allow TensorFlow to control this flag. | |
Update documentation of new Altivec MMA flags.",Chip Kerchner,2022-03-17T22:35:28.858Z,NA,NA | |
923 (https://gitlab.com/libeigen/eigen/-/merge_requests/923),Fix AVX512 builds with MSVC.,"Fix AVX512 builds with MSVC. | |
MSVC doesn't allow `reinterpret_cast` between types `__m512` and | |
`__m512d` and the like. Requires the use of the explicit cast | |
intrinsic. | |
Also actually add the AVX512 test option for MSVC for our CI... | |
previously was always testing only SSE. | |
Fixes #2466.",Antonio Sánchez,2022-03-18T16:04:53.960Z,NA,NA | |
925 (https://gitlab.com/libeigen/eigen/-/merge_requests/925),Fix ODR violation in trsm.,"Functions need to be marked inline. | |
Fixes #2468.",Antonio Sánchez,2022-03-20T15:56:55.474Z,NA,NA | |
926 (https://gitlab.com/libeigen/eigen/-/merge_requests/926),Fix usages of wrong namespace,"### What does this implement/fix? | |
Fix compilation errors introduced by 421cbf086. | |
Changes were formatted and tested with SYCL tests.",Romain Biessy,2022-03-21T15:07:56.022Z,NA,NA | |
927 (https://gitlab.com/libeigen/eigen/-/merge_requests/927),Update warning suppression to latest.,Fixes #2453.,Antonio Sánchez,2022-03-21T15:56:04.552Z,NA,NA | |
921 (https://gitlab.com/libeigen/eigen/-/merge_requests/921),Optimize visitor traversal in case of RowMajor.,"Non-vectorized paths previously always traversed in col-major order. | |
Here we check for layout and traverse in row-major for RowMajor inputs. | |
Fixes #2463.",Antonio Sánchez,2022-03-23T15:27:58.595Z,NA,NA | |
929 (https://gitlab.com/libeigen/eigen/-/merge_requests/929),Split general_matrix_vector_product interface for Power into two macros - one ColMajor and RowMajor.,Fixes TensorFlow compilation issues related to GEMV.,Chip Kerchner,2022-03-23T18:09:34.188Z,NA,NA | |
798 (https://gitlab.com/libeigen/eigen/-/merge_requests/798),Add a NNLS solver to unsupported - issue #655,"### Reference issue | |
#655 | |
### What does this implement/fix? | |
This adds a non-negative least squares (NNLS) solver to unsupported. | |
The algorithm is the standard active-set algorithm from the text ""Solving Least Squares Problems"", Charles L. Lawson and Richard J. Hanson. | |
It's also the same algorithm described on the wikipedia page https://en.wikipedia.org/wiki/Non-negative_least_squares (accessed 2022-01-04). | |
The API is similar to the Sparse Solver concept (i.e. is has `compute`, `solve`, and `info` methods). | |
However, it does not inherit from a Sparse Solver class, because this code applies to dense matrices. | |
### Additional information | |
The code has a bit of history. | |
I believe it was started by Hannes Matuschek (who opened issue #655) in 2014 and that history is visible on his git repo https://github.com/hmatuschek/eigen3-nnls. | |
Based on the contents of that repo, I suspect it began life as an f2c translation from an older FORTRAN program. | |
Between 2018-2022 I'm aware of no work on this solver. | |
Then, at the beginning of 2022, I did some significant refactoring and testing to prepare this PR. | |
The hmatuschek/eigen3-nnls version has been in use for many years. | |
The main changes relative to that version are: | |
- lots more tests | |
- refactoring the API to resemble Eigen's other iterative solvers, e.g. the 'compute' and 'solve' and 'info' API. | |
- deleting some non-essential and broken parts",Essex Edwards,2022-03-23T20:20:45.502Z,NA,NA | |
918 (https://gitlab.com/libeigen/eigen/-/merge_requests/918),Add missing explicit reinterprets,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
In https://gitlab.com/libeigen/eigen/-/merge_requests/834, there is a missing explicit reinterprets in `_mm512_shuffle_f32x4` which causes some build errors when using g++.",b-shi,2022-03-23T21:10:26.752Z,NA,NA | |
930 (https://gitlab.com/libeigen/eigen/-/merge_requests/930),added a missing typename and fixed a unused typedef warning,this adds a missing typename that breaks compilation on gcc 9. Also removes an unused typedef to get rid of the associated warning.,Erik Schultheis,2022-03-24T19:48:47.393Z,NA,NA | |
931 (https://gitlab.com/libeigen/eigen/-/merge_requests/931),Enable Aarch64 CI,Aarch64 CI machines are back online so we should reenable those pipelines.,Everton Constantino,2022-03-24T20:10:51.717Z,NA,NA | |
892 (https://gitlab.com/libeigen/eigen/-/merge_requests/892),"Add is_constant_evaluated, update alignment checks","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This is the second of the separate MRs for MR https://gitlab.com/libeigen/eigen/-/merge_requests/881 | |
It adds a wrapper for `std::is_constant_evaluated` which evaluates to false in the case of C++ versions which don't have that function. The alignment check assertions are disabled in constant evaluation using the new wrapper function, and lastly, and in line with the extant comment in the `block_evaluator` constructor, `eigen_assert` there is replaced with `eigen_internal_assert` (this is an unrelated change of course, but it seemed prudent to do it while someone is actually looking at the code).",Tobias Schlüter,2022-03-25T04:00:58.776Z,NA,NA | |
937 (https://gitlab.com/libeigen/eigen/-/merge_requests/937),Eliminate trace unused warning.,NA,Antonio Sánchez,2022-03-29T22:25:20.090Z,NA,NA | |
924 (https://gitlab.com/libeigen/eigen/-/merge_requests/924),Disable f16c scalar conversions for MSVC.,"MSVC seems to be lacking the scalar conversion instructions. | |
It has the vector ones. | |
Bug reported over at TensorFlow: https://github.com/tensorflow/tensorflow/issues/54397 | |
Similiar issue found in an intel repo: Similar issue found in an intel repo: https://github.com/intel/tinycbor/pull/193",Antonio Sánchez,2022-03-30T18:35:33.143Z,NA,NA | |
939 (https://gitlab.com/libeigen/eigen/-/merge_requests/939),Don't include .cpp in lapack.,"This is bad practice in general. | |
I didn't rename these as `.h` since they're not really headers | |
either, they are special macro-dependent implementation details | |
that *could* be considered ""textual headers"" from a modules | |
perspective (https://clang.llvm.org/docs/Modules.html).",Antonio Sánchez,2022-03-30T21:41:57.372Z,NA,NA | |
934 (https://gitlab.com/libeigen/eigen/-/merge_requests/934),fixed order of arguments in blas syrk,"During my work on !906 I came across this error: The order of the template arguments is wrong. This becomes noticable as a compile error once the Row/ColMajor arguments are no longer implicitly convertible to/from bool -- then the template instantiation will be rejected. | |
TODO: Figure out why this was not caught by any automated test. | |
Verify that the new code is actually doing what BLAS specifies.",Erik Schultheis,2022-03-30T22:05:08.191Z,NA,NA | |
941 (https://gitlab.com/libeigen/eigen/-/merge_requests/941),Consider inf/nan in scalar test_isApprox.,"Otherwise we periodically get comparisons of `VERIFY_IS_APPROX(inf, | |
inf)`, which probably should be `true`, but instead fails.",Antonio Sánchez,2022-04-01T17:00:25.259Z,NA,NA | |
940 (https://gitlab.com/libeigen/eigen/-/merge_requests/940),Add back std::remove* aliases - third-party libraries rely on these.,"Removing them broke a bunch of 3P libs that need to be updated. That will likely take some time, so putting aliases back in.",Antonio Sánchez,2022-04-01T17:23:11.380Z,NA,NA | |
854 (https://gitlab.com/libeigen/eigen/-/merge_requests/854),Added Scaling function overload for vector rvalue reference,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
#2431 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Provides an overload to `Scaling` that takes an rvalue reference, so that if a user attempts to generate a diagonal (scaling) matrix from one, e.g. a temporary Vector<Scalar, Size>, they obtain a valid matrix. The current `Scaling` overload that is used when an eigen user tries to do this returns a `DiagonalWrapper<Derived>` unlike all the other `Scaling` functions. | |
The linked issue explains the issue in more detail, and shows an example of a user falling into this trap, as well as a code example demonstrating this overload working in the desired case. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
An alternative fix is to not supply an additional overload, but change the current overload that takes a vector and returns a `DiagonalWrapper<Derived>` and instead return a `DiagonalMatrix`, such as by the method used in the new overload in this MR.",William Talbot,2022-04-04T16:50:09.765Z,NA,NA | |
904 (https://gitlab.com/libeigen/eigen/-/merge_requests/904),static const class members turned into constexpr,"This MR turns `static const` class member variables into `static constexpr`. | |
Based on the discussion in the other MR, I've left the enums as they are. | |
There are also some `static const` variables at function score, I haven't looked at those.",Erik Schultheis,2022-04-04T17:33:33.782Z,NA,NA | |
943 (https://gitlab.com/libeigen/eigen/-/merge_requests/943),More constexpr helpers,"This MR converts some helper functions in `XprHelper.h` from template metaprogramming to `constexpr` functions. The change functions are | |
* `compute_default_alignment_helper` | |
* `compute_matrix_flags` | |
* `size_at_compile_time` | |
When replacing usages of the latter, I noticed that in many cases the call site could actually be simplified by using the already existing helper `size_of_xpr_at_compile_time`, so I've updated these. | |
In order to get these to compile, I also had to mark `ignore_unused_variable` as a `constexpr` function.",Erik Schultheis,2022-04-04T18:38:35.562Z,NA,NA | |
936 (https://gitlab.com/libeigen/eigen/-/merge_requests/936),Performance improvements in GEMM for Power,"Added vector_pair loads for LHS of GEMM for MMA (10% faster) | |
An extra accumulator for extra_row of GEMM for MMA & VSX (non-vectorized right portion of the matrix executes for essentially free in almost all cases - 1200% / number of columns eliminated) | |
Single pass for extra_col of GEMM for VSX (bottom of the matrix executes in a single pass versus up to 3 passes - 2400% / number of rows faster). | |
Other minor performance changes.",Chip Kerchner,2022-04-05T12:18:53.720Z,NA,NA | |
944 (https://gitlab.com/libeigen/eigen/-/merge_requests/944),constexpr reshape helper,This changes one more metaprogramming utility from template to constexpr function.,Erik Schultheis,2022-04-05T17:32:18.302Z,NA,NA | |
942 (https://gitlab.com/libeigen/eigen/-/merge_requests/942),Fix navbar scroll with toc.,"Looks like doxygen's navtree js has changed, requiring us to change our | |
override hacks to resize the navbar to fit our TOC. Because the `resizeHeight` | |
function is now privately nested in another function, we need to override the | |
entire outer `initResizable()`. | |
Also requires adjusting the position - the absolute position for | |
`div.toc` seems to interfere with scrollbars. | |
Fixes #2467",Antonio Sánchez,2022-04-05T20:14:22.809Z,NA,NA | |
945 (https://gitlab.com/libeigen/eigen/-/merge_requests/945),Fix some max size expressions.,These were accidentally replaced in !943.,Antonio Sánchez,2022-04-06T22:19:58.469Z,NA,NA | |
949 (https://gitlab.com/libeigen/eigen/-/merge_requests/949),Fix ODR issues in lapacke_helpers.,Fixes #2473,Antonio Sánchez,2022-04-08T15:31:30.863Z,NA,NA | |
948 (https://gitlab.com/libeigen/eigen/-/merge_requests/948),Fix MSVC+CUDA issues.,"Darn MSVC+CUDA gets confused for diagonal and transpose again, not able | |
to match out-of-line definitions with the corresponding declarations. | |
~~Removed~~ Modified internal typedefs ~~and just use the true type~~ to get around this. | |
MSVC also complained about not passing enough arguments to function-like | |
macro, and about invalid friend declarations. Removed unused macro | |
argument, and explicitly specified friend classes to get around these.",Antonio Sánchez,2022-04-08T18:05:33.193Z,NA,NA | |
946 (https://gitlab.com/libeigen/eigen/-/merge_requests/946),Remove EIGEN_EMPTY_STRUCT_CTOR,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This removes the old empty struct workaround for gcc as discussed in !899 | |
### Additional information | |
<!--Any additional information you think is important.-->",Tobias Schlüter,2022-04-08T18:47:57.892Z,NA,NA | |
953 (https://gitlab.com/libeigen/eigen/-/merge_requests/953),Fix ambiguous DiagonalMatrix constructors.,"The following became ambiguous: | |
``` | |
const Eigen::DiagonalMatrix<double, 4> m({1, -1, -1, 1}); | |
``` | |
since the initializer list `{1, -1, -1, 1}` could create either a | |
`DiagonalMatrix` with list of scalars, *or* a `DiagonalVectorType` with list of scalars. | |
Added a single initializer list constructor to avoid this ambiguity.",Antonio Sánchez,2022-04-11T19:13:26.180Z,NA,NA | |
951 (https://gitlab.com/libeigen/eigen/-/merge_requests/951),Fix Power GEMV order of operations in predux for MMA.,A rare unit test failure showed a difference of the order of operations in predux (add all the elements together for a Packet) for GEMV MMA. Changing it to be similar to the VSX version and reduce the number of instructions from 20 to 7. Fix some inline assembly for GCC that no longer compiles/assembles.,Chip Kerchner,2022-04-11T21:29:06.196Z,NA,NA | |
952 (https://gitlab.com/libeigen/eigen/-/merge_requests/952),Allow all tests to pass with `EIGEN_TEST_NO_EXPLICIT_VECTORIZATION`,"Some tests currently fail due to alignment assumptions. Here we just | |
work around the failing tests. | |
Identified in #2470.",Antonio Sánchez,2022-04-12T14:48:24.054Z,NA,NA | |
959 (https://gitlab.com/libeigen/eigen/-/merge_requests/959),"Restrict new AVX512 trsm to AVX512VL, rename files for consistency.","Some of the newly added AVX512 packetmath functions, including | |
the masked add for AVX require `__AVX512VL__`. | |
Also renamed headers for consistency with the rest of Eigen: | |
- upper-camel case `.h` | |
- non-standard headers are `.inc` (i.e. require being included in odd places)",Antonio Sánchez,2022-04-14T16:58:33.189Z,NA,NA | |
960 (https://gitlab.com/libeigen/eigen/-/merge_requests/960),Remove AVX512VL dependency in trsm,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
This PR addresses the issue mentioned in https://gitlab.com/libeigen/eigen/-/merge_requests/959. The `_mm256_mask*` instrinsics are not supported in `AVX512F` (`-mfma -avx512f`) and requires `AVX512F + AVX512VL`. To fix this we switch to corresponding `_mm512_mask*` intrinsics and reinterpret `zmm <-> ymm` when necessary. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
In https://gitlab.com/libeigen/eigen/-/merge_requests/834 `-march=native` was used for performance testing. With `-march=native` the changes here do not cause any performance regressions. With `-mfma -mavx512f` performance is lower for smaller problem sizes in cases requiring intermediate transposes.",b-shi,2022-04-14T20:55:13.848Z,NA,NA | |
962 (https://gitlab.com/libeigen/eigen/-/merge_requests/962),Update HouseholderSequence.h,"Applying a Householder sequence to the left of a vector (mostly commonly Q from a QR factorization) results in an avoidable heap allocation. Previous fix did not consider right hand sides with fixed number of columns when the argument `inputIsIdentity` is `true`. This fix checks `inputIsIdentity`, and if true, uses a dynamic size block with `bottomRightCorner()`. Otherwise, a fixed size column block with `bottomRows()` is used which implicitly passes 'Dest::ColsAtCompileTime'. Same logic is applied to the block case. Passing `Dest::ColsAtCompileTime` to `internal::apply_block_householder_on_the_left` revealed a typo: `TFactorSize` should correspond to `VectorsType::ColsAtCompileTime`.",Charles Schlosser,2022-04-15T16:56:17.831Z,NA,NA | |
963 (https://gitlab.com/libeigen/eigen/-/merge_requests/963),Fix cwise NaN propagation for scalar input.,"Was missing a template parameter. Updated tests. | |
Fixes #2474.",Antonio Sánchez,2022-04-16T05:07:44.501Z,NA,NA | |
964 (https://gitlab.com/libeigen/eigen/-/merge_requests/964),Fix HouseholderSequence.h,"The InnerPanel template parameter is not always false and in those cases where it is true, the assignment won't compile. | |
/cc @cantonios",Rohit Santhanam,2022-04-17T06:08:03.072Z,NA,NA | |
965 (https://gitlab.com/libeigen/eigen/-/merge_requests/965),"Add fused multiply functions for PowerPC - pmsub, pnmadd and pnmsub","Add fused multiply functions for PowerPC - pmsub, pnmadd and pnmsub",Chip Kerchner,2022-04-18T16:16:32.757Z,NA,NA | |
958 (https://gitlab.com/libeigen/eigen/-/merge_requests/958),Fix compiler bugs for GCC 10 & 11 for Power GEMM,Inline assembly for load vector pair broken for GCC 10 & 11. Using multiple vector pairs broken for GCC 10.,Chip Kerchner,2022-04-20T15:59:00.776Z,NA,NA | |
966 (https://gitlab.com/libeigen/eigen/-/merge_requests/966),Removed need to supply the Symmetric flag to UpLo argument for Accelerate LLT and LDLT,"This pull request removes the need to supply the Symmetric flag to the UpLo template argument for Accelerate's LLT and LDLT. | |
The Accelerate LLT and LDLT solvers require the supplied matrices to be symmetric. We therefore make it easier to utilize the support module by implicitly ORing the Symmetric flag with the supplied UpLo argument. | |
The UpLo argument to AccelerateQR and AccelerateCholeskyAtA has also been removed as it was unnecessary.",John Mather,2022-04-21T20:02:10.996Z,NA,NA | |
967 (https://gitlab.com/libeigen/eigen/-/merge_requests/967),Add load vector_pairs for RHS of GEMM MMA. Improved predux GEMV.,"Add load vector_pairs for RHS of GEMM MMA (10% faster in some situations). Improved predux GEMV - use vectors instead of scalars. General cleanup of GEMV - remove unnecessary typename Index from GEMM, etc.",Chip Kerchner,2022-04-25T16:23:02.162Z,NA,NA | |
968 (https://gitlab.com/libeigen/eigen/-/merge_requests/968),make diagonal matrix cols() and rows() methods constexpr,"### What does this implement/fix? | |
This PR adds the Eigen constexpr macro `EIGEN_CONSTEXPR` to the `cols` and `rows()` methods of `Eigen::DiagonalMatrix`. | |
This also lines it up with `Eigen::Matrix` which already has this constexpr methods. This is then inline with https://gitlab.com/libeigen/eigen/-/issues/2152.",Alex_M,2022-05-05T15:29:21.993Z,NA,NA | |
969 (https://gitlab.com/libeigen/eigen/-/merge_requests/969),Add `uninstall` target only if not already defined.,"### What does this implement/fix? | |
As suggested by https://gitlab.kitware.com/cmake/community/-/wikis/FAQ#can-i-do-make-uninstall-with-cmake it is common to check for the `uninstall` target to exist before adding it. | |
While this is not a problem for standalone projects or for projects using `ExternalProject` (https://cmake.org/cmake/help/latest/module/ExternalProject.html), while using FetchContent (https://cmake.org/cmake/help/latest/module/FetchContent.html) all targets are ""imported"" into a single namespace. | |
It is reasonable to let the user know if a dependency already defines a specific target, but the `uninstall` meta target has a common name and for this reason it is a good practice to check for its existence before defining it. | |
This change should not have any impact on users installing the project separately, but will allow the use of Eigen with FetchContent",Francesco Romano,2022-05-05T17:43:10.496Z,NA,NA | |
356 (https://gitlab.com/libeigen/eigen/-/merge_requests/356),Adding PocketFFT support in FFT module since kissfft has some flaw in accuracy and performance,"Eigen using KissFFT as the default fft implementation now,but it`s performance drops sharply in some specific situations since kissfft fail to handle ""odd-sized"" inputs (i.e., sizes with large factors) efficiently, which was proposed in #1717.In pocketfft, for lengths with very large prime factors, Bluestein's algorithm is used, and instead of an FFT of length n, a convolution of length ```n2 >= 2*n-1``` is performed,where n is chosen to be highly composite. | |
In addition, the google/jax project uses eigen's fft as its backend before, which also has problems with performance and accuracy. You can see their discussion [FFT precision/performance #2952](https://github.com/google/jax/issues/2952). Currently, they have switched to pocketfft as their default fft implementation. | |
Prelininary performance comparison(complex to complex and -O3 optimization ): | |
``` | |
------------------------------------------- | |
length kissfft pocketfft | |
------------------------------------------- | |
100000 7.03 ms 4.36 ms | |
100000*2 13.7 ms 9.67 ms | |
100001 14999 ms 24.3 ms | |
``` | |
Updated benchmarks tested by fft_benchmark.cpp on AArch64 (time units:ns) : | |
``` | |
Benchmark Time CPU Time Old Time New CPU Old CPU New | |
------------------------------------------------------------------------------------------------------------------------- | |
test_scalar_float/100 -0.1687 -0.1688 2555 2124 2554 2123 | |
test_scalar_float/512 -0.4176 -0.4176 14773 8603 14767 8600 | |
test_scalar_float/4096 -0.5413 -0.5413 158755 72818 158684 72781 | |
test_scalar_float/32768 -0.4713 -0.4714 1639528 866753 1638255 866018 | |
test_scalar_float/100000 -0.5012 -0.5012 6823375 3403716 6817207 3400439 | |
test_scalar_float/100001 -0.9988 -0.9988 22048197988 25454072 22029408670 25420215 | |
test_scalar_double/100 -0.1298 -0.1298 2421 2107 2420 2106 | |
test_scalar_double/512 -0.3835 -0.3835 13892 8565 13885 8561 | |
test_scalar_double/4096 -0.5060 -0.5060 153078 75624 153005 75581 | |
test_scalar_double/32768 -0.4349 -0.4350 1912198 1080593 1910733 1079557 | |
test_scalar_double/100000 -0.4256 -0.4256 7089696 4072526 7083264 4068571 | |
test_scalar_double/100001 -0.9985 -0.9985 29955156310 44809112 29930479990 44739807 | |
test_complex_float/100 -0.4638 -0.4638 4454 2388 4452 2387 | |
test_complex_float/512 -0.6285 -0.6285 29882 11102 29869 11097 | |
test_complex_float/4096 -0.6174 -0.6174 285366 109192 285238 109137 | |
test_complex_float/32768 -0.6265 -0.6265 3698800 1381495 3695956 1380318 | |
test_complex_float/100000 -0.6730 -0.6730 13653536 4465063 13642033 4460459 | |
test_complex_float/100001 -0.9990 -0.9990 22388960173 22772174 22370358320 22745811 | |
test_complex_double/100 -0.1601 -0.1602 4239 3560 4237 3558 | |
test_complex_double/512 -0.6085 -0.6085 28463 11143 28452 11138 | |
test_complex_double/4096 -0.5924 -0.5924 294522 120061 294352 119977 | |
test_complex_double/32768 -0.5963 -0.5963 4087268 1650075 4083765 1648615 | |
test_complex_double/100000 -0.5011 -0.5011 15675452 7820870 15659329 7811764 | |
test_complex_double/100001 -0.9985 -0.9985 28986634903 43583605 28961759140 43521986 | |
``` | |
[fft_benchmark.cpp](/uploads/163b7e8dcde4510db63e134e1f3ce218/fft_benchmark.cpp)",Guoqiang QI,2022-05-11T17:44:22.976Z,MR to reopen,NA | |
860 (https://gitlab.com/libeigen/eigen/-/merge_requests/860),Add AVX512 optimizations for matrix multiply,"## Edit | |
I've refactored the original implementation in this merge request to use packet math and avoid inline asm/intrinsics as much as possible. It also supports double precision. | |
The changes implement optimizations for compute kernels use 48x8 and 24x8 unrolls for single and double precision respectively. Tail handling is done with powers of 2 when possible, that is, when packing routines (`gemm_pack_rhs` and `gemm_pack_lhs`) support it. If not supported, we loop over ones as done before. | |
The new kernels do not support inner stride different than one for C matrix, hence, we fallback to Eigen's previously used kernels (with `nr == 4`). We need to make decision at `gebp_traits` stage such all kernels are compatible to avoid more intrusive changes in other Eigen drivers that use `gebp_kernel`, `gemm_pack_rhs` and `gemm_pack_lhs`. | |
I've also added a couple macros to reduce register pressure, which is very high: 24 accumulators + 6 registers for load A and 2 for loading B. Using `EIGEN_ARCH_AVX512_GEMM_KERNEL_USE_LESS_A_REGS` or `EIGEN_ARCH_AVX512_GEMM_KERNEL_USE_LESS_B_REGS` will reduce register use for A and B by half. Performance lost was not that much (less than 2% for large sizes). For gcc we use 3 register to load A by default, since it was the only way I was able to avoid the zmm register spills. | |
I've build the tests and run them for the following architectures: SSE2, SSE3, SSSE3, SSE4.1 SSE4.2, AVX, AVX2, AVX512, and AVX512DQ. Other ones need to be checked. | |
### Performance | |
I've done a simple sweep test for square problem sizes on Xeon 8180 (Skylake) in sequential mode. gcc11 and clang11 were used to compile benchmark code. There are speedups for dgemm (~20%) and sgemm (~15%), but for sgemm there can be some slowdowns for small problem sizes. Below there are some more details on the performance. I've also measured the other transpose case (NT, TN and TT), but results are similar. | |
According to @b-shi he also saw improvements for trsm with this patch. | |
#### `dgemm` | |
Performance improvements for ""`dgemm`"" seems reasonable around ~20% improvements when comparing to Eigen before changes (c38f91d). Some small sizes also improved if they are multiples of 2 at least. | |
 | |
 | |
For smaller sizes I didn't see very large regressions. | |
 | |
 | |
#### `sgemm` | |
For ""`sgemm`"", I see some speedups as well (up to 15% for clang and a bit more for gcc), but I've also notice some slow downs depending if the size is a multiple of 4 or not and for small sizes. | |
 | |
 | |
As mentioned before non-multiples of 4 have some regressions for smaller problem sizes. Hopefully, the benefits outweighs the regressions enough that will make those changes worth it. | |
 | |
 | |
For reference here is the performance for multiples of 2 (step = 2): | |
 | |
 | |
This slowdowns can probably be mitigated if we further enable packing to handle the tail with m = 2 for single precision directly instead of looping over ones. For example, for m = 47 tail handling would be 32 + 8 + 4 + **2** + 1 instead of 32 + 8 + 4 + **1 + 1** + 1. | |
## Old stuff before large refactor. | |
### What does this implement/fix? | |
This implements/adds some optimizations for ""sgemm"" compute kernels for AVX512. This is still work in progress since it doesn't use packet math yet. However, it will be useful in getting some early feedback on the changes. | |
Here are some comments/questions worth mentioning: | |
1. It will be useful to know if the inline asm used in the kernel (`Eigen/src/Core/arch/AVX512/sgemm_kern.hpp`) is reasonable/acceptable. In particular, there is some register mapping that was used to avoid gcc register spills. Also, using inline asm for loading A/B elements result in better performance with gcc. | |
2. The kernel is quite verbose, but it should be buildable with c++14. It manually unrolls and handle tails with powers of 2, except for `m/n` equal to 2, where I had to loop around ones such I could reuse the packing kernels. Does it make sense in rewriting the kernel with c++17 as @b-shi did in !834? | |
3. I'm not really sure if the performance improvements justify changes. It seems the Eigen's ""sgemm"" performance is quite good. I see about 10% to 15% performance increase for large sizes with the changes. Maybe we will need to use some threshold to dispatch the kernel for larges sizes only to avoid regressions for smaller sizes. Is this acceptable? | |
4. I tried to reenable packing with `nr = 8` for `gemm_pack_rhs`, by uncommenting + small changes. It seems to work for matrix multiplication, but I'm not sure if it was commented out for other reasons. Was there a reason? | |
### Additional information | |
Here is some initial performance measurements on `Intel(R) Xeon(R) Platinum 8180` for A/B non-transpose. For clang we can actually remove the register mapping without having register spills, but gcc performance would be lower. | |
#### NN using gcc11 | |
 | |
#### NN using clang11 | |
 | |
+@b-shi",aaraujom,2022-05-12T23:41:20.087Z,NA,NA | |
908 (https://gitlab.com/libeigen/eigen/-/merge_requests/908),Fix 'Incorrect reference code in STL_interface.hh for ata_product' eigen/isses/2425,"### Reference issue | |
#2425 | |
### What does this implement/fix? | |
This fixes bug in code ata_product",Rohan Ghige,2022-05-18T14:42:58.314Z,NA,NA | |
974 (https://gitlab.com/libeigen/eigen/-/merge_requests/974),Prevent BDCSVD crash caused by index out of bounds.,"For a large matrix of ones, we end up trying to access `perm(-1)`, which | |
causes a memory access error and crash. | |
Added a basic check for this case, and force `zhat` to zero, in an | |
attempt to continue. Reports a numerical issue. | |
Related to #2491",Antonio Sánchez,2022-05-19T22:29:49.402Z,NA,NA | |
973 (https://gitlab.com/libeigen/eigen/-/merge_requests/973),Add arg() to tensor,"### What does this implement/fix? | |
Adds a .arg() method to Tensors.",Tobias Wood,2022-05-20T03:51:50.545Z,NA,NA | |
977 (https://gitlab.com/libeigen/eigen/-/merge_requests/977),Fix BDCSVD condition for failing with numerical issue.,NA,Antonio Sánchez,2022-05-20T15:40:38.069Z,NA,NA | |
984 (https://gitlab.com/libeigen/eigen/-/merge_requests/984),unset executable flag,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Subject says it all. | |
### Additional information | |
<!--Any additional information you think is important.-->",Eisuke Kawashima,2022-05-23T04:13:16.289Z,NA,NA | |
985 (https://gitlab.com/libeigen/eigen/-/merge_requests/985),Improve plogical_shift_* implementations and fix typo in SVE/PacketMath.h,"This changes was part of works i did before, i think it may be useful for Eigen, so i create a MR.",Guoqiang QI,2022-05-23T14:44:42.521Z,NA,NA | |
983 (https://gitlab.com/libeigen/eigen/-/merge_requests/983),[SYCL] Extending SYCL queue interface extension.,This PR extends the `QueueInterface` in the SYCL backend to accept an existing SYCL queue. This will enable us to integrate Eigen SYCL in high-level frameworks that already have SYCL-queue. Reusing the existing SYCL queue will avoid the extra context creation and potential unnecessary memory movement which is expensive.,Mehdi Goli,2022-05-23T14:45:28.119Z,NA,NA | |
980 (https://gitlab.com/libeigen/eigen/-/merge_requests/980),Avoid signed integer overflow in adjoint test.,Sanitizers complain about this.,Antonio Sánchez,2022-05-23T14:46:17.414Z,NA,NA | |
975 (https://gitlab.com/libeigen/eigen/-/merge_requests/975),Add subMappers to Power GEMM packing - simplifies the address calculations (10% faster),Add subMappers to Power GEMM packing - simplifies the address calculations (10% faster). Added missing getSubMapper & getLinearMapper for TensorContractonMapper - fixed compilation issues. Other minor complex packing improvements.,Chip Kerchner,2022-05-23T15:18:30.414Z,NA,NA | |
982 (https://gitlab.com/libeigen/eigen/-/merge_requests/982),Avoid ambiguous Tensor comparison operators for C++20 compatibility,Should be compatible with any C++ version.,Benjamin Kramer,2022-05-23T17:36:03.918Z,NA,NA | |
986 (https://gitlab.com/libeigen/eigen/-/merge_requests/986),[SYCL] SYCL-2020 range does not have default constructor.,According to the [SYCL-2020 spec](https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#table.constructors.range) the range class does not have a default constructor anymore. This PR changes all default constructed ranges to the range of size 1 to make sure at least one thread will be created to run the `parallel_for`.,Mehdi Goli,2022-05-24T03:11:47.315Z,NA,NA | |
976 (https://gitlab.com/libeigen/eigen/-/merge_requests/976),fix: issue 2481: LDLT produce wrong results with AutoDiffScalar,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
#2481 | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
When the value of an `AutoDiffScalar` is 0 (as in the minimal example), some updates in the derivative that should take place in a `triangular_solve_vector<...>::run` are being skipped. | |
See for instance: | |
https://gitlab.com/libeigen/eigen/-/blob/master/Eigen/src/Core/products/TriangularSolverVector.h#L118 | |
The error does not occur when the value of the `AutoDiffScalar` is something different than 0. | |
Proposed fix: | |
- The check must take into account the value of the derivatives when checking `AutoDiffScalar` for zeroes | |
### Additional information | |
<!--Any additional information you think is important.-->",Mario Rincon-Nigro,2022-05-25T15:26:11.707Z,NA,NA | |
971 (https://gitlab.com/libeigen/eigen/-/merge_requests/971),Add R-Bidiagonalization step to BDCSVD,"### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This adds an R-Bidiagonalization step to BDCSVD. (mentioned in section 8.6.3 of the Golub and Van Loan ""Matrix Computations"" book.) | |
- When the input matrix ``A`` is sufficiently tall (or wide), this first computes the QR decomposition of ``A`` (or ``A*``). It then just runs the normal bidiagonalization/svd routines on the top rows of R. Finally, it has to apply Q on the left of ``U`` (or ``V``) to get the actual SVD of ``A = Q(R/0) = Q(USV^T/0)``. | |
- Optimization for tall matrices since less work is done in bidiagonalization. | |
As far as I can tell, LAPACK always does this in [dgesdd](http://www.netlib.org/lapack/explore-html/d1/d7e/group__double_g_esing_gad8e0f1c83a78d3d4858eaaa88a1c5ab1.html#gad8e0f1c83a78d3d4858eaaa88a1c5ab1) | |
when there's sufficient workspace. | |
I'm not sure if adding additional workspace here is acceptable, but this mr shouldn't add much more. A lot of the new workspace needed for the QR decomposition should be cancelled out because UpperBidiagonalization uses less. | |
Looked like an easy win, so I thought I'd try it out. (There was also some recent discussion on Discord where people were using BDCSVD on super tall matrices, so I think this meets a real need.) | |
### Additional information | |
<!--Any additional information you think is important.--> | |
I think it's working pretty well. Passes the test suite and I'm pleased with the speedup I'm seeing. | |
These are from some informal benchmarks: | |
- Run on an AMD EPYC 7742, with gcc-10.2.0, -O3 -march=core-avx2 | |
- show results for computing both Thin U and V and just computhing Thin V, for double and complex<double>. | |
##### Small-ish | |
~~~~ | |
Benchmark Time CPU Time Old Time New CPU Old CPU New | |
----------------------------------------------------------------------------------------------------------------------------------------- | |
bdcsvd_double_computeThinUV/38/38 -0.0150 -0.0146 560730 552343 558549 550367 | |
bdcsvd_double_computeThinUV/38/138 -0.0311 -0.0317 665697 644985 663199 642201 | |
bdcsvd_double_computeThinUV/38/238 -0.0851 -0.0852 783044 716412 780009 713560 | |
bdcsvd_double_computeThinUV/38/338 -0.0897 -0.0905 884546 805231 881206 801484 | |
bdcsvd_double_computeThinUV/38/438 -0.0891 -0.0896 969086 882702 965227 878719 | |
bdcsvd_double_computeThinUV/38/538 -0.1708 -0.1712 1118555 927540 1114049 923328 | |
bdcsvd_double_computeThinUV/38/638 -0.1533 -0.1534 1213303 1027309 1208212 1022824 | |
bdcsvd_double_computeThinUV/38/738 -0.2031 -0.2033 1334411 1063345 1329151 1058981 | |
bdcsvd_double_computeThinUV/38/838 -0.1994 -0.1998 1430272 1145130 1424248 1139691 | |
bdcsvd_double_computeThinV/38/38 +0.0027 +0.0025 495862 497220 494012 495256 | |
bdcsvd_double_computeThinV/38/138 +0.0156 +0.0153 526925 535143 524829 532842 | |
bdcsvd_double_computeThinV/38/238 -0.0104 -0.0108 564038 558148 562120 556069 | |
bdcsvd_double_computeThinV/38/338 -0.0942 -0.0950 654858 593161 652641 590671 | |
bdcsvd_double_computeThinV/38/438 -0.1722 -0.1726 746768 618174 743960 615546 | |
bdcsvd_double_computeThinV/38/538 -0.1859 -0.1867 780695 635544 777907 632657 | |
bdcsvd_double_computeThinV/38/638 -0.2150 -0.2149 842719 661564 839380 658989 | |
bdcsvd_double_computeThinV/38/738 -0.2572 -0.2572 926010 687850 922064 684915 | |
bdcsvd_double_computeThinV/38/838 -0.2878 -0.2876 996698 709868 992685 707202 | |
~~~~ | |
##### Larger | |
~~~~ | |
Benchmark Time CPU Time Old Time New CPU Old CPU New | |
------------------------------------------------------------------------------------------------------------------------------------------- | |
bdcsvd_double_computeThinUV/500/500 -0.0223 -0.0218 120934799 118241344 120206450 117587615 | |
bdcsvd_double_computeThinUV/500/1000 -0.0781 -0.0768 168039073 154912570 166927595 154101147 | |
bdcsvd_double_computeThinUV/500/2000 -0.0841 -0.0839 252251583 231030995 250837282 229786319 | |
bdcsvd_double_computeThinUV/500/3000 -0.2933 -0.2931 354488660 250504511 352466612 249157515 | |
bdcsvd_double_computeThinUV/500/4000 -0.3849 -0.3845 498082335 306358967 495251025 304808607 | |
bdcsvd_double_computeThinUV/500/5000 -0.4102 -0.4101 611187632 360497410 607805405 358562461 | |
bdcsvd_double_computeThinUV/500/6000 -0.5858 -0.5855 993030518 411281976 986559546 408897937 | |
bdcsvd_double_computeThinV/500/500 +0.0023 +0.0025 96144720 96369521 95591914 95829350 | |
bdcsvd_double_computeThinV/500/1000 -0.0108 -0.0104 125544195 124185888 124790609 123487358 | |
bdcsvd_double_computeThinV/500/2000 +0.0516 +0.0515 169231449 177963941 168368397 177041849 | |
bdcsvd_double_computeThinV/500/3000 -0.3211 -0.3211 220459022 149679402 219407709 148955709 | |
bdcsvd_double_computeThinV/500/4000 -0.3434 -0.3435 292084731 191771479 290552164 190751061 | |
bdcsvd_double_computeThinV/500/5000 -0.4782 -0.4776 404396172 211012044 401878571 209961360 | |
bdcsvd_double_computeThinV/500/6000 -0.6294 -0.6291 598237028 221678815 594283542 220415778 | |
bdcsvd_complex_double_computeThinUV/500/500 +0.0916 +0.0897 255520963 278933399 254162600 276962257 | |
bdcsvd_complex_double_computeThinUV/500/1000 -0.0046 -0.0063 470148880 467963802 467394496 464450050 | |
bdcsvd_complex_double_computeThinUV/500/2000 +0.0331 +0.0314 917793280 948137549 912521721 941157706 | |
bdcsvd_complex_double_computeThinUV/500/3000 -0.2198 -0.2211 1360012279 1061090288 1351584140 1052804356 | |
bdcsvd_complex_double_computeThinUV/500/4000 -0.3050 -0.3053 1915540819 1331249650 1903755375 1322459094 | |
bdcsvd_complex_double_computeThinUV/500/5000 -0.2804 -0.2804 2421978472 1742944820 2405996094 1731417102 | |
bdcsvd_complex_double_computeThinUV/500/6000 -0.3471 -0.3468 2871235454 1874670684 2851029992 1862315016 | |
bdcsvd_complex_double_computeThinV/500/500 -0.0028 -0.0027 197499887 196955268 196285795 195757475 | |
bdcsvd_complex_double_computeThinV/500/1000 -0.0236 -0.0235 318554734 311042543 316618729 309192952 | |
bdcsvd_complex_double_computeThinV/500/2000 -0.0431 -0.0423 567077441 542655686 563321471 539468600 | |
bdcsvd_complex_double_computeThinV/500/3000 -0.4819 -0.4812 915616107 474385083 909423431 471792729 | |
bdcsvd_complex_double_computeThinV/500/4000 -0.4868 -0.4862 1144582829 587365450 1137211810 584304876 | |
bdcsvd_complex_double_computeThinV/500/5000 -0.5118 -0.5114 1436150418 701125911 1427595270 697484811 | |
bdcsvd_complex_double_computeThinV/500/6000 -0.5330 -0.5328 1733322684 809429173 1722951672 805023436 | |
OVERALL_GEOMEAN -0.2741 -0.2740 0 0 0 0 | |
~~~~ | |
##### Very Tall | |
~~~~ | |
Benchmark Time CPU Time Old Time New CPU Old CPU New | |
---------------------------------------------------------------------------------------------------------------------------------------------- | |
bdcsvd_double_computeThinUV/32/100000 -0.3170 -0.3175 226529586 154717160 225472766 153888702 | |
bdcsvd_double_computeThinUV/264/100000 -0.6096 -0.6095 5149984761 2010317098 5119070612 1998970140 | |
bdcsvd_double_computeThinUV/1024/100000 -0.6538 -0.6530 59780980101 20698847700 59313139187 20578710233 | |
bdcsvd_double_computeThinV/32/100000 -0.2606 -0.2609 258735994 191316443 257246119 190123143 | |
bdcsvd_double_computeThinV/264/100000 -0.6068 -0.6065 5527338202 2173494907 5493199942 2161352169 | |
bdcsvd_double_computeThinV/1024/100000 -0.6419 -0.6411 60740516522 21750376091 60278909057 21633052616 | |
bdcsvd_complex_double_computeThinUV/32/100000 -0.2993 -0.2989 636792072 446202903 633206937 443951859 | |
bdcsvd_complex_double_computeThinUV/264/100000 -0.4889 -0.4878 18326362603 9366890170 18194148881 9318136331 | |
bdcsvd_complex_double_computeThinUV/1024/100000 -0.4089 -0.4083 195544107745 115591737930 194190302999 114900596242 | |
bdcsvd_complex_double_computeThinV/32/100000 -0.4411 -0.4417 368514577 205950727 366569465 204654500 | |
bdcsvd_complex_double_computeThinV/264/100000 -0.6587 -0.6583 12379953027 4225803248 12292426264 4200267389 | |
bdcsvd_complex_double_computeThinV/1024/100000 -0.6676 -0.6669 125078044965 41579651579 124086840510 41338883708 | |
OVERALL_GEOMEAN -0.5259 -0.5255 7 3 7 3 | |
~~~~",Arthur,2022-05-27T02:00:25.878Z,NA,NA | |
987 (https://gitlab.com/libeigen/eigen/-/merge_requests/987),Fix integer shortening warnings in visitor tests.,NA,Rasmus Munk Larsen,2022-05-27T18:51:38.136Z,NA,NA | |
972 (https://gitlab.com/libeigen/eigen/-/merge_requests/972),Add AVX512 s/dgemm optimizations for compute kernel (2nd try),"This is a follow up to resolve issues that pop up after [!860](https://gitlab.com/libeigen/eigen/-/merge_requests/860) was merged but reverted. | |
It addresses the following: | |
* Build issue on NEON64 | |
* Rename/move of transpose for trsm | |
* Rename member in data mapper to include to something other than `incr` to avoid shadowing | |
I'm keeping the commits separate for now to help with readability. I will squash all of them on final rebase.",aaraujom,2022-05-28T02:00:22.752Z,NA,NA | |
981 (https://gitlab.com/libeigen/eigen/-/merge_requests/981),Adding an MKL adapter in FFT module.,"Fix missing template argument bug in FFT header for mkl adapter. | |
Add adapter inplementations for mkl, kfr and ffts FFT libraries. oneAPI mkl is now free both dor Winlows and Linux and gives performance similar to fftw for Intel CPU.kfr FFT gives performanse similar to fftw. It requires either GPL or comercial license. FFTS gives good performans. There is no developers activity for last three years. Another drawback is thet FFTS has not completed real to complex and 2d transformation for double precission. For 1d real to complex and backward transformation I have to convert real array to complex with Im = 0.",Oleg Shirokobrod,2022-06-02T18:10:44.351Z,NA,NA | |
989 (https://gitlab.com/libeigen/eigen/-/merge_requests/989),Fix c++20 ambiguity of comparisons.,"Via Google core library team: usage of comparison operators and | |
templates are now ambiguous in c++20 due to operator reversal. This change | |
resolves the ambiguity.",Antonio Sánchez,2022-06-03T05:11:07.793Z,NA,NA | |
988 (https://gitlab.com/libeigen/eigen/-/merge_requests/988),Fix build issues with MSVC for AVX512,"There are couple build issues with MSVC currently. I've seen it in this [pipeline](https://gitlab.com/libeigen/eigen_ci_cross_testing/-/jobs/2516428634). This addresses the ones I've seen so far. Test are still building. MSVC is being real slow to build and consuming quite a bit of memory. | |
--- | |
~Edit: I've seen a single process of `cl.exe` taking more than 60GB of memory. I think it will be more prudent to disable the recent optimizations added for AVX512 (GemmKernel and TrsmKernel) to avoid build issues on Windows with MSVC. I think this might be related to the template recursion used in both kernels. Compilation aborts with fatal error below.~ | |
``` | |
fatal error C1002: compiler is out of heap space in pass 2 | |
```",aaraujom,2022-06-03T14:55:41.443Z,NA,NA | |
990 (https://gitlab.com/libeigen/eigen/-/merge_requests/990),Provide DiagonalMatrix Product and Initializers,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This includes two tiny additions to DiagonalMatrix: 1. diag*diag products and 2. static initializers for zero and identity. | |
~~~~ | |
// dumb example | |
DiagType m = ...; | |
DiagType n = ...; | |
DiagType B = (m * n) + 2 * DiagType::Identity(); | |
~~~~ | |
### Additional information | |
<!--Any additional information you think is important.--> | |
These are just things I've tried before and sort of assumed would work. Just for convenience, so nbd if stuff like this isn't desired! | |
I'm guessing DiagonalMatrix is meant to be lightweight since most stuff is pretty easy to do with vector ops on ``.diagonal()``? Maybe there could be other methods for stuff like operatorNorm, determinant, etc as well that have simple, but not totally obvious, implementations for diagonal matrices :shrug:",Arthur,2022-06-06T21:43:22.741Z,NA,NA | |
991 (https://gitlab.com/libeigen/eigen/-/merge_requests/991),Fix ambiguous comparisons for c++20 (again again),"C++20 introduces a reversibility lookup for comparison operators, | |
which leads to ambiguous comparison warnings in clang. | |
Modify comparisons in `TensorBase` to be symmetric. | |
This is a redo of !982 and !989.",Antonio Sánchez,2022-06-07T17:06:18.241Z,NA,NA | |
993 (https://gitlab.com/libeigen/eigen/-/merge_requests/993),Fix row vs column vector typo in Matrix class tutorial,"In the matrix class tutorial, the role of column- and row-vector is swapped in one of the example code snippets. This is just a typo, but it occurs at a very prominent place and might confuse first-time users of Eigen.",sfalmo,2022-06-07T17:28:20.125Z,NA,NA | |
992 (https://gitlab.com/libeigen/eigen/-/merge_requests/992),AVX512 TRSM Kernels respect EIGEN_NO_MALLOC,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
Addresses `malloc` comments in https://gitlab.com/libeigen/eigen/-/merge_requests/988. Switched `malloc` calls to eigen's `handmade` versions. To respect `EIGEN_NO_MALLOC`, the `trsmKernelL` kernels are disabled if `malloc` is not allowed. The previous struct `trsm_kernels` is split into `trsmKernelL`/`trsmKernelR` to make disabling the left-variant kernels simpler. `EIGEN_USE_AVX512_TRSM_KERNELS` and `EIGEN_ENABLE_AVX512_NOCOPY_TRSM_CUTOFFS` macros are split apart similarly. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
It seems that the general triangular solve driver does not fully support `EIGEN_NO_MALLOC` even with the AVX512 optimizations (GEBP and TRSM) disabled. From some quick testing, I saw that for double precision and moderate sized problems (> M=N=~200), eigen's `check_that_malloc_is_allowed()` fails.",b-shi,2022-06-07T18:53:54.510Z,NA,NA | |
994 (https://gitlab.com/libeigen/eigen/-/merge_requests/994),Mark `index_remap` as `EIGEN_DEVICE_FUNC` in `src/Core/Reshaped.h` (Fixes #2493),"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #2493 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Marking `index_remap` as `EIGEN_DEVICE_FUNC` allows usage of expression reshape in GPU code. | |
### Additional information | |
<!--Any additional information you think is important.-->",Binhao Qin,2022-06-07T20:10:48.427Z,NA,NA | |
995 (https://gitlab.com/libeigen/eigen/-/merge_requests/995),Document DiagonalBase,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
DiagonalBase is undocumented and doesn't show up in the [class lists](https://eigen.tuxfamily.org/dox/group__Core__Module.html). It has the math-y diagonal methods so I think it's important that it has some docs. I tried to base these off of other docs I saw in MatrixBase. | |
I also included some cleanup and clang-formatted the updated portions. | |
The type aliases are kind of ugly, but seem necessary because the return types can be very long and easily go past the end of my screen in the doc website. IDK if there's a better way to handle that. | |
Happy to make any changes!",Arthur,2022-06-08T17:46:32.660Z,NA,NA | |
996 (https://gitlab.com/libeigen/eigen/-/merge_requests/996),[SYCL-Spec] According to [SYCL-2020 spec](...,"According to [SYCL-2020 spec]( https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:naming.kernels), the types used for kernel names be a C++ type and be forward declarable. | |
Since Eigen is the expression tree-based kernel, the expression type will be used as the | |
name of the kernel. When an enum is used in the kernel's name, the integration header in SYCL must forward enum. [To do this forward declaration](https://docs.microsoft.com/en-us/cpp/cpp/enumerations-cpp?view=msvc-170), either the unscoped enum must have an underlying type specified or the scoped enum (e.g enum class) should be used. however, using scoped enum requires the enumerator to be qualified by enum type(e.g `EnumCLASS::VALUE`) which can be a more intrusive change. This PR inherits the enum from int to be less intrusive and compliant with the SYCL2020 spec for kernel name.",Mehdi Goli,2022-06-13T15:52:30.230Z,NA,NA | |
998 (https://gitlab.com/libeigen/eigen/-/merge_requests/998),Fix tanh and erf to use vectorized version for EIGEN_FAST_MATH in VSX.,Fix tanh and erf to use vectorized version for EIGEN_FAST_MATH in VSX.,Chip Kerchner,2022-06-15T16:06:44.934Z,NA,NA | |
997 (https://gitlab.com/libeigen/eigen/-/merge_requests/997),AVX512 TRSM kernels use alloca if EIGEN_NO_MALLOC requested,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
Follow-up PR to address comments in https://gitlab.com/libeigen/eigen/-/merge_requests/992. In that PR, LHS variants of TRSM kernels are disabled if `EIGEN_NO_MALLOC` is requested. In particular the use of `alloca` was suggested [here](https://gitlab.com/libeigen/eigen/-/merge_requests/992#note_974476732) instead of completely disabling the LHS variant AVX512 TRSM kernels. | |
This PR changes the behaviour as follows: | |
- If `EIGEN_NO_MALLOC` is requested: | |
- If max temp workspace size using default blocking sizes is less than `EIGEN_STACK_ALLOCATION_LIMIT` then use `alloca`. | |
- Otherwise, reduce blocking size up to the minimum supported then use `alloca` (perf. is still better than generic trsm kernel, see graph below) | |
- If max temp workspace size using minimum blocking sizes is still larger than `EIGEN_STACK_ALLOCATION_LIMIT` then throw assertion. | |
- If `EIGEN_NO_MALLOC` is not requested we use `handmade_aligned_malloc` | |
### Additional information | |
There is a noticeable performance hit (see graph below) when using `alloca` vs `malloc`, so `malloc` is still used if allowed. | |
 | |
- Non-optimized: generic trsm kernels, code-path used when `EIGEN_NO_MALLOC` is requested (behaviour as of https://gitlab.com/libeigen/eigen/-/merge_requests/992) | |
- Min-blocking: AVX512 trsm kernels with minimum required blocking sizes + `alloca`. | |
- Default-blocking: AVX512 trsm kernels with default blocking sizes + `alloca`. | |
- Malloc: Default-blocking: AVX512 trsm kernels with default blocking sizes + `malloc`.",b-shi,2022-06-17T18:05:27.791Z,NA,NA | |
999 (https://gitlab.com/libeigen/eigen/-/merge_requests/999),Use numext::sqrt in Householder.h.,"This is to make it easier to apply to custom types - by using the | |
`numext` version, the user can specialize the function more easily. | |
Otherwise we require a `sqrt` function to be defined prior to including | |
the Eigen headers, which can be awkward and lead to header-include-order | |
issues. | |
Related to #2496",Antonio Sánchez,2022-06-21T16:30:00.070Z,NA,NA | |
1003 (https://gitlab.com/libeigen/eigen/-/merge_requests/1003),Eliminate undef warnings when not compiling for AVX512.,"The original code assumes some macros are defined, when they are only | |
ever defined for AVX512. Here we add some guards to eliminate the | |
warnings.",Antonio Sánchez,2022-06-24T15:10:11.268Z,NA,NA | |
1001 (https://gitlab.com/libeigen/eigen/-/merge_requests/1001),Skip f16/bf16 bessel specializations on AVX512 if unavailable.,"The bessel functions are not available for AVX512 on msvc prior to 1923 (VS 2019) | |
or old versions of gcc (prior to 5.3). This causes a build error since | |
`pexp` is not available for these half->float specializations. | |
Fixes #2499.",Antonio Sánchez,2022-06-24T15:10:37.063Z,NA,NA | |
1002 (https://gitlab.com/libeigen/eigen/-/merge_requests/1002),Fix clang-tidy warnings about function definitions in headers.,"Clang-tidy is generating warnings for this. | |
Also clang-formatted, since the weird indenting made the original hard to read.",Antonio Sánchez,2022-06-24T15:19:57.232Z,NA,NA | |
1000 (https://gitlab.com/libeigen/eigen/-/merge_requests/1000),Better performance for Power10 using more load and store vector pairs for GEMV,Better performance for Power10 using more load and store vector pairs,Chip Kerchner,2022-06-27T18:11:56.165Z,NA,NA | |
947 (https://gitlab.com/libeigen/eigen/-/merge_requests/947),"Add pload_partial, pstore_partial (and unaligned versions), pgather_partial, pscatter_partial, loadPacketPartial and storePacketPartial.","Add ploadN, pstoreN (and unaligned versions), pgatherN, pscatterN, loadPacketN and storePacketN. | |
Useful for: | |
1) memory access - prevent reading/writing past end of data (only elements needed), | |
2) performance - eliminates masking, one Packet vs N scalars, less complexity for edge condition functions/templates (better i-cache), etc. | |
3) partial Packet operations - simplified Packet operations instead of read scalars, merge with Packet, operation, get scalar, write scalars. | |
4) consistent results - reduces variations for scalar vs packet operations",Chip Kerchner,2022-06-27T19:18:01.129Z,NA,NA | |
1005 (https://gitlab.com/libeigen/eigen/-/merge_requests/1005),Enable subtests which use device side malloc since this has been fixed in ROCm 5.2.,"Device side malloc functionality has been fixed in the recently released ROCm 5.2 so reenable the unit tests that use it. | |
/cc @cantonios",Rohit Santhanam,2022-06-29T21:52:08.511Z,NA,NA | |
1007 (https://gitlab.com/libeigen/eigen/-/merge_requests/1007),Fix ODR violations.,"This declaration in a header: | |
``` | |
typedef enum { ... } E; | |
``` | |
creates a new unnamed type every time that header is `#include`d, | |
resulting in an (undiagnosed) ODR violation. | |
With header modules such ODR violations cause build failure with a cryptic | |
error message. | |
Fix this by creating a named type instead. | |
Courtesy of Paul Pluzhnikov.",Antonio Sánchez,2022-07-09T04:56:37.478Z,NA,NA | |
1006 (https://gitlab.com/libeigen/eigen/-/merge_requests/1006),"AutoDiff depends on Core, so include appropriate header.","Our other top-level headers include their dependencies, so this probably | |
should too.",Antonio Sánchez,2022-07-09T23:57:10.476Z,NA,NA | |
1009 (https://gitlab.com/libeigen/eigen/-/merge_requests/1009),Fix wrong doxygen group usage,Fix wrong usage of doxygen groups,Mathieu Westphal,2022-07-12T15:17:58.052Z,NA,NA | |
1013 (https://gitlab.com/libeigen/eigen/-/merge_requests/1013),Add option to disable avx512 GEBP kernels,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
Issue mentioend [here](https://gitlab.com/libeigen/eigen/-/merge_requests/972#note_1022907267). | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This is a quick fix by allowing enabling/disabling of AVX512 GEBP kernels via compiler flag. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
Fixed some some `undef` warnings when avx512 trsm kernels are enabled, but not using `clang`.",b-shi,2022-07-18T17:59:09.960Z,NA,NA | |
1014 (https://gitlab.com/libeigen/eigen/-/merge_requests/1014),Fix aligned_realloc to call check_that_malloc_is_allowed() if ptr == 0,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
The macros `EIGEN_RUNTIME_NO_MALLOC` and `EIGEN_NO_MALLOC` help developers detect dynamic memory allocations requested by Eigen by triggering as assertion. It is possible to circumvent this behavior by declaring an empty object and subsequently resizing. For example: `VectorXd x; x.conservativeResize(100);` `conservativeResize()` internally calls `std::realloc(ptr,new_size)`, which is equivalent to `std::malloc(new_size)` if `ptr == 0`. This is fixed by defering to `aligned_malloc` if `ptr == 0`. | |
### Additional information | |
https://godbolt.org/z/Pb6xzdzbP",Charles Schlosser,2022-07-19T20:59:07.769Z,NA,NA | |
1015 (https://gitlab.com/libeigen/eigen/-/merge_requests/1015),Disable AVX512 GEMM kernels by default.,"They are causing segfaults in application. Bug reproducer to be | |
investigated.",Antonio Sánchez,2022-07-20T21:22:48.653Z,NA,NA | |
1016 (https://gitlab.com/libeigen/eigen/-/merge_requests/1016),Include immintrin.h header for enscripten.,"The other headers seem to fail. | |
Fixes #2514.",Antonio Sánchez,2022-07-22T02:27:43.397Z,NA,NA | |
978 (https://gitlab.com/libeigen/eigen/-/merge_requests/978),Add Sparse Subset of Matrix Inverse,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! | |
--> | |
### What does this implement/fix? | |
Certain problems require access to specific elements of the inverse of a sparse matrix. Calculating the full inverse will usually result in a dense matrix, and the other option of calculating just a single column of the inverse can quickly become expensive and tends towards needing the whole matrix if eg. block diagonal elements of the inverse are needed. | |
This MR implements a method to efficiently calculate a sparse subset of the inverse, corresponding to the dense elements in an LU decomposition, plus any additional elements required. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
The Takahashi method (https://dl.acm.org/doi/10.1145/360680.360704) allows the computation of a sparse subset of the inverse, corresponding to the dense elements of a sparse LU factorization. Once this sparse subset is calculated, any additional elements of the inverse can be calculated at the cost of a single sparse dot product. In this implementation, this was achieved by calculating the inverse value for all dense values in the LU. Thus if a user needs specific values of the inverse, they can insert corresponding zeros into the sparse non-inverted matrix before calculating the inverse. | |
Due to large sparse matrices and in general interesting problems easily becoming close to rank deficient, and also the recursive nature of the algorithm, it is very sensitive to numerical errors, particularly on the first elements. In order to reduce this sensitivity, the dot products used for the accumulator were modified to use [Kahan Summation](https://en.wikipedia.org/wiki/Kahan_summation_algorithm#The_algorithm) for the accumulator. In testing (and in the test case added) this makes many problems go from intractable to easily solveable. However, the Kahan summation does come at a cost: roughly 4x the operations of a simple summation. In addition, [Neumaeier summation](https://en.wikipedia.org/wiki/Kahan_summation_algorithm#Further_enhancements) was tested, which only merges back the accumulated error at the end of the summation rather than at each iteration. This was found to be less accurate and a similar speed, so Kahan summation was used. | |
Despite the addition of the Kahan summation, the Takahashi method is still very fast, particularly for larger, sparser matrices. A potential follow-up to this would be to make the algorithm block-aware, to be more efficient on block sparse matrices.",Julian Kent,2022-07-28T18:04:41.005Z,NA,NA | |
1021 (https://gitlab.com/libeigen/eigen/-/merge_requests/1021),Updated AccelerateSupport documentation after PR 966.,This fixes the documentation after the changes incorporated in PR 966.,John Mather,2022-07-29T17:42:31.839Z,NA,NA | |
1019 (https://gitlab.com/libeigen/eigen/-/merge_requests/1019),Avoid including <sstream> with EIGEN_NO_IO,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
This allows using the Eigen/Dense header in embedded environments using | |
libc++ built with -DLIBCXX_ENABLE_LOCALIZATION=OFF. Without this change, | |
including Eigen/Dense will result in the following compiler error: | |
`error: ""<locale.h> is not supported since libc++ has been configured without support for localization` | |
I did not include a test for this change since we would have to mock | |
headers for an entire C++ standard library without iostream support. | |
### Additional information | |
I used the following test binary to check that I can build against my custom embedded libc++ | |
```c++ | |
#define EIGEN_NO_IO | |
#define EIGEN_NO_MALLOC | |
#include <eigen3/Eigen/Dense> | |
#include <eigen3/Eigen/Sparse> | |
int main() { | |
Eigen::Vector3f f; | |
return f.size(); | |
} | |
```",Alexander Richardson,2022-07-29T18:02:51.996Z,NA,NA | |
1004 (https://gitlab.com/libeigen/eigen/-/merge_requests/1004),Add true determinant to QR and its variants,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Fixes #471. | |
Implements `determinant()` method which gives true determinant, for `HouseholderQR`, `ColPivHouseholderQR`, `FullPivHouseholderQR`, `CompleteOrthogonalDecomposition`. | |
Documentation and test code is included. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
To calculate determinant of Q matrix, `struct householder_determinant` is added into `namespace internal`. | |
This class is specialized for real scalar type, so that it can make use of the fact that each reflection negates determinant for real matrices.",sjusju,2022-07-29T18:24:15.274Z,NA,NA | |
1011 (https://gitlab.com/libeigen/eigen/-/merge_requests/1011),Improve pblend AVX implementation,"blendv only cares about top bit of a mask, so we can use ints. | |
Removes vcvtdq2ps instruction and makes pblend faster: | |
BM_blend 1.31ns ± 1% 0.98ns ±15% -24.84% (p=0.008 n=5+5)",Ilya Tokar,2022-07-29T18:45:33.813Z,NA,NA | |
1020 (https://gitlab.com/libeigen/eigen/-/merge_requests/1020),Use numext::sqrt in ConjugateGradient.,"This allows us to apply the method for types that have custom sqrt | |
functions, e.g. gcc `__float128`. | |
Fixes #2519.",Antonio Sánchez,2022-07-29T20:17:24.802Z,NA,NA | |
1023 (https://gitlab.com/libeigen/eigen/-/merge_requests/1023),Fix flaky packetmath_1 test.,"The pmsub and pnmadd tests often fail due to cancellation of values. | |
Here we adjust the inputs so that they don't.",Antonio Sánchez,2022-08-02T17:42:45.862Z,NA,NA | |
1010 (https://gitlab.com/libeigen/eigen/-/merge_requests/1010),Fix inner iterator for sparse block.,"The original incorrectly ignores the outer index of the block. | |
It looks like the main two-arg constructor for `InnerIterator` | |
does correctly consider the appropriate `outerIndexPtr`, so | |
we simply forward to that constructor. | |
Fixes #2507.",Antonio Sánchez,2022-08-03T17:26:13.579Z,NA,NA | |
1024 (https://gitlab.com/libeigen/eigen/-/merge_requests/1024),Partial Packet support for GEMM real-only (PowerPC). Also fix compilation warnings & errors for some conditions in new API.,"Partial Packet support for GEMM real-only (PowerPC). Also fix compilation warnings & errors for some conditions in new API. | |
Up to 40% reduction in binary size. Minor performance improvements.",Chip Kerchner,2022-08-03T18:15:21.141Z,NA,NA | |
1025 (https://gitlab.com/libeigen/eigen/-/merge_requests/1025),Fix use of Packet2d type for non-VSX.,Fix use of Packet2d type for non-VSX.,Chip Kerchner,2022-08-03T20:48:13.923Z,NA,NA | |
1028 (https://gitlab.com/libeigen/eigen/-/merge_requests/1028),Fix non-VSX PowerPC build,"Fix non-VSX PowerPC build. | |
This should resolve [issue2513](https://gitlab.com/libeigen/eigen/-/issues/2513)",Chip Kerchner,2022-08-08T18:18:18.299Z,NA,NA | |
1027 (https://gitlab.com/libeigen/eigen/-/merge_requests/1027),Fix code and unit test for a few corner cases in vectorized pow(),"Due to a bad test, a few corner cases were not handled correctly by the vectorized implementation of `pow()`. Specifically, the following two specifications would not be satisfied: | |
1. pow(-∞, exp) returns -∞ if exp is a positive odd integer. | |
2. pow(-0, exp), where exp is a negative odd integer, returns -∞. | |
Instead, the erroneous code returned +∞ in these cases. | |
Thanks to @chuckyschluz for reporting this.",Rasmus Munk Larsen,2022-08-08T18:48:36.771Z,NA,NA | |
1012 (https://gitlab.com/libeigen/eigen/-/merge_requests/1012),Fix vectorized Jacobi Rotation,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
There seems to be a bug in the `apply_rotation_in_the_plane_selector`, so the packet math vectorized version is never used. (Modern clang and gcc seem to vectorize the default version pretty well FWIW.) | |
This also makes some fixes to get the ""fixed-size"" code path to pass the test suite. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
Just for reference, this seems to be the reason that the packet-math version isn't being used atm: | |
~~~~ | |
const bool Vectorizable = (int(VectorX::Flags) & int(VectorY::Flags) & PacketAccessBit) ... | |
~~~~ | |
`Vectorizable` is always false because `VectorX` and `VectorY` are block expressions, which seem to not set the PacketAccessBit at all. Doing something like checking the `Flags` of the block `evaluator` instead seems to work better: | |
~~~~ | |
const bool Vectorizable = (int(evaluator<VectorX>::Flags) & int(evaluator<VectorY>::Flags) & PacketAccessBit) ... | |
~~~~",Arthur,2022-08-08T19:29:57.698Z,NA,NA | |
1026 (https://gitlab.com/libeigen/eigen/-/merge_requests/1026),Vectorize the sign operator in Eigen.,"This fixes an old TODO to vectorize `scalar_sign_op` for real types. | |
Measured speedup on Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz: | |
(ctype == `std::complex<float>`, cdtype == `std::complex<double>`) | |
``` | |
--march=nehalem (SSE*) | |
before: | |
BM_eigen_sign_double/8_mean 6.44 6.41 109201388 | |
BM_eigen_sign_double/64_mean 34.0 34.0 20234383 | |
BM_eigen_sign_double/512_mean 260 260 2666898 | |
BM_eigen_sign_double/2k_mean 1016 1016 687992 | |
BM_eigen_sign_float/8_mean 9.71 9.71 72262781 | |
BM_eigen_sign_float/64_mean 21.7 21.7 32433773 | |
BM_eigen_sign_float/512_mean 117 117 5997151 | |
BM_eigen_sign_float/2k_mean 449 448 1200000 | |
BM_eigen_sign_ctype/8_mean 57.2 57.2 12000000 | |
BM_eigen_sign_ctype/64_mean 509 509 1200000 | |
BM_eigen_sign_ctype/512_mean 4095 4086 167682 | |
BM_eigen_sign_ctype/2k_mean 16289 16288 42865 | |
BM_eigen_sign_cdtype/8_mean 75.5 75.2 9199233 | |
BM_eigen_sign_cdtype/64_mean 722 722 982228 | |
BM_eigen_sign_cdtype/512_mean 6704 6682 105444 | |
BM_eigen_sign_cdtype/2k_mean 27827 27832 24978 | |
after: | |
BM_eigen_sign_double/8_mean 5.74 5.74 120000000 | |
BM_eigen_sign_double/64_mean 31.7 31.7 21927020 | |
BM_eigen_sign_double/512_mean 252 252 2706623 | |
BM_eigen_sign_double/2k_mean 982 982 712900 | |
BM_eigen_sign_float/8_mean 9.68 9.68 72326995 *not vectorized | |
BM_eigen_sign_float/64_mean 21.4 21.4 32621349 *not vectorized | |
BM_eigen_sign_float/512_mean 116 116 6013897 *not vectorized | |
BM_eigen_sign_float/2k_mean 447 447 1200000 *not vectorized | |
BM_eigen_sign_ctype/8_mean 15.6 15.6 45293266 | |
BM_eigen_sign_ctype/64_mean 88.6 88.6 7900154 | |
BM_eigen_sign_ctype/512_mean 680 680 1031326 | |
BM_eigen_sign_ctype/2k_mean 2689 2691 257909 | |
BM_eigen_sign_cdtype/8_mean 28.4 28.4 24440221 | |
BM_eigen_sign_cdtype/64_mean 257 257 2711124 | |
BM_eigen_sign_cdtype/512_mean 2077 2077 336557 | |
BM_eigen_sign_cdtype/2k_mean 8312 8313 84323 | |
--march=skylake (AVX2) | |
before: | |
BM_eigen_sign_double/8_mean 12.0 12.0 58310104 | |
BM_eigen_sign_double/64_mean 38.5 38.5 17869356 | |
BM_eigen_sign_double/512_mean 250 249 2775578 | |
BM_eigen_sign_double/2k_mean 996 996 714018 | |
BM_eigen_sign_float/8_mean 12.7 12.7 55495035 | |
BM_eigen_sign_float/64_mean 32.4 32.4 21378522 | |
BM_eigen_sign_float/512_mean 122 122 5698877 | |
BM_eigen_sign_float/2k_mean 414 413 1618011 | |
BM_eigen_sign_ctype/8_mean 58.2 58.2 11965594 | |
BM_eigen_sign_ctype/64_mean 518 518 1200000 | |
BM_eigen_sign_ctype/512_mean 4080 4063 169364 | |
BM_eigen_sign_ctype/2k_mean 16333 16332 42414 | |
BM_eigen_sign_cdtype/8_mean 79.7 79.7 8728939 | |
BM_eigen_sign_cdtype/64_mean 718 717 971102 | |
BM_eigen_sign_cdtype/512_mean 6705 6700 105340 | |
BM_eigen_sign_cdtype/2k_mean 28451 28447 24539 | |
after: | |
BM_eigen_sign_double/8_mean 10.2 10.2 68883377 | |
BM_eigen_sign_double/64_mean 27.8 27.8 25028814 | |
BM_eigen_sign_double/512_mean 169 169 4134488 | |
BM_eigen_sign_double/2k_mean 631 631 1102968 | |
BM_eigen_sign_float/8_mean 16.4 16.4 42694836 | |
BM_eigen_sign_float/64_mean 21.1 21.1 33109093 | |
BM_eigen_sign_float/512_mean 96.9 96.9 7209604 | |
BM_eigen_sign_float/2k_mean 326 326 2110066 | |
BM_eigen_sign_ctype/8_mean 27.7 27.7 25070458 | |
BM_eigen_sign_ctype/64_mean 96.1 96.1 7270548 | |
BM_eigen_sign_ctype/512_mean 634 634 1102494 | |
BM_eigen_sign_ctype/2k_mean 2467 2467 280365 | |
BM_eigen_sign_cdtype/8_mean 28.3 28.3 24573556 | |
BM_eigen_sign_cdtype/64_mean 241 241 2869555 | |
BM_eigen_sign_cdtype/512_mean 1946 1946 358788 | |
BM_eigen_sign_cdtype/2k_mean 7793 7793 89187 | |
--march=skylake-avx512 (AVX512) | |
before: | |
BM_eigen_sign_double/8_mean 11.5 11.5 61014691 | |
BM_eigen_sign_double/64_mean 41.5 41.5 16519411 | |
BM_eigen_sign_double/512_mean 285 285 2438747 | |
BM_eigen_sign_double/2k_mean 1140 1140 598276 | |
BM_eigen_sign_float/8_mean 11.5 11.5 61125484 | |
BM_eigen_sign_float/64_mean 29.4 29.4 23598559 | |
BM_eigen_sign_float/512_mean 103 103 6759891 | |
BM_eigen_sign_float/2k_mean 371 371 1851622 | |
BM_eigen_sign_ctype/8_mean 58.4 58.4 11787688 | |
BM_eigen_sign_ctype/64_mean 509 509 1200000 | |
BM_eigen_sign_ctype/512_mean 4091 4093 168895 | |
BM_eigen_sign_ctype/2k_mean 16314 16295 42802 | |
BM_eigen_sign_cdtype/8_mean 79.6 79.6 8605987 | |
BM_eigen_sign_cdtype/64_mean 717 717 965781 | |
BM_eigen_sign_cdtype/512_mean 6831 6828 101197 | |
BM_eigen_sign_cdtype/2k_mean 28749 28743 24183 | |
after: | |
BM_eigen_sign_double/8_mean 16.4 16.4 42809039 | |
BM_eigen_sign_double/64_mean 23.2 23.2 30221446 | |
BM_eigen_sign_double/512_mean 74.3 74.3 9397251 | |
BM_eigen_sign_double/2k_mean 258 259 2604931 | |
BM_eigen_sign_float/8_mean 16.5 16.5 42515449 | |
BM_eigen_sign_float/64_mean 31.3 31.3 22136770 | |
BM_eigen_sign_float/512_mean 60.9 60.9 11516560 | |
BM_eigen_sign_float/2k_mean 153 153 4570941 | |
BM_eigen_sign_ctype/8_mean 62.7 62.7 10956854 | |
BM_eigen_sign_ctype/64_mean 121 121 5783435 | |
BM_eigen_sign_ctype/512_mean 501 501 1200000 | |
BM_eigen_sign_ctype/2k_mean 1835 1835 379651 | |
BM_eigen_sign_cdtype/8_mean 57.7 57.8 11825262 | |
BM_eigen_sign_cdtype/64_mean 203 203 3418327 | |
BM_eigen_sign_cdtype/512_mean 1392 1392 497932 | |
BM_eigen_sign_cdtype/2k_mean 5406 5406 120000 | |
```",Rasmus Munk Larsen,2022-08-09T19:54:58.036Z,NA,NA | |
1030 (https://gitlab.com/libeigen/eigen/-/merge_requests/1030),Don't double-define Half functions on aarch64,"### What does this implement/fix? | |
This change fixes a compilation error that occurs when compiling for GPU (CUDA) with an aarch64 (aka Arm64) host. Two sets of Half functions are defined, which conflict with each other. To fix this, the change specifically disables the aarch64 versions during the GPU compile phase.",Lexi Bromfield,2022-08-09T20:00:34.784Z,NA,NA | |
1031 (https://gitlab.com/libeigen/eigen/-/merge_requests/1031),Eliminate bool bitwise warnings.,These previously triggered `-Wbitwise-instead-of-logical` warnings.,Antonio Sánchez,2022-08-09T22:42:32.300Z,NA,NA | |
1032 (https://gitlab.com/libeigen/eigen/-/merge_requests/1032),"Disable bad ""deprecated warning"" edge-case in BDCSVD","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This fixes a couple of invalid deprecation warnings in BDCSVD. | |
If no computationOptions are set with the runtime parameters, BDCSVD dispatches to JacobiSVD at runtime. BDCSVD has to pass a ``0`` as the computationOptions to JacobiSVD's deprecated constructor. Unfortunately, I don't think there is a simple way to naturally prevent this in the implementation. So, I just disable the warning when bdcsvd tries to call jacobisvd's deprecated constructor. :shrug: | |
~~~~ | |
#include <Eigen/Core> | |
#include <Eigen/SVD> | |
int main() { | |
Eigen::MatrixXd m = Eigen::MatrixXd::Random(100, 100); | |
// Valid input, to get the singular values, but shows deprecation warning for JacobiSVD! | |
Eigen::BDCSVD<Eigen::MatrixXd> svd1(m); | |
// Deprecated warning for both BDCSVD and JacobiSVD. | |
Eigen::BDCSVD<Eigen::MatrixXd> svd2(m, Eigen::ComputeFullU); | |
return 0; | |
} | |
~~~~ | |
I just borrowed the stuff to disable the warning from [highway](https://github.com/google/highway/blob/33b43877277c438d25247e6624fe6a30616b44ae/hwy/base.h#L48). this should work for the compiler versions eigen supports...",Arthur,2022-08-11T18:43:32.535Z,NA,NA | |
1033 (https://gitlab.com/libeigen/eigen/-/merge_requests/1033),[SYCL] Fix some SYCL tests,"### What does this implement/fix? | |
Sigmoid failed in tensor_math because of the specializations in PacketMath. | |
The binary logic operators were casting floating types with rounding but they | |
are meant to do bitwise casting. The generic implementations of and, or, xor, | |
andnot are working as expected with SYCL. | |
tensor_builtin test could fail for certain seeds because of log | |
producing too small outputs for the test precision. | |
tensor_random could fail for certain seeds. The neighborhood check is | |
removed as it is not a safe way to check for the distribution. | |
This matches the behavior of the CUDA test.",Romain Biessy,2022-08-16T17:37:54.733Z,NA,NA | |
1035 (https://gitlab.com/libeigen/eigen/-/merge_requests/1035),Removed unnecessary checks for FP16C,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
The AVX512 packetmath unnecessarily checks for the presence of FP16C when using the intrinsics `_mm512_cvtps_ph` and `_mm512_cvtph_ps` - these are AVX512F intrinsics, and do not need this flag to be set. Currently, if -mfp16c is not set, a scalar typecast is used for float2half and half2float. | |
### Additional information | |
Checking various versions of GCC, clang, MSVC, all seem to compile the intrinsics fine with only AVX512F enabled. This makes a massive performance difference if someone has set -mavx512f but not -mfp16c, as it avoids very slow scalar typecasts.",Matthew Sterrett,2022-08-16T19:09:47.278Z,NA,NA | |
1029 (https://gitlab.com/libeigen/eigen/-/merge_requests/1029),add fixed power unary operation,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
\#1425 | |
### What does this implement/fix? | |
Adds a unary expression for performing a coefficientwise real-valued power operation on an array. If the exponent argument is an integer, the operation is performed using repeated squaring. Otherwise, the operation defers to Eigen's existing vectorized pow routine with a fixed exponent. Both cases are IEEE compliant with respect to error handling. | |
This MR replaces !1022 as the functionality is quite different, though the result is the same. | |
### Additional information | |
Some benchmark measurements are here: https://gitlab.com/libeigen/eigen/-/snippets/2388067. The speedups for integer exponents are very impressive. | |
The benchmarks measure the following simple functions for float and double arrays of varying sizes: | |
``` | |
template<typename T> | |
void eigen_powquarter(const Eigen::Matrix<T, Eigen::Dynamic, 1>& v, | |
Eigen::Matrix<T, Eigen::Dynamic, 1>* u) { | |
*u = v.array().pow(T(0.25)); | |
} | |
template<typename T> | |
void eigen_pow4(const Eigen::Matrix<T, Eigen::Dynamic, 1>& v, | |
Eigen::Matrix<T, Eigen::Dynamic, 1>* u) { | |
*u = v.array().pow(int(4)); | |
} | |
template<typename T> | |
void eigen_pow4float(const Eigen::Matrix<T, Eigen::Dynamic, 1>& v, | |
Eigen::Matrix<T, Eigen::Dynamic, 1>* u) { | |
*u = v.array().pow(T(4)); | |
} | |
``` | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-08-16T21:32:36.602Z,NA,NA | |
1017 (https://gitlab.com/libeigen/eigen/-/merge_requests/1017),Add support for AVX512-FP16 for vectorizing half precision math,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This merge request takes advantage of the AVX512-FP16 instruction set to vectorize half floating-point operations. It implements Packet32h and replaces many packet operations for the pre-existing Packet16h and Packet8h. The pre-existing AVX implementations used typecasting to float and back, so this could improve performance significantly by avoiding these intermediate typecasts. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
I've ran all tests against this, and it passes all of them consistently except for packetmath_13, which checks half precision packet math. The specific tests that fail are the fused math operations, particularly pmsub. From my testing, this seems to just be because of lower precision of the AVX512-FP16 fused intrinsics compared to the reference calculations. I'm not sure what the best thing to do here is, it would be nice to get some feedback on how to handle this. | |
On my experimental setup, I got a consistent 8-9 times improvement in the performance of the bench_gemm benchmark changing from using AVX512F only to using AVX512-FP16, with OpenMP disabled. This performance gain was consistent for matrices varying from size 256x256 - 4096x4096. | |
Specifically, the speedup was about 8.4x for 256x256, 8.3x for 512x512, 8.8x for 1024x1024, 8.7x for 2048x2048, and 9.4x for 4096x4096.",Matthew Sterrett,2022-08-17T18:15:22.710Z,NA,NA | |
1034 (https://gitlab.com/libeigen/eigen/-/merge_requests/1034),Use proper double word division algorithm for pow<double>. Gives 11-15% speedup.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Fix TODO in `accurate_log2<double>` used by `pow<double>`. This change replaces the division via an approximate reciprocal with Algorithm 15 from: | |
""Tight and rigourous error bounds for basic building blocks of double-word arithmetic"", | |
Joldes, Muller, & Popescu, 2017. https://hal.archives-ouvertes.fr/hal-01351529 | |
This speeds up `pow<double>` by 11-15%. Benchmark measurements: https://gitlab.com/libeigen/eigen/-/snippets/2390173 | |
Comparison against MPFR shows no change in accuracy. The algorithm still return faithfully rounded results when the result is normal. | |
Thanks to David Majnemer ([email protected]) for pointing me to the paper. | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-08-17T18:36:24.370Z,NA,NA | |
1037 (https://gitlab.com/libeigen/eigen/-/merge_requests/1037),Protect new pblend implementation with EIGEN_VECTORIZE_AVX2,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
!1011 broke the AVX build without AVX2. | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-08-22T18:28:04.190Z,NA,NA | |
1039 (https://gitlab.com/libeigen/eigen/-/merge_requests/1039),"Fix psign for unsigned integer types, such as bool.","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.--> | |
Fixes bug in !1026 where `psign<bool>` would try to return bool(-1).",Rasmus Munk Larsen,2022-08-22T20:19:36.046Z,NA,NA | |
1036 (https://gitlab.com/libeigen/eigen/-/merge_requests/1036),Sparse Core: Replace malloc/free with conditional_aligned,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
The sparse classes currently use a mix of `std::malloc`, `std::realloc`, and `std::free` for memory management instead of the `aligned_malloc` family of functions defined in `Memory.h`. Other than consistency with the dense classes, this will enable users to track heap allocations with `#define EIGEN_RUNTIME_NO_MALLOC` and related mechanisms. Also, there may be some advantage to ensuring that the sparse matrices use aligned storage for vectorization. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
Benchmark measurement: | |
``` | |
double sse no align Total duration: 21531, per iteration: 2153.1 | |
double sse align Total duration: 21691, per iteration: 2169.1 | |
double avx no align Total duration: 21548, per iteration: 2154.8 | |
double avx align Total duration: 21448, per iteration: 2144.8 | |
double old sse Total duration: 95201, per iteration: 9520.1 | |
double old avx Total duration: 97138, per iteration: 9713.8 | |
``` | |
I did this test 10 times for `size = 1000`. | |
``` | |
template<typename Scalar, bool Align> | |
void testSparseAlign(Index size) | |
{ | |
SparseMatrix<Scalar, ColMajor, int, Align> A(size, size); | |
for (int i = 0; i < size; i++) | |
for (int j = 0; j <= i; j++) | |
A.coeffRef(size-i-1, j) = i*j; | |
std::cout << A.nonZeros() << ""\n""; | |
std::cout << A.data().allocatedSize() << ""\n""; | |
std::cout << A.sum() << ""\n""; | |
} | |
```",Charles Schlosser,2022-08-23T21:44:23.506Z,NA,NA | |
1044 (https://gitlab.com/libeigen/eigen/-/merge_requests/1044),Add missing ptr in realloc call.,Introduced in !1036,Antonio Sánchez,2022-08-25T05:05:17.064Z,NA,NA | |
1045 (https://gitlab.com/libeigen/eigen/-/merge_requests/1045),Fix GeneralizedEigenSolver::info() and Asserts,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
#2524 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
small fix for ``GeneralizedEigenSolver::info()`` :-) | |
basically, m_valuesOkay was used to check if the decomp was initialized, but was only set to true when m_realQZ was successful. So info() would raise an assert when the decomposition failed. IMO info() should always be accessible when the decomp is initialized (and I think that's how all the other decomps work). | |
This just replaces m_valuesOkay with m_isInitialized and replaces spots where m_valuesOkay was used with ``info() == Success``. It also changes some of the error messages to make them more accurate (I.e., they now say the decomposition failed, rather than always just saying ""uninitialized"")",Arthur,2022-08-25T22:05:05.397Z,NA,NA | |
1042 (https://gitlab.com/libeigen/eigen/-/merge_requests/1042),Avoid undefined behavior in array_cwise test due to signed integer overflow,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-08-26T16:19:04.541Z,NA,NA | |
1040 (https://gitlab.com/libeigen/eigen/-/merge_requests/1040),"Specialize psign<Packet8i> for AVX2, don't vectorize psign<bool>.","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.--> | |
Speedup for `psign<Packet8i>` with AVX2 enabled: | |
``` | |
name old cpu/op new cpu/op delta | |
BM_eigen_sign_int/1 2.73ns ± 0% 0.56ns ± 1% -79.45% (p=0.000 n=52+56) | |
BM_eigen_sign_int/8 6.81ns ± 1% 5.33ns ± 0% -21.75% (p=0.000 n=49+55) | |
BM_eigen_sign_int/64 16.1ns ± 1% 7.9ns ± 0% -50.95% (p=0.000 n=52+57) | |
BM_eigen_sign_int/512 58.0ns ± 0% 28.2ns ± 0% -51.40% (p=0.000 n=58+49) | |
BM_eigen_sign_int/4k 405ns ± 1% 198ns ± 1% -51.05% (p=0.000 n=46+60) | |
BM_eigen_sign_int/32k 3.83µs ± 1% 2.46µs ± 1% -35.76% (p=0.000 n=42+54) | |
BM_eigen_sign_int/256k 78.5µs ± 2% 78.5µs ± 1% ~ (p=0.369 n=59+51) | |
BM_eigen_sign_int/1M 315µs ± 2% 315µs ± 1% ~ (p=0.983 n=32+30) | |
```",Rasmus Munk Larsen,2022-08-26T17:02:38.413Z,NA,NA | |
1046 (https://gitlab.com/libeigen/eigen/-/merge_requests/1046),re-enable pow for complex types,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-08-29T16:06:08.524Z,NA,NA | |
1043 (https://gitlab.com/libeigen/eigen/-/merge_requests/1043),Vectorize pow for integer base / exponent types,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
#2522 | |
### What does this implement/fix? | |
Vectorized `pow` with handling for negative exponents. e.g. `ArrayXi x, y; y = x.pow(3);` | |
In general, integer divide by zero and signed integer overflow is undefined behavior. Under these conditions, output may vary from `std::pow` (or Eigen's `square` and `cube` for that matter) depending on implementation. For example: msvc and clang always (?) return `lowest()` for overflow and underflow for signed types -- while gcc returns `highest()` or `lowest()` depending on the arguments and data types. | |
Unsigned integers do not overflow per standard. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-08-29T19:23:55.104Z,NA,NA | |
1038 (https://gitlab.com/libeigen/eigen/-/merge_requests/1038),"Vectorize acos, asin, and atan for float.","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This change vectorizes the `acos`, `asin`, and `atan` operators in Eigen. | |
### Additional information | |
**Accuracy:** Exhaustive testing for all float arguments in [-1:1] shows that this implementation | |
is accurate to 2.6 ulps for `pacos`, and 3.8 ulps for `pasin`. Maximum relative error for `patan` is 2 ulps. | |
**Speed**: See: libeigen/eigen$2393114 | |
Speedup for 4k element vector: | |
| | SSE | AVX | AVX512 | | |
| ---------- | ----- | ----- | ------ | | |
| `asin()` | 6x | 15.9x | 18.6x | | |
| `acos()` | 11.4x | 29.5x | 29.8x | | |
| `atan()` | 5.5x | 9.3x | 17x | | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-08-29T19:49:33.827Z,NA,NA | |
1048 (https://gitlab.com/libeigen/eigen/-/merge_requests/1048),Fix some test build errors in new unary pow.,"For `real^complex`, we need the return type to be the output of | |
`ScalarBinaryOpTraits`. For `complex<real>^int`, we need the exponent to be | |
promoted to real for use in` ScalarBinaryOpTraits`. | |
Also removed some `const` qualifiers on return types, since these are often overly restrictive.",Antonio Sánchez,2022-08-30T17:24:15.511Z,NA,NA | |
1049 (https://gitlab.com/libeigen/eigen/-/merge_requests/1049),2 typos fix in the 3rd table.,"### What does this implement/fix? | |
It seems to me that there are 2 typos in the 3rd table of the slicing tutorial page. One shall read matrix `A` and not vector `v`.",Gilles Aouizerate,2022-08-31T22:45:04.405Z,NA,NA | |
1051 (https://gitlab.com/libeigen/eigen/-/merge_requests/1051),Fix mixingtypes tests.,"The new unary pow op no longer calls the binary op plugin, so some of | |
these tests were failing. Modified the test to account for this... | |
I think this is better than forcing the unary op to call a binary | |
plugin.",Antonio Sánchez,2022-09-02T15:30:13.933Z,NA,NA | |
1052 (https://gitlab.com/libeigen/eigen/-/merge_requests/1052),Fix some cmake issues.,"We shouldn't be building benchmarks by default on peoples' systems. | |
Also fixed some test dependency issues if sparse libraries are detected. | |
Fixes #2529.",Antonio Sánchez,2022-09-02T16:43:15.460Z,NA,NA | |
1050 (https://gitlab.com/libeigen/eigen/-/merge_requests/1050),Add asserts for index-out-of-bounds in IndexedView.,Fixes #2530.,Antonio Sánchez,2022-09-02T17:28:03.803Z,NA,NA | |
1053 (https://gitlab.com/libeigen/eigen/-/merge_requests/1053),fixed msvc compilation error in GeneralizedEigenSolver.h,"### Reference issue | |
#2532 | |
### What does this implement/fix? | |
Fixed a compilation error happening at least on MSVC due to a missing semi-colon.",Michael Palomas,2022-09-05T15:55:32.954Z,NA,NA | |
1054 (https://gitlab.com/libeigen/eigen/-/merge_requests/1054),fix typo in doc/TutorialSparse.dox,"### What does this implement/fix? | |
fix typo in doc/TutorialSparse.dox (2 columns were inverted)",Gilles Aouizerate,2022-09-06T17:58:00.432Z,NA,NA | |
1055 (https://gitlab.com/libeigen/eigen/-/merge_requests/1055),Call check_that_malloc_is_allowed() in aligned_realloc(),"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
https://gitlab.com/libeigen/eigen/-/issues/2526 | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
This prevents reallocs when EIGEN_RUNTIME_NO_MALLOC is defined. The check is not needed with EIGEN_NO_MALLOC as no memory could be malloc'd in the first place. We only call the assert if result != ptr which means an allocation happened. | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Florian Richer,2022-09-06T18:18:57.788Z,NA,NA | |
1056 (https://gitlab.com/libeigen/eigen/-/merge_requests/1056),Reduce compiler warnings for tests.,NA,Antonio Sánchez,2022-09-06T18:21:14.674Z,NA,NA | |
1057 (https://gitlab.com/libeigen/eigen/-/merge_requests/1057),Adjust overflow threshold bound for pow tests.,"The original bound was too weak and did result in some overflows causing | |
our CI pipelines to fail (https://gitlab.com/libeigen/eigen_ci_cross_testing/-/pipelines/631839537). | |
In particular, it allowed `int32_t` `pow(256, 4)`, which overflowed to 0 | |
for integers, and to an inexact result for float. Here we adjust the | |
bounds.",Antonio Sánchez,2022-09-06T19:53:29.762Z,NA,NA | |
899 (https://gitlab.com/libeigen/eigen/-/merge_requests/899),"Add constexpr, test for C++14 constexpr.","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This is the next split-off from !881. It adds the C++14 parts of the patch which allows `constexpr` initialization of `Map`s from static `constexpr` memory and some of the basic operations (including, to my surprise, my memory is a sieve) addition and subtraction. | |
I included a testcase which is the testcase in !881 with the C++20 parts reduced to empty placeholders. | |
Edit: this is done. ~~I realize now that I forgot to follow up on @cantonios comment re destructors (""Whatever we do for NVCC, we probably also need to do for HIP as well.""), so I'll have to leave this as a draft until I can review what that implies. Edit: I think `#if defined(EIGEN_GPUCC)` should be the right thing.~~ | |
Compared to the version in !881 this re-instates (I think) all the inline keywords that I had removed when adding `constexpr`. | |
### Additional information | |
<!--Any additional information you think is important.-->",Tobias Schlüter,2022-09-07T04:02:46.824Z,NA,NA | |
1058 (https://gitlab.com/libeigen/eigen/-/merge_requests/1058),Add missing comparison operators for GPU packets.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Adds missing comparison operators for GPUs. Without these, the recent vectorized version of psign (!1026) does not build with CUDA. | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-09-07T21:34:16.479Z,NA,NA | |
1061 (https://gitlab.com/libeigen/eigen/-/merge_requests/1061),Tweak bound for pow to account for floating-point types.,"Floating-point types can only represent up to 2^digits integers | |
exactly, so we need to adjust the corrected bound from !1057. | |
This fixes an `array_cwise_3` failure.",Antonio Sánchez,2022-09-08T17:40:45.985Z,NA,NA | |
1060 (https://gitlab.com/libeigen/eigen/-/merge_requests/1060),Fix realloc for non-trivial types.,"Fix realloc for non-trivial types. | |
If a non-trivial type `RequiresInitialization`, then we unfortunately | |
can't simply rely on `realloc` to move the memory to a new buffer. | |
If the data type contains self-referencing pointers (as does | |
`AnnoyingScalar` in tests), then those pointers become invalid. | |
Instead, we need to allocate a new buffer and copy-construct (or | |
move-construct) existing elements. | |
This fixes some failing tests in `sparse_block`, which tickled the | |
bug.",Antonio Sánchez,2022-09-08T19:39:37.176Z,NA,NA | |
1047 (https://gitlab.com/libeigen/eigen/-/merge_requests/1047),Feature/skew symmetric matrix3,"### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
https://gitlab.com/libeigen/eigen/-/issues/1474 | |
### Additional information | |
This MR implements a skew symmetric matrix class for Vector3. | |
It is based on DiagonalMatrix and implements its exponential using | |
[Rodrigues' rotation formula](https://mathworld.wolfram.com/RodriguesRotationFormula.html).",Thomas Gloor,2022-09-08T20:44:41.220Z,NA,NA | |
1064 (https://gitlab.com/libeigen/eigen/-/merge_requests/1064),Fix g++-6 constexpr and c++20 constexpr build errors.,"Apparently g++-6 requires that all variables in a constexpr function | |
need to be initialized on construction. | |
Also added more `constexpr` labels that are required post c++20. | |
And clang has a bug for `consteval` that claims it's not a contexpr (https://stackoverflow.com/questions/63364918/clang-says-call-to-void-consteval-function-is-not-a-constant-expression). Re-wrote the `assert_constexpr` ""function"" using template parameter and macro. | |
Fixes #2536",Antonio Sánchez,2022-09-09T03:41:45.852Z,NA,NA | |
1065 (https://gitlab.com/libeigen/eigen/-/merge_requests/1065),[ROCm] Fix for sparse matrix related breakage on ROCm.,"The following commit caused a compilation failure on ROCm: https://gitlab.com/rohitsan/eigen/-/commit/ec9c7163a3acd941163dc26aa1bea913a4a5c3a8 | |
This MR fixes this. | |
/cc @cantonios",Rohit Santhanam,2022-09-09T15:48:05.107Z,NA,NA | |
1063 (https://gitlab.com/libeigen/eigen/-/merge_requests/1063),Fix a couple of issues with unary pow():,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
1. Explicitly cast value returned by std::pow() to result_type. | |
1. Discard const when comparing types. | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-09-09T17:21:11.889Z,NA,NA | |
1069 (https://gitlab.com/libeigen/eigen/-/merge_requests/1069),Remove bad skew_symmetric_matrix3 test.,"We can't compare an uninitialized matrix since this results in msan | |
errors, and it does actually pass the approx test with compiler | |
optimizations since the old memory address is re-used. | |
Fixes #2537.",Antonio Sánchez,2022-09-10T07:08:38.726Z,NA,NA | |
1066 (https://gitlab.com/libeigen/eigen/-/merge_requests/1066),"Allow mixed types for pow(), as long as the exponent is exactly representable in the base type.","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
The current implementation of unary_pow break many existing codes because of the hard constraint that base type == exponent type for floating point types. Promoting the exponent from float to double is safe. This change is an attempt to write this in a general way, so we can add support for mixed `pow()` for more types on the future, such as `pow(double, bfloat16)`. | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-09-12T21:55:31.867Z,NA,NA | |
1070 (https://gitlab.com/libeigen/eigen/-/merge_requests/1070),Fix test for pow with mixed integer types. We do not convert the exponent if it is an integer type.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-09-12T23:13:38.030Z,NA,NA | |
1073 (https://gitlab.com/libeigen/eigen/-/merge_requests/1073),Add AVX int32_t pdiv,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
Enables vectorized int32_t division by internally casting to double and truncating the result. Approximately 2x throughput compared to native integer division. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-09-16T17:06:30.044Z,NA,NA | |
1074 (https://gitlab.com/libeigen/eigen/-/merge_requests/1074),"Revert ""Add constexpr, test for C++14 constexpr.""","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-09-16T21:14:30.500Z,NA,NA | |
1077 (https://gitlab.com/libeigen/eigen/-/merge_requests/1077),[ROCm] fixed gpuGetDevice unused message,"### Reference issue | |
https://gist.github.com/cheshire/f81fd036f07ea9267e3abb8c14c12aec#file-gistfile1-txt-L7-L8 | |
We are fixing unused-result warning for Tensorflow at all Linux builds, and this unused-result warning existes both in ROCm and cuda. | |
### What does this implement/fix? | |
Add the warning status check for `gpuGetDevice`, and it will print out ""Failed to get the GPU devices "". | |
This is identical to `gpuGetDeviceCount`.",Chao Chen,2022-09-20T21:38:21.447Z,NA,NA | |
1076 (https://gitlab.com/libeigen/eigen/-/merge_requests/1076),"Add vectorized integer division for int32 with AVX512, AVX or SSE.","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This implements vectorized integer division for int32 on AVX512, AVX and SSE | |
Original author is @chuckyschluz . | |
This change adds | |
- Sets HasDiv=1 for Packet4i, Packet8i, and Packet16i | |
- raises SIGFPE upon division by zero | |
### Additional information | |
<!--Any additional information you think is important.--> | |
Benchmark data for SSE & AVX here: https://gitlab.com/libeigen/eigen/-/snippets/2412525",Rasmus Munk Larsen,2022-09-21T00:27:23.997Z,NA,NA | |
1018 (https://gitlab.com/libeigen/eigen/-/merge_requests/1018),Use 3px8/2px8/1px8/1x8 gebp_kernel on arm64-neon,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> -->#2518 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
I found that the gebp_kernel used by eigen neon is 3px4/2px4/1px4 by default. This is reasonable on x86(avx/fma) and arm32. However, arm64 neon has 32 registers, and a larger size gebp_kernel can be used to get better data reuse to improve performance. | |
Therefore, I implement gebp_kernel 3px8/2px8/1px8 on eigen (3px8 24 registers for acc, 3 for lhs, 1 for rhs). | |
### Additional information | |
<!--Any additional information you think is important.--> | |
**benchmark** | |
# clang | |
## dgemm | |
 | |
 | |
## sgemm | |
 | |
 | |
## hgemm | |
 | |
 | |
# gcc | |
## dgemm | |
 | |
 | |
## sgemm | |
 | |
 | |
## hgemm | |
 | |
 | |
**platform** : Ampere® Altra | |
Architecture: aarch64 | |
CPU op-mode(s): 32-bit, 64-bit | |
Byte Order: Little Endian | |
CPU(s): 128 | |
On-line CPU(s) list: 0-127 | |
Vendor ID: ARM | |
Model name: Neoverse-N1 | |
Model: 1 | |
Thread(s) per core: 1 | |
Core(s) per socket: 64 | |
Socket(s): 2 | |
Stepping: r3p1 | |
CPU max MHz: 3000.0000 | |
CPU min MHz: 1000.0000 | |
BogoMIPS: 50.00 | |
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs | |
Caches (sum of all): | |
L1d: 8 MiB (128 instances) | |
L1i: 8 MiB (128 instances) | |
L2: 128 MiB (128 instances)",Lianhuang Li,2022-09-21T16:36:41.318Z,NA,NA | |
1078 (https://gitlab.com/libeigen/eigen/-/merge_requests/1078),Add a macro to set the nr trait in the GEBP kernel for NEON.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-09-22T23:56:34.752Z,NA,NA | |
1080 (https://gitlab.com/libeigen/eigen/-/merge_requests/1080),Remove unused typedef.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-09-23T19:33:55.589Z,NA,NA | |
1079 (https://gitlab.com/libeigen/eigen/-/merge_requests/1079),Try to reduce compilation time/memory for GEBP kernel using EIGEN_IF_CONSTEXPR,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-09-23T20:09:43.550Z,NA,NA | |
1083 (https://gitlab.com/libeigen/eigen/-/merge_requests/1083),Try to reduce size of GEBP kernel for non-ARM targets.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This is an experiment to try and prevent MSVC from running out of heap memory when building TensorFlow. | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-09-28T02:37:18.984Z,NA,NA | |
1082 (https://gitlab.com/libeigen/eigen/-/merge_requests/1082),Add a vectorized implementation of atan2 to Eigen.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This adds support for the array syntax `z = x.atan2(y)` and the corresponding global function `z = atan2(x,y)`. Since the common case is `atan2(x) = atan(y/x)` and `atan()` is already vectorized, this MR mostly adds global declarations and vectorized handling of special cases, as specified at https://en.cppreference.com/w/cpp/numeric/math/atan2. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
Speedup is about 12.4x for AVX512 on large arrays. See detailed benchmark numbers here: | |
https://gitlab.com/libeigen/eigen/-/snippets/2418433",Rasmus Munk Larsen,2022-09-28T20:46:50.727Z,NA,NA | |
1084 (https://gitlab.com/libeigen/eigen/-/merge_requests/1084),Vectorize atan() for double.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
See benchmarks: https://gitlab.com/libeigen/eigen/-/snippets/2420060 | |
### Additional information | |
<!--Any additional information you think is important.--> | |
Sampling the function for arguments in the interval [0:1] with multiplicative stepsize of 1+1e-7 shows a maximum relative error of 2 ULPs.",Rasmus Munk Larsen,2022-10-01T01:49:30.973Z,NA,NA | |
1086 (https://gitlab.com/libeigen/eigen/-/merge_requests/1086),Only vectorize atan<double> for Altivec if VSX is available.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-10-03T22:06:59.594Z,NA,NA | |
1085 (https://gitlab.com/libeigen/eigen/-/merge_requests/1085),Fix 4x4 inverse when compiling with -Ofast.,"Fast mode doesn't respect -0, causing sign flips in the inverse. | |
Fixes #2549",Antonio Sánchez,2022-10-04T16:05:49.760Z,NA,NA | |
1088 (https://gitlab.com/libeigen/eigen/-/merge_requests/1088),Replace assert with eigen_assert.,This is for consistency and ability to disable entirely.,Antonio Sánchez,2022-10-04T17:11:23.683Z,NA,NA | |
1089 (https://gitlab.com/libeigen/eigen/-/merge_requests/1089),Unconditionally enable CXX11 math.,It should be supported on all compilers for C++14 and up.,Antonio Sánchez,2022-10-04T17:37:47.753Z,NA,NA | |
1087 (https://gitlab.com/libeigen/eigen/-/merge_requests/1087),Simpler range reduction strategy for atan<float>().,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.--> | |
This change saves a division and some `pselect` logic, in exchange for a couple of extra FMAs. The relative error is still <= 2 ulps, while speedup is 20-40% on x86. libeigen/eigen$2421160 | |
Unfortunately, the same change is not viable for `double` without going to a very high polynomial degree, negating the benefit. | |
Also, this change refactors the inner polynomial approximations for `atan<float>()` and `atan<double>()` to separate functions for future use in a more efficient implementation of `atan2()`.",Rasmus Munk Larsen,2022-10-04T18:11:01.153Z,NA,NA | |
1091 (https://gitlab.com/libeigen/eigen/-/merge_requests/1091),[clang-format] Add a few macros to AttributeMacros,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
This improves the automated formatting with clang-format. I used this for !1090 since without it, some lines were formatted rather oddly.",Alexander Richardson,2022-10-10T16:44:47.781Z,NA,NA | |
1092 (https://gitlab.com/libeigen/eigen/-/merge_requests/1092),Remove references to M_PI_2 and M_PI_4.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Fixes #2553 | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-10-11T00:27:17.517Z,NA,NA | |
1093 (https://gitlab.com/libeigen/eigen/-/merge_requests/1093),Handle NaN inputs to atan2.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-10-11T17:35:52.548Z,NA,NA | |
1094 (https://gitlab.com/libeigen/eigen/-/merge_requests/1094),Eigen/Sparse: fix warnings -Wunused-but-set-variable,"### What does this implement/fix? | |
This merge-request fixes the following warnings, reported by clang 16.0.0git (compiled from the `main` branch of LLVM/clang) in `Eigen/Sparse`: | |
``` | |
/usr/include/eigen3/Eigen/src/SparseLU/SparseLU_heap_relax_snode.h:78:9: warning: variable 'nsuper_et_post' set but not used [-Wunused-but-set-variable] | |
Index nsuper_et_post = 0; // Number of relaxed snodes in postordered etree | |
^ | |
/usr/include/eigen3/Eigen/src/SparseLU/SparseLU_heap_relax_snode.h:79:9: warning: variable 'nsuper_et' set but not used [-Wunused-but-set-variable] | |
Index nsuper_et = 0; // Number of relaxed snodes in the original etree | |
^ | |
``` | |
``` | |
/usr/include/eigen3/Eigen/src/SparseCore/TriangularSolver.h:273:13: warning: variable 'count' set but not used [-Wunused-but-set-variable] | |
Index count = 0; | |
^ | |
``` | |
### Additional information: Context | |
Those warnings can be seen in context in the [testsuite page](https://cgal.geometryfactory.com/CGAL/testsuite/) of [CGAL (The Computational Geometry Algorithms Library)](https://www.cgal.org/) and in particular here: https://cgal.geometryfactory.com/CGAL/testsuite/CGAL-5.6-I-86/Heat_method_3/TestReport_gimeno_Debian-testing-clang-main.gz. | |
### Additional information: Testsuite passed locally | |
I have compiled and run tests successfully on my machine (x86_64, with compiler `clang version 14.0.5 (Fedora 14.0.5-1.fc36)`). | |
``` | |
100% tests passed, 0 tests failed out of 1121 | |
Label Time Summary: | |
Official = 8562.54 sec*proc (778 tests) | |
Unsupported = 1867.33 sec*proc (233 tests) | |
smoketest = 176.24 sec*proc (109 tests) | |
Total Test time (real) = 1041.58 sec | |
```",Laurent Rineau,2022-10-11T17:37:05.033Z,NA,NA | |
1075 (https://gitlab.com/libeigen/eigen/-/merge_requests/1075),Don't use generic sign function for sign(complex) unless it is vectorizable,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Don't use generic psign function for sign(complex) unless the type is vectorizable, which implies that it hasvdata member `v` that we use for access complex packets as a vector of real. | |
Bug reported in on powerpc without VSX support by Chip Kerchner. | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-10-12T16:03:30.585Z,NA,NA | |
1095 (https://gitlab.com/libeigen/eigen/-/merge_requests/1095),"Refactor special values test for pow, and add a similar test for atan2","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-10-12T20:12:09.584Z,NA,NA | |
1096 (https://gitlab.com/libeigen/eigen/-/merge_requests/1096),Fix bug atan2,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Using a packet with just a single bit set (in this case the sign bit) as the predicate for `pselect` does not work on some platforms. | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2022-10-12T23:49:33.472Z,NA,NA | |
1099 (https://gitlab.com/libeigen/eigen/-/merge_requests/1099),Explicitly state that indices must be sorted.,Fixes #2558.,Antonio Sánchez,2022-10-19T18:15:29.847Z,NA,NA | |
1101 (https://gitlab.com/libeigen/eigen/-/merge_requests/1101),Change handmade_aligned_malloc/realloc/free to store a 1 byte offset instead of absolute address,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
Fixes #2554 | |
### What does this implement/fix? | |
The `handmade_aligned_malloc` family of functions assumes that `malloc` will return an address that is aligned to at least `sizeof(void*)`, i.e. `alignof(max_align_t) % sizeof(void*) == 0`. If this is not true, there may be insufficient space to store `original` in the offset between `aligned` and `original`. | |
One solution is to store a 1-byte offset `aligned-original` at `aligned-1` and deduce `original` when needed for `realloc` and `free`. This should work for `alignment<256` and any `alignof(max_align_t)`. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-10-22T22:51:32.357Z,NA,NA | |
1105 (https://gitlab.com/libeigen/eigen/-/merge_requests/1105),Fix pragma check for disabling fastmath.,Fixes #2563.,Antonio Sánchez,2022-10-26T22:50:58.352Z,NA,NA | |
1102 (https://gitlab.com/libeigen/eigen/-/merge_requests/1102),Add assert for invalid outerIndexPtr array in SparseMapBase.,"The outer index array must have size equal to `outerSize + 1`, with the | |
last element being the size of the `valuePtr` array. | |
Fixes ##2561.",Antonio Sánchez,2022-10-26T22:51:34.067Z,NA,NA | |
1106 (https://gitlab.com/libeigen/eigen/-/merge_requests/1106),Fix handmade_aligned_malloc offset computation.,"Turns out `-` takes precedence over `&`, which was causing a bunch of | |
compiler warnings. We have a oss-fuzz bug reported via chromium as | |
well, reporting writing to out-of-bounds memory, and I *think* this solves | |
that issue.",Antonio Sánchez,2022-10-27T17:33:47.768Z,NA,NA | |
1107 (https://gitlab.com/libeigen/eigen/-/merge_requests/1107),Disable patan for double on PPC.,"It's not defined, leading to build failures.",Antonio Sánchez,2022-10-27T17:56:09.225Z,NA,NA | |
1100 (https://gitlab.com/libeigen/eigen/-/merge_requests/1100),Allow empty matrices to be resized.,"Previously we had the storage fixed to 0-by-0 if the compile-time size | |
of the storage was 0, but this conflicts with the compile-time | |
matrix size. This was preventing dynamic empty matrices from being | |
properly resized, and causing a misreporting of the matrices # of rows | |
and columns. | |
Fixes #2557.",Antonio Sánchez,2022-10-27T20:33:36.040Z,NA,NA | |
1110 (https://gitlab.com/libeigen/eigen/-/merge_requests/1110),Remove unused parameter name.,NA,Antonio Sánchez,2022-11-01T23:13:50.521Z,NA,NA | |
1109 (https://gitlab.com/libeigen/eigen/-/merge_requests/1109),Remove recently added sparse assert in SparseMapBase.,"Turns out we have existing use-cases where we use the map to populate | |
the sparse matrix, so the map may not be valid on construction here.",Antonio Sánchez,2022-11-03T17:29:06.295Z,NA,NA | |
1097 (https://gitlab.com/libeigen/eigen/-/merge_requests/1097),Add signbit function,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
Checking the sign of a floating point value appears trivial, except when checking for -0. This is necessary for some frequently used math functions, such as `pow`. A consistent and efficient signbit will simplify implementation and improve performance of many floating point functions. This may also be a faster method of checking for negative values, as it uses shift operations instead of floating point comparisons. | |
AVX arithmetic shift: `_mm256_srai_epi32`, latency 1, CPI: 0.5 | |
AVX floating point compare: `_mm256_cmp_ps`, latency 4, CPI: 0.5 | |
`std::signbit` checks if the leading bit is set and returns a `bool`. These functions return a bitmask (all 1's for true, all 0's for false) of the same type so that it may be used for logical operations (but will evaluate to the same boolean value). | |
Also added an AVX2 packet op to perform arithmetic shift of `int64_t` packets -- `Packet4l`. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-11-04T00:31:21.802Z,NA,NA | |
1111 (https://gitlab.com/libeigen/eigen/-/merge_requests/1111),fix neon,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-11-08T20:03:02.287Z,NA,NA | |
1112 (https://gitlab.com/libeigen/eigen/-/merge_requests/1112),Fix typo in CholmodSupport,"Whadyaknow, it is pretty easy to edit in the gui. | |
Fixes #2566.",Antonio Sánchez,2022-11-08T23:49:56.815Z,NA,NA | |
1118 (https://gitlab.com/libeigen/eigen/-/merge_requests/1118),Fix ambiguity in PPC for vec_splats call.,"`uint64_t` is `unsigned long` in clang, but the IBM intrinsic is only defined for `unsigned long long`.",Antonio Sánchez,2022-11-14T18:58:16.974Z,NA,NA | |
1119 (https://gitlab.com/libeigen/eigen/-/merge_requests/1119),Put brackets around unsigned type names.,NA,Antonio Sánchez,2022-11-15T17:32:32.188Z,NA,NA | |
1116 (https://gitlab.com/libeigen/eigen/-/merge_requests/1116),Correct pnegate for floating-point zero.,"The original formulation of `0 - x` incorrectly handles +/-0 for | |
floating point numbers. We need to instead flip the sign bit | |
explicitly.",Antonio Sánchez,2022-11-15T18:07:24.310Z,NA,NA | |
1098 (https://gitlab.com/libeigen/eigen/-/merge_requests/1098),Cross product for vectors of size 2. Fixes #1037,"### Reference issue | |
Fixes #1037 | |
### What does this implement/fix? | |
Implements cross product for vectors of size 2. | |
The result is a scalar equal to the signed area of a parallelepiped spanned by the input vectors. | |
Or, to put it differently, the cross product of (v1, v2) and (w1, w2) is the 3rd coordinate of the cross product of (v1, v2, v3) and (w1, w2, w3). | |
### Additional information | |
This is my first time contributing to Eigen! I tried to do as best as I could, but I still have some doubts | |
- I think we can consider this MR fixes #1037 because the issue specifically requests a cross product of 2D vectors. Actually, that issue seems to be a little big wider, also proposing a more general implementation for arbitrary-size vectors returning skew-symmetric matrices, possibly named `wedge`. This is not implemented by this MR, but since the main focus of that issue seems to be 2D vectors I think it can be closed. A comment in the thread mentions `numpy`, allowing arbitrary combinations of 2D and 3D vectors, where a 2D vector is interpreted as a 3D vector with vanishing z-component. This behavior is not implemented either in this MR, the cross product is only defined between vectors of the same size | |
- A few words about `cross_product_return_type` | |
- As a shift in paradigm, I took its definition out of `MatrixBase`, turning it from a nested struct into an independent templated struct. I thought it more elegant. This change is not backward compatible, but I took this liberty because the struct was undocumented anyways, and also it was used nowhere in the code base. If requested, I can easily revert this change. | |
- I decided to document it as a type rather than hide it behind something like `PlainObject` as it was previously done. It makes it easier to explain the output. Also, I didn't like the idea of obfuscating the doc | |
- I am not sure about naming conventions, especially now that it's a public type. Should it be `CrossProductReturnType::Type` instead of `cross_product_return_type::type`? Maybe it could even be renamed to something completely different such as `CrossProductTraits::ReturnType`? | |
- In passing, I removed a seemingly-orphaned forward declaration of a class called `Cross` which looked like a relic, as it wasn't used anywhere in the code base and indeed I could not even find its full declaration. Again, if requested, I can put it back | |
- A simpler implementation (mentioned in #1037) would have consisted in giving the method for size-2 vectors a different name, like `cross2`. However, I decided to call it `cross` because I liked it better as an interface, which seems to be shared by other important resources such as `numpy`. Also, I actively enjoyed template metaprogramming!",Gabriele Buondonno,2022-11-15T22:39:43.056Z,NA,NA | |
1113 (https://gitlab.com/libeigen/eigen/-/merge_requests/1113),Fix duplicate execution code for Power 8 Altivec in pstore_partial.,Fix duplicate execution code for Power 8 Altivec in pstore_partial.,Chip Kerchner,2022-11-16T13:41:44.026Z,NA,NA | |
1115 (https://gitlab.com/libeigen/eigen/-/merge_requests/1115),Fix AVX2 psignbit,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #2569 | |
Thank you Ogre Transporter for identifying this | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-11-16T13:43:11.755Z,NA,NA | |
1117 (https://gitlab.com/libeigen/eigen/-/merge_requests/1117),Small cleanup of IDRS.h,"Removed a set, but unused variable in IDRS.h, cleaned up the odd line breaks in the comments on lines 94-100.",Chris,2022-11-16T13:51:23.918Z,NA,NA | |
1120 (https://gitlab.com/libeigen/eigen/-/merge_requests/1120),Fix bug in handmade_aligned_realloc,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
Addresses two bugs in `handmade_aligned_realloc`: | |
1) `memmove` the correct number of bytes from `new_size` (used to be `size`) to `min(old_size,new_size)` so as to not overrun the bounds of the allocated memory. If `new_size > old_size`, then the array was expanded, and we need to copy the entire original array (`old_size`). If `new_size < old_size`, then the array was shrunk, and we need to copy the entire new array (`new_size`). Thus, in general, we need the minimum of these two sizes. We explicitly avoid the case where `new_size == old_size`, as the behavior is possibly undefined (at best, this would result in a no-op anyway). | |
2) If `std::realloc` returns a new address, the reallocation is performed by ""allocating a new memory block of size `new_size` bytes, copying memory area with size equal the **lesser of the new and the old sizes**, and **freeing** the old block."" Therefore, there is no guarantee that the memory still exists at `ptr`/`old_original`, and should instead be copied from `original`. | |
https://en.cppreference.com/w/cpp/memory/c/realloc | |
Also added an additional condition that `alignment <= 128`. Although a byte can store a maximum offset of `255`, `128` is the largest such power of two. If this becomes insufficient, we can set aside two bytes for the offset with a little more effort. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-11-18T22:35:32.048Z,NA,NA | |
1121 (https://gitlab.com/libeigen/eigen/-/merge_requests/1121),Add serialization for sparse matrix and sparse vector.,"This was required for another project in order to simplify reproduction | |
of a sparse solver issue.",Antonio Sánchez,2022-11-21T19:43:08.273Z,NA,NA | |
1122 (https://gitlab.com/libeigen/eigen/-/merge_requests/1122),Fix a bunch of annoying compiler warnings in tests,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Should substantially reduce the number of warnings emitted during the MR smoketests, specifically visitor.cpp, adjoint.cpp, and array_cwise.cpp. This is mostly due to narrowing conversions -- nothing material. | |
Job buildsmoketests:x86-64:linux:clang-10:cxx11-on log: | |
Before: 2189 lines | |
After : 1193 lines | |
The only warnings are associated with the deprecation of pow, which is intentional. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-11-21T20:07:20.390Z,NA,NA | |
1125 (https://gitlab.com/libeigen/eigen/-/merge_requests/1125),Add synchronize method to all devices.,"This is to simply writing generic device code. Previously only the GPU | |
and Sycl tensor devices had a `synchronize` method, which is required | |
in testing to ensure all operations are performed. Added a dummy method | |
for threadpool and default devices, where are synchronous by default.",Antonio Sánchez,2022-11-29T19:35:03.227Z,NA,NA | |
1124 (https://gitlab.com/libeigen/eigen/-/merge_requests/1124),Fix sparseLU solver when destination has a non-unit stride.,"The previous code had an implicit assumption that the destination is | |
directly accessible and has unit stride. This is not the case for | |
block expressions or views like `complex_matrix.real()`, which | |
has a non-unit stride. Converting the code to use a block expression | |
address this. | |
Fixes #2562.",Antonio Sánchez,2022-11-29T19:37:04.391Z,NA,NA | |
1114 (https://gitlab.com/libeigen/eigen/-/merge_requests/1114),Changing BiCGSTAB parameters initialization so that it works with custom types,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Alexandre Hoffmann,2022-11-29T19:37:46.887Z,NA,NA | |
1123 (https://gitlab.com/libeigen/eigen/-/merge_requests/1123),Fix reshape strides when input has non-zero inner stride.,"Fixes #2560. The outer stride needs to depend on the inner stride | |
if we have direct access.",Antonio Sánchez,2022-11-29T19:39:30.827Z,NA,NA | |
1127 (https://gitlab.com/libeigen/eigen/-/merge_requests/1127),Fix serialization for non-compressed matrices.,"The size of the data buffer was incorrect - it is not the number of | |
non-zeros, that is only true for compressed matrices. | |
It turned out that clang previously always converted the test matrices | |
to compressed mode because of the move-constructor, whereas this move | |
construction was optimized out by gcc. Added an explicit move constructor | |
to fix this.",Antonio Sánchez,2022-11-30T18:16:48.488Z,NA,NA | |
1008 (https://gitlab.com/libeigen/eigen/-/merge_requests/1008),Add support for Power10 (AltiVec) MMA instructions for bfloat16.,"Hello everyone, long time no see! | |
I hope to find everyone safe and sound :smile_cat: | |
## What? | |
This merge request add MMA support for bfloat16 on Power 10 machines. As Power 10 has bfloat16 support this is way faster comparing to what we had before using only VSX instructions that falls back for float32 to any computation. | |
## How | |
**Briefly**, Power10 MMA instructions for 16 bits types has a Rank-2 operation `xvbf16ger2pp` that is able to do two rank-1 updates simultaneously using 2 rows/columns. | |
It takes a `4x2` against a `2x4` matrix block and do a rank 2 update. Below there's a scheme of how my MMA register needs to be and the result. One thing worth mentioning is, result is a `4x4` float32 matrix, there's a ""type upgrade"" on this operation. | |
<table> | |
<tr> | |
<td>A</td> | |
<td>B</td> | |
<td>C</td> | |
<td>D</td> | |
<td>E</td> | |
<td>F</td> | |
<td>G</td> | |
<td>H</td> | |
</tr> | |
</table> | |
<table> | |
<tr> | |
<td>I</td> | |
<td>J</td> | |
<td>K</td> | |
<td>L</td> | |
<td>M</td> | |
<td>N</td> | |
<td>O</td> | |
<td>P</td> | |
</tr> | |
</table> | |
<table> | |
<tr> | |
<td>A*I + B*J</td> | |
<td>A*K + B*L</td> | |
<td>A*M + B*N</td> | |
<td>A*O + B*P</td> | |
</tr> | |
<td>C*I + D*J</td> | |
<td>C*K + D*L</td> | |
<td>C*M + D*N</td> | |
<td>C*O + D*P</td> | |
<tr> | |
<td>E*I + F*J</td> | |
<td>E*K + F*L</td> | |
<td>E*M + F*N</td> | |
<td>E*O + F*P</td> | |
</tr> | |
<td>G*I + H*J</td> | |
<td>G*K + H*L</td> | |
<td>G*M + H*N</td> | |
<td>G*O + H*P</td> | |
</tr> | |
</table> | |
In short, what `gemmMMAbfloat16` is doing, it's loading `4x2` and `2x4` blocks from LHS/RHS, organizing them at the registers and running rank-2 update. As standard packing wasn't created with this situation in mind, it can be a little confusing how I'm acessing memory. If you think a detailed explanation is necessary I don't mind drawing something to make myself clear as possible :smile: | |
~~Out of curiosity, I did try to change packing to make code more friendly but I couldn't make my custom packing work for triangular so I went back.~~ | |
## Code | |
### Temporary float32 result | |
Talking further about the result being a float32 matrix to avoid converting back and forth from float32 <-> bfloat16 on GEMM I created a temporary float32 matrix to hold result. | |
```c++ | |
float** result = new float*[cols]; | |
for(int i = 0; i < cols; i++) result[i] = new float[rows]; | |
``` | |
I didn't see any code using `new` so if that's a problem I'm open to suggestions. :wink: | |
### Long and ugly switch statement | |
`pgerMMAbfloat16` is basically running rank-2 update instructions. There's a mask feature for this set of instructions that I'm able to ignore some parts of the result matrix. | |
This is useful when we are running that last section of a matrix that is unable to fit whole 4 elements and/or don't have two rows/columns. Using masks I'm able to ignore result for those non-existent values. | |
Now comes the ugly part, I don't know masks at compile time and so I can't write something like: | |
`__builtin_mma_pmxvbf16ger2pp(acc, reinterpret_cast<Packet16uc>(a.m_val), reinterpret_cast<Packet16uc>(b.m_val), maskX, maskY, 0b11);` | |
I don't have exact compiler error at this moment but it was something that mentions `literals`. I bet it's because, after compilation, these masked rank updates are different instructions (instead of a instruction with masks as arguments). | |
## Testing | |
For testing I've changed files below (not submitted): | |
* test/product_syrk.cpp | |
* test/product_large.cpp | |
* test/product_symm.cpp | |
* test/triangular.cpp | |
I had a problem creating those tests because `bfloat16` doesn't support int scaling (i.e. k * A) so there's some tests that had like `2*m1` that doesn't work for `bfloat16`. To work I've changed: | |
This: | |
`VERIFY_IS_APPROX(res, 2*(square + m1 * m2.transpose()));` | |
To: | |
`VERIFY_IS_APPROX(res, (square + m1 * m2.transpose()) + ( square + m1 * m2.transpose()) );` | |
There's also this code section that don't work for bfloat16 and I don't have any idea why: | |
```c++ | |
if(!MatrixType::IsRowMajor) | |
{ | |
typedef Matrix<Scalar,Dynamic,Dynamic> MatrixX; | |
MatrixX buffer(2*rows,2*rows); | |
Map<RowSquareMatrixType,0,Stride<Dynamic,2> > map1(buffer.data(),rows,rows,Stride<Dynamic,2>(2*rows,2)); | |
buffer.setZero(); | |
VERIFY_IS_APPROX(map1 = m1 * m2.transpose(), (m1 * m2.transpose()).eval()); | |
buffer.setZero(); | |
VERIFY_IS_APPROX(map1.noalias() = m1 * m2.transpose(), (m1 * m2.transpose()).eval()); | |
buffer.setZero(); | |
VERIFY_IS_APPROX(map1.noalias() += m1 * m2.transpose(), (m1 * m2.transpose()).eval()); | |
} | |
``` | |
If you people think it's important to update tests files to have my bfloat16 tests I don't mind doing it. Honestly I didn't give much tought about this matter but maybe I can specialize `product` function on `product.h` for a Matrix of bfloat16. Suggestions are also appreciated here :smile: | |
## Last considerations | |
As this is a lot of changes I can imagine this will go back and forth. Any suggestion/consideration will be much appreciated! | |
Thanks a lot for your time. | |
PS: | |
This is a collaboration between me, Chip Kerchner and Rafael Souza. (co-authors on commit message)",Pedro Caldeira,2022-11-30T23:33:37.637Z,NA,NA | |
1104 (https://gitlab.com/libeigen/eigen/-/merge_requests/1104),Fix the bug using neon instruction fmla for data type half,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
https://gitlab.com/libeigen/eigen/-/merge_requests/1018 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
The register at operand 3 of fmla for data type half must be v0~v15, inline assembly can't be used here to advoid the bug that vfmaq_lane_f16 is implemented through a costly dup in gcc compiler. However, when gcc compiler is enable, using the intrinsics will lead to performance degradation, so I make a restriction here. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
This bug is not triggered by coincidence when EIGEN_NEON_GEBP_NR=8. If EIGEN_NEON_GEBP_NR is set to 4, gcc compiler will report the following error | |
",Lianhuang Li,2022-12-01T17:28:57.765Z,NA,NA | |
1103 (https://gitlab.com/libeigen/eigen/-/merge_requests/1103),add sparse sort inner vectors function,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
Maybe #299 , #364 , #2558 | |
### What does this implement/fix? | |
Adds utility to sort the inner vectors of a sparse matrix / vector with comparison function (default is `std::less<>`). Some sparse algorithms implicitly assume sorted inner vectors, and will not function properly otherwise. Other algorithms could benefit from sorted inner vectors, such as sparse transpositions, factorizations, and so on. | |
The sort is implemented with `std::sort` using a custom iterator `CompressedStorageIterator` that sorts the inner indices and values in parallel. Usually, a temporary vector of indices is used to apply the sorting permutation to the indices and values. This requires only one pass and no auxiliary storage. | |
`CompressedStorageIterator` can be used for many STL algorithms to operate on both the indices and values in parallel, such as `std::swap_ranges`, `std::rotate`, which may simplify further improvements to the sparse module. | |
Adapted from https://artificial-mind.net/blog/2020/11/28/std-sort-multiple-ranges | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-12-01T19:28:56.698Z,NA,NA | |
1130 (https://gitlab.com/libeigen/eigen/-/merge_requests/1130),Fix index type for sparse index sorting.,Type typo.,Antonio Sánchez,2022-12-06T00:02:32.462Z,NA,NA | |
1131 (https://gitlab.com/libeigen/eigen/-/merge_requests/1131),Increase L2 and L3 cache size for Power10.,"By increasing the L2 and L3 sizes for Power10, situations which rely on breaking matrices into sub-matrices (like triangular matrix solve) for packing and GEMM, the performance increases by 1.33X.",Chip Kerchner,2022-12-07T18:20:33.935Z,NA,NA | |
1128 (https://gitlab.com/libeigen/eigen/-/merge_requests/1128),Enable direct access for NestByValue.,"If the underlying expression has direct access, then so does | |
`NestByValue`. Conditionally adds the appropriate accessors for this case. | |
Fixes #2574.",Antonio Sánchez,2022-12-07T18:21:46.505Z,NA,NA | |
1129 (https://gitlab.com/libeigen/eigen/-/merge_requests/1129),Add BDCSVD_LAPACKE binding,"### What does this implement/fix? | |
When `EIGEN_USE_LAPACKE` is set, this calls ?gesdd for BDCSVD which is the corresponding LAPACK SVD divide & conquer variant. | |
### Additional information | |
Unfortunately, ?gesdd can only calculate fullU/fullV ('A'), thinU/thinV ('S'), or none of them ('N'). | |
So there is slightly more mapping code compared to JacobiSVD (?gesvd) to make all variants work (e.g. when thinU and fullV is set).",Melven Roehrig-Zoellner,2022-12-09T18:50:13.395Z,NA,NA | |
1133 (https://gitlab.com/libeigen/eigen/-/merge_requests/1133),add EqualSpaced / setEqualSpaced,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This MR introduces a new function: `setEqualSpaced`. It is analogous to `setLinSpaced` when `size-1` is an exact multiple of `high-low`. It is vectorized for all types (with add and multiply), and somewhat more intuitive. The current `setLinSpaced` implementation is not vectorized for integer types, and has somewhat awkward floating point logic. | |
``` | |
Index size = 10; | |
int low = 0; | |
int step = 1; | |
std::cout << VectorXi::EqualSpaced(size, low, step).transpose() << ""\n""; // {0,1,2,3,4,5,6,7,8,9} | |
VectorXi test(size); | |
// basically performs this simple loop | |
for(Index i = 0; i < size; i++) | |
test(i) = low + i * step; | |
``` | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-12-13T00:54:58.505Z,NA,NA | |
1134 (https://gitlab.com/libeigen/eigen/-/merge_requests/1134),optimize equalspace packetop,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-12-13T01:22:25.951Z,NA,NA | |
1090 (https://gitlab.com/libeigen/eigen/-/merge_requests/1090),Allow std::initializer_list constructors in constexpr expressions,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
Previously attempting to declare a constexpr Eigen::Matrix/Eigen::Array | |
would result in a compiler error, now it will succeed for fixed-size | |
matrices as well as dynamic-sized ones with fixed storage size if all | |
elements are initialized. | |
This works when targeting C++20 and some basic functionality is also | |
supported when using Clang targeting C++14/17, but GCC rejects declaring a | |
matrix as constexpr until C++20.",Alexander Richardson,2022-12-14T17:05:38.398Z,NA,NA | |
1135 (https://gitlab.com/libeigen/eigen/-/merge_requests/1135),Avoid using std::raise() for divide by zero,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
https://gitlab.com/libeigen/eigen/-/merge_requests/1076 | |
### What does this implement/fix? | |
Since commit 7b2901e2 there is a dependency | |
on std::raise(), but the header `<csignal>` might not exist or be usable on | |
embedded targets or operating systems where UNIX signals are not | |
implemented. Instead of calling std::raise(), we can force evaluation of | |
an integer division by zero. This should not result in any functional | |
changes on most systems, as that division will be translated to a SIGFPE | |
and the process exits with exist code 136. | |
We use volatile variables here which forces the compiler to evaluate the | |
expression even though it is unused: see https://godbolt.org/z/qYjdaq94s | |
### Additional information | |
This is required to use Eigen on embedded operating systems.",Alexander Richardson,2022-12-14T20:06:16.993Z,NA,NA | |
1137 (https://gitlab.com/libeigen/eigen/-/merge_requests/1137),"Use numext::signbit instead of std::signbit, which is not defined for bfloat16.",NA,Rasmus Munk Larsen,2022-12-15T19:33:19.408Z,NA,NA | |
1138 (https://gitlab.com/libeigen/eigen/-/merge_requests/1138),Update test of numext::signbit.,NA,Rasmus Munk Larsen,2022-12-15T20:21:10.477Z,NA,NA | |
1139 (https://gitlab.com/libeigen/eigen/-/merge_requests/1139),Add operators to CompressedStorageIterator,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Added all the comparison operators (couldn't hurt), `+=` and `-=`. There are many operators required to satisfy `RandomAccessIterator`, and the vast majority appear to be satisfied. There are a few exceptions -- the iterator doesn't satisfy all the requirements of `LegacyForwardIterator` as it is not `DefaultConstructible`. Let me know if something else breaks your tests. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-12-16T16:48:51.459Z,NA,NA | |
1141 (https://gitlab.com/libeigen/eigen/-/merge_requests/1141),Enable NEON pabs for unsigned int types,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
The packet_traits for NEON `uint16_t`, `uint32_t`, `uint64_t` set `HasAbs = 0`. But, these types seem to have [pabs implemented](https://gitlab.com/libeigen/eigen/-/blob/master/Eigen/src/Core/arch/NEON/PacketMath.h#L2358) (just as an identity operation). | |
This MR just sets `HasAbs = 1` for these types. This also matches pabs in [GenericPacketMath.h](https://gitlab.com/libeigen/eigen/-/blob/master/Eigen/src/Core/GenericPacketMath.h#L547) since numext::abs is also an identity for unsigned types. | |
I think that ATM, if an expression of uint32_t matrices uses `.cwiseAbs()`, then the entire expression ends up not using Eigen's packet operations...? So HasAbs=1 should be better? | |
I guess this only matters (if at all) for some generic code, since people probably aren't using abs if they know they have an unsigned type :-)",Arthur,2022-12-19T18:12:33.894Z,NA,NA | |
1143 (https://gitlab.com/libeigen/eigen/-/merge_requests/1143),"Revert ""Avoid mixing types in CompressedStorage.h""",NA,Rasmus Munk Larsen,2022-12-19T20:09:38.048Z,NA,NA | |
1142 (https://gitlab.com/libeigen/eigen/-/merge_requests/1142),Fix incorrect NEON native fp16 multiplication.,"TensorFlow's tensor contractions currently fail on ARM hardware with native fp16 | |
support due to a bug in the specialized kernel implementation. | |
All the NEON GEBP specializations are a bit hacky, in that they replace the | |
RHS packet with a single scalar, then use special instructions to | |
perform a `Packet += Packet * Scalar`. In the case of native `__fp16`, | |
where we have a `Packet8h`, this broke an assumption that we can split | |
the left-hand packet into groups of 4 elements, then multiply by a RHS loaded | |
via `ploadquad`. The hack works for floats, since the packet size is | |
4, so `ploadquad` fills the packet with a single value, which we _can_ | |
mimic using multiplication by a single scalar. However, the assumption breaks | |
down when the packet size is 8. | |
Put in a fallback in the general GEBP kernel to avoid `ploadquad` when | |
not feasible, and added an assertion to the NEON `__fp16` | |
specialization.",Antonio Sánchez,2022-12-19T20:46:44.891Z,NA,NA | |
1144 (https://gitlab.com/libeigen/eigen/-/merge_requests/1144),Fix up C++ version detection macros and cmake tests.,"Eigen was reporting the wrong c++ version for intermediate versions of | |
`__cplusplus`. Also disabling explicit c++17/c++20 constexpr tests, | |
since these are breaking on our CI. It looks like some versions of | |
clang are reporting `__cplusplus` as 20, but don't support all c++20 | |
features. | |
CI failures: https://gitlab.com/libeigen/eigen_ci_cross_testing/-/pipelines/725775539 | |
Fixes #2584",Antonio Sánchez,2022-12-20T18:06:04.427Z,NA,NA | |
1146 (https://gitlab.com/libeigen/eigen/-/merge_requests/1146),"Enable NEON pcmp, plset, and complex psqrt","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Another small MR: just enables NEON's complex `psqrt`. It works fine as far as I can tell, so I'm guessing it was just missed and that it's safe to enable? | |
Also noticed that `plset` isn't enabled. seems to work with the nullary.cpp tests, so assuming this is safe as well.",Arthur,2022-12-22T05:38:35.382Z,NA,NA | |
1145 (https://gitlab.com/libeigen/eigen/-/merge_requests/1145),Adjust thresholds for bfloat16 product tests that are currently failing.,"For bfloat16, the default epsilon for `areNotApprox` ends up being | |
relatively large, causing some tests to fail even though practically | |
the sides being compared _are_ significantly different. | |
Also, some of the `VERIFY_IS_APPROX` tests when comparing matrices that | |
are multiplications of 3 matrices are failing for bfloat16, since coefficients can | |
grow quite large, but we don't adjust the threshold. Doubling the | |
threshold seems to allow the test to pass reliably.",Antonio Sánchez,2022-12-28T19:32:26.383Z,NA,NA | |
1140 (https://gitlab.com/libeigen/eigen/-/merge_requests/1140),Patch SparseLU,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
#2582 Does not directly address root cause, but the generic dense GEMM kernel yields the expected result. The two kernels produce results that are very similar, and pass the `isApprox` test. However, after hundreds/thousands of iterations, these differences could be enough to influence the detection of a pivot in a badly conditioned matrix. As for the difference between Eigen 3.3 and 3.4? I chalk that up to changes in global behavior (blocking sizes in the triangular solvers, etc) that produce small numerical differences that can eventually influence a very sensitive parameter. | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Currently, Sparse LU has a custom dense GEMM kernel that appears fundamentally correct, but references a lot of dated code. For example: `Alignment = PacketSize>1 ? Aligned : 0`. Here, `Aligned` is deprecated and doesn't utilize alignments greater than 16. Also, this precludes Eigen from switching to a user's preferred BLAS backend (such as MKL) for dense GEMM kernels, among other Eigen features (CPU tuned blocking sizes) that have been added in the years since Gael contributed this code. | |
In `SparseLUTransposeView` there is a subtle bug that causes the SparseLU tests to fail. `APIBase()` calls the default constructor, which in turn sets `m_isInitialized = false`, even if `view` is initialized. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2022-12-31T04:52:37.090Z,NA,NA | |
1149 (https://gitlab.com/libeigen/eigen/-/merge_requests/1149),Fixes git add . doesn't include scripts/buildtests.in,"### Reference issue | |
`git add .` doesn't include `scripts/buildtests.in`. | |
### What does this implement/fix? | |
In `.gitignore`, there are rule to ignore `*build*` [ref](https://gitlab.com/libeigen/eigen/-/blob/master/.gitignore#L15). | |
This rule makes `scripts/buildtests.in` be excluded when we run `gid add .`. | |
To fix this issue, this PR modify `.gitignore` to exclude `scripts/buildtests.in`.",LAI Bruce,2023-01-03T17:06:36.896Z,NA,NA | |
1151 (https://gitlab.com/libeigen/eigen/-/merge_requests/1151),Fix EIGEN_HAS_CXX17_OVERALIGN for icc,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #2575 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-01-03T17:58:14.027Z,NA,NA | |
1155 (https://gitlab.com/libeigen/eigen/-/merge_requests/1155),Fix overalign check.,"EIGEN_COMP_ICC is always defined, having a value of 0 if we're not using | |
icc. Modified the check.",Antonio Sánchez,2023-01-05T17:10:49.610Z,NA,NA | |
1156 (https://gitlab.com/libeigen/eigen/-/merge_requests/1156),Fix a bunch of minor build and test issues.,"- `SPQRSupport` included a header with the wrong relative path | |
- `minmax` visitor is only vectorized if vectorized comparisons are | |
available | |
- `half_float`/`bfloat16_float` included a superfluous internal header | |
- `gpu_basic`/`gpu_example` didn't properly define `EIGEN_USE_GPU`, which meant | |
GPU packets weren't actually used | |
- `incomplete_cholesky`/`sparselu` included unused ""unsupported"" headers | |
- `sparse_extra` unnecessarily duplicated the original sparse_product test",Antonio Sánchez,2023-01-06T16:37:27.572Z,NA,NA | |
1153 (https://gitlab.com/libeigen/eigen/-/merge_requests/1153),Fix guard macros for emulated FP16 operators on GPU,"### What does this implement/fix? | |
This change fixes the following two issues: | |
- The macro guards for emulating FP16 operations have slightly different conditions for the push vs pop macro. This can result in an error when compiling with `__CUDA__` but `EIGEN_CUDACC` is not defined. These guards originally matched, but only the push_macro guard was updated in a [previous commit](https://gitlab.com/libeigen/eigen/-/commit/b08527b0c1ffdbd44347ca3a7869f10b0cb3cbb6#27a0d506a487d29a0bac9d54459393b0a6d2d673_318_318). | |
- The comment on [line 459](https://gitlab.com/RSenApps/eigen/-/blob/1235ad596cd35d8f4fecad344c9d46ba8f7f785d/Eigen/src/Core/arch/Default/Half.h#L459) claims that these emulated FP16 operations should be available for both HIP and CUDA, but `EIGEN_CUDACC` is used instead of `EIGEN_GPUCC`.",Ryan Senanayake,2023-01-06T22:02:51.826Z,NA,NA | |
1154 (https://gitlab.com/libeigen/eigen/-/merge_requests/1154),Improve performance for Power10 MMA bfloat16 GEMM,"Improve performance for Power10 MMA bfloat16 GEMM. | |
Includes packing for rank-2 friendly data, better indexing variables, elimination of MMA masking, improved edge handling, hardware bfloat16 conversions, fixes slowdown with LLVM, use of LinearMappers, general cleanup, etc. | |
It is now up to 61X faster than generic GEMM code and 2.3X faster for GCC & 7-12X for LLVM than previous version.",Chip Kerchner,2023-01-06T23:08:38.173Z,NA,NA | |
1158 (https://gitlab.com/libeigen/eigen/-/merge_requests/1158),Modified spbenchsolver help message because it could be misunderstood,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
The help message stated that the matrix should be named MatrixName.mtx and MatrixName_b.mtx. In the case of a SPD matrix, you have to name it MatrixName_SPD.mtx. I think it was not clear enough that as a consequence, the rhs should be named MatrixName_SPD_b.mtx. I actually made the mistake when trying to use spbenchsolver.",Robin Miquel,2023-01-07T21:35:46.786Z,NA,NA | |
1147 (https://gitlab.com/libeigen/eigen/-/merge_requests/1147),Overhaul Sparse Core,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
Various improvements to `SparseMatrix` and related classes. Minor improvements include: | |
- substitute legacy code for STL algorithms where possible. this reduces bloat, is easier to maintain, and is probably faster | |
- use aligned maps now that buffers are aligned to make use of packet ops. | |
- replace a bunch of assignment loops with `smart_memmove` (and `moveChunk`, which now exclusively calls `smart_memmove`) | |
More substantial improvements include: | |
- reworked `setFromTriplets` to perform fewer passes over the triplets. Currently, a transposed and unordered copy of the matrix is constructed, and the matrix is ordered using the transposition assignment. This version constructs an unordered matrix, handles the duplicates, and sorts the inner indices after construction. It also performs one allocation (possibly on the stack) to track nonzeros and collapse duplicates instead of two. | |
- add `setFromSortedTriplets` which performs all steps in one pass (after scanning to determine allocation size) and uses no temporary storage to achieve the same outcome as `setFromTriplets`. A container (such as a `std::vector`) of triplets can be easily sorted using `std::sort` (as is demonstrated in the tests), so this function should be used whenever possible. | |
- reworked `conservativeResize` to use binary search and general cleanup. Currently, decreasing the inner size always uncompresses the matrix. In this version, the matrix is uncompressed (if not already) only if inner size is decreased and nonzeros are lost due to inner size change. Additionally, `data().resize()` is appropriately called to minimize subsequent reallocations. | |
- separate search and insertion functions in `insert`. Currently, `coeffRef` performs a binary search to find an element. If it does not exist, it calls `insert` where the search is performed again. added `insertAt`, `insertUncompressedAt` and `insertCompressedAt` that uses a previously determined insertion location from `insert` or `coeffRef`. Updated and deprecated `insertUncompressed` and `insertCompressed` in any anyone uses those. Will gladly delete. | |
- `prune` now works on uncompressed matrices, resolving a long standing TODO | |
`setFromTriplets` benchmarks (initialize large sparse matrix 50 times): | |
- setFromTriplets (old), shuffled triplets: 12s | |
- setFromTriplets (old), sorted triplets: 9s | |
- setFromTriplets (new), shuffled triplets: 12s | |
- setFromTriplets (new), sorted triplets: 3s | |
- setFromSortedTriplets, sorted triplets: 1.6s | |
While the new `setFromTriplets` is on par with the previous algorithm when the triplets are randomly shuffled, the performance gap widens when the triplets are sorted. The new `setFromSortedTriplets` performs even better, with no temporary storage, and thus should be the preferred method of initializing a sparse matrix. | |
There are many more fairly simple opportunities for improvement throughout the sparse module, but this MR will focus on code in `SparseMatrix.h`. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-01-07T22:09:43.319Z,NA,NA | |
1159 (https://gitlab.com/libeigen/eigen/-/merge_requests/1159),Add missing header for GPU tests.,"Oops, accidentally deleted this earlier.",Antonio Sánchez,2023-01-09T19:45:07.065Z,NA,NA | |
1161 (https://gitlab.com/libeigen/eigen/-/merge_requests/1161),"Fix error: unused parameter 'tmp' [-Werror,-Wunused-parameter] on clang/32-bit arm","Fixes this error, when building on 32-bit arm with clang | |
``` | |
In file included from ../../qt6_local_build/eigen/eigen-3.4.0/Eigen/Dense:1: | |
In file included from ../../qt6_local_build/eigen/eigen-3.4.0/Eigen/Core:352: | |
../../qt6_local_build/eigen/eigen-3.4.0/Eigen/src/Core/arch/NEON/GeneralBlockPanelKernel.h:27:56: error: unused parameter 'tmp' [-Werror,-Wunused-parameter] | |
Packet4f& c, Packet4f& tmp, | |
```",Martin Burchell,2023-01-10T21:15:28.707Z,NA,NA | |
1160 (https://gitlab.com/libeigen/eigen/-/merge_requests/1160),change insert strategy,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
Previously, it was reported that repeatedly calling `insert` on a large compressed sparse matrix was slow. This is expected as the user is repeatedly performing sorted insertions. However, this appears to be a common usage pattern. | |
This MR changes `insertAtByOuterInner` , which pertains to `insert` and `coeffRef`, and will now always uncompress the matrix (a no-op if already uncompressed) and call `insertUncompressedAtByOuterInner`. Users may still opt to call `insertCompressed()` if they do not want their matrix to be uncompressed, which can still be useful if inserting a few elements or only pushing to back. | |
If `insertUncompressedAtByOuterInner` fails to find a vector with capacity, each vector's capacity will increase to a minimum of two to avoid future reallocations and reduce insertion times. This strategy can be tuned two ways: | |
- change `kReserveSizePerVector` from `2` to some other number, like `10`, or a complex expression like `reserve(IndexVector::AlignedMapType(outerSize(), innerNonZeroPtr())` (possibly double the reserve size of each vector!) | |
- instead of searching from `outer` to `outerSize()` for insertion capacity, limit the search e.g. `outer + 2` to avoid lengthy searches | |
This may remediate some issues with users calling `insert` in a loop. However, it is still best to call `reserve`, e.g. `reserve(VectorXi::Constant(10))` rather than rely on this heuristic. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-01-11T06:24:50.362Z,NA,NA | |
1152 (https://gitlab.com/libeigen/eigen/-/merge_requests/1152),"Add template to specify QR permutation index type, Fix ColPivHouseholderQR Lapacke bindings","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #2586 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
The permutation index type can now be specified for `ColPivHouseholderQR`, `FullPivHouseHolderQR`, and `CompleteOrthogonalDecomposition`. The default type is either `int` or `lapack_int` if the lapacke bindings are used. Fixed `ColPivHouseholderQR` lapacke bindings so that they are called when passed by non-const reference, and if `lapack_int` is `int64_t`. Also fixed `determinant()` to produce correct sign. Removed much of the macros to make it easier to debug. | |
TODO: apply the same changes to `LU` and `SVD`. The changes are pretty simple, mostly copy and paste work. Would be a good project for someone who is starting out. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-01-11T15:57:29.492Z,NA,NA | |
1150 (https://gitlab.com/libeigen/eigen/-/merge_requests/1150),Altivec fixes for Darwin: do not use unsupported VSX insns,"Existing macros checking for `__VSX__` define do not work correctly on macOS: GCC thinks VSX is available, however ISA used in 7450 and 970 CPUs do not support those insns (those were introduced in v2.06, which 970 supports ISA up to v2.03). | |
Example why it matters (and how I discovered the problem in the first place): `nanoflann` examples use Eigen headers, and invoke VSX – when the build is for ppc32 on MacOS, see: https://trac.macports.org/ticket/66602 | |
Additional PPC-related fix: in order for macOS ppc64 being recognized, either `__ppc64__` should be specified or `__POWERPC__` (which encompasses both `__ppc__` and `__ppc64__`). The latter is perhaps preferable, since it also gonna include a case of BeOS. | |
(I am not sure if any non-Apple OS uses `__ppc__`; if not, `__POWERPC__` is sufficient.)",Sergey Fedorov,2023-01-12T16:33:34.323Z,NA,NA | |
1162 (https://gitlab.com/libeigen/eigen/-/merge_requests/1162),"Fix QR, again",This is a rollback of https://gitlab.com/libeigen/eigen/-/commit/6156797016164b87b3e360e02d0e4107f7f66fbc after fixing a build error due to conflicting definitions of `StorageIndex`.,Charles Schlosser,2023-01-13T03:23:18.268Z,NA,NA | |
1167 (https://gitlab.com/libeigen/eigen/-/merge_requests/1167),avoid move assignment in ColPivHouseholderQR,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Fussy compilers may not like initializing variables with the move assignment. E.g. `m_colsPermutation= PermutationType(cols)` This could be circumvented by simply calling `m_colsPermutation.resize(cols);` which serves the same purpose. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-01-15T01:34:10.980Z,NA,NA | |
1165 (https://gitlab.com/libeigen/eigen/-/merge_requests/1165),Add missing EIGEN_DEVICE_FUNC in a few places when called by asserts.,"Also removed an old gcc 4.7 workaround which is UB anyways, silenced some pedantic warnings when internal asserts are enabled, and added a missing `inline` specifier.",Antonio Sánchez,2023-01-15T02:06:17.785Z,NA,NA | |
1164 (https://gitlab.com/libeigen/eigen/-/merge_requests/1164),improve sparse permutations,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Currently, sparse permutations are performed in the following manner: | |
1. the right hand side expression is evaluated into a sparse matrix temporary (allocation) | |
2. the result is created in a new sparse matrix temporary (allocation) | |
3. the result is **assigned** to the destination (allocation) | |
If the inverse permutation is requested, then the inverse permutation is computed (allocation) | |
If the inner indices are permuted, the **transpose** of the matrix is constructed so that the transpose assignment implicitly sorts the new inner indices, which requires random access and is considerably slower. | |
This MR seeks to reduce the number of allocations and improve performance. The new sequence of events is: | |
1. if the right hand side is a plain sparse object, just reference it. otherwise, evaluate it into a temporary (no allocation in many cases) | |
2. the result is created in a new sparse matrix temporary (allocation) | |
3. the result is **moved** to the destination (no allocation if the objects are compatible, i.e. storage orders match) | |
If the inner indices are permuted, and the inverse permutation is requested, then the inverse permutation is computed (allocation in one of the four use cases). | |
If the inner indices are permuted, the elements with their updated indices are inserted into the result in an unsorted fashion. the indices are sorted in place after the matrix is finalized. | |
In summary, the previous strategy always had 3 copies of the matrix and 50% of the use cases (outer/inner inverse permutation) requires a copy of the permutation. | |
Now, permuting a plain matrix involves 1 copy, and 25% of use cases (inner inverse) requires a copy of the permutation. In all cases, the permuted matrix is now constructed in a manner that allows contiguous chunks of data to be copied, instead of elements one-at-a-time in random order. | |
| Operation | Before | After | % Change | | |
| ------ | ------ | ----- | ----- | | |
| Outer | 7196 | 3847 | -47% | | |
| Inner | 16125 | 4069 | -75% | | |
| Inverse Outer | 6895 | 4433 | -36% | | |
| Inverse Inner | 15918 | 4257 | -73% | | |
Note that the largest improvement is due to avoiding the construction of the transposed matrix (inner permutations). All cases benefit from avoiding two fewer copies of the matrix (reference `xpr`, move `result`). | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-01-15T03:21:25.855Z,NA,NA | |
1168 (https://gitlab.com/libeigen/eigen/-/merge_requests/1168),Support per-thread is_malloc_allowed() state,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
#2576 | |
### What does this implement/fix? | |
This merge request makes the state of `is_malloc_allowed()` thread-local, so it can be used in multi-threaded applications without data races. | |
### Additional information | |
By default, the `is_malloc_allowed()` state is now `thread_local` on systems | |
that support it. To disable this behavior, and to use a single global | |
variable instead (e.g. for single-threaded applications), the compiler | |
flag `-DEIGEN_AVOID_THREAD_LOCAL=1` can be used.",tttapa,2023-01-16T01:34:57.246Z,NA,NA | |
1126 (https://gitlab.com/libeigen/eigen/-/merge_requests/1126),[SYCL-2020 Support] Enabling Intel DPCPP Compiler support to Eigen,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Additional information | |
<!--Any additional information you think is important.--> | |
This PR enables [intel opensource DPCPP compiler](https://github.com/intel/llvm) for Eigen SYCL backend. | |
The C++17 features have been enabled as it is the requirement of SYCL-2020 features. | |
Using this backend, it is possible to compile and run Eigen SYCL backend on a set of accelerators including (but not limited to) IntelCPU, IntelGPU, NvidiaGPU, etc",Mehdi Goli,2023-01-16T07:04:08.926Z,NA,NA | |
1136 (https://gitlab.com/libeigen/eigen/-/merge_requests/1136),issue #2581: review and cleanup of compiler version checks,NA,Sean McBride,2023-01-17T18:58:35.404Z,NA,NA | |
1169 (https://gitlab.com/libeigen/eigen/-/merge_requests/1169),Replace the Deprecated `$<CONFIGURATION>` with `$<CONFIG>`,"`$<CONFIGURATION>` is deprecated since CMake 3.0. I was going through our codebase and notice that eigen uses it as well, so I made a MR for it as well! :slight_smile: | |
### What does this implement/fix? | |
Replaces a deprecated cmake generator expression with its replacement, `$<CONFIG>`.",Amir Masoud Abdol,2023-01-17T19:44:33.472Z,NA,NA | |
1166 (https://gitlab.com/libeigen/eigen/-/merge_requests/1166),Add custom ODR-safe assert.,"This is in response to an issue encountered when compiling with C++20 modules. | |
It turns out that the use of `__FILE__` or `assert` in headers can lead | |
to ODR violations, since the filename is expanded to a string token | |
which differs depending on the relative path used to include a given | |
header. Normally, the compiler would ignore this, and select one of the | |
implementations when linking. However, with C++20 modules, these ODR | |
violations become breaking compile errors. | |
The work-around is to write a custom assert that uses `__builtin_FILE()` | |
when available. | |
The implementation here tries to use `__builtin_FILE()` and | |
`__assert_fail()` when available. This allows the safe use of asserts | |
for GCC, MSVC, Clang, and NVCC (after CUDA 11).",Antonio Sánchez,2023-01-20T17:38:14.820Z,NA,NA | |
1170 (https://gitlab.com/libeigen/eigen/-/merge_requests/1170),Fix sparse insert,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
The sparse product test revealed a subtle unintended behavior. First, I thought the issue was limited to `StorageIndex` overflowing, which is always a potential issue. However, the actual issue is that uncompressed matrices may have unused capacity in a) the buffer to the right of `data().size()` and b) the inactive nonzeros to the left of `data().size()`. Previously, I was only checking for a). Now, I also check for b) by searching to the left as well. This simplifies the reserve strategy, as now insert exhausts all options before reserving nonzeros. | |
Added check for fast push back to coeffRef and insert, which helps in certain access patterns. | |
`makeCompressed()` now attempts to make fewer, larger copies of contiguous data, though I haven't been able to design a test that demonstrates any difference in performance. | |
I also addressed a very irritating issue where every empty sparse matrix held allocated memory for the outer indices (which is completely worthless and has to be reallocated for any resize). | |
``` | |
SparseMatrix<float> A; // constructor calls resize(0,0), which allocates memory for m_outerIndex (m_outerSize + 1 == 1) | |
A.resize(1,1); // must reallocate the memory at m_outerIndex | |
``` | |
Diagonal assignment with coeffRef (mostly to prove there was no regression for push-to-back access pattern) | |
``` | |
int diagSize = 500'000'000; | |
SparseMatrix<double> A(diagSize, diagSize); | |
for (int j = 0; j < diagSize; j++) | |
A.coeffRef(j, j) = j; | |
``` | |
Before: 5797 ms | |
After: 4948 ms | |
Random insertion with coeffRef: | |
``` | |
int size = 50'000; | |
SparseMatrix<double> A(size, size); | |
for (int nz = 0; nz < size ; nz++) | |
{ | |
int i = internal::random(0, size - 1); | |
int j = internal::random(0, size - 1); | |
A.coeffRef(i, j) = i*j; | |
} | |
``` | |
Before: 2056 ms | |
After: 25 ms | |
`insert/coeffRef` now appears to scale well with the size of the matrix. | |
`int size = 500'000;` | |
Before: 244,838 ms | |
After: 488 ms | |
Random insertion into a denser matrix using coeffRef: | |
``` | |
int size = 10'000; | |
SparseMatrix<double> A(size, size); | |
for (int nz = 0; nz < size ; nz++) | |
{ | |
for (int nz2 = 0; nz2 < size/100; nz2++) | |
{ | |
int i = internal::random(0, size - 1); | |
int j = internal::random(0, size - 1); | |
A.coeffRef(i, j) = i*j; | |
} | |
} | |
``` | |
Before: 21,239 ms | |
After: 1764 ms | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-01-20T21:32:33.582Z,NA,NA | |
1172 (https://gitlab.com/libeigen/eigen/-/merge_requests/1172),Refactor sparse,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Refactor `SparseMatrix.h` to reference class members directly instead of encapsulation wrappers for consistency. E.g. `m_outerIndex` in lieu of `outerIndexPtr()`. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-01-23T17:55:51.592Z,NA,NA | |
1173 (https://gitlab.com/libeigen/eigen/-/merge_requests/1173),Revert qr tests,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
`boostmultiprec.cpp` (perhaps others) refers to tests defined in `qr_colpivoting.cpp` and `qr_fullpivoting.cpp`, which were changed to include a permutation index type template parameter. The original tests (without the static assert that the permutation index must be `int`) will work as the default permutation index type is defined in `FwdDeclarations.h`. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-01-23T22:23:09.174Z,NA,NA | |
1174 (https://gitlab.com/libeigen/eigen/-/merge_requests/1174),Fix slowdown in bfloat16 MMA when rows is not a multiple of 8 or columns is not a multiple of 4.,"Fixed significant slowdown with bfloat16 MMA when rows is not a multiple of 8 or columns is not a multiple of 4 - 50% slower for columns (RHS) and 5-6% for rows (LHS). Required rewriting of packing and processing (only in extra areas). | |
Packing was RowMajor in extra areas. Change to be ColMajor so that a simple pload_partial can be used instead of element by element creation of the packet.",Chip Kerchner,2023-01-25T18:22:21.219Z,NA,NA | |
1175 (https://gitlab.com/libeigen/eigen/-/merge_requests/1175),Tweak atan2,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Partially addresses #2597 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
The corner cases of `atan2` as documented by https://en.cppreference.com/w/cpp/numeric/math/atan2 are numerous. However, most of these can be handled by `atan` if the argument is conditioned properly. Consider bending two rules of arithmetic that otherwise produce indeterminant results: | |
- 0 / 0 = 0 | |
- inf / inf = 1 | |
If `arg = y/x` (with the sign applied), then `atan(arg)` will produce the correct results, which are then shifted to the appropriate quadrant. Strict equality comparisons also preserve `nan` arguments. This simplifies the implementation and avoids a lot of work already being performed by `atan`. | |
Additionally, `atan2` is now present in `numext` (which uses `std::atan2`) and has a corresponding packet function `patan2`. `scalar_atan2_op` has been configured to reference either `numext::atan2` or `patan2` per Eigen convention. | |
Also, there was a weird bug in (at least in MSVC) that prevented `binary_ops_test()` from working correctly. Passing `x` and `y` as `auto` produced random garbage, causing the tests to fail. Changing it to `const auto&` fixed it. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-01-26T17:38:21.974Z,NA,NA | |
1179 (https://gitlab.com/libeigen/eigen/-/merge_requests/1179),Turn off vectorize version of rsqrt - doesn't match generic version,Turn off vectorize version of rsqrt - vec_rsqrt doesn't match the generic version.,Chip Kerchner,2023-01-27T18:28:54.800Z,NA,NA | |
1181 (https://gitlab.com/libeigen/eigen/-/merge_requests/1181),Fix bugs exposed by enabling GPU asserts.,The convolution issue is a true bug :S.,Antonio Sánchez,2023-01-27T21:43:01.775Z,NA,NA | |
1178 (https://gitlab.com/libeigen/eigen/-/merge_requests/1178),Fix sparse warnings,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-01-27T22:47:42.784Z,NA,NA | |
1176 (https://gitlab.com/libeigen/eigen/-/merge_requests/1176),Optimize various mathematical packet ops,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #2599 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
- fixed atan(-0) = -0. | |
- fixed binary pow = -0 issues | |
- optimized atan2 | |
- optimized acos, use new polynomial that is symmetric about 0 | |
- added signbit (ignoring nan) test for binary pow, atan2, and unary pow -- all of which pass now | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-01-28T01:34:27.375Z,NA,NA | |
1180 (https://gitlab.com/libeigen/eigen/-/merge_requests/1180),Fix stupid sparse bugs with outerSize == 0,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Calling `nonZeros()` on an empty sparse matrix could segfault since `m_outerIndex` is null. There are many more cases where this could be an issue: `makeCompressed`, the various flavors of `insert`, the list goes on. Reverted change so that a sparse matrix with `outerSize == 0` allocates a size-1 array. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-01-28T02:03:09.773Z,NA,NA | |
1184 (https://gitlab.com/libeigen/eigen/-/merge_requests/1184),Fix pre-POWER8_VECTOR bugs in pcmp_lt and pnegate and reactivate psqrt.,Fix pre-POWER8_VECTOR bugs in pcmp_lt and pnegate and reactivate psqrt.,Chip Kerchner,2023-01-31T19:40:24.677Z,NA,NA | |
1183 (https://gitlab.com/libeigen/eigen/-/merge_requests/1183),Fix undefined behavior in Block access,"### Reference issue | |
Fixes #2482. | |
### What does this implement/fix? | |
Avoids undefined behavior in Block access. | |
Pointer arithmetic on null pointers is undefined behavior in C++. | |
See the issue for more details. | |
### Additional information | |
I've put this through its paces in my own codebase. | |
In [this](https://github.com/RobotLocomotion/drake/pull/18700) merge request I've tested with and without the patch. | |
Without the patch [here](https://drake-jenkins.csail.mit.edu/job/linux-jammy-clang-bazel-experimental-undefined-behavior-sanitizer/3/consoleFull) and [here](https://drake-jenkins.csail.mit.edu/job/linux-focal-clang-bazel-experimental-undefined-behavior-sanitizer/39/consoleFull) you can see the handful of UBSan errors. With the patch, those all go away [here](https://drake-jenkins.csail.mit.edu/job/linux-jammy-clang-bazel-experimental-undefined-behavior-sanitizer/2/) and [here](https://drake-jenkins.csail.mit.edu/job/linux-focal-clang-bazel-experimental-undefined-behavior-sanitizer/38/).",Jeremy Nimmer,2023-02-01T00:40:46.229Z,NA,NA | |
1185 (https://gitlab.com/libeigen/eigen/-/merge_requests/1185),Tweak special case handling in atan2.,This fixes a test failure in TensorFlow with Clang.,Rasmus Munk Larsen,2023-02-01T02:46:15.584Z,NA,NA | |
1186 (https://gitlab.com/libeigen/eigen/-/merge_requests/1186),Update file ForwardDeclarations.h,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-02-01T17:19:15.760Z,NA,NA | |
1188 (https://gitlab.com/libeigen/eigen/-/merge_requests/1188),"Revert StlIterators edit from ""Fix undefined behavior...""",As discussed here: https://gitlab.com/libeigen/eigen/-/merge_requests/1183#note_1261831078,Jeremy Nimmer,2023-02-01T20:01:36.670Z,NA,NA | |
1190 (https://gitlab.com/libeigen/eigen/-/merge_requests/1190),Use VERIFY_IS_EQUAL to compare to zeros.,NA,Rasmus Munk Larsen,2023-02-01T22:15:11.808Z,NA,NA | |
1191 (https://gitlab.com/libeigen/eigen/-/merge_requests/1191),fix lapacke config,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Include `lapacke.h` before `ForwardDeclarations.h` so that `lapack_int` can be defined consistently with the external lapack library. If MKL is used, we use their config file. Otherwise, we use the one included with Eigen (has include guard `_LAPACKE_H_`). Users may include an optional `lapacke_config.h` with `HAVE_LAPACK_CONFIG_H` to override the defaults. | |
Change `lapacke.h` to define (default) complex types as `std::complex<float>` , `std::complex<double>`. Add option to define `lapack_int` as `int64_t` with `LAPACK_ILP64` switch (per https://netlib.org/lapack/lapacke_config.h). I specifically placed the macro guards in `Core` below `#include <complex>` so we don't need to worry about the complex headers being included. | |
Fix QR bindings to use `std::complex` instead of `scomplex` as those are specific to MKL and not Eigen in general. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-02-03T16:40:09.710Z,NA,NA | |
1192 (https://gitlab.com/libeigen/eigen/-/merge_requests/1192),More EIGEN_DEVICE_FUNC fixes for CUDA 10/11/12.,"Silly NVCC, breaking in different ways for different versions. Also cleaned up a few warnings.",Antonio Sánchez,2023-02-03T19:18:46.291Z,NA,NA | |
1189 (https://gitlab.com/libeigen/eigen/-/merge_requests/1189),Fixes #2602,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
#2602 | |
### What does this implement/fix? | |
Add missing `EIGEN_DEVICE_FUNC` qualifiers inside the Assignment struct for `SkewSymmetricDense`, to enable usage of SkewSymmetric<?> on CUDA. | |
### Additional information | |
Without this, SkewSymmetric<?> doesn't work inside CUDA kernels.",Gregory Kramida,2023-02-06T22:52:40.489Z,NA,NA | |
1198 (https://gitlab.com/libeigen/eigen/-/merge_requests/1198),Change in Power eigen_asserts to eigen_internal_asserts since it is putting unnecessary error checking and assertions without NDEBUG.,Change in Power eigen_asserts to eigen_internal_asserts since it is putting unnecessary error checking and assertions without NDEBUG.,Chip Kerchner,2023-02-08T00:57:31.649Z,NA,NA | |
1197 (https://gitlab.com/libeigen/eigen/-/merge_requests/1197),Remove LGPL Code and references.,"LGPL is fundamentally incompatible with MPL2, and for a heavily | |
templated header-only library like Eigen, it essentially forces all | |
users to license their work under LGPL (thereby acting more like GPL). | |
Specfically, for LGPLv3, users are only able to relicense their work | |
under a different license if | |
> ... the incorporated material is not limited to numerical parameters, | |
> data structure layouts and accessors, or small macros, inline functions and | |
> templates (ten or fewer lines in length) | |
The LGPL code in Eigen contained the entire logic in large templates and | |
inline functions. | |
The spirit of the LGPL license is that downstream users can swap out | |
the library with their own version if desired. This will clearly not be | |
possible with Eigen in general. The only way to adhere to this would be | |
to write a separate interface library licensed as LGPL. | |
The existing LGPL code is limited in scope, hasn't been modified in | |
a while, and doesn't seem to be heavily used. Therefore we have decided | |
to remove it. Anybody wanting to restore the missing module can use | |
an older version of Eigen, or can easily copy the removed components | |
locally.",Antonio Sánchez,2023-02-08T01:25:06.853Z,NA,NA | |
1200 (https://gitlab.com/libeigen/eigen/-/merge_requests/1200),Get rid of custom implementation of equal_to and not_equal_no. No longer needed with c+14.,Small cleanup.,Rasmus Munk Larsen,2023-02-08T17:13:54.275Z,NA,NA | |
1199 (https://gitlab.com/libeigen/eigen/-/merge_requests/1199),Add IWYU export pragmas to top-level headers.,"A lot of tooling seems to depend on these. In particular, clang-tidy | |
and clangd, which can use IWYU configs to auto-suggest which headers | |
to include for a given symbol, and for removing unnecessary headers.",Antonio Sánchez,2023-02-08T17:40:32.585Z,NA,NA | |
1202 (https://gitlab.com/libeigen/eigen/-/merge_requests/1202),Fix MSVC arm build.,"There are two main issues with MSVC+ARM | |
1) All intrinsic functions are macros, which complicates calling other | |
functions in arguments (e.g. use of commas, <>) | |
2) All vector types are actually aliases of one of `_n64`, `_n128`, so | |
we need to use `eigen_packet_wrapper` and add some explicit | |
conversions. | |
Fixes #2607.",Antonio Sánchez,2023-02-08T21:46:38.739Z,NA,NA | |
1206 (https://gitlab.com/libeigen/eigen/-/merge_requests/1206),Update file ColPivHouseholderQR_LAPACKE.h,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #2612 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Use `translate_type_imp` to handle replacement of std::complex with lapacke complex types. Declare lapacke specializations with std::complex. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-02-09T14:40:04.497Z,NA,NA | |
1207 (https://gitlab.com/libeigen/eigen/-/merge_requests/1207),Optimize psign,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Optimizes the generic sign function for floating point types. | |
Previously (float): | |
- 3 constants | |
- 3 comparisons | |
- 3 logicals | |
- 1 blend | |
Now: | |
- 2 constants | |
- 1 comparison | |
- 2 logicals (+1 absolute value) = ~3 logicals | |
- 1 blend | |
AVX2 disassembly for float | |
``` | |
Eigen::internal::test_old(float __vector(8)): | |
vxorps xmm1, xmm1, xmm1 | |
vcmpps ymm3, ymm0, ymm0, 0 | |
vbroadcastss ymm4, DWORD PTR .LC1[rip] | |
vcmpps ymm2, ymm1, ymm0, 17 | |
vcmpps ymm1, ymm0, ymm1, 17 | |
vandps ymm2, ymm2, ymm4 | |
vbroadcastss ymm4, DWORD PTR .LC3[rip] | |
vandps ymm1, ymm1, ymm4 | |
vorps ymm1, ymm2, ymm1 | |
vblendvps ymm0, ymm0, ymm1, ymm3 | |
ret | |
``` | |
``` | |
Eigen::internal::test_new(float __vector(8)): | |
vbroadcastss ymm1, DWORD PTR .LC5[rip] | |
vbroadcastss ymm3, DWORD PTR .LC1[rip] | |
vandps ymm2, ymm0, ymm1 | |
vandnps ymm1, ymm1, ymm0 | |
vxorps xmm0, xmm0, xmm0 | |
vcmpps ymm0, ymm0, ymm2, 17 | |
vorps ymm1, ymm1, ymm3 | |
vblendvps ymm0, ymm2, ymm1, ymm0 | |
ret | |
``` | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-02-09T22:15:27.577Z,NA,NA | |
1201 (https://gitlab.com/libeigen/eigen/-/merge_requests/1201),Fix ODR violation with `gemm_extra_cols` on PPC,"When `EIGEN_ALTIVEC_MMA_DYNAMIC_DISPATCH` is enabled `gemmMMA` will | |
instantiate `gemm_extra_cols` again (first in `gemm`). | |
However as `MatrixProductMMA.h` enables `#pragma GCC target(""cpu=power10,htm"")` | |
this leads to a different binary version of the function with the same name | |
which is an ODR violation. | |
This in turn leads to crashes in `gemm->gemm_extra_cols` when it happens | |
to use the MMA instantiation. | |
Solve this by using a new name for the function similar to | |
`gemmMMA_cols` vs `gemm_cols`. | |
Fixes #2608",Alexander Grund,2023-02-09T22:16:07.646Z,NA,NA | |
1208 (https://gitlab.com/libeigen/eigen/-/merge_requests/1208),Revert ODR changes and make gemm_extra_cols and gemm_complex_extra_cols EIGEN_ALWAYS_INLINE to avoid external functions.,Revert ODR changes and make gemm_extra_cols and gemm_complex_extra_cols EIGEN_ALWAYS_INLINE to avoid external functions.,Chip Kerchner,2023-02-10T17:05:07.963Z,NA,NA | |
1210 (https://gitlab.com/libeigen/eigen/-/merge_requests/1210),Fold extra column calculations into an extra MMA accumulator and other bfloat16 MMA GEMM improvements,"Fold extra column calculations into an extra MMA accumulator - up to 10% faster. | |
Increase number of MMA accumulators from 7 to 8 - up to 5-10% faster. | |
Misc other minor improvements.",Chip Kerchner,2023-02-10T17:32:07.505Z,NA,NA | |
1209 (https://gitlab.com/libeigen/eigen/-/merge_requests/1209),Print diagonal matrix,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Allows users to print diagonal matrix expressions without assigning them to a dense object. Cute utility that may be useful for debugging. Surprisingly, it is remarkably flexible and incurs no allocations (provided the expression itself does not). | |
Example input: | |
``` | |
int main() | |
{ | |
VectorXf a(10); | |
internal::set_is_malloc_allowed(false); | |
a.setRandom(); | |
// plain vectors | |
std::cout << a.asDiagonal() << ""\n\n""; | |
// simple expressions | |
std::cout << VectorXf::EqualSpaced(10, -2.0f, 1.5f).asDiagonal() << ""\n\n""; | |
// slightly more complicated expressions | |
std::cout << MatrixXf::Random(10, 10).diagonal(-1).asDiagonal() << ""\n\n""; | |
} | |
``` | |
Example output: | |
``` | |
-0.997497 0 0 0 0 0 0 0 0 0 | |
0 0.127171 0 0 0 0 0 0 0 0 | |
0 0 -0.613392 0 0 0 0 0 0 0 | |
0 0 0 0.617481 0 0 0 0 0 0 | |
0 0 0 0 0.170019 0 0 0 0 0 | |
0 0 0 0 0 -0.0402539 0 0 0 0 | |
0 0 0 0 0 0 -0.299417 0 0 0 | |
0 0 0 0 0 0 0 0.791925 0 0 | |
0 0 0 0 0 0 0 0 0.64568 0 | |
0 0 0 0 0 0 0 0 0 0.49321 | |
-2 0 0 0 0 0 0 0 0 0 | |
0 -0.5 0 0 0 0 0 0 0 0 | |
0 0 1 0 0 0 0 0 0 0 | |
0 0 0 2.5 0 0 0 0 0 0 | |
0 0 0 0 4 0 0 0 0 0 | |
0 0 0 0 0 5.5 0 0 0 0 | |
0 0 0 0 0 0 7 0 0 0 | |
0 0 0 0 0 0 0 8.5 0 0 | |
0 0 0 0 0 0 0 0 10 0 | |
0 0 0 0 0 0 0 0 0 11.5 | |
-0.668203 0 0 0 0 0 0 0 0 | |
0 0.97705 0 0 0 0 0 0 0 | |
0 0 -0.108615 0 0 0 0 0 0 | |
0 0 0 -0.761834 0 0 0 0 0 | |
0 0 0 0 -0.990661 0 0 0 0 | |
0 0 0 0 0 -0.982177 0 0 0 | |
0 0 0 0 0 0 -0.24424 0 0 | |
0 0 0 0 0 0 0 0.0633259 0 | |
0 0 0 0 0 0 0 0 0.142369 | |
``` | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-02-10T18:07:29.916Z,NA,NA | |
1212 (https://gitlab.com/libeigen/eigen/-/merge_requests/1212),Disable array BF16 to F32 conversions in Power,Disable array BF16 to F32 conversions in Power,Chip Kerchner,2023-02-10T20:06:58.792Z,NA,NA | |
1213 (https://gitlab.com/libeigen/eigen/-/merge_requests/1213),Fix compiler warnings.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2023-02-10T21:11:13.182Z,NA,NA | |
1214 (https://gitlab.com/libeigen/eigen/-/merge_requests/1214),Fix problem with array conversions BF16->F32 in Power.,Fix problem with array conversions BF16->F32 in Power. Half as many vector instructions for conversions.,Chip Kerchner,2023-02-13T21:30:45.743Z,NA,NA | |
1215 (https://gitlab.com/libeigen/eigen/-/merge_requests/1215),Fix compiler warnings in tests.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2023-02-14T02:29:04.009Z,NA,NA | |
1216 (https://gitlab.com/libeigen/eigen/-/merge_requests/1216),Fix NEON make_packet2f.,Typo on return value.,Antonio Sánchez,2023-02-14T16:52:08.141Z,NA,NA | |
1218 (https://gitlab.com/libeigen/eigen/-/merge_requests/1218),Fix MSVC atan2 test.,"Add a correction when `std::atan2` returns `denorm_min`, returning `y/x` | |
on underflow according to the POSIX spec. | |
We potentially could add such a correction to `numext::atan2`, but | |
I'm not sure it's necessary if the MSVC implementation _does_ satisfy | |
the C++ spec (but not POSIX).",Antonio Sánchez,2023-02-14T18:30:59.079Z,NA,NA | |
1220 (https://gitlab.com/libeigen/eigen/-/merge_requests/1220),More NEON packetmath fixes.,"Address gcc compile issues, and a preinterpret stack overflow.",Antonio Sánchez,2023-02-14T21:45:25.980Z,NA,NA | |
1219 (https://gitlab.com/libeigen/eigen/-/merge_requests/1219),"Tweak pasin_float, fix psqrt_complex","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #2597 | |
`pasin_float`: | |
Swapped out a comparison for some bit flipping and some other minor optimizations. This reduces runtime by ~11% (AVX). | |
|size|before|after|diff| | |
|----|-------|------|--------------| | |
| 32 | 1080 | 949 | -12% | | |
| 64 | 1058 | 992 | -6% | | |
| 128 | 1089 | 912 | -16% | | |
| 256 | 1127 | 889 | -21% | | |
| 512 | 1086 | 882 | -18% | | |
| 1024 | 1014 | 845 | -16% | | |
| 2048 | 1223 | 952 | -22% | | |
| 4096 | 1125 | 856 | -23% | | |
| 8192 | 1270 | 1018 | -19% | | |
| 16384 | 1133 | 841 | -25% | | |
| 32768 | 1129 | 1021 | -9% | | |
| 65536 | 1067 | 880 | -17% | | |
| 131072 | 1145 | 861 | -24% | | |
| 262144 | 1125 | 982 | -12% | | |
| 524288 | 1199 | 937 | -21% | | |
| 1048576 | 1460 | 929 | -36% | | |
| 2097152 | 1220 | 1042 | -14% | | |
| 4194304 | 1431 | 1166 | -18% | | |
| 8388608 | 1885 | 1195 | -36% | | |
| 16777216 | 1798 | 1275 | -29% | | |
| 33554432 | 1485 | 1137 | -23% | | |
| 67108864 | 1373 | 1112 | -19% | | |
| 134217728 | 1338 | 1149 | -14% | | |
`psqrt_complex`: Fixed error handling where, unless otherwise specified, if either the real or imaginary component is `nan`, then the result is `nan`. This is addressed before handling the special infinity cases. Overall, it is slower. | |
https://godbolt.org/z/vneGGcGjc | |
|size|before|after|diff| | |
|----|-------|------|--------------| | |
| 32 | 464 | 527 | 13% | | |
| 64 | 513 | 567 | 10% | | |
| 128 | 498 | 544 | 9% | | |
| 256 | 499 | 563 | 12% | | |
| 512 | 492 | 514 | 4% | | |
| 1024 | 605 | 636 | 5% | | |
| 2048 | 468 | 540 | 15% | | |
| 4096 | 551 | 612 | 11% | | |
| 8192 | 467 | 548 | 17% | | |
| 16384 | 510 | 612 | 20% | | |
| 32768 | 466 | 571 | 22% | | |
| 65536 | 517 | 542 | 4% | | |
| 131072 | 520 | 608 | 16% | | |
| 262144 | 520 | 537 | 3% | | |
| 524288 | 510 | 577 | 13% | | |
| 1048576 | 497 | 589 | 18% | | |
| 2097152 | 505 | 583 | 15% | | |
| 4194304 | 551 | 566 | 2% | | |
| 8388608 | 514 | 600 | 16% | | |
| 16777216 | 514 | 511 | 0% | | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-02-15T01:01:16.900Z,NA,NA | |
1211 (https://gitlab.com/libeigen/eigen/-/merge_requests/1211),Add CArg,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Enables vectorized complex argument calculations. Eigen lacks the ""infrastructure"" to easily adapt assignment loops for complex to real conversions. However, it is not difficult to represent those real numbers as complex numbers with a 0 imaginary component. This adds `CArg`, which is simply the real-valued `arg` function returned as a complex number. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-02-15T21:33:07.511Z,NA,NA | |
1221 (https://gitlab.com/libeigen/eigen/-/merge_requests/1221),Guard complex sqrt on old MSVC compilers.,"They don't support many AVX512 math functions, and is causing complex | |
sqrt to fail. | |
Fixes #2499.",Antonio Sánchez,2023-02-16T19:47:00.958Z,NA,NA | |
1222 (https://gitlab.com/libeigen/eigen/-/merge_requests/1222),Fix epsilon value in long double for double doubles. Prevented some algorithms from converging on PPC.,Fix epsilon value in long double for double doubles. Prevented some algorithms from converging on PPC.,Chip Kerchner,2023-02-16T23:35:42.972Z,NA,NA | |
1224 (https://gitlab.com/libeigen/eigen/-/merge_requests/1224),Add and enable Packet int divide for Power10.,Add and enable Packet int divide for Power10.,Chip Kerchner,2023-02-17T19:04:19.439Z,NA,NA | |
1203 (https://gitlab.com/libeigen/eigen/-/merge_requests/1203),Add typed logicals,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Typed (i.e. not necessarily `bool`) logical operators enable full vectorization without casting to `bool`, which may be particularly useful for nested logical expressions, comparisons (once typed comparisons are enabled), and other boolean reductions. | |
This MR generalizes the boolean logical ops: | |
- `scalar_boolean_and_op` / `&&` | |
- `scalar_boolean_or_op` / `||` | |
- `scalar_boolean_xor_op` | |
- `scalar_boolean_not_op` / `!` | |
to operate on any Scalar as if they were boolean. | |
This is implemented in the following manner: for any scalar `T a`, `a` is `false` if `a == T(0)`. For floating types, `-0.0` will also evaluate to `false`. Conversely, `a` is `true` if `a != T(0)`. This allows any type, including complex types and non-standard scalars, to be evaluated as if they were booleans. Vectorized comparison ops typically return bitmasks comprised of all 1's for `true`, which is problematic if the scalar is a floating point type as all 1's corresponds to `nan`. This may lead to non-intuitive results, as `nan != nan`. | |
Note: C++ doesn't have a dedicated boolean xor operator. The functional equivalent is `!=`, which is reserved for generic inequality. Otherwise, `12.182384 != -0.283383` would evaluate to `false`, which would be confusing. | |
New ops for bitwise logical operations and corresponding overloads for Eigen types are provided: | |
- `scalar_bitwise_and_op` / `&` | |
- `scalar_bitwise_or_op` / `|` | |
- `scalar_bitwise_xor_op` / `^` | |
- `scalar_bitwise_not_op` / `~` | |
Currently, the sparse and tensor binary ops are hard coded to `bool` as I am less familiar with those uses. If someone could point me to a potentially breaking scenario that involves typed logicals, I can look into it. | |
Previously, `^` corresponded to `scalar_boolean_xor_op`, but now it corresponds to `scalar_bitiwse_xor_op`. However, the previous implementation effectively computed a bitwise xor, so this shouldn't be a problem. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-02-18T01:23:48.301Z,NA,NA | |
1223 (https://gitlab.com/libeigen/eigen/-/merge_requests/1223),Vectorize atanh & add a missing definition and unit test for atan.,"This change adds | |
1. A vectorized implementation of `atanh`. | |
2. A missing definition for `atan<half>`. | |
3. Unit tests for patan. | |
4. A new helper function for testing unary functors on special IEEE values. Tests are added for most common mathematical functions. | |
The vectorized function is accurate to 2 ULP. | |
Benchmark numbers: | |
``` | |
SSE: | |
name old cpu/op new cpu/op delta | |
BM_eigen_atanh_float/1 0.29ns ± 1% 2.45ns ± 0% +748.86% (p=0.000 n=53+46) | |
BM_eigen_atanh_float/8 36.4ns ± 0% 34.3ns ± 2% -5.92% (p=0.000 n=43+38) | |
BM_eigen_atanh_float/64 334ns ± 1% 205ns ± 4% -38.61% (p=0.000 n=44+60) | |
BM_eigen_atanh_float/512 2.65µs ± 1% 1.60µs ± 1% -39.82% (p=0.000 n=51+38) | |
BM_eigen_atanh_float/4k 21.2µs ± 0% 12.7µs ± 4% -40.00% (p=0.000 n=56+59) | |
BM_eigen_atanh_float/32k 169µs ± 0% 102µs ± 1% -39.80% (p=0.000 n=51+43) | |
BM_eigen_atanh_float/256k 1.36ms ± 0% 0.82ms ± 1% -39.84% (p=0.000 n=56+48) | |
BM_eigen_atanh_float/1M 5.43ms ± 0% 3.26ms ± 2% -39.87% (p=0.000 n=55+43) | |
AVX2: | |
name old cpu/op new cpu/op delta | |
BM_eigen_atanh_float/1 2.12ns ± 1% 1.91ns ± 0% -9.68% (p=0.000 n=54+55) | |
BM_eigen_atanh_float/8 38.1ns ± 0% 39.4ns ± 1% +3.55% (p=0.000 n=45+43) | |
BM_eigen_atanh_float/64 334ns ± 3% 137ns ± 4% -59.10% (p=0.000 n=50+59) | |
BM_eigen_atanh_float/512 2.66µs ± 2% 0.84µs ± 5% -68.51% (p=0.000 n=51+57) | |
BM_eigen_atanh_float/4k 21.3µs ± 1% 6.5µs ± 5% -69.53% (p=0.000 n=58+59) | |
BM_eigen_atanh_float/32k 170µs ± 1% 51µs ± 4% -69.90% (p=0.000 n=57+53) | |
BM_eigen_atanh_float/256k 1.36ms ± 1% 0.41ms ± 4% -69.81% (p=0.000 n=58+47) | |
BM_eigen_atanh_float/1M 5.46ms ± 1% 1.65ms ± 5% -69.71% (p=0.000 n=60+59) | |
AVX512: | |
name old cpu/op new cpu/op delta | |
BM_eigen_atanh_float/1 2.18ns ± 1% 3.28ns ± 1% +49.98% (p=0.000 n=56+47) | |
BM_eigen_atanh_float/8 38.1ns ± 0% 41.4ns ± 1% +8.87% (p=0.000 n=47+44) | |
BM_eigen_atanh_float/64 332ns ± 1% 148ns ± 3% -55.49% (p=0.000 n=43+54) | |
BM_eigen_atanh_float/512 2.66µs ± 1% 0.47µs ± 5% -82.21% (p=0.000 n=52+56) | |
BM_eigen_atanh_float/4k 21.3µs ± 1% 3.1µs ± 3% -85.39% (p=0.000 n=54+50) | |
BM_eigen_atanh_float/32k 170µs ± 0% 24µs ± 3% -85.86% (p=0.000 n=54+55) | |
BM_eigen_atanh_float/256k 1.36ms ± 1% 0.19ms ± 3% -85.90% (p=0.000 n=56+60) | |
BM_eigen_atanh_float/1M 5.46ms ± 1% 0.76ms ± 4% -85.98% (p=0.000 n=57+58) | |
```",Rasmus Munk Larsen,2023-02-21T03:14:05.980Z,NA,NA | |
1226 (https://gitlab.com/libeigen/eigen/-/merge_requests/1226),Use pmsub in twoprod. This speeds up pow() on Skylake by ~1%.,"Benchmark numbers with `-march=skylake`: | |
``` | |
name old cpu/op new cpu/op delta | |
BM_eigen_powquarter_float/1 3.82ns ± 1% 3.82ns ± 1% ~ (p=0.330 n=46+43) | |
BM_eigen_powquarter_float/8 95.0ns ± 0% 95.0ns ± 0% ~ (p=0.779 n=51+56) | |
BM_eigen_powquarter_float/64 1.03µs ± 3% 1.02µs ± 2% -0.85% (p=0.000 n=51+56) | |
BM_eigen_powquarter_float/512 8.48µs ± 2% 8.43µs ± 3% -0.59% (p=0.002 n=52+57) | |
BM_eigen_powquarter_float/4k 68.4µs ± 3% 67.5µs ± 3% -1.27% (p=0.000 n=54+53) | |
BM_eigen_powquarter_float/32k 546µs ± 4% 541µs ± 3% -0.93% (p=0.007 n=57+56) | |
BM_eigen_powquarter_float/256k 4.38ms ± 5% 4.33ms ± 3% -0.99% (p=0.001 n=56+55) | |
BM_eigen_powquarter_float/1M 17.4ms ± 3% 17.3ms ± 3% -0.51% (p=0.032 n=53+54) | |
```",Rasmus Munk Larsen,2023-02-21T20:09:30.324Z,NA,NA | |
1227 (https://gitlab.com/libeigen/eigen/-/merge_requests/1227),[SYCL-2020]- null placeholder accessor issue in Reduction SYCL test,"SYCL 2020 Does not allow a null placeholder accessor to be created. As a result, the Reduction test fails with segmentation fault using the latest DPCPP compiler that strictly applies this rule. This PR fixes the issue by avoiding a null accessor for SYCL backend",Mehdi Goli,2023-02-22T17:44:54.766Z,NA,NA | |
1229 (https://gitlab.com/libeigen/eigen/-/merge_requests/1229),Fix a number of MSAN failures in SVD tests.,"This fixes a few MSAN errors in SVD tests where we find the maximum absolute entry in an uninitialized matrix, while testing handling of options and asserts. Harmless, but undefined behavior, nonetheless.",Rasmus Munk Larsen,2023-02-23T18:44:54.177Z,NA,NA | |
1230 (https://gitlab.com/libeigen/eigen/-/merge_requests/1230),Get rid of EIGEN_HAS_AVX512_MATH workaround.,"The workarounds using EIGEN_HAS_AVX512_MATH seem to have primarily been present to avoid packet ops depending implicitly on `prsqrt`, which now has a generic implementation in terms of `preciprocal` and `psqrt`. | |
The workaround in `PacketMathFP16.h` is redundant because such old compilers never had support for AVX512FP16, which is required for that file to be included. | |
I verified that the intrinsics needed for psqrt/sqrt are indeed available for the older compilers excluded by the workaround.",Rasmus Munk Larsen,2023-02-23T23:16:42.567Z,NA,NA | |
1228 (https://gitlab.com/libeigen/eigen/-/merge_requests/1228),Fix compiler versions for certain instructions on Power.,"Fix compiler versions for certain instructions on Power. | |
Recently added `vec_div` command for int type does `NOT` work for gcc 10.4",Chip Kerchner,2023-02-23T23:24:42.211Z,NA,NA | |
1232 (https://gitlab.com/libeigen/eigen/-/merge_requests/1232),Guard use of long double on GPU device.,"CUDA/HIP treat it as `double` in device code, leading to warnings and | |
duplicate symbols.",Antonio Sánchez,2023-02-24T21:49:59.806Z,NA,NA | |
1196 (https://gitlab.com/libeigen/eigen/-/merge_requests/1196),vectorize comparisons and select by enabling typed comparisons,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #2603 | |
### What does this implement/fix? | |
Vectorization of comparison ops is at optimal when the return type is the same as the arguments. The return type of the comparison ops is now controlled by a template parameter, whose default value is controlled by a new configuration macro `EIGEN_USE_TYPED_COMPARATORS`. This allows users to opt into this feature, which is faster but may break current applications. | |
Instead of messing with the Select evaluator, a new ternary op is introduced: `scalar_boolean_select_op`. `scalar_boolean_select_op` is similar to `pselect`, except that it can operate on a ""boolean"" mask (0 or 1) as well as a true bitmask `0xFF`. The overhead of calling the equality comparison and flushing to Scalar(1) appears to be very minimal (or at least, not measurable on this machine). | |
Benchmarks: size `100'000'000` `ArrayXf` `a = (a < b).select(a, b)` | |
| test | Time (ms) | | |
| ------ | ------ | | |
| Boolean comparisons | 413 , 390 , 393 | | |
| Typed comparison | 47 , 46 , 48 | | |
Future work: not completely related, but often associated -- vectorize `any()` and `all()`. Currently, these ops live in `BooleanRedux.h`, along with a few other odds and ends. I think this file could be deleted entirely and all associated ops be moved to `Redux.h`. | |
Here, the vectorization infrastructure is already in place, including the unrolled loops. For each version of `redux_impl`, we could add something like `runAny` and `runAll`. The question is how often to call the reduction on the packets to break out of the loop. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-02-25T20:52:12.471Z,NA,NA | |
1235 (https://gitlab.com/libeigen/eigen/-/merge_requests/1235),Fix ODR issues with Intel's AVX512 TRSM kernels.,"We can't have `static` free functions (i.e. internal linkage) | |
in header files. Remove the `static` qualifier.",Antonio Sánchez,2023-02-27T07:54:53.261Z,NA,NA | |
1237 (https://gitlab.com/libeigen/eigen/-/merge_requests/1237),Fix gpu conv3d out-of-resources failure.,"It seems the conv3d kernel is highly sensitive to internal variable | |
size, requiring 32-bit int variables to avoid running out of resources. | |
This fixes the `cxx11_tensor_device_2` and `cxx11_tensor_gpu_3` tests. | |
This is a partial reversion of !1192.",Antonio Sánchez,2023-02-28T21:25:00.835Z,NA,NA | |
1239 (https://gitlab.com/libeigen/eigen/-/merge_requests/1239),fix signed shift test,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
`vshrq_n_s64` and possibly other NEON integer shift operations accept arguments greater than or equal to 1. Passing zero causes the tests to fail. `error: argument value 0 is outside the valid range [1, 64]` | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-03-01T14:58:55.380Z,NA,NA | |
1240 (https://gitlab.com/libeigen/eigen/-/merge_requests/1240),Scalarize comps,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
The world is not yet ready for typed comparisons. This reverts the default behavior for the comparison overloads to return an array of `bool`. However, an API for typed comparisons is added: `a.cwiseTypedLesser(b)`. This returns a non-boolean array if `a` and `b` have the same type. In this way, expressions like `e = a.cwiseTypedLesser(b).select(c,d)` may be fully vectorized. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-03-02T17:06:24.187Z,NA,NA | |
1236 (https://gitlab.com/libeigen/eigen/-/merge_requests/1236),Added partial linear access for LHS & Output - 30% faster for bfloat16 GEMM MMA (Power),Added partial linear access for LHS & Output - 30% faster (1/3 less memory loads). Fixed bfloat16 MMA GEMM to follow disable MMA flag.,Chip Kerchner,2023-03-02T19:22:44.028Z,NA,NA | |
1243 (https://gitlab.com/libeigen/eigen/-/merge_requests/1243),fix tensor comparison test,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Forgot to revert the test in `unsupported/test/cxx11_tensor_comparisons.cpp` | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-03-06T13:36:22.078Z,NA,NA | |
1242 (https://gitlab.com/libeigen/eigen/-/merge_requests/1242),"Fix 2240, 2620","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #2240 | |
Partially sort-of addresses #2620 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Pre-allocate `cols()` workspace for in-place tridiagonalization. If the Eigenvectors are requested, then this fix simply allocates the memory a bit sooner. However, if the eigen vectors are not requested, then this memory is not used. This could be rectified by checking the options parameter, though it is only passed in one variant of the constructor. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-03-06T23:11:07.201Z,NA,NA | |
1233 (https://gitlab.com/libeigen/eigen/-/merge_requests/1233),Vectorize any() / all(),"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
`DenseBase::any()` / `DenseBase::all()` are not vectorized, even if the dependent functors are. Instead of building out the specialized evaluators for any and all to support this functionality, I instead modified the visitor evaluator to optionally support short-circuit evaluation. If this is not requested, it compiles to a no-op. The any/all visitors are straight forward, and use `predux` at each step to check if the appropriate condition has been met to break out the loop early. | |
Benchmarks: 20'007 x 20'007 MatrixXf filled with nonzeros except the bottom-right coefficient -- worst-case scenario for `all()`. Odd size was used to force a mix of vector and scalar ops. Numbers are similar for `any()` in an analogous scenario. Time in ms. | |
| Before | SSE | AVX | | |
| ------ | ------ | ----- | | |
| 233 | 98 | 82 | | |
| 228 | 100 | 82 | | |
| 230 | 103 | 82 | | |
The speedup is less than expected due to the relatively expensive predux that occurs each iteration. However, the aggregate speedup for ops that use .any() / all() could be much greater, as vectorization is currently disabled for the entire expression chain. I went ahead and deleted the entire `BooleanRedux.h` header, and vectorized `count()` (except for `bool`) `hasnan()` `allfinite()`. | |
Other improvements to the visitors include: | |
- enable linear access, which is at-worst the same speed, but often ~5% faster for most cases I have tested. This potentially breaks custom visitors that users have created, though all that is required is `LinearAccess = false` in the functor traits. | |
- tweaked vectorized loops to call a vectorized initialization function. This will break custom visitors with vectorization, but all the user has to do is add an analogous `initpacket` function which shouldn't be too different from `packet` | |
- vectorized unrolled visitors (both linear and outer-inner traversals) | |
In general, visitors offer an alternative to the usual unary/binary/ternary expressions and allow the functor to be modified. I think this functionality should be explored in other aspects of Eigen where our usual approach may be limited. | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-03-06T23:54:03.493Z,NA,NA | |
1241 (https://gitlab.com/libeigen/eigen/-/merge_requests/1241),Set CMAKE_* cache variables only when Eigen is a top-level project,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
Eigen's main `CMakeLists.txt` currently sets `CMAKE_*` cache variables without checking whether it is built as a top-level project or not. Setting these variables as a sub-project also sets them in the outer project's scope (if they were not previously set), possibly changing the build process of the outer project. Such behavior seems counter-intuitive and unexpected. | |
This MR makes Eigen first check if it is a top-level project and set `CMAKE_*` cache variables only if it is.",Timofey Pushkin,2023-03-07T14:39:46.167Z,NA,NA | |
1245 (https://gitlab.com/libeigen/eigen/-/merge_requests/1245),Modify failing cwise test to get it to pass.,"Random integer matrix squaring leads to a lot of signed int overflows, | |
which doesn't satisfy `m * m >= 0`. Modified the test to use `.abs()`.",Antonio Sánchez,2023-03-07T19:47:42.941Z,NA,NA | |
1244 (https://gitlab.com/libeigen/eigen/-/merge_requests/1244),Specify Permutation Index for PartialPivLU and FullPivLU,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #2475 Fixes #2592 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Provides a mechanism to specify the permutation index type for LU decomposition classes. This is primarily useful for Lapacke ILP64 interfaces that require a 64 bit integer type for the permutation. Following this patch, all the implemented lapacke wrappers should be compatible with ILP64. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-03-07T20:28:06.629Z,NA,NA | |
1248 (https://gitlab.com/libeigen/eigen/-/merge_requests/1248),Fix LinAlgSVD example code,"A typo in the LinAlgSVD example code made the example not compile. | |
With this fix, the program compiles and runs as expected: | |
``` | |
Here is the matrix A: | |
0.680375 0.59688 | |
-0.211234 0.823295 | |
0.566198 -0.604897 | |
Here is the right hand side b: | |
-0.329554 | |
0.536459 | |
-0.444451 | |
The least-squares solution is: | |
-0.669626 | |
0.314253 | |
```",Zach Davis,2023-03-09T05:57:44.922Z,NA,NA | |
1250 (https://gitlab.com/libeigen/eigen/-/merge_requests/1250),s/Lesser/Less/,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Rasmus Munk Larsen,2023-03-10T00:28:32.525Z,NA,NA | |
1251 (https://gitlab.com/libeigen/eigen/-/merge_requests/1251),Add newline to end of file.,NA,Rasmus Munk Larsen,2023-03-10T17:20:58.853Z,NA,NA | |
1252 (https://gitlab.com/libeigen/eigen/-/merge_requests/1252),Work around compiler bug in Tridiagonalization.h,NA,Rasmus Munk Larsen,2023-03-10T21:21:07.850Z,NA,NA | |
1253 (https://gitlab.com/libeigen/eigen/-/merge_requests/1253),Clean up generic packetmath specializations for various backends with the help of a macro.,test,Rasmus Munk Larsen,2023-03-10T22:02:24.512Z,NA,NA | |
1249 (https://gitlab.com/libeigen/eigen/-/merge_requests/1249),Fix failing MSVC tests due to compiler bugs.,"Pretty scary, but it looks like there's something wrong with _some_ | |
calls to `*_set1_*` intrinsics with MSVC, where results end up being | |
somewhat random, depending on what's in other registers. In particular, | |
`ldexp` and `pow` tests were failing because `pnegate` produced bogus | |
values (unless you actually tried to print it, in which case it would | |
everything would work fine). Similarly, a masked store test was failing | |
because of a `set1` intrinsic used within `pstoreu` (again, unless you | |
printed the mask, in which case it would work fine). Replacing the | |
`set1` calls in two places with just `set` (and manually duplicating the | |
values) seems to be enough to get our tests to pass. | |
Should we be avoiding `set1` altogether? It's used in _many_ other | |
places throughout the codebase. It's unclear why we're seeing the issue | |
in these two places over others. | |
And, of course, isolating the calls to try to create a small reproducer | |
fails to reproduce the issue. | |
Original failing tests: | |
- AVX2: https://gitlab.com/libeigen/eigen_ci_cross_testing/-/jobs/3892597115 | |
- AVX512: https://gitlab.com/libeigen/eigen_ci_cross_testing/-/jobs/3892597132",Antonio Sánchez,2023-03-10T22:36:58.181Z,NA,NA | |
1254 (https://gitlab.com/libeigen/eigen/-/merge_requests/1254),Make new Select implementation backwards compatible.,"Swap template argument order when implementing DenseBase::select as a CwiseTernaryOp. Swapping the arguments assures backwards compatibility with the old Select implementation, which took storage traits from the second ""then"" argument, while CwiseTernaryOp takes it from the first argument. This avoids a number of breakages in legacy code mixing arrays and matrices.",Rasmus Munk Larsen,2023-03-10T23:07:48.456Z,NA,NA | |
1256 (https://gitlab.com/libeigen/eigen/-/merge_requests/1256),Fix bug in minmax_coeff_visitor for matrix of all NaNs.,NA,Rasmus Munk Larsen,2023-03-13T18:25:22.770Z,NA,NA | |
1255 (https://gitlab.com/libeigen/eigen/-/merge_requests/1255),Add MMA to BF16 GEMV - 5.0-6.3X faster (for Power),"Instead of converting from BF16->F32, do operation (madd), convert back F32->BF16 for each instruction, use MMA. | |
RowMajor is 6.3X faster. ColMajor is 5.0X faster. Both are between 1.3-1.9X faster than F32 GEMV.",Chip Kerchner,2023-03-13T19:37:13.691Z,NA,NA | |
1257 (https://gitlab.com/libeigen/eigen/-/merge_requests/1257),Handle PropagateFast the same way as PropagateNaN in minmax visitor,This change is to avoid returning out-of-bounds indices for matrices of all NaNs.,Rasmus Munk Larsen,2023-03-13T20:47:12.715Z,NA,NA | |
1258 (https://gitlab.com/libeigen/eigen/-/merge_requests/1258),Revert changes that made BF16 GEMM to cause bad register spillage for LLVM (Power),Revert changes that made BF16 GEMM to cause bad register spillage for LLVM (Power) - 20% slowdown.,Chip Kerchner,2023-03-13T23:36:06.870Z,NA,NA | |
1259 (https://gitlab.com/libeigen/eigen/-/merge_requests/1259),Put deadcode checks back in from previous change.,Put deadcode checks back in from previous change. Added a new one too.,Chip Kerchner,2023-03-14T00:57:16.706Z,NA,NA | |
1262 (https://gitlab.com/libeigen/eigen/-/merge_requests/1262),Limit the number of build jobs to 8 and link jobs to 4 for PowerPC. This should help reduce the OOM build problems.,Limit the number of build jobs to 8 and link jobs to 4 for PowerPC. This should help reduce the OOM build problems.,Chip Kerchner,2023-03-15T16:29:42.662Z,NA,NA | |
1263 (https://gitlab.com/libeigen/eigen/-/merge_requests/1263),Fix recent PowerPC warnings and clang warning,Fix recent PowerPC warnings and clang warning.,Chip Kerchner,2023-03-15T16:50:47.463Z,NA,NA | |
1260 (https://gitlab.com/libeigen/eigen/-/merge_requests/1260),Use C++11 standard features for detecting presence of Inf and NaN,"We now demand c+14 support of the compiler, and can rely on `std::numeric_limits<T>::{has_infinity,has_signaling_NaN,has_quiet_NaN}`.",Rasmus Munk Larsen,2023-03-15T16:52:45.788Z,NA,NA | |
1264 (https://gitlab.com/libeigen/eigen/-/merge_requests/1264),Use EIGEN_NOT_A_MACRO macro (oh the irony!) to avoid build issue in TensorFlow.,NA,Rasmus Munk Larsen,2023-03-15T19:13:28.230Z,NA,NA | |
1265 (https://gitlab.com/libeigen/eigen/-/merge_requests/1265),Vectorize tensor.isnan() by using typed predicates.,"Getting this to run fast with AVX512 required vectorizing casting between Packet16f and Packet16b. | |
Casting could be improved for other backends later. | |
Benchmark measurements: | |
``` | |
AVX512F: | |
name old cpu/op new cpu/op delta | |
BM_isNaN_1T/3 [using 1 threads] 10.9ns ± 4% 10.9ns ± 5% ~ (p=0.083 n=57+54) | |
BM_isNaN_1T/4 [using 1 threads] 8.07ns ±12% 7.69ns ± 4% -4.81% (p=0.000 n=55+54) | |
BM_isNaN_1T/7 [using 1 threads] 12.0ns ± 6% 9.9ns ± 5% -17.60% (p=0.000 n=57+44) | |
BM_isNaN_1T/8 [using 1 threads] 12.8ns ± 3% 8.7ns ± 6% -31.61% (p=0.000 n=53+54) | |
BM_isNaN_1T/10 [using 1 threads] 19.2ns ± 6% 13.2ns ± 6% -31.20% (p=0.000 n=54+47) | |
BM_isNaN_1T/15 [using 1 threads] 30.9ns ± 8% 15.3ns ± 7% -50.51% (p=0.000 n=50+55) | |
BM_isNaN_1T/16 [using 1 threads] 31.8ns ± 4% 15.4ns ± 6% -51.42% (p=0.000 n=50+53) | |
BM_isNaN_1T/31 [using 1 threads] 104ns ± 5% 53ns ± 6% -49.41% (p=0.000 n=55+59) | |
BM_isNaN_1T/32 [using 1 threads] 109ns ± 3% 56ns ± 6% -48.48% (p=0.000 n=54+59) | |
BM_isNaN_1T/64 [using 1 threads] 420ns ± 4% 221ns ± 4% -47.45% (p=0.000 n=57+54) | |
BM_isNaN_1T/128 [using 1 threads] 1.70µs ± 5% 0.90µs ± 4% -47.18% (p=0.000 n=59+55) | |
BM_isNaN_1T/256 [using 1 threads] 6.79µs ± 6% 3.57µs ± 4% -47.45% (p=0.000 n=60+49) | |
BM_isNaN_1T/512 [using 1 threads] 40.7µs ± 4% 33.1µs ± 6% -18.71% (p=0.000 n=48+50) | |
BM_isNaN_1T/1k [using 1 threads] 192µs ± 4% 198µs ± 3% +3.18% (p=0.000 n=55+54) | |
BM_isNaN_1T/2k [using 1 threads] 887µs ±24% 912µs ±24% ~ (p=0.054 n=43+45) | |
BM_isNaN_1T/4k [using 1 threads] 7.37ms ±11% 6.47ms ± 5% -12.26% (p=0.000 n=33+32) | |
BM_isNaN_1T/10k [using 1 threads] 46.3ms ± 7% 40.6ms ± 3% -12.19% (p=0.000 n=15+11) | |
AVX2: | |
name old cpu/op new cpu/op delta | |
BM_isNaN_1T/3 [using 1 threads] 10.9ns ± 4% 17.3ns ± 7% +58.34% (p=0.000 n=58+52) | |
BM_isNaN_1T/4 [using 1 threads] 8.51ns ± 3% 14.58ns ± 4% +71.25% (p=0.000 n=49+53) | |
BM_isNaN_1T/7 [using 1 threads] 15.4ns ± 5% 19.9ns ± 5% +29.65% (p=0.000 n=58+52) | |
BM_isNaN_1T/8 [using 1 threads] 18.1ns ± 8% 19.8ns ± 4% +9.67% (p=0.000 n=54+57) | |
BM_isNaN_1T/10 [using 1 threads] 27.7ns ± 4% 26.0ns ± 5% -6.28% (p=0.000 n=51+51) | |
BM_isNaN_1T/15 [using 1 threads] 50.2ns ± 5% 37.8ns ± 6% -24.69% (p=0.000 n=60+40) | |
BM_isNaN_1T/16 [using 1 threads] 55.6ns ± 4% 39.5ns ±10% -28.87% (p=0.000 n=59+47) | |
BM_isNaN_1T/31 [using 1 threads] 196ns ± 3% 121ns ± 4% -38.27% (p=0.000 n=56+54) | |
BM_isNaN_1T/32 [using 1 threads] 208ns ± 4% 128ns ± 5% -38.61% (p=0.000 n=55+59) | |
BM_isNaN_1T/64 [using 1 threads] 822ns ± 4% 464ns ± 5% -43.61% (p=0.000 n=57+60) | |
BM_isNaN_1T/128 [using 1 threads] 3.27µs ± 4% 2.09µs ± 6% -36.14% (p=0.000 n=50+58) | |
BM_isNaN_1T/256 [using 1 threads] 13.0µs ± 4% 8.3µs ± 4% -36.45% (p=0.000 n=55+57) | |
BM_isNaN_1T/512 [using 1 threads] 54.4µs ± 6% 43.6µs ± 7% -19.89% (p=0.000 n=60+58) | |
BM_isNaN_1T/1k [using 1 threads] 226µs ± 5% 198µs ± 4% -12.26% (p=0.000 n=52+52) | |
BM_isNaN_1T/2k [using 1 threads] 1.09ms ±33% 0.97ms ±26% -10.63% (p=0.000 n=41+47) | |
BM_isNaN_1T/4k [using 1 threads] 8.22ms ± 7% 7.50ms ±14% -8.79% (p=0.000 n=39+36) | |
BM_isNaN_1T/10k [using 1 threads] 50.9ms ± 8% 47.3ms ± 6% -7.16% (p=0.000 n=15+14) | |
SSE: | |
name old cpu/op new cpu/op delta | |
BM_isNaN_1T/3 [using 1 threads] 10.6ns ± 4% 17.0ns ± 6% +60.36% (p=0.000 n=47+58) | |
BM_isNaN_1T/4 [using 1 threads] 8.65ns ± 7% 12.44ns ± 5% +43.82% (p=0.000 n=54+54) | |
BM_isNaN_1T/7 [using 1 threads] 16.4ns ± 5% 16.8ns ± 6% +2.44% (p=0.000 n=57+53) | |
BM_isNaN_1T/8 [using 1 threads] 17.5ns ± 7% 17.5ns ± 3% ~ (p=0.551 n=55+53) | |
BM_isNaN_1T/10 [using 1 threads] 23.5ns ± 5% 23.3ns ± 6% ~ (p=0.080 n=50+48) | |
BM_isNaN_1T/15 [using 1 threads] 39.9ns ± 4% 34.9ns ± 9% -12.56% (p=0.000 n=45+46) | |
BM_isNaN_1T/16 [using 1 threads] 42.8ns ± 4% 36.5ns ± 8% -14.56% (p=0.000 n=54+47) | |
BM_isNaN_1T/31 [using 1 threads] 142ns ± 3% 120ns ± 4% -15.39% (p=0.000 n=59+46) | |
BM_isNaN_1T/32 [using 1 threads] 149ns ± 4% 126ns ± 6% -15.50% (p=0.000 n=60+54) | |
BM_isNaN_1T/64 [using 1 threads] 558ns ± 4% 457ns ± 8% -18.18% (p=0.000 n=60+59) | |
BM_isNaN_1T/128 [using 1 threads] 2.47µs ± 5% 1.89µs ± 5% -23.34% (p=0.000 n=54+52) | |
BM_isNaN_1T/256 [using 1 threads] 9.82µs ± 4% 7.47µs ± 4% -23.93% (p=0.000 n=60+59) | |
BM_isNaN_1T/512 [using 1 threads] 46.8µs ± 7% 42.2µs ± 7% -9.68% (p=0.000 n=60+56) | |
BM_isNaN_1T/1k [using 1 threads] 203µs ± 6% 195µs ± 6% -3.66% (p=0.000 n=53+54) | |
BM_isNaN_1T/2k [using 1 threads] 1.01ms ±38% 1.01ms ±43% ~ (p=0.804 n=49+46) | |
BM_isNaN_1T/4k [using 1 threads] 7.55ms ±10% 7.21ms ± 9% -4.43% (p=0.001 n=39+29) | |
BM_isNaN_1T/10k [using 1 threads] 46.6ms ± 5% 44.6ms ± 6% -4.31% (p=0.002 n=14+13)",Rasmus Munk Larsen,2023-03-16T04:04:23.852Z,NA,NA | |
1266 (https://gitlab.com/libeigen/eigen/-/merge_requests/1266),Remove pools if cmake is less than 3.11,Remove pools if cmake is less than 3.11,Chip Kerchner,2023-03-16T16:54:46.423Z,NA,NA | |
1268 (https://gitlab.com/libeigen/eigen/-/merge_requests/1268),Fix parsing of command-line arguments when already specified as a cmake list.,"The built-in `separate_arguments` chokes when given a cmake list variable - | |
it _needs_ a space-separated list of arguments. However, this leads to | |
an incompatibilty with our CI, in which the YAML configuration doesn't | |
support forwarding space-separated variables along to build scripts | |
(it either escapes the spaces, or separates spaces into different arguments). | |
By allowing us to specify flags via semi-colon separated lists, these | |
are already separated by cmake automatically. This also allows us to set | |
these lists directly via the cmake command-line.",Antonio Sánchez,2023-03-16T22:47:39.723Z,NA,NA | |
1267 (https://gitlab.com/libeigen/eigen/-/merge_requests/1267),Fix some typos,NA,Jonas Schulze,2023-03-16T23:11:44.283Z,NA,NA | |
1269 (https://gitlab.com/libeigen/eigen/-/merge_requests/1269),Undo cmake pools changes,"Since the cmake pool changes cause errors, it is better to undo them.",Chip Kerchner,2023-03-17T16:06:26.878Z,NA,NA | |
1271 (https://gitlab.com/libeigen/eigen/-/merge_requests/1271),Fix 2624 2625,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #2624 | |
Fixes #2625 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Changes the `SparseMatrix::Map` typedef to use `Options_` instead of `Flags` and checks for `StorageIndex` overflow in the first pass of `setFromTriplets`; | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-03-20T16:30:05.895Z,NA,NA | |
1270 (https://gitlab.com/libeigen/eigen/-/merge_requests/1270),Fix arm builds.,"A missing cast, MSVC packet conversion issue, and missing macro definitions for 32-bit arm. | |
Fixes #2623.",Antonio Sánchez,2023-03-20T16:59:39.439Z,NA,NA | |
1273 (https://gitlab.com/libeigen/eigen/-/merge_requests/1273),Replaced all instances of internal::(U)IntPtr with std::(u)intptr_t. Remove ICC workaround.,"### Reference issue | |
Closes issue #2596 | |
### What does this implement/fix? | |
Removes an old typedef (regarding use of `std::intptr_t` vs. `std::size_t`, see issue #2596) used to workaround an issue in ICC. No longer necessary in C++14. | |
### Additional information | |
Removal of this typedef improves compatibility with the novel CHERI/Morello architecture and should have no ill effects elsewhere. The architecture has strict rules about the use of pointers, in particular using int types as if they were pointers. Breaking these rules can lead to a hardware exception at runtime as the validity of the pointer cannot be established. | |
All tests (`make check`) passing before and after changes. | |
This PR replaces an earlier PR which could not be merged due to failure to rebase properly.",Colin Broderick,2023-03-21T16:50:24.423Z,NA,NA | |
1272 (https://gitlab.com/libeigen/eigen/-/merge_requests/1272),Optimize casting for x86_64.,"This MR optimizes several `pcast` operators for the x86 backends, in particular cast to bool, which will become more important in light of the recent improvements by @chuckyschluz to typed comparison and the Select operator. Currently, this mainly benefits users of the Tensor library, but hopefully we can also find a way to make casting at least partially vectorized in Eigen Core. | |
Speedup of casting is measured on a Skylake (Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz) using the following code: | |
``` | |
template <typename IN, typename OUT> | |
static void BM_cast(benchmark::State& state) { | |
int n = state.range(0); | |
const Eigen::array<TensorIndex, 1> sizes{n,}; | |
const Tensor<IN, 1, 0, TensorIndex> A(sizes); | |
Tensor<OUT, 1, 0, TensorIndex> B(sizes); | |
for (auto s : state) { | |
benchmark::DoNotOptimize(B = A.template cast<OUT>()); | |
} | |
state.SetItemsProcessed(static_cast<int64>(state.iterations()) * state.range(0)); | |
} | |
``` | |
Benchmarks numbers for affected cast operations: | |
``` | |
SSE: | |
BM_cast<float,bool>/8 18.2ns ± 0% 16.6ns ± 0% -8.65% (p=0.000 n=52+52) | |
BM_cast<float,bool>/64 17.9ns ± 1% 15.2ns ± 0% -15.11% (p=0.000 n=55+52) | |
BM_cast<float,bool>/512 67.2ns ± 9% 41.1ns ± 2% -38.89% (p=0.000 n=57+45) | |
BM_cast<float,bool>/4k 465ns ±11% 249ns ± 1% -46.37% (p=0.000 n=57+49) | |
BM_cast<float,bool>/32k 3.84µs ± 1% 2.31µs ± 1% -39.96% (p=0.000 n=40+51) | |
BM_cast<float,bool>/256k 42.5µs ± 6% 36.2µs ± 8% -14.79% (p=0.000 n=60+49) | |
BM_cast<float,bool>/1M 199µs ± 3% 196µs ± 3% -1.64% (p=0.000 n=60+60) | |
AVX: | |
BM_cast<float,bool>/8 18.3ns ± 1% 14.9ns ± 0% -18.26% (p=0.000 n=47+56) | |
BM_cast<float,bool>/64 17.9ns ± 4% 15.0ns ± 0% -16.55% (p=0.000 n=57+51) | |
BM_cast<float,bool>/512 69.6ns ± 6% 59.3ns ± 1% -14.85% (p=0.000 n=60+60) | |
BM_cast<float,bool>/4k 495ns ± 7% 430ns ± 0% -13.07% (p=0.000 n=58+49) | |
BM_cast<float,bool>/32k 4.25µs ± 4% 3.44µs ± 3% -19.08% (p=0.000 n=57+50) | |
BM_cast<float,bool>/256k 44.7µs ± 2% 41.2µs ± 5% -7.88% (p=0.000 n=49+54) | |
BM_cast<float,bool>/1M 202µs ± 1% 201µs ± 2% -0.60% (p=0.000 n=50+60) | |
BM_cast<double,float>/8 14.1ns ± 0% 12.2ns ± 0% -13.91% (p=0.000 n=44+53) | |
BM_cast<double,float>/64 18.3ns ± 1% 15.9ns ± 0% -12.99% (p=0.000 n=57+49) | |
BM_cast<double,float>/512 55.7ns ± 3% 53.8ns ± 8% -3.41% (p=0.000 n=54+56) | |
BM_cast<double,float>/4k 576ns ± 3% 577ns ± 4% ~ (p=0.934 n=54+60) | |
BM_cast<double,float>/32k 4.52µs ± 4% 4.59µs ± 4% +1.35% (p=0.000 n=57+60) | |
BM_cast<double,float>/256k 119µs ± 1% 119µs ± 2% ~ (p=0.577 n=56+51) | |
BM_cast<double,float>/1M 479µs ± 2% 478µs ± 2% ~ (p=0.547 n=45+37) | |
AVX2: | |
BM_cast<float,bool>/8 18.3ns ± 0% 17.8ns ± 0% -2.44% (p=0.000 n=46+48) | |
BM_cast<float,bool>/64 17.8ns ± 2% 18.2ns ± 0% +2.61% (p=0.000 n=55+48) | |
BM_cast<float,bool>/512 66.6ns ± 8% 52.0ns ± 9% -21.84% (p=0.000 n=53+60) | |
BM_cast<float,bool>/4k 466ns ± 1% 322ns ± 3% -30.94% (p=0.000 n=53+51) | |
BM_cast<float,bool>/32k 4.18µs ± 4% 2.71µs ± 5% -35.20% (p=0.000 n=49+55) | |
BM_cast<float,bool>/256k 44.8µs ± 6% 39.6µs ± 6% -11.63% (p=0.000 n=58+49) | |
BM_cast<float,bool>/1M 204µs ± 3% 200µs ± 2% -1.71% (p=0.000 n=60+59) | |
BM_cast<double,float>/8 14.1ns ± 1% 12.2ns ± 0% -13.95% (p=0.000 n=53+47) | |
BM_cast<double,float>/64 18.2ns ± 0% 15.9ns ± 0% -12.85% (p=0.000 n=54+46) | |
BM_cast<double,float>/512 56.1ns ± 6% 56.5ns ± 9% ~ (p=0.281 n=59+60) | |
BM_cast<double,float>/4k 579ns ± 4% 578ns ± 4% ~ (p=0.385 n=60+60) | |
BM_cast<double,float>/32k 4.55µs ± 4% 4.56µs ± 5% ~ (p=0.421 n=60+56) | |
BM_cast<double,float>/256k 120µs ± 2% 120µs ± 2% ~ (p=0.069 n=57+58) | |
BM_cast<double,float>/1M 482µs ± 3% 480µs ± 2% ~ (p=0.155 n=38+46) | |
AVX512: | |
BM_cast<bool,float>/8 20.3ns ± 4% 14.9ns ± 1% -26.36% (p=0.000 n=50+60) | |
BM_cast<bool,float>/64 21.0ns ± 4% 14.0ns ± 3% -33.12% (p=0.000 n=58+55) | |
BM_cast<bool,float>/512 72.9ns ± 7% 26.8ns ± 9% -63.27% (p=0.000 n=60+55) | |
BM_cast<bool,float>/4k 464ns ± 4% 134ns ± 5% -71.11% (p=0.000 n=53+60) | |
BM_cast<bool,float>/32k 4.09µs ±10% 2.40µs ± 3% -41.35% (p=0.000 n=52+52) | |
BM_cast<bool,float>/256k 46.5µs ± 4% 40.6µs ± 3% -12.66% (p=0.000 n=58+50) | |
BM_cast<bool,float>/1M 224µs ± 6% 201µs ± 2% -10.47% (p=0.000 n=55+59) | |
BM_cast<float,bool>/8 18.0ns ± 0% 15.7ns ± 4% -12.81% (p=0.000 n=53+60) | |
BM_cast<float,bool>/64 18.0ns ± 4% 12.6ns ± 4% -30.23% (p=0.000 n=53+60) | |
BM_cast<float,bool>/512 66.0ns ± 6% 29.5ns ± 8% -55.30% (p=0.000 n=57+52) | |
BM_cast<float,bool>/4k 452ns ± 5% 227ns ± 4% -49.73% (p=0.000 n=59+59) | |
BM_cast<float,bool>/32k 3.85µs ± 6% 1.82µs ± 4% -52.79% (p=0.000 n=50+60) | |
BM_cast<float,bool>/256k 44.3µs ± 8% 33.9µs ± 5% -23.44% (p=0.000 n=60+50) | |
BM_cast<float,bool>/1M 202µs ± 3% 199µs ± 1% -1.38% (p=0.000 n=59+58) | |
BM_cast<double,float>/8 20.7ns ± 2% 13.3ns ± 1% -36.02% (p=0.000 n=57+54) | |
BM_cast<double,float>/64 21.0ns ± 8% 15.1ns ± 4% -28.17% (p=0.000 n=60+59) | |
BM_cast<double,float>/512 72.9ns ± 8% 38.0ns ± 7% -47.88% (p=0.000 n=59+48) | |
BM_cast<double,float>/4k 845ns ± 7% 475ns ± 8% -43.74% (p=0.000 n=60+56) | |
BM_cast<double,float>/32k 6.74µs ± 7% 3.78µs ± 5% -43.96% (p=0.000 n=57+47) | |
BM_cast<double,float>/256k 169µs ± 3% 121µs ± 3% -28.47% (p=0.000 n=57+60) | |
BM_cast<double,float>/1M 681µs ± 5% 486µs ± 5% -28.64% (p=0.000 n=41+40) | |
```",Rasmus Munk Larsen,2023-03-21T18:24:16.887Z,NA,NA | |
1274 (https://gitlab.com/libeigen/eigen/-/merge_requests/1274),"Optimize float->bool cast for AVX2, based on Charles Schlosser's comments.","Thanks for the catch, @chuckyschluz . This nets a nice little speedup for `pcast<Packet8f,Packet16b>` with AVX2 enabled: | |
Both versions compared to the same baseline prior to !1272: | |
``` | |
Before: | |
name old cpu/op new cpu/op delta | |
BM_cast<float,bool>/8 18.3ns ± 0% 17.8ns ± 0% -2.44% (p=0.000 n=46+48) | |
BM_cast<float,bool>/64 17.8ns ± 2% 18.2ns ± 0% +2.61% (p=0.000 n=55+48) | |
BM_cast<float,bool>/512 66.6ns ± 8% 52.0ns ± 9% -21.84% (p=0.000 n=53+60) | |
BM_cast<float,bool>/4k 466ns ± 1% 322ns ± 3% -30.94% (p=0.000 n=53+51) | |
BM_cast<float,bool>/32k 4.18µs ± 4% 2.71µs ± 5% -35.20% (p=0.000 n=49+55) | |
BM_cast<float,bool>/256k 44.8µs ± 6% 39.6µs ± 6% -11.63% (p=0.000 n=58+49) | |
BM_cast<float,bool>/1M 204µs ± 3% 200µs ± 2% -1.71% (p=0.000 n=60+59) | |
After: | |
name old cpu/op new cpu/op delta | |
BM_cast<float,bool>/8 18.3ns ± 0% 17.9ns ± 0% -2.03% (p=0.000 n=45+53) | |
BM_cast<float,bool>/64 17.6ns ± 1% 16.8ns ± 0% -4.71% (p=0.000 n=55+45) | |
BM_cast<float,bool>/512 65.5ns ± 1% 46.1ns ± 5% -29.61% (p=0.000 n=51+59) | |
BM_cast<float,bool>/4k 470ns ± 2% 260ns ± 3% -44.53% (p=0.000 n=47+53) | |
BM_cast<float,bool>/32k 4.21µs ± 6% 2.42µs ± 4% -42.49% (p=0.000 n=55+51) | |
BM_cast<float,bool>/256k 40.2µs ±27% 32.3µs ±52% -19.69% (p=0.000 n=60+50) | |
BM_cast<float,bool>/1M 190µs ±21% 196µs ± 6% ~ (p=0.885 n=59+52) | |
```",Rasmus Munk Larsen,2023-03-22T14:52:56.815Z,NA,NA | |
1234 (https://gitlab.com/libeigen/eigen/-/merge_requests/1234),Remove unused declarations of BLAS/LAPACK routines,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
#2420 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This MR does two things: | |
- Copy `Eigen/src/misc/{blas,lapack}.h` to `{blas,lapack}/` respectively, and update the relevant `#include`'s. | |
- Remove unused declarations of BLAS/LAPACK routines from the original headers so that fixing signature conflicts with third-party headers will be much easier. | |
- `Eigen/src/misc/lapack.h` is removed as it is no longer referenced. | |
### Additional information | |
<!--Any additional information you think is important.-->",unageek,2023-03-23T21:54:06.654Z,NA,NA | |
1275 (https://gitlab.com/libeigen/eigen/-/merge_requests/1275),"Add more missing vectorized casts for int on x86, and remove redundant unit tests","The removed unit tests were redundant, and worse, some were invoking undefined behavior. | |
Benchmark measurements for affected operations: | |
``` | |
Measured on Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz | |
SSE | |
BM_cast<double,int>/8 14.0ns ± 0% 10.8ns ± 3% -22.97% (p=0.000 n=16+20) | |
BM_cast<double,int>/64 21.4ns ± 0% 18.7ns ± 0% -12.55% (p=0.000 n=14+19) | |
BM_cast<double,int>/512 113ns ± 0% 108ns ± 0% -3.81% (p=0.000 n=16+19) | |
BM_cast<double,int>/4k 854ns ± 0% 851ns ± 0% -0.41% (p=0.000 n=18+18) | |
BM_cast<double,int>/32k 6.70µs ± 0% 6.70µs ± 0% ~ (p=0.751 n=19+19) | |
BM_cast<double,int>/256k 117µs ± 1% 118µs ± 1% +0.59% (p=0.001 n=20+19) | |
vectorize, AVX | |
name old cpu/op new cpu/op delta | |
BM_cast<float,int>/8 13.2ns ± 0% 11.9ns ± 0% -10.13% (p=0.000 n=56+59) | |
BM_cast<float,int>/64 15.7ns ± 1% 13.4ns ± 0% -14.66% (p=0.000 n=43+58) | |
BM_cast<float,int>/512 32.6ns ± 2% 31.7ns ±19% -2.75% (p=0.000 n=50+54) | |
BM_cast<float,int>/4k 203ns ±11% 222ns ± 6% +9.03% (p=0.000 n=60+60) | |
BM_cast<float,int>/32k 2.65µs ± 2% 2.65µs ± 3% ~ (p=0.150 n=53+53) | |
BM_cast<float,int>/256k 79.9µs ± 2% 79.8µs ± 1% -0.15% (p=0.040 n=57+60) | |
BM_cast<int,float>/8 13.2ns ± 0% 11.9ns ± 0% -10.16% (p=0.000 n=55+59) | |
BM_cast<int,float>/64 15.7ns ± 1% 13.4ns ± 1% -14.64% (p=0.000 n=41+59) | |
BM_cast<int,float>/512 32.6ns ± 2% 29.8ns ± 2% -8.36% (p=0.000 n=45+52) | |
BM_cast<int,float>/4k 203ns ±11% 221ns ± 6% +8.87% (p=0.000 n=60+59) | |
BM_cast<int,float>/32k 2.65µs ± 2% 2.64µs ± 3% ~ (p=0.658 n=55+55) | |
BM_cast<int,float>/256k 79.8µs ± 2% 79.8µs ± 1% ~ (p=0.650 n=60+59) | |
BM_cast<double,int>/8 14.0ns ± 1% 10.4ns ± 0% -25.50% (p=0.000 n=25+23) | |
BM_cast<double,int>/64 18.2ns ± 0% 15.9ns ± 0% -12.76% (p=0.000 n=23+25) | |
BM_cast<double,int>/512 56.6ns ± 3% 54.7ns ±11% -3.21% (p=0.006 n=25+25) | |
BM_cast<double,int>/4k 582ns ± 2% 581ns ± 2% ~ (p=0.617 n=24+24) | |
BM_cast<double,int>/32k 4.57µs ± 3% 4.58µs ± 2% ~ (p=0.466 n=25+22) | |
BM_cast<double,int>/256k 120µs ± 2% 120µs ± 2% ~ (p=0.476 n=25+25) | |
AVX2: | |
name old cpu/op new cpu/op delta | |
BM_cast<float,int>/8 13.3ns ± 1% 11.9ns ± 0% -10.49% (p=0.000 n=58+59) | |
BM_cast<float,int>/64 15.7ns ± 0% 13.4ns ± 0% -14.66% (p=0.000 n=43+56) | |
BM_cast<float,int>/512 32.8ns ± 2% 30.2ns ± 3% -8.13% (p=0.000 n=49+52) | |
BM_cast<float,int>/4k 216ns ± 8% 228ns ±10% +5.98% (p=0.000 n=49+48) | |
BM_cast<float,int>/32k 2.67µs ± 3% 2.66µs ± 2% ~ (p=0.057 n=54+53) | |
BM_cast<float,int>/256k 79.7µs ± 3% 80.2µs ± 1% +0.67% (p=0.004 n=60+60) | |
BM_cast<int,float>/8 13.2ns ± 0% 11.9ns ± 0% -10.11% (p=0.000 n=53+59) | |
BM_cast<int,float>/64 15.7ns ± 0% 13.4ns ± 0% -14.72% (p=0.000 n=47+55) | |
BM_cast<int,float>/512 32.8ns ± 2% 30.1ns ± 2% -8.08% (p=0.000 n=49+55) | |
BM_cast<int,float>/4k 210ns ±10% 223ns ±12% +6.23% (p=0.000 n=59+58) | |
BM_cast<int,float>/32k 2.68µs ± 2% 2.66µs ± 3% -0.70% (p=0.004 n=55+55) | |
BM_cast<int,float>/256k 79.7µs ± 2% 80.2µs ± 1% +0.63% (p=0.006 n=59+59) | |
BM_cast<double,int>/8 14.0ns ± 0% 10.4ns ± 0% -25.90% (p=0.000 n=20+22) | |
BM_cast<double,int>/64 18.2ns ± 0% 15.9ns ± 0% -12.81% (p=0.000 n=23+18) | |
BM_cast<double,int>/512 56.8ns ± 1% 53.8ns ± 1% -5.31% (p=0.000 n=20+24) | |
BM_cast<double,int>/4k 586ns ± 1% 588ns ± 3% ~ (p=0.486 n=23+23) | |
BM_cast<double,int>/32k 4.62µs ± 3% 4.66µs ± 1% +0.89% (p=0.006 n=24+23) | |
BM_cast<double,int>/256k 121µs ± 1% 121µs ± 1% ~ (p=0.358 n=23+24) | |
AVX512F | |
name old cpu/op new cpu/op delta | |
BM_cast<float,int>/8 18.5ns ± 0% 14.0ns ± 0% -24.31% (p=0.000 n=56+59) | |
BM_cast<float,int>/64 19.4ns ± 3% 14.0ns ± 3% -27.81% (p=0.000 n=59+60) | |
BM_cast<float,int>/512 64.2ns ±10% 21.3ns ± 5% -66.77% (p=0.000 n=58+59) | |
BM_cast<float,int>/4k 405ns ± 3% 108ns ± 7% -73.40% (p=0.000 n=50+60) | |
BM_cast<float,int>/32k 4.95µs ± 3% 2.69µs ± 3% -45.55% (p=0.000 n=57+53) | |
BM_cast<float,int>/256k 92.4µs ± 5% 81.0µs ± 2% -12.35% (p=0.000 n=60+60) | |
BM_cast<int,float>/8 18.3ns ± 1% 13.8ns ± 1% -24.69% (p=0.000 n=57+60) | |
BM_cast<int,float>/64 19.4ns ± 3% 13.9ns ± 3% -28.14% (p=0.000 n=57+60) | |
BM_cast<int,float>/512 65.1ns ± 7% 21.3ns ± 5% -67.29% (p=0.000 n=55+59) | |
BM_cast<int,float>/4k 414ns ± 3% 108ns ± 7% -73.98% (p=0.000 n=52+60) | |
BM_cast<int,float>/32k 4.94µs ± 3% 2.69µs ± 4% -45.49% (p=0.000 n=57+54) | |
BM_cast<int,float>/256k 91.2µs ± 5% 81.0µs ± 2% -11.16% (p=0.000 n=60+59) | |
BM_cast<double,int>/8 19.9ns ± 2% 13.0ns ± 0% -34.89% (p=0.000 n=21+23) | |
BM_cast<double,int>/64 20.5ns ± 3% 15.2ns ± 4% -25.76% (p=0.000 n=23+25) | |
BM_cast<double,int>/512 72.8ns ±14% 37.9ns ± 6% -47.97% (p=0.000 n=24+20) | |
BM_cast<double,int>/4k 844ns ± 5% 477ns ± 4% -43.44% (p=0.000 n=25+23) | |
BM_cast<double,int>/32k 6.82µs ± 5% 3.82µs ± 3% -43.95% (p=0.000 n=25+20) | |
BM_cast<double,int>/256k 167µs ± 4% 122µs ± 2% -26.94% (p=0.000 n=25+25) | |
```",Rasmus Munk Larsen,2023-03-24T16:02:01.192Z,NA,NA | |
1276 (https://gitlab.com/libeigen/eigen/-/merge_requests/1276),Optimize generic_rsqrt_newton_step,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Tweaks `generic_rsqrt_newton_step` in a few ways: | |
- change order of operations to improve accuracy | |
- eliminate a floating point comparison and constant | |
These tests use `scalar_rsqrt_op<float>`. The AVX path uses an intrinsic `_mm256_rsqrt_ps` to compute an initial guess followed by a single newton iteration. | |
Accuracy: for every `float`, compare to the output to that calculated by MPFR's `rec_sqrt`. Negative ulps indicates value is less than reference | |
- Before: worst ulps = `-2.28266` | |
- After: worst ulps = `-1.97993` | |
Speed: (ms, repeats = 1 << 30 / size) | |
| Size | Old | New | Diff | | |
|------------|---------|---------|--------| | |
| 32000 | 283.879 | 194.964 | -31.3% | | |
| 64000 | 216.68 | 180.354 | -16.8% | | |
| 128000 | 216.34 | 208.688 | -3.5% | | |
| 256000 | 212.886 | 193.805 | -9.0% | | |
| 512000 | 212.307 | 197.84 | -6.8% | | |
| 1024000 | 217.387 | 193.233 | -11.1% | | |
| 2048000 | 268.367 | 248.117 | -7.5% | | |
| 4096000 | 331.165 | 375.633 | 13.4% | | |
| 8192000 | 370.818 | 355.267 | -4.2% | | |
| 16384000 | 361.733 | 359.107 | -0.7% | | |
| 32768000 | 365.886 | 350.384 | -4.2% | | |
| 65536000 | 365.256 | 353.878 | -3.1% | | |
| 131072000 | 360.055 | 348.825 | -3.1% | | |
| 262144000 | 363.505 | 346.75 | -4.6% | | |
| 524288000 | 362.604 | 360.668 | -0.5% | | |
| 1048576000 | 365.782 | 349.517 | -4.4% | | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-03-24T22:42:58.452Z,NA,NA | |
1277 (https://gitlab.com/libeigen/eigen/-/merge_requests/1277),Fix incorrect casting in AVX512DQ path.,NA,Rasmus Munk Larsen,2023-03-27T16:55:17.600Z,NA,NA | |
1279 (https://gitlab.com/libeigen/eigen/-/merge_requests/1279),"refactor indexedviewmethods, enable non-const ref access with symbolic indices","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #2630 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Indexed view expressions that accept a symbolic index (i.e. `placeholders::last`) currently return by const reference or value only (`CoeffReturnType`). This patch allows appropriate l-value expressions to return scalars by non-const reference. | |
Also refactors `IndexedViewMethods.h` so that it is not included twice, and instead defines a const and non-const version of each function. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-03-29T01:35:27.618Z,NA,NA | |
1280 (https://gitlab.com/libeigen/eigen/-/merge_requests/1280),disable raw array indexed view access for 1d arrays,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Follow up to !1279. Raw arrays were enabled for 1d arrays as well (and were likewise buggy). | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-03-29T02:39:46.305Z,NA,NA | |
1283 (https://gitlab.com/libeigen/eigen/-/merge_requests/1283),Use the correct truncating intrinsic for double->int casting.,NA,Rasmus Munk Larsen,2023-04-03T21:38:50.123Z,NA,NA | |
1148 (https://gitlab.com/libeigen/eigen/-/merge_requests/1148),"Guard all malloc, realloc and free() fonctions with check_that_malloc_is_allowed()","# Context | |
Checking dynamic allocation at runtime for a specific check of code, using `Eigen::internal::set_is_malloc_allowed(true|false)` | |
Changing `eigen_assert` to throw instead of aborting the program, (most unit c++ unit testing framework do not support `std::abort()`) | |
```cpp | |
#include <stdexcept> | |
#define eigen_assert(x) if(!(x)) throw(std::runtime_error(""Assertion failed"")); | |
#include <Eigen/Core> | |
``` | |
ref: https://eigen.tuxfamily.org/dox/TopicAssertions.html | |
After #1055, the following code (using catch2) crashes on windows, due to the fact that memory has been reallocated successfully, but not returned due to the throw. | |
```cpp | |
Eigen::VectorXd q(50); | |
REQUIRE_FALSE(Eigen::internal::set_is_malloc_allowed(false)); | |
CHECK_THROWS(q.conservativeResizeLike(Eigen::VectorXd::Zero(350))); /// calls aligned_realloc(), next line crashes while trying to free the q variable. | |
REQUIRE(Eigen::internal::set_is_malloc_allowed(true)); /// next instruction is a crash as it is trying to free the reallocated-not-really q variable. | |
``` | |
### What does this implement/fix? | |
* The change introduced in https://gitlab.com/libeigen/eigen/-/merge_requests/1055 only check *after* the memory was actually allocated, causing the free() function to behave unexpectedly (heap crashes on windows) | |
* Add guards for all malloc, realloc and free function, so user can detect *ANY* unwanted call in their application",Antoine Hoarau,2023-04-04T04:24:23.462Z,NA,NA | |
1284 (https://gitlab.com/libeigen/eigen/-/merge_requests/1284),Small packet math cleanup.,"This cleans up a few omissions in packet math: | |
1) Remove HasHalfPacket trait, which is unused. | |
2) Add a few missing specializations of pselect and pblend. | |
3) Set HasBlend for the packet types that support it.",Rasmus Munk Larsen,2023-04-04T16:14:33.460Z,NA,NA | |
1286 (https://gitlab.com/libeigen/eigen/-/merge_requests/1286),qualify non-const symbolic indexed view with is_lvalue,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Isn't it patently obvious from the title? | |
In some situations, the compiler will use the non-const symbolic indexed view overload when the expression is not an l-value. This adds an explicit qualifier to check for l-value-ness during template substitution. | |
``` | |
Eigen::VectorXd vec(10); | |
auto mapExpr = Eigen::Map<const Eigen::VectorXd>(vec.data(), vec.size()); // oops, the type should be const Map<const VectorXd>! | |
const Eigen::Index start_idx = mapExpr(0) < mapExpr(Eigen::indexing::last) ? 0 : mapExpr.size() - 1; | |
``` | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-04-04T19:06:33.141Z,NA,NA | |
1282 (https://gitlab.com/libeigen/eigen/-/merge_requests/1282),ASAN fixes for AVX512 GEMM/TRSM,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Apologies for the long delay! This is a follow-up to https://gitlab.com/libeigen/eigen/-/merge_requests/1067. | |
This MR addresses some memory related issues in the AVX512 GEMM/TRSM kernels detected via address sanitizer. | |
### What does this implement/fix? | |
For GEMM the fix implemented is mentioned [here](https://gitlab.com/libeigen/eigen/-/merge_requests/1067#note_1111452507). The buffer overrun comes from the `A` matrix pre-loads in the `kloop`. The fix is to split the `k` loop into two sections (`k = k_ + kRem`). Pre-loads are disabled when handling `kRem`. | |
For TRSM, masked loads were added to `aux_loadB`. For certain remainder cases we were loading out-of-bound data. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
As far as I can tell, there are no significant performance impact with the changes in the gemm kernels.",b-shi,2023-04-05T03:26:31.103Z,NA,NA | |
1287 (https://gitlab.com/libeigen/eigen/-/merge_requests/1287),Don't crash on empty tensor contraction.,Since https://gitlab.com/libeigen/eigen/-/commit/9b48d1021569d5d6b87d6cc452ca18d777e34250 we return nullptr for size 0 allocations. This tripped an assert in TensorContraction.h. Omitting the assert and returning nullptr suffices.,Rasmus Munk Larsen,2023-04-05T17:06:15.417Z,NA,NA | |
1288 (https://gitlab.com/libeigen/eigen/-/merge_requests/1288),DOC: Update documentation for 3.4.x,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Locally checking out `master` and trying to build the documentation ran into these errors. | |
### Additional information | |
<!--Any additional information you think is important.-->",Rohit Goswami,2023-04-06T19:20:42.418Z,NA,NA | |
1291 (https://gitlab.com/libeigen/eigen/-/merge_requests/1291),exclude `Eigen/Core` and `Eigen/src/Core` from being ignored due to `core` ignore rule,"### Reference issue | |
#2643 | |
### What does this implement/fix? | |
Fixes the fact that `Eigen/Core` and `Eigen/src/Core` are marked as .gitignore on Windows.",rconde01,2023-04-12T19:14:37.122Z,NA,NA | |
1281 (https://gitlab.com/libeigen/eigen/-/merge_requests/1281),Insert from triplets,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
More tools for sparse matrix users per discussion in #2631 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Add `insertFromTriplets/insertFromSortedTriplets` which allows users to add a batch of triplets to an existing sparse matrix with functionality similar to that of `setFromTriplets` | |
If the duplicate functor is addition (the most common case), this is not much different from the following operation (which is basically a merge of sorted arrays) | |
``` | |
SparseMatrixType tmp(mat.rows(),mat.cols()); | |
tmp.setFromTriplets(begin,end); | |
mat += tmp; // mat = mat.binaryExpr(tmp,internal::scalar_sum_op<SparseMatrixType::Scalar>()); | |
``` | |
However, this only works by happenstance as the binary evaluator handles non-duplicate entries by assuming a neutral element of zero `m_value = m_functor(m_lhsIter.value(), Scalar(0));` For addition: `a = a + 0 == a`. This is not valid in general, as is the case when the functor is multiplication: `a = a * 0 != a`. This is addressed by specializing the sparse-sparse evaulator when the duplicate functor is wrapped with `scalar_dup_op`. In this case, non-duplicate entries are returned without modification. | |
Also, this MR reverts `setFromTriplets` to perform out-of-place sorting with transposed assignment (credits to the original author Gael) with a few minor optimizations. | |
`setFromTriplets` benchmarks: | |
The performance of `setFromTriplets` is sensitive to the ordering of the triplets as well as the number of duplicate entries. This benchmark sets 50 sparse matrices with pre-allocated memory from a container of randomly shuffled triplets with the following characteristics: | |
Rows: `1,140,149` | |
Cols: `1,140,149` | |
Unique nonzeros: `3,309,592` | |
Duplicates: `827,398` | |
Time in ms | |
| Eigen 3.4 | This MR| | |
| ------ | ------ | | |
| 15,282 | 14,469 | | |
| 15,824 | 14,167 | | |
| 15,233 | 13,844 | | |
| 15,120 | 14,261 | | |
This is a roughly 7% faster than the implementation in 3.4. This improvement is attributed not using `reserveInnerVectors`, which does a bit more work than is required than creating a brand new sparse matrix, and instead allocates a temporary array for the non-zero entries and re-uses the same buffer for the duplicate removal. This also attempts to allocate this buffer using `ei_declare_aligned_stack_constructed_variable` which can avoid a heap allocation if the matrix inner/outer sizes are not astoundingly gigantic. | |
Previously the following fixes were related to this MR, but I decided to go in another direction. Still, I think these are worthwhile changes/improvements: | |
- Generalized `CompressedStorageIterator` so that its value type is not `std::pair` but a thin pair-like class with comparators (the default comparator for `std::pair` containing a complex scalar is problematic). | |
- Allow `CompressedStorageIterator`'s value and reference types to be compared directly to `StorageIndex` (convenient for some STL algorithms) | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-04-12T20:01:49.704Z,NA,NA | |
1293 (https://gitlab.com/libeigen/eigen/-/merge_requests/1293),Enable new AVX512 GEMM kernel by default.,"This enables by default the new AVX512 GEMM kernel contributed by Intel, following recent ASAN fixes.",Rasmus Munk Larsen,2023-04-14T02:55:43.264Z,NA,NA | |
1294 (https://gitlab.com/libeigen/eigen/-/merge_requests/1294),Improve accuracy of erf().,"This implements an improved rational approximation and more careful clamping for the error function erf(). Speed is unchanged. | |
Table of maximum relative errors in ULPs before and after. | |
| Range | Before | After | | |
| ----------------- | ------- | ------ | | |
| Subnormal floats | 250,000 | 630 | | |
| Normalized floats | 32 | 3 | | |
Thanks to my colleague James Lottes for deriving the rational approximant.",Rasmus Munk Larsen,2023-04-14T16:57:57.464Z,NA,NA | |
1296 (https://gitlab.com/libeigen/eigen/-/merge_requests/1296),Add dynamic dispatch to BF16 GEMM (Power) and new VSX version,"Add dynamic dispatch to BF16 GEMM (Power) and new VSX version - 13.4X faster than original generic code, 1.36X faster than F32 GEMM (non-MMA). | |
``` | |
Lots of other fixes and improvements - | |
- Many conversions from BF16 <-> F32. | |
- Improve dynamic dispatch code for all cases. | |
- Hardware conversion for P10 in vector F32->BF16. | |
- Simplify partial packet loads and stores for early processors. | |
- Improve software conversion in vector F32->BF16 - up to 40% faster. | |
- Disabled subnormal calculations since none of the other architectures have it. | |
- Fix compilation issues and make code consistent. | |
- Generic code is 1.84X faster due to improved vector conversions. | |
```",Chip Kerchner,2023-04-14T22:20:43.063Z,NA,NA | |
1295 (https://gitlab.com/libeigen/eigen/-/merge_requests/1295),Refactor IndexedView,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Refactor code to minimize verbosity of SFINAE in public API and make it easier to debug and maintain. Re-enable raw, fixed-size array access. | |
Typical operator() overload before: | |
``` | |
template <typename RowIndices, typename ColIndices> | |
std::enable_if_t<internal::valid_indexed_view_overload<RowIndices, ColIndices>::value && | |
internal::traits<IndexedViewType<RowIndices, ColIndices>>::ReturnAsBlock, | |
typename internal::traits<IndexedViewType<RowIndices, ColIndices>>::BlockType> | |
operator()(const RowIndices& rowIndices, const ColIndices& colIndices) { | |
typedef typename internal::traits<IndexedViewType<RowIndices, ColIndices>>::BlockType BlockType; | |
IvcRowType<RowIndices> actualRowIndices = ivcRow(rowIndices); | |
IvcColType<ColIndices> actualColIndices = ivcCol(colIndices); | |
return BlockType(derived(), internal::first(actualRowIndices), internal::first(actualColIndices), | |
internal::index_list_size(actualRowIndices), internal::index_list_size(actualColIndices)); | |
} | |
``` | |
After: | |
``` | |
template <typename RowIndices, typename ColIndices, EnableOverload<RowIndices, ColIndices> = true> | |
IndexedViewType<RowIndices, ColIndices> operator()(const RowIndices& rowIndices, const ColIndices& colIndices) { | |
return IndexedViewSelector<RowIndices, ColIndices>::run(derived(), rowIndices, colIndices); | |
} | |
``` | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-04-17T12:32:51.797Z,NA,NA | |
1297 (https://gitlab.com/libeigen/eigen/-/merge_requests/1297),"Add `Packet4ui`, `Packet8ui`, and `Packet4ul` to the `SSE`/`AVX` `PacketMath.h` headers","### Reference issue | |
None. | |
### What does this implement/fix? | |
Adds `Packet`s for `uint32_t` and `uint64_t` on x86_64. | |
### Additional information | |
The decimal digits of $\pi$ contain the sequence `314159` as of the 176'452nd decimal digit.",Pedro Gonnet,2023-04-17T23:33:59.892Z,NA,NA | |
1299 (https://gitlab.com/libeigen/eigen/-/merge_requests/1299),New BF16 pcast functions and move type casting to TypeCasting.h,New BF16 pcast functions and move type casting to TypeCasting.h,Chip Kerchner,2023-04-18T02:38:39.214Z,NA,NA | |
1302 (https://gitlab.com/libeigen/eigen/-/merge_requests/1302),fix typo in sse packetmath,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-04-18T18:45:39.514Z,NA,NA | |
1298 (https://gitlab.com/libeigen/eigen/-/merge_requests/1298),Use select ternary op in tensor select evaulator,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
If the select expression is compatible with `scalar_boolean_select_op`, use that instead. Otherwise, fall back to the current (partially) vectorized implementation. This only affects the `packet` function -- everything else is unmodified. | |
``` | |
int main() | |
{ | |
using TypedGTOp = internal::scalar_cmp_op<float, float, internal::cmp_GT, true>; | |
Index i_size = 2000, j_size = 300, k_size = 1000; | |
Tensor<float, 3> selector(i_size, j_size, k_size); | |
Tensor<float, 3> mat1(i_size, j_size, k_size); | |
Tensor<float, 3> mat2(i_size, j_size, k_size); | |
Tensor<float, 3> result(i_size, j_size, k_size); | |
selector.setRandom(); | |
mat1.setRandom(); | |
mat2.setRandom(); | |
result = (selector > mat1).select(mat1, mat2); // current boolean path | |
result = selector.binaryExpr(mat1, TypedGTOp()).select(mat1, mat2); // new typed path | |
} | |
``` | |
Benchmarks (ms): | |
| Boolean Comparison| Typed Comparison| Diff| | |
| ------ | ------ | --- | | |
|3,664 | 3,176 |-13%| | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-04-18T20:52:17.345Z,NA,NA | |
1303 (https://gitlab.com/libeigen/eigen/-/merge_requests/1303),Make sure we return +/-1 above the clamping point for Erf().,"This also gives a tiny speedup in some cases, here measured for AVX2 on Skylake compiled with -march=skylake. | |
``` | |
name old cpu/op new cpu/op delta | |
BM_eigen_erf_float/1 1.10ns ± 0% 1.09ns ± 0% -0.43% (p=0.000 n=55+57) | |
BM_eigen_erf_float/8 13.9ns ± 1% 12.5ns ± 6% -10.05% (p=0.000 n=48+60) | |
BM_eigen_erf_float/64 38.9ns ± 6% 36.4ns ± 3% -6.31% (p=0.000 n=46+42) | |
BM_eigen_erf_float/512 231ns ± 3% 221ns ± 4% -4.17% (p=0.000 n=52+47) | |
BM_eigen_erf_float/4k 1.80µs ± 3% 1.73µs ± 5% -3.55% (p=0.000 n=58+53) | |
BM_eigen_erf_float/32k 14.2µs ± 3% 13.8µs ± 7% -3.33% (p=0.000 n=51+54) | |
BM_eigen_erf_float/256k 117µs ± 5% 115µs ± 5% -1.76% (p=0.000 n=59+57) | |
BM_eigen_erf_float/1M 470µs ± 3% 463µs ± 6% -1.47% (p=0.000 n=58+60) | |
```",Rasmus Munk Larsen,2023-04-18T21:22:21.788Z,NA,NA | |
1306 (https://gitlab.com/libeigen/eigen/-/merge_requests/1306),Delete last few occurences of HasHalfPacket.,"### Reference issue | |
None. | |
### What does this implement/fix? | |
When removing the unused enum HasHalfPacket, I missed a few instances. This changes removes the last few. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
People are not wearing enough hats.",Rasmus Munk Larsen,2023-04-19T18:06:07.888Z,NA,NA | |
1301 (https://gitlab.com/libeigen/eigen/-/merge_requests/1301),Geometry/EulerAngles: make sure that returned solution has canonical ranges,"## NB: this is possibly a breaking change because, compared to legacy code, it _may_ return a _different_ dual solution, which is equally valid, but mapped to the respective canonical (standard) Euler angle ranges. | |
### Reference issue | |
#2617 (initially started out describing an unrelated issue in `unsupported/EulerAngles`; this issue **does not** close #2617) | |
### What does this implement/fix? | |
Per detailed discussion in #2617, this is an initial implementation of the formulas derived by @evbernardes given in [this comment](https://gitlab.com/libeigen/eigen/-/issues/2617#note_1298729055) for Tait-Bryan angle sequences and in [this comment](https://gitlab.com/libeigen/eigen/-/issues/2617#note_1313934905) for proper Euler sequences. | |
Prior to this fix, Eigen was returning a set of angles in a non-standard set of angle ranges, `[0, pi] × [-pi, pi] × [-pi, pi]`, which is inappropriate for e.g. yaw-pitch-roll computations which is probably the most common application of `.eulerAngles()`. | |
Without an angle range restriction, for any given rotation (matrix) and Euler sequence, there are two valid solution sets of Euler angles in the general, non-degenerate (non-gymbal lock) case. This MR applies solution-flipping formulas derived by @evbernardes to flip the solution to the one inside the canonical range for the respective kind of Euler angles: | |
- `[-pi, pi] × [-pi/2, pi/2] × [-pi, pi]` for Tait-Bryan angles (a0 != a2, e.g. ZXY) | |
- `[-pi, pi] × [0, pi] × [-pi, pi]` for proper Euler angles (a0 == a2, e.g. XYX) | |
This MR introduces a default-parameter `bool canonical = true` which defaults all clients to the new (_more_ correct) behaviour. The existing code for computing angles which has seen long time battle-testing and seems to be robust to degenerate cases has deliberately not been touched; instead we apply at the very end a single solution-flip step when needed, which brings the solution angles into the respective canonical ranges. | |
If a client wants to keep using legacy Eigen behaviour, they can pass `canonical = false` to `.eulerAngles()`.",Juraj Oršulić,2023-04-19T19:12:25.747Z,NA,NA | |
1308 (https://gitlab.com/libeigen/eigen/-/merge_requests/1308),"fix pow for uint32_t, disable pmul<Packet4ul>","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Now that `uint32_t` is vectorized for SSE/AVX, error handling for `int_pow` has to be specialized for the unsigned case to avoid undefined packet ops. Also, there is no pmul op for Packet4ul, so I tagged it as such to avoid compilation errors. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-04-21T05:47:57.020Z,NA,NA | |
1307 (https://gitlab.com/libeigen/eigen/-/merge_requests/1307),New VSX version of BF16 GEMV (Power) - up to 6.7X faster,"New VSX version of BF16 GEMV (Power) | |
``` | |
GEMV RowMajor 6.7X faster than generic code | |
GEMV ColMajor 5.9X faster than generic code | |
GEMV RowMajor 2.0X faster than F32 GEMV RowMajor | |
GEMV ColMajor 1.39X faster than F32 GEMV ColMajor | |
```",Chip Kerchner,2023-04-21T17:07:00.580Z,NA,NA | |
1309 (https://gitlab.com/libeigen/eigen/-/merge_requests/1309),Packet4ul does not have Abs2,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-04-21T20:16:18.625Z,NA,NA | |
1312 (https://gitlab.com/libeigen/eigen/-/merge_requests/1312),Fix boolean bitwise and warning.,Eliminate simple warning in test.,Antonio Sánchez,2023-04-25T15:52:20.397Z,NA,NA | |
1311 (https://gitlab.com/libeigen/eigen/-/merge_requests/1311),Fix sparse iterator and tests.,"On macos+clang, the `StorageRef` type needs to be move-able for use in | |
`std::sort` and friends. | |
Also fixed warnings related to deprecation and removal of | |
`std::random_shuffle`.",Antonio Sánchez,2023-04-25T19:05:49.764Z,NA,NA | |
1304 (https://gitlab.com/libeigen/eigen/-/merge_requests/1304),Vectorize cast,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Specialize the evaluator of `scalar_cast_op` to handle different input and output packet types, multiple input packets per output packet, etc. Works as long as `type_casting_traits` is correctly defined and `pcast` returns a single packet. | |
If the new packet size is less than the old packet size, then we are performing multiple loads for the same data (in fact, the entire expression). For example: | |
``` | |
template<> EIGEN_STRONG_INLINE Packet2d pcast<Packet4f, Packet2d>(const Packet4f& a) { | |
// Simply discard the second half of the input | |
return _mm_cvtps_pd(a); | |
} | |
``` | |
We would load elements `{0,1,2,3}`, increment the assignment loop by two, load elements `{2,3,4,5}`, and so on. We also reduce the alignment by half, and probably perform unalinged loads. This isn't great, but probably preferable to the scalar path. Recommend investigating a remedy to optimize casts to a smaller packet (number of elements). | |
Benchmarks: | |
For a pure cast, `dst = src.cast<DstType>()` I saw very little difference in performance from the scalar path, which is understandable as there is very little work being done to justify the overhead of the loads and stores. However, I also saw no decrease in performance, which is good. | |
The story is different for a more complex expression like `dst = src.abs2().sqrt().log().cast<DstType>();` | |
AVX (double->float): -55% | |
AVX (float->double): -59% | |
From this example, we see that the meat and potatoes of the expression -- the arithmetic operations -- vastly outweigh the the cost of the cast, even if we effectively evaluate the expression twice as is the cast for `float->double`. The similarity of the numbers is also explainable. In the double->float case, we invoke **1** `Packet4d` op to increment the loop by **4** elements. In float->double, we invoke **2** `Packet8f` ops to increment the loop by **8** elements. Overall, the packet ops per increment is the same. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-04-26T02:50:14.749Z,NA,NA | |
1313 (https://gitlab.com/libeigen/eigen/-/merge_requests/1313),"AVX2: Packet4ul has pmul, abs2","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Presumably, we can use the int64_t multiplication routine for uint64_t. This will resolve a few compilation issues and enable those features. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-04-26T16:22:17.647Z,NA,NA | |
1316 (https://gitlab.com/libeigen/eigen/-/merge_requests/1316),"SSE Packet4ui has pcmp, pmin, pmax","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Eigen implicitly requires that any vectorizable type have pcmp, pmin, pmax (among others) or the packet math test will fail. `Packet4ui` actually has these functions defined, but the enums were conditionally set by `#ifdef EIGEN_VECTORIZE_SSE4_1`. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-04-28T20:36:08.855Z,NA,NA | |
1305 (https://gitlab.com/libeigen/eigen/-/merge_requests/1305),Add half-`Packet` operations to `StridedLinearBufferCopy`.,"### Reference issue | |
None. | |
### What does this implement/fix? | |
The operations in `StridedLinearBufferCopy::Run` operate either on groups of `Packet`, single `Packet`s, or on `Scalar`. | |
In some cases, though, e.g. when using `AVX`, which provides `Packet8f`, we don't have enough data to fill a single `Packet8f`, but still enough to make the `Scalar` operations slow. | |
This MR checks whether the `Packet` implementation provides a half-`Packet`, and if so, provides the operations using half-`Packet`s where appropriate. | |
Note that this code avoids checking `packet_traits<Scalar>::HasHalfPacket` since this appears to only ever be set in `Eigen/src/Core/arch/AVX512/PacketMathFP16.h`, even though almost all `Packet` provide a distinct `packet_traits<Scalar>::half`. | |
### Additional information | |
Whales are mammals, and as such, they breastfeed their young. | |
### Benchmarks | |
To evaluate the change, I created the following benchmark suite based on github.com/google/benchmark: | |
```c++ | |
template <typename Scalar> | |
void BM_StridedLinearBufferCopyLinear(benchmark::State& state) { | |
// Get the parameters of this test. | |
const int packet_size = Eigen::internal::packet_traits<Scalar>::size; | |
const int num_packets = state.range(0); | |
const int num_extra = state.range(1); | |
// Create 1D source and destination Tensor of the requested shape. | |
const int N = num_packets * packet_size; | |
Eigen::Tensor<Scalar, 1> src(N + num_extra); | |
Eigen::Tensor<Scalar, 1> dest(N + num_extra); | |
// Initialize the source and destination Tensors. | |
src.setRandom(); | |
dest.setZero(); | |
for (auto s : state) { | |
using StridedLinearBufferCopy = | |
Eigen::internal::StridedLinearBufferCopy<Scalar, int>; | |
StridedLinearBufferCopy::template Run< | |
StridedLinearBufferCopy::Kind::Linear>( | |
{/*offset=*/0, /*stride=*/1, /*data=*/dest.data()}, | |
{/*offset=*/0, /*stride=*/1, /*data=*/src.data()}, | |
/*count=*/N + num_extra); | |
CHECK_EQ(dest(0), src(0)); | |
} | |
} | |
template <typename Scalar> | |
void BM_StridedLinearBufferCopyScatter(benchmark::State& state) { | |
// Get the parameters of this test. | |
const int packet_size = Eigen::internal::packet_traits<Scalar>::size; | |
const int num_packets = state.range(0); | |
const int num_extra = state.range(1); | |
// Create 1D source and destination Tensor of the requested shape. | |
const int N = num_packets * packet_size; | |
Eigen::Tensor<Scalar, 1> src(N + num_extra); | |
Eigen::Tensor<Scalar, 1> dest(2 * (N + num_extra)); | |
// Initialize the source and destination Tensors. | |
src.setRandom(); | |
dest.setZero(); | |
for (auto s : state) { | |
using StridedLinearBufferCopy = | |
Eigen::internal::StridedLinearBufferCopy<Scalar, int>; | |
StridedLinearBufferCopy::template Run< | |
StridedLinearBufferCopy::Kind::Scatter>( | |
{/*offset=*/0, /*stride=*/2, /*data=*/dest.data()}, | |
{/*offset=*/0, /*stride=*/1, /*data=*/src.data()}, | |
/*count=*/N + num_extra); | |
CHECK_EQ(dest(0), src(0)); | |
} | |
} | |
template <typename Scalar> | |
void BM_StridedLinearBufferCopyFillLinear(benchmark::State& state) { | |
// Get the parameters of this test. | |
const int packet_size = Eigen::internal::packet_traits<Scalar>::size; | |
const int num_packets = state.range(0); | |
const int num_extra = state.range(1); | |
// Create 1D source and destination Tensor of the requested shape. | |
const int N = num_packets * packet_size; | |
Eigen::Tensor<Scalar, 1> src(1); | |
Eigen::Tensor<Scalar, 1> dest(N + num_extra); | |
// Initialize the source and destination Tensors. | |
src.setRandom(); | |
dest.setZero(); | |
for (auto s : state) { | |
using StridedLinearBufferCopy = | |
Eigen::internal::StridedLinearBufferCopy<Scalar, int>; | |
StridedLinearBufferCopy::template Run< | |
StridedLinearBufferCopy::Kind::FillLinear>( | |
{/*offset=*/0, /*stride=*/1, /*data=*/dest.data()}, | |
{/*offset=*/0, /*stride=*/0, /*data=*/src.data()}, | |
/*count=*/N + num_extra); | |
CHECK_EQ(dest(0), src(0)); | |
} | |
} | |
template <typename Scalar> | |
void BM_StridedLinearBufferCopyFillScatter(benchmark::State& state) { | |
// Get the parameters of this test. | |
const int packet_size = Eigen::internal::packet_traits<Scalar>::size; | |
const int num_packets = state.range(0); | |
const int num_extra = state.range(1); | |
// Create 1D source and destination Tensor of the requested shape. | |
const int N = num_packets * packet_size; | |
Eigen::Tensor<Scalar, 1> src(1); | |
Eigen::Tensor<Scalar, 1> dest(2 * (N + num_extra)); | |
// Initialize the source and destination Tensors. | |
src.setRandom(); | |
dest.setZero(); | |
for (auto s : state) { | |
using StridedLinearBufferCopy = | |
Eigen::internal::StridedLinearBufferCopy<Scalar, int>; | |
StridedLinearBufferCopy::template Run< | |
StridedLinearBufferCopy::Kind::FillScatter>( | |
{/*offset=*/0, /*stride=*/2, /*data=*/dest.data()}, | |
{/*offset=*/0, /*stride=*/0, /*data=*/src.data()}, | |
/*count=*/N + num_extra); | |
CHECK_EQ(dest(0), src(0)); | |
} | |
} | |
template <typename Scalar> | |
void BM_StridedLinearBufferCopyGather(benchmark::State& state) { | |
// Get the parameters of this test. | |
const int packet_size = Eigen::internal::packet_traits<Scalar>::size; | |
const int num_packets = state.range(0); | |
const int num_extra = state.range(1); | |
// Create 1D source and destination Tensor of the requested shape. | |
const int N = num_packets * packet_size; | |
Eigen::Tensor<Scalar, 1> src(2 * (N + num_extra)); | |
Eigen::Tensor<Scalar, 1> dest(N + num_extra); | |
// Initialize the source and destination Tensors. | |
src.setRandom(); | |
dest.setZero(); | |
for (auto s : state) { | |
using StridedLinearBufferCopy = | |
Eigen::internal::StridedLinearBufferCopy<Scalar, int>; | |
StridedLinearBufferCopy::template Run< | |
StridedLinearBufferCopy::Kind::Gather>( | |
{/*offset=*/0, /*stride=*/1, /*data=*/dest.data()}, | |
{/*offset=*/0, /*stride=*/2, /*data=*/src.data()}, | |
/*count=*/N + num_extra); | |
CHECK_EQ(dest(0), src(0)); | |
} | |
} | |
#define CREATE_BENCHMARK_FLOAT(benchmark_function) \ | |
BENCHMARK(benchmark_function<float>) \ | |
->ArgPair(10, 0) \ | |
->ArgPair(10, 1) \ | |
->ArgPair(10, 2) \ | |
->ArgPair(10, 3) \ | |
->ArgPair(10, 4) \ | |
->ArgPair(10, 5) \ | |
->ArgPair(10, 6) \ | |
->ArgPair(10, 7) | |
#define CREATE_BENCHMARK_DOUBLE(benchmark_function) \ | |
BENCHMARK(benchmark_function<double>) \ | |
->ArgPair(10, 0) \ | |
->ArgPair(10, 1) \ | |
->ArgPair(10, 2) \ | |
->ArgPair(10, 3) | |
#define CREATE_BENCHMARK(benchmark_function) \ | |
CREATE_BENCHMARK_FLOAT(benchmark_function); \ | |
CREATE_BENCHMARK_DOUBLE(benchmark_function) | |
CREATE_BENCHMARK(BM_StridedLinearBufferCopyLinear); | |
CREATE_BENCHMARK(BM_StridedLinearBufferCopyScatter); | |
CREATE_BENCHMARK(BM_StridedLinearBufferCopyFillLinear); | |
CREATE_BENCHMARK(BM_StridedLinearBufferCopyFillScatter); | |
CREATE_BENCHMARK(BM_StridedLinearBufferCopyGather); | |
``` | |
These are the results before the change: | |
``` | |
Run on gonnet.zrh (48 X 2594 MHz CPUs); 2023-04-20T02:00:41.041679817-07:00 | |
CPU: Intel Haswell with HyperThreading (24 cores) dL1:32KB dL2:256KB dL3:30MB | |
Benchmark Time(ns) CPU(ns) Iterations | |
-------------------------------------------------------------------------------------------------- | |
BM_StridedLinearBufferCopyLinear<float>/10/0_mean 4.61 4.60 148657641 | |
BM_StridedLinearBufferCopyLinear<float>/10/1_mean 5.60 5.55 122490604 | |
BM_StridedLinearBufferCopyLinear<float>/10/2_mean 5.92 5.95 90081675 | |
BM_StridedLinearBufferCopyLinear<float>/10/3_mean 6.21 6.20 103781757 | |
BM_StridedLinearBufferCopyLinear<float>/10/4_mean 6.67 6.67 98333440 | |
BM_StridedLinearBufferCopyLinear<float>/10/5_mean 7.27 7.19 92921701 | |
BM_StridedLinearBufferCopyLinear<float>/10/6_mean 7.17 7.19 95629489 | |
BM_StridedLinearBufferCopyLinear<float>/10/7_mean 8.44 8.41 86564025 | |
BM_StridedLinearBufferCopyLinear<double>/10/0_mean 4.65 4.65 129877156 | |
BM_StridedLinearBufferCopyLinear<double>/10/1_mean 5.56 5.62 123804223 | |
BM_StridedLinearBufferCopyLinear<double>/10/2_mean 5.92 6.01 117764227 | |
BM_StridedLinearBufferCopyLinear<double>/10/3_mean 6.22 6.23 109170634 | |
BM_StridedLinearBufferCopyScatter<float>/10/0_mean 25.8 26.3 27789649 | |
BM_StridedLinearBufferCopyScatter<float>/10/1_mean 26.7 26.5 26698954 | |
BM_StridedLinearBufferCopyScatter<float>/10/2_mean 26.3 26.3 27055923 | |
BM_StridedLinearBufferCopyScatter<float>/10/3_mean 26.9 26.7 25942966 | |
BM_StridedLinearBufferCopyScatter<float>/10/4_mean 27.3 26.9 25579153 | |
BM_StridedLinearBufferCopyScatter<float>/10/5_mean 28.1 27.9 24336030 | |
BM_StridedLinearBufferCopyScatter<float>/10/6_mean 28.9 29.3 24055241 | |
BM_StridedLinearBufferCopyScatter<float>/10/7_mean 29.0 29.1 24201064 | |
BM_StridedLinearBufferCopyScatter<double>/10/0_mean 14.2 14.3 51049048 | |
BM_StridedLinearBufferCopyScatter<double>/10/1_mean 13.3 13.4 54687735 | |
BM_StridedLinearBufferCopyScatter<double>/10/2_mean 13.6 13.7 52260623 | |
BM_StridedLinearBufferCopyScatter<double>/10/3_mean 14.4 14.4 47349048 | |
BM_StridedLinearBufferCopyFillLinear<float>/10/0_mean 3.53 3.54 196561355 | |
BM_StridedLinearBufferCopyFillLinear<float>/10/1_mean 4.70 4.71 126585448 | |
BM_StridedLinearBufferCopyFillLinear<float>/10/2_mean 5.10 5.18 123645598 | |
BM_StridedLinearBufferCopyFillLinear<float>/10/3_mean 5.43 5.43 126781294 | |
BM_StridedLinearBufferCopyFillLinear<float>/10/4_mean 5.97 6.06 111594166 | |
BM_StridedLinearBufferCopyFillLinear<float>/10/5_mean 6.39 6.37 88540177 | |
BM_StridedLinearBufferCopyFillLinear<float>/10/6_mean 6.78 6.84 95933919 | |
BM_StridedLinearBufferCopyFillLinear<float>/10/7_mean 7.24 7.23 92916630 | |
BM_StridedLinearBufferCopyFillLinear<double>/10/0_mean 3.35 3.39 205481242 | |
BM_StridedLinearBufferCopyFillLinear<double>/10/1_mean 4.56 4.61 126582456 | |
BM_StridedLinearBufferCopyFillLinear<double>/10/2_mean 5.02 5.09 129027005 | |
BM_StridedLinearBufferCopyFillLinear<double>/10/3_mean 5.50 5.55 120377372 | |
BM_StridedLinearBufferCopyFillScatter<float>/10/0_mean 24.7 24.9 27841243 | |
BM_StridedLinearBufferCopyFillScatter<float>/10/1_mean 25.1 24.7 26702424 | |
BM_StridedLinearBufferCopyFillScatter<float>/10/2_mean 25.3 25.6 27833392 | |
BM_StridedLinearBufferCopyFillScatter<float>/10/3_mean 25.6 25.3 25918611 | |
BM_StridedLinearBufferCopyFillScatter<float>/10/4_mean 25.7 25.4 26326241 | |
BM_StridedLinearBufferCopyFillScatter<float>/10/5_mean 26.6 26.8 25666331 | |
BM_StridedLinearBufferCopyFillScatter<float>/10/6_mean 27.1 27.3 26045742 | |
BM_StridedLinearBufferCopyFillScatter<float>/10/7_mean 27.8 27.5 24600928 | |
BM_StridedLinearBufferCopyFillScatter<double>/10/0_mean 12.6 12.5 55896468 | |
BM_StridedLinearBufferCopyFillScatter<double>/10/1_mean 13.0 12.9 53489777 | |
BM_StridedLinearBufferCopyFillScatter<double>/10/2_mean 13.7 13.7 52233089 | |
BM_StridedLinearBufferCopyFillScatter<double>/10/3_mean 13.7 13.7 53468665 | |
BM_StridedLinearBufferCopyGather<float>/10/0_mean 19.4 19.6 36365543 | |
BM_StridedLinearBufferCopyGather<float>/10/1_mean 18.4 18.1 35718133 | |
BM_StridedLinearBufferCopyGather<float>/10/2_mean 18.6 18.4 37917347 | |
BM_StridedLinearBufferCopyGather<float>/10/3_mean 18.8 18.7 35731989 | |
BM_StridedLinearBufferCopyGather<float>/10/4_mean 19.2 19.0 37200322 | |
BM_StridedLinearBufferCopyGather<float>/10/5_mean 19.7 19.5 34028747 | |
BM_StridedLinearBufferCopyGather<float>/10/6_mean 20.3 20.2 33193722 | |
BM_StridedLinearBufferCopyGather<float>/10/7_mean 20.4 20.3 34032379 | |
BM_StridedLinearBufferCopyGather<double>/10/0_mean 12.0 11.8 52511524 | |
BM_StridedLinearBufferCopyGather<double>/10/1_mean 11.5 11.5 60765169 | |
BM_StridedLinearBufferCopyGather<double>/10/2_mean 11.8 11.8 58332651 | |
BM_StridedLinearBufferCopyGather<double>/10/3_mean 12.5 12.7 54688558 | |
``` | |
And the result of `pprof --list=StridedLinearBufferCopy /tmp/tensor_block_benchmark_test.prof`: | |
``` | |
. . 1030: template <typename StridedLinearBufferCopy::Kind kind> | |
. . 1031: static EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE void Run(const Dst& dst, | |
. . 1032: const Src& src, | |
. . 1033: const size_t count) { | |
. 49.51s 1034: Run<kind>(count, dst.offset, dst.stride, dst.data, src.offset, src.stride, | |
. . 1035: src.data); | |
. . 1036: } | |
. . 1037: | |
. . 1038: private: | |
. . 1039: template <typename StridedLinearBufferCopy::Kind kind> | |
. . 1040: static EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE void Run( | |
. . 1041: const IndexType count, const IndexType dst_offset, | |
. . 1042: const IndexType dst_stride, Scalar* EIGEN_RESTRICT dst_data, | |
. . 1043: const IndexType src_offset, const IndexType src_stride, | |
. . 1044: const Scalar* EIGEN_RESTRICT src_data) { | |
. . 1045: const Scalar* src = &src_data[src_offset]; | |
. . 1046: Scalar* dst = &dst_data[dst_offset]; | |
. . 1047: | |
. . 1048: if (!Vectorizable) { | |
. . 1049: for (Index i = 0; i < count; ++i) { | |
. . 1050: dst[i * dst_stride] = src[i * src_stride]; | |
. . 1051: } | |
. . 1052: return; | |
. . 1053: } | |
. . 1054: | |
. . 1055: const IndexType vectorized_size = count - PacketSize; | |
. . 1056: IndexType i = 0; | |
. . 1057: | |
. . 1058: if (kind == StridedLinearBufferCopy::Kind::Linear) { | |
. . 1059: // ******************************************************************** // | |
. . 1060: // Linear copy from `src` to `dst`. | |
. . 1061: const IndexType unrolled_size = count - 4 * PacketSize; | |
. . 1062: eigen_assert(src_stride == 1 && dst_stride == 1); | |
710ms 710ms 1063: for (; i <= unrolled_size; i += 4 * PacketSize) { | |
. . 1064: for (int j = 0; j < 4; ++j) { | |
. 2.06s 1065: Packet p = ploadu<Packet>(src + i + j * PacketSize); | |
. 590ms 1066: pstoreu<Scalar, Packet>(dst + i + j * PacketSize, p); | |
. . 1067: } | |
. . 1068: } | |
1.28s 1.28s 1069: for (; i <= vectorized_size; i += PacketSize) { | |
. 60ms 1070: Packet p = ploadu<Packet>(src + i); | |
. 420ms 1071: pstoreu<Scalar, Packet>(dst + i, p); | |
. . 1072: } | |
1.36s 1.36s 1073: for (; i < count; ++i) { | |
1.10s 1.10s 1074: dst[i] = src[i]; | |
. . 1075: } | |
. . 1076: // ******************************************************************** // | |
. . 1077: } else if (kind == StridedLinearBufferCopy::Kind::Scatter) { | |
. . 1078: // Scatter from `src` to `dst`. | |
. . 1079: eigen_assert(src_stride == 1 && dst_stride != 1); | |
1.04s 1.04s 1080: for (; i <= vectorized_size; i += PacketSize) { | |
. . 1081: Packet p = ploadu<Packet>(src + i); | |
. 9.37s 1082: pscatter<Scalar, Packet>(dst + i * dst_stride, p, dst_stride); | |
. . 1083: } | |
700ms 700ms 1084: for (; i < count; ++i) { | |
60ms 60ms 1085: dst[i * dst_stride] = src[i]; | |
. . 1086: } | |
. . 1087: // ******************************************************************** // | |
. . 1088: } else if (kind == StridedLinearBufferCopy::Kind::FillLinear) { | |
. . 1089: // Fill `dst` with value at `*src`. | |
. . 1090: eigen_assert(src_stride == 0 && dst_stride == 1); | |
. . 1091: const IndexType unrolled_size = count - 4 * PacketSize; | |
. 720ms 1092: Packet p = pload1<Packet>(src); | |
780ms 780ms 1093: for (; i <= unrolled_size; i += 4 * PacketSize) { | |
. . 1094: for (int j = 0; j < 4; ++j) { | |
. 2.34s 1095: pstoreu<Scalar, Packet>(dst + i + j * PacketSize, p); | |
. . 1096: } | |
. . 1097: } | |
750ms 750ms 1098: for (; i <= vectorized_size; i += PacketSize) { | |
. 430ms 1099: pstoreu<Scalar, Packet>(dst + i, p); | |
. . 1100: } | |
1.58s 1.58s 1101: for (; i < count; ++i) { | |
1.38s 1.38s 1102: dst[i] = *src; | |
. . 1103: } | |
. . 1104: // ******************************************************************** // | |
. . 1105: } else if (kind == StridedLinearBufferCopy::Kind::FillScatter) { | |
. . 1106: // Scatter `*src` into `dst`. | |
. . 1107: eigen_assert(src_stride == 0 && dst_stride != 1); | |
. 150ms 1108: Packet p = pload1<Packet>(src); | |
930ms 930ms 1109: for (; i <= vectorized_size; i += PacketSize) { | |
. 8.87s 1110: pscatter<Scalar, Packet>(dst + i * dst_stride, p, dst_stride); | |
. . 1111: } | |
590ms 590ms 1112: for (; i < count; ++i) { | |
450ms 450ms 1113: dst[i * dst_stride] = *src; | |
. . 1114: } | |
. . 1115: // ******************************************************************** // | |
. . 1116: } else if (kind == StridedLinearBufferCopy::Kind::Gather) { | |
. . 1117: // Gather from `src` into `dst`. | |
. . 1118: eigen_assert(dst_stride == 1); | |
690ms 690ms 1119: for (; i <= vectorized_size; i += PacketSize) { | |
. 5.24s 1120: Packet p = pgather<Scalar, Packet>(src + i * src_stride, src_stride); | |
. 3.11s 1121: pstoreu<Scalar, Packet>(dst + i, p); | |
. . 1122: } | |
1.15s 1.15s 1123: for (; i < count; ++i) { | |
10ms 10ms 1124: dst[i] = src[i * src_stride]; | |
. . 1125: } | |
. . 1126: // ******************************************************************** // | |
. . 1127: } else if (kind == StridedLinearBufferCopy::Kind::Random) { | |
. . 1128: // Random. | |
. . 1129: for (; i < count; ++i) { | |
``` | |
And these are the results with this MR: | |
``` | |
Run on gonnet.zrh (48 X 2594 MHz CPUs); 2023-04-20T01:48:05.813475259-07:00 | |
CPU: Intel Haswell with HyperThreading (24 cores) dL1:32KB dL2:256KB dL3:30MB | |
Benchmark Time(ns) CPU(ns) Iterations | |
-------------------------------------------------------------------------------------------------- | |
BM_StridedLinearBufferCopyLinear<float>/10/0_mean 4.70 4.80 123291440 | |
BM_StridedLinearBufferCopyLinear<float>/10/1_mean 6.28 6.33 107396720 | |
BM_StridedLinearBufferCopyLinear<float>/10/2_mean 7.24 7.27 87505633 | |
BM_StridedLinearBufferCopyLinear<float>/10/3_mean 7.33 7.27 87495105 | |
BM_StridedLinearBufferCopyLinear<float>/10/4_mean 6.16 6.15 107382784 | |
BM_StridedLinearBufferCopyLinear<float>/10/5_mean 5.60 5.62 121796808 | |
BM_StridedLinearBufferCopyLinear<float>/10/6_mean 6.39 6.36 93073315 | |
BM_StridedLinearBufferCopyLinear<float>/10/7_mean 6.76 6.72 92914680 | |
BM_StridedLinearBufferCopyLinear<double>/10/0_mean 4.73 4.80 129885590 | |
BM_StridedLinearBufferCopyLinear<double>/10/1_mean 6.51 6.56 102967312 | |
BM_StridedLinearBufferCopyLinear<double>/10/2_mean 6.12 6.12 109194717 | |
BM_StridedLinearBufferCopyLinear<double>/10/3_mean 5.74 5.71 119825492 | |
BM_StridedLinearBufferCopyScatter<float>/10/0_mean 25.0 25.2 27810621 | |
BM_StridedLinearBufferCopyScatter<float>/10/1_mean 25.1 25.4 28188447 | |
BM_StridedLinearBufferCopyScatter<float>/10/2_mean 25.3 25.1 27058978 | |
BM_StridedLinearBufferCopyScatter<float>/10/3_mean 25.6 25.5 27466771 | |
BM_StridedLinearBufferCopyScatter<float>/10/4_mean 25.9 26.0 27431745 | |
BM_StridedLinearBufferCopyScatter<float>/10/5_mean 26.7 26.7 26044859 | |
BM_StridedLinearBufferCopyScatter<float>/10/6_mean 27.2 27.1 25664379 | |
BM_StridedLinearBufferCopyScatter<float>/10/7_mean 27.4 27.5 25758283 | |
BM_StridedLinearBufferCopyScatter<double>/10/0_mean 12.8 12.8 53473872 | |
BM_StridedLinearBufferCopyScatter<double>/10/1_mean 13.0 13.0 57118077 | |
BM_StridedLinearBufferCopyScatter<double>/10/2_mean 13.1 13.2 55880641 | |
BM_StridedLinearBufferCopyScatter<double>/10/3_mean 13.9 13.9 52546924 | |
BM_StridedLinearBufferCopyFillLinear<float>/10/0_mean 3.89 3.90 172241583 | |
BM_StridedLinearBufferCopyFillLinear<float>/10/1_mean 4.77 4.77 120000000 | |
BM_StridedLinearBufferCopyFillLinear<float>/10/2_mean 5.10 5.10 130905186 | |
BM_StridedLinearBufferCopyFillLinear<float>/10/3_mean 5.39 5.40 123085108 | |
BM_StridedLinearBufferCopyFillLinear<float>/10/4_mean 3.95 3.97 174488581 | |
BM_StridedLinearBufferCopyFillLinear<float>/10/5_mean 6.53 6.53 94337606 | |
BM_StridedLinearBufferCopyFillLinear<float>/10/6_mean 5.85 5.84 114341069 | |
BM_StridedLinearBufferCopyFillLinear<float>/10/7_mean 6.18 6.20 90360074 | |
BM_StridedLinearBufferCopyFillLinear<double>/10/0_mean 3.69 3.69 187667735 | |
BM_StridedLinearBufferCopyFillLinear<double>/10/1_mean 4.75 4.75 144057221 | |
BM_StridedLinearBufferCopyFillLinear<double>/10/2_mean 3.94 3.96 175955939 | |
BM_StridedLinearBufferCopyFillLinear<double>/10/3_mean 6.10 6.09 103702513 | |
BM_StridedLinearBufferCopyFillScatter<float>/10/0_mean 24.8 24.9 28220207 | |
BM_StridedLinearBufferCopyFillScatter<float>/10/1_mean 25.1 25.0 27079599 | |
BM_StridedLinearBufferCopyFillScatter<float>/10/2_mean 25.2 25.2 27018279 | |
BM_StridedLinearBufferCopyFillScatter<float>/10/3_mean 25.9 26.1 27855702 | |
BM_StridedLinearBufferCopyFillScatter<float>/10/4_mean 26.1 25.8 26322048 | |
BM_StridedLinearBufferCopyFillScatter<float>/10/5_mean 26.3 26.2 27013995 | |
BM_StridedLinearBufferCopyFillScatter<float>/10/6_mean 26.7 27.0 27437384 | |
BM_StridedLinearBufferCopyFillScatter<float>/10/7_mean 27.0 26.9 26318163 | |
BM_StridedLinearBufferCopyFillScatter<double>/10/0_mean 12.4 12.4 54689179 | |
BM_StridedLinearBufferCopyFillScatter<double>/10/1_mean 14.7 14.7 49826154 | |
BM_StridedLinearBufferCopyFillScatter<double>/10/2_mean 14.7 14.7 47616820 | |
BM_StridedLinearBufferCopyFillScatter<double>/10/3_mean 13.5 13.4 54684977 | |
BM_StridedLinearBufferCopyGather<float>/10/0_mean 18.6 18.5 38647747 | |
BM_StridedLinearBufferCopyGather<float>/10/1_mean 18.8 18.8 37839340 | |
BM_StridedLinearBufferCopyGather<float>/10/2_mean 18.9 18.9 37054690 | |
BM_StridedLinearBufferCopyGather<float>/10/3_mean 19.6 19.1 35355901 | |
BM_StridedLinearBufferCopyGather<float>/10/4_mean 19.4 19.4 37187867 | |
BM_StridedLinearBufferCopyGather<float>/10/5_mean 19.7 19.8 35021213 | |
BM_StridedLinearBufferCopyGather<float>/10/6_mean 19.9 20.0 34520624 | |
BM_StridedLinearBufferCopyGather<float>/10/7_mean 20.2 20.5 35247986 | |
BM_StridedLinearBufferCopyGather<double>/10/0_mean 11.0 11.0 59792320 | |
BM_StridedLinearBufferCopyGather<double>/10/1_mean 12.0 12.1 58329290 | |
BM_StridedLinearBufferCopyGather<double>/10/2_mean 11.7 11.7 63199313 | |
BM_StridedLinearBufferCopyGather<double>/10/3_mean 13.0 13.2 55901534 | |
``` | |
And the result of `pprof --list=StridedLinearBufferCopy /tmp/tensor_block_benchmark_test.prof`: | |
``` | |
. . 1034: template <typename StridedLinearBufferCopy::Kind kind> | |
. . 1035: static EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE void Run(const Dst& dst, | |
. . 1036: const Src& src, | |
. . 1037: const size_t count) { | |
. 47.43s 1038: Run<kind>(count, dst.offset, dst.stride, dst.data, src.offset, src.stride, | |
. . 1039: src.data); | |
. . 1040: } | |
. . 1041: | |
. . 1042: private: | |
. . 1043: template <typename StridedLinearBufferCopy::Kind kind> | |
. . 1044: static EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE void Run( | |
. . 1045: const IndexType count, const IndexType dst_offset, | |
. . 1046: const IndexType dst_stride, Scalar* EIGEN_RESTRICT dst_data, | |
. . 1047: const IndexType src_offset, const IndexType src_stride, | |
. . 1048: const Scalar* EIGEN_RESTRICT src_data) { | |
. . 1049: const Scalar* src = &src_data[src_offset]; | |
. . 1050: Scalar* dst = &dst_data[dst_offset]; | |
. . 1051: | |
. . 1052: if (!Vectorizable) { | |
. . 1053: for (Index i = 0; i < count; ++i) { | |
. . 1054: dst[i * dst_stride] = src[i * src_stride]; | |
. . 1055: } | |
. . 1056: return; | |
. . 1057: } | |
. . 1058: | |
. . 1059: const IndexType vectorized_size = count - PacketSize; | |
. . 1060: IndexType i = 0; | |
. . 1061: | |
. . 1062: if (kind == StridedLinearBufferCopy::Kind::Linear) { | |
. . 1063: // ******************************************************************** // | |
. . 1064: // Linear copy from `src` to `dst`. | |
. . 1065: const IndexType unrolled_size = count - 4 * PacketSize; | |
. . 1066: eigen_assert(src_stride == 1 && dst_stride == 1); | |
570ms 570ms 1067: for (; i <= unrolled_size; i += 4 * PacketSize) { | |
. . 1068: for (int j = 0; j < 4; ++j) { | |
. 1.72s 1069: Packet p = ploadu<Packet>(src + i + j * PacketSize); | |
. 880ms 1070: pstoreu<Scalar, Packet>(dst + i + j * PacketSize, p); | |
. . 1071: } | |
. . 1072: } | |
1.21s 1.21s 1073: for (; i <= vectorized_size; i += PacketSize) { | |
. 90ms 1074: Packet p = ploadu<Packet>(src + i); | |
. 390ms 1075: pstoreu<Scalar, Packet>(dst + i, p); | |
. . 1076: } | |
. . 1077: if (HasHalfPacket) { | |
. . 1078: const IndexType vectorized_half_size = count - HalfPacketSize; | |
380ms 380ms 1079: for (; i <= vectorized_half_size; i += HalfPacketSize) { | |
. 90ms 1080: HalfPacket p = ploadu<HalfPacket>(src + i); | |
. 10ms 1081: pstoreu<Scalar, HalfPacket>(dst + i, p); | |
. . 1082: } | |
. . 1083: } | |
690ms 690ms 1084: for (; i < count; ++i) { | |
270ms 270ms 1085: dst[i] = src[i]; | |
. . 1086: } | |
. . 1087: // ******************************************************************** // | |
. . 1088: } else if (kind == StridedLinearBufferCopy::Kind::Scatter) { | |
. . 1089: // Scatter from `src` to `dst`. | |
. . 1090: eigen_assert(src_stride == 1 && dst_stride != 1); | |
950ms 950ms 1091: for (; i <= vectorized_size; i += PacketSize) { | |
. . 1092: Packet p = ploadu<Packet>(src + i); | |
. 9.12s 1093: pscatter<Scalar, Packet>(dst + i * dst_stride, p, dst_stride); | |
. . 1094: } | |
. . 1095: if (HasHalfPacket) { | |
. . 1096: const IndexType vectorized_half_size = count - HalfPacketSize; | |
300ms 300ms 1097: for (; i <= vectorized_half_size; i += HalfPacketSize) { | |
. . 1098: HalfPacket p = ploadu<HalfPacket>(src + i); | |
. 100ms 1099: pscatter<Scalar, HalfPacket>(dst + i * dst_stride, p, dst_stride); | |
. . 1100: } | |
. . 1101: } | |
570ms 570ms 1102: for (; i < count; ++i) { | |
60ms 60ms 1103: dst[i * dst_stride] = src[i]; | |
. . 1104: } | |
. . 1105: // ******************************************************************** // | |
. . 1106: } else if (kind == StridedLinearBufferCopy::Kind::FillLinear) { | |
. . 1107: // Fill `dst` with value at `*src`. | |
. . 1108: eigen_assert(src_stride == 0 && dst_stride == 1); | |
. . 1109: const IndexType unrolled_size = count - 4 * PacketSize; | |
. . 1110: Scalar s = *src; | |
. 770ms 1111: Packet p = pset1<Packet>(s); | |
820ms 820ms 1112: for (; i <= unrolled_size; i += 4 * PacketSize) { | |
. . 1113: for (int j = 0; j < 4; ++j) { | |
. 2.38s 1114: pstoreu<Scalar, Packet>(dst + i + j * PacketSize, p); | |
. . 1115: } | |
. . 1116: } | |
900ms 900ms 1117: for (; i <= vectorized_size; i += PacketSize) { | |
. 650ms 1118: pstoreu<Scalar, Packet>(dst + i, p); | |
. . 1119: } | |
. . 1120: if (HasHalfPacket) { | |
. . 1121: const IndexType vectorized_half_size = count - HalfPacketSize; | |
. 80ms 1122: HalfPacket hp = pset1<HalfPacket>(s); | |
680ms 680ms 1123: for (; i <= vectorized_half_size; i += HalfPacketSize) { | |
. 30ms 1124: pstoreu<Scalar, HalfPacket>(dst + i, hp); | |
. . 1125: } | |
. . 1126: } | |
630ms 630ms 1127: for (; i < count; ++i) { | |
470ms 470ms 1128: dst[i] = s; | |
. . 1129: } | |
. . 1130: // ******************************************************************** // | |
. . 1131: } else if (kind == StridedLinearBufferCopy::Kind::FillScatter) { | |
. . 1132: // Scatter `*src` into `dst`. | |
. . 1133: eigen_assert(src_stride == 0 && dst_stride != 1); | |
90ms 90ms 1134: Scalar s = *src; | |
. . 1135: Packet p = pset1<Packet>(s); | |
1.09s 1.09s 1136: for (; i <= vectorized_size; i += PacketSize) { | |
. 8.89s 1137: pscatter<Scalar, Packet>(dst + i * dst_stride, p, dst_stride); | |
. . 1138: } | |
. . 1139: if (HasHalfPacket) { | |
. . 1140: const IndexType vectorized_half_size = count - HalfPacketSize; | |
. . 1141: HalfPacket hp = pset1<HalfPacket>(s); | |
200ms 200ms 1142: for (; i <= vectorized_half_size; i += HalfPacketSize) { | |
. 170ms 1143: pscatter<Scalar, HalfPacket>(dst + i * dst_stride, hp, dst_stride); | |
. . 1144: } | |
. . 1145: } | |
480ms 480ms 1146: for (; i < count; ++i) { | |
60ms 60ms 1147: dst[i * dst_stride] = s; | |
. . 1148: } | |
. . 1149: // ******************************************************************** // | |
. . 1150: } else if (kind == StridedLinearBufferCopy::Kind::Gather) { | |
. . 1151: // Gather from `src` into `dst`. | |
. . 1152: eigen_assert(dst_stride == 1); | |
750ms 750ms 1153: for (; i <= vectorized_size; i += PacketSize) { | |
. 5.68s 1154: Packet p = pgather<Scalar, Packet>(src + i * src_stride, src_stride); | |
. 2.66s 1155: pstoreu<Scalar, Packet>(dst + i, p); | |
. . 1156: } | |
. . 1157: if (HasHalfPacket) { | |
. . 1158: const IndexType vectorized_half_size = count - HalfPacketSize; | |
150ms 150ms 1159: for (; i <= vectorized_half_size; i += HalfPacketSize) { | |
. . 1160: HalfPacket p = | |
. 90ms 1161: pgather<Scalar, HalfPacket>(src + i * src_stride, src_stride); | |
. 50ms 1162: pstoreu<Scalar, HalfPacket>(dst + i, p); | |
. . 1163: } | |
. . 1164: } | |
660ms 660ms 1165: for (; i < count; ++i) { | |
170ms 170ms 1166: dst[i] = src[i * src_stride]; | |
. . 1167: } | |
. . 1168: // ******************************************************************** // | |
. . 1169: } else if (kind == StridedLinearBufferCopy::Kind::Random) { | |
. . 1170: // Random. | |
. . 1171: for (; i < count; ++i) { | |
``` | |
Note that the absolute `ms` in the `pprof` listings are not comparable since the benchmarking suite will run each test until a minimum number of `ms` have been reached, and not for a fixed number of iterations.",Pedro Gonnet,2023-05-01T16:09:33.279Z,NA,NA | |
1317 (https://gitlab.com/libeigen/eigen/-/merge_requests/1317),Unroll F32 to BF16 loop - 1.8X faster conversions for LLVM. Use vector pairs for GCC.,Unroll F32 to BF16 loop - 1.8X faster conversions for LLVM. Use vector pairs for GCC. Other minor improvements.,Chip Kerchner,2023-05-01T16:54:17.404Z,NA,NA | |
1318 (https://gitlab.com/libeigen/eigen/-/merge_requests/1318),JacobiSVD: set m_nonzeroSingularValues to zero if not finite,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #2650 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Sets `m_nonzeroSingularValues = 0` when invalid input is detected. This will cause `rank()` to return 0 and prevent crashes. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-05-02T17:48:21.950Z,NA,NA | |
1319 (https://gitlab.com/libeigen/eigen/-/merge_requests/1319),Fix ColMajor BF16 GEMV for when vector is RowMajor,Fix ColMajor BF16 GEMV for when vector is RowMajor.,Chip Kerchner,2023-05-03T20:12:51.510Z,NA,NA | |
1321 (https://gitlab.com/libeigen/eigen/-/merge_requests/1321),clean up array_cwise test,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Suppresses annoying compiler warnings on MSVC when negating an unsigned integer, some ambiguous operator precedence issues, and delete completely redundant shift test. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-05-04T16:02:09.482Z,NA,NA | |
1322 (https://gitlab.com/libeigen/eigen/-/merge_requests/1322),Specialized loadColData correctly - fix previous BF16 GEMV MR,"Specialized loadColData correctly - fix previous BF16 GEMV MR. | |
LLVM didn't like SFINAE solution.",Chip Kerchner,2023-05-04T16:38:17.889Z,NA,NA | |
1323 (https://gitlab.com/libeigen/eigen/-/merge_requests/1323),Visitor: fix modulo by zero compiler warning,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-05-04T18:21:10.893Z,NA,NA | |
1325 (https://gitlab.com/libeigen/eigen/-/merge_requests/1325),Change array_cwise test name,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Suppress a few more compiler warnings, rename test so as not to conflict with tensor array(). | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-05-05T03:08:44.054Z,NA,NA | |
1289 (https://gitlab.com/libeigen/eigen/-/merge_requests/1289),Thread pool,"### What does this implement/fix? | |
Moves the thread pool code from Tensor into Core | |
### Additional information | |
This is a first step towards making the thread pool available for the whole of Eigen. It does not implement `.device()` for matrices.",Tobias Wood,2023-05-05T16:23:34.954Z,NA,NA | |
1324 (https://gitlab.com/libeigen/eigen/-/merge_requests/1324),Return NaN in ndtri for values outside valid input range.,This is consistent with scipy/matlab.,Antonio Sánchez,2023-05-05T16:27:27.572Z,NA,NA | |
1320 (https://gitlab.com/libeigen/eigen/-/merge_requests/1320),Use std::shared_ptr for FFTW/IMKL FFT plan implementation; Fixes #2651,"### What does this implement/fix? | |
This MR fixes possible undefined behavior caused by the copying of FFT plan objects happening with FFTW and IMKL FFT backends, please see #2651 for details. | |
### Additional information | |
I've verified the fix to work using the existing testsuite, only in the case of MKL backend I enabled the use of 'mklfft.cpp' test with the following snippet: | |
```cmake | |
set(MKL_THREADING gnu_thread) | |
set(MKL_INTERFACE lp64) | |
find_package(MKL) | |
if(MKL_FOUND) | |
ei_add_test( mklfft ""-DEIGEN_MKLFFT_DEFAULT"" ""MKL::MKL"" ) | |
endif() | |
``` | |
I'm not merging this snippet though, as this is an ad-hoc solution which doesn't take into account all the possible complexities of integrating MKL into the build system.",Andrzej Ciarkowski,2023-05-05T16:58:24.600Z,NA,NA | |
1285 (https://gitlab.com/libeigen/eigen/-/merge_requests/1285),[SYCL-2020] Enabling USM support for SYCL. SYCL-1.2.1 did not have support for USM.,"[SYCL-2020] Enabling USM support for SYCL. SYCL-1.2.1 did not have support for USM. As a result the Eigen SYCL has to mimic the USM style pointer via device pointer simulation. The proposed virtual pointer is an 8 bytes host pointer used as a key to access the SYCL device buffer. The reason was that the device buffer could not be used as a pointer to represent the `m_data` in Eigen leaf node expressions. Hence, the virtual pointer (that was the key in the buffer map) used in Eigen expression construction. This leads to the memory finding and looking in the buffer map to construct the Evaluator class. Using SYCL-2020 USM pointer allow us to remove all the unnecessary steps added to mimic the pointer for the device code in the SYCL backend.",Mehdi Goli,2023-05-05T17:30:37.003Z,NA,NA | |
1327 (https://gitlab.com/libeigen/eigen/-/merge_requests/1327),Fix cuda compilation,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #2652 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
The issue doesn't specifically pertain to compiling with CUDA so much as `EIGEN_AVOID_STL_ARRAY`. This is fixed by moving the new `Meta.h` additions and `EmulateArray.h` after the Core Eigen stuff has been included. | |
I also had some issues compiling without `#include <vector>` so I added that to `Core` as well. I think that plugs all the holes. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-05-08T16:15:48.290Z,NA,NA | |
1329 (https://gitlab.com/libeigen/eigen/-/merge_requests/1329),Make it possible to override the synchonization primitives used by the threadpool using macros.,"Many production or embedded systems use custom synchronization primitives for the purpose of performance, instrumentation etc. | |
Add macros to make it possible to override the defaults used by the Eigen ThreadPool.",Rasmus Munk Larsen,2023-05-09T19:36:17.940Z,NA,NA | |
1333 (https://gitlab.com/libeigen/eigen/-/merge_requests/1333),SVD: fix numerous compiler warnings / failures,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Eigen's JacobiSVD and BDCSVD share allocation routines. In some use cases, small fixed size matrix members are not initialized, generating a heap of compiler warnings and occasionally compilation failures in the CI smoketests. | |
Example warning on GCC 10 with AVX2: | |
``` | |
[1959/2008] Building CXX object test/CMakeFiles/bdcsvd_23.dir/bdcsvd.cpp.o | |
In file included from ../Eigen/Core:313, | |
from ../Eigen/QR:11, | |
from ../test/main.h:347, | |
from ../test/bdcsvd.cpp:26: | |
../Eigen/src/Core/PlainObjectBase.h: In member function 'void Eigen::BDCSVD<MatrixType, Options>::allocate(Eigen::BDCSVD<MatrixType, Options>::Index, Eigen::BDCSVD<MatrixType, Options>::Index, unsigned int) [with MatrixType_ = Eigen::Matrix<float, 3, 3>; int Options_ = 0]': | |
../Eigen/src/Core/PlainObjectBase.h:487:7: warning: '<anonymous>' may be used uninitialized in this function [-Wmaybe-uninitialized] | |
487 | m_storage = std::move(other.m_storage); | |
| ^~~~~~~~~ | |
``` | |
And occasionally: | |
``` | |
[490/2008] Building CXX object test/CMakeFiles/bdcsvd_25.dir/bdcsvd.cpp.o | |
FAILED: test/CMakeFiles/bdcsvd_25.dir/bdcsvd.cpp.o | |
/usr/bin/x86_64-linux-gnu-g++-10 -DEIGEN_TEST_MAX_SIZE=320 -DEIGEN_TEST_PART_25=1 -I../ -pedantic -Wall -Wextra -Wundef -Wcast-align -Wchar-subscripts -Wnon-virtual-dtor -Wunused-local-typedefs -Wpointer-arith -Wwrite-strings -Wformat-security -Wlogical-op -Wdouble-promotion -Wshadow -Wno-psabi -Wno-variadic-macros -Wno-long-long -fno-check-new -fno-common -fstrict-aliasing -mavx2 -mfma -O3 -DNDEBUG -std=c++14 -MD -MT test/CMakeFiles/bdcsvd_25.dir/bdcsvd.cpp.o -MF test/CMakeFiles/bdcsvd_25.dir/bdcsvd.cpp.o.d -o test/CMakeFiles/bdcsvd_25.dir/bdcsvd.cpp.o -c ../test/bdcsvd.cpp | |
x86_64-linux-gnu-g++-10: fatal error: Killed signal terminated program cc1plus | |
compilation terminated. | |
``` | |
These issues appear to be eliminated with this fix. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-05-15T16:56:47.919Z,NA,NA | |
1314 (https://gitlab.com/libeigen/eigen/-/merge_requests/1314),Geometry/EulerAngles: introduce canonicalEulerAngles,"This MR is a continuation of reverted !1301. The MR implements same correction of the returned solutions from Eigen's non-standard angle ranges to standard canonical ranges in the form of a new `canonicalEulerAngles` method of `MatrixBase`, instead of modifying behaviour of `eulerAngles`, which is now marked as deprecated. | |
~~The implementation has been factored out to a protected method `MatrixBase::eulerAnglesImpl`, which is utilized both by the deprecated `eulerAngles`, and by the new `canonicalEulerAngles` which makes sure that the solution is in the respective canonical Tait-Bryan/proper Euler angle ranges.~~ | |
`canonicalEulerAngles` has been implemented by taking and tweaking the code for `eulerAngles`, and using the same battle-tested expressions for calculating angles. The code for tweaking the first angle to be in Eigen's non-standard [0, pi] range has been removed. We instead take care when passing arguments to `atan2` to ensure that the results are in correct quadrants for the solution to be canonical. The level of documentation has also been significantly increased. The method of indexing archetype matrices from Graphics Gems IV should be apparent from the comments, and the used formulas should should be easier to understand and verify.",Juraj Oršulić,2023-05-19T15:42:23.614Z,NA,NA | |
1334 (https://gitlab.com/libeigen/eigen/-/merge_requests/1334),Fix unrolled assignment evaluator,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Small fixed size arrays and matrices utilize a specialized unrolled evaluation loop for assignment ops, e.g. `A = B.cwiseAbs2()`. Currently, the unrolled loop will always use the 2d interface `coeff(row,col)` and `packet(row,col)` instead of the linear interface `coeff(index)` and `packet(index)`, even if linear access is requested. This is probably not an issue for most cases but could lead to unpredictable data access patterns. At the very least, it was creating confusing debugging scenarios for this contributor. | |
This patch adds an analogous unrolling strategy that uses the linear access `coeff` and `packet` functions. Changed a few template parameter names that were identical to a local typedef. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-05-22T16:39:25.702Z,NA,NA | |
1335 (https://gitlab.com/libeigen/eigen/-/merge_requests/1335),Sparse matrix column/row removal,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #2659 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Adds two functions `SparseMatrix::removeOuterVectors(start, num)` and `SparseMatrix::insertEmptyOuterVectors(start, num)` to remove/add `num` contiguous outer vectors starting at `start`. If the first vector is not empty and it is removed, then `m_outerIndex[0] != 0`, which in theory shouldn't be an issue. However, I am not confident that every sparse matrix function can handle `m_outerIndex[0] != 0`. For this reason, and because its relatively inexpensive to address, I explicitly handle this case and shift the data so that `m_outerIndex[0] == 0`. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-05-24T17:04:46.848Z,NA,NA | |
1336 (https://gitlab.com/libeigen/eigen/-/merge_requests/1336),Add linear redux evaluators,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Linear access `coeff(index)` allows an expression to be evaluated as if it were a 1d array. This conveys many benefits, including simplified traversal, simple alignment logic, etc. At worst, it has the same performance as 2d access `coeff(row,col)`, but is usually faster. The redux evaluators do not implement linear access for the unrolled scalar, unrolled vectorized (currently pseudo linear access), and ""rolled"" scalar traversals. This patch adds those traversals. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-05-24T17:07:26.104Z,NA,NA | |
1337 (https://gitlab.com/libeigen/eigen/-/merge_requests/1337),Clean up Redux.h and fix vectorization_logic test after changes to traversal order in Redux.,Fixes a few issues triggered by https://gitlab.com/libeigen/eigen/-/merge_requests/1336,Rasmus Munk Larsen,2023-05-24T20:26:53.557Z,NA,NA | |
1341 (https://gitlab.com/libeigen/eigen/-/merge_requests/1341),Replace usage of CudaStreamDevice with GpuStreamDevice in tensor benchmarks GPU,This MR addresses an issue in the tensor gpu benchmarks where the code was incorrectly utilizing CudaStreamDevice instead of GpuStreamDevice.,Alejandro Acosta,2023-05-30T15:44:32.247Z,NA,NA | |
1339 (https://gitlab.com/libeigen/eigen/-/merge_requests/1339),Do not set EIGEN_HAS_ARM64_FP16_SCALAR_ARITHMETIC for cuda compilation,"The previous version assumed that ARM and CUDA will never mix, | |
for code that is shared between host and device this leads to miscompilation | |
(reference to __host__ function '__builtin_neon_vabsh_f16' in __host__ __device__ function)",Alexander Shaposhnikov,2023-05-31T15:15:07.107Z,NA,NA | |
1338 (https://gitlab.com/libeigen/eigen/-/merge_requests/1338),Optimize scalar_unary_pow_op error handling,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Only recently realized the generic `pnot` is not vector friendly and is not specialized in most platforms. Addressed that and improved error handling with lessons learned from other contributions. | |
- No error handling is required for floating base, integer exponent pow! | |
- Fixed an issue when the base type is an unsigned integer and the exponent is a negative integer. | |
- For integer base, integer exponent operations, the full loop is now performed even if the exponent exceeds the number of digits of the scalar type. Previously, this was a shortcut as overflow is guaranteed unless the base is 0 or 1. However, this doesn't work with unsigned base types as they do not overflow. The number of operations in repeated squaring is logarithmic with respect to the value of the exponent, so the execution time isn't too bad even if an absurdly large exponent is used. This makes the int/int error handling routines simpler as they only handle negative exponents. | |
Difference in assembly as generated by x86 Clang 12 with AVX2: | |
- `handle_nonint_int_errors`: eliminated | |
- `handle_nonint_nonint_errors`: 40 fewer lines (branchess) | |
- `handle_int_int` (signed): 21 fewer lines | |
- `handle_int_int` (unsigned): 19 fewer lines | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-06-02T18:53:06.893Z,NA,NA | |
1342 (https://gitlab.com/libeigen/eigen/-/merge_requests/1342),Reduce max relative error of prsqrt from 3 to 2 ulps.,"Use a different formulation of the Newton-Raphson step for rsqrt. This was measured exhaustively for all floats using AVX in comparison with the exact value computed by MPFR. | |
Thanks to Solomon Boulos for the suggestion.",Rasmus Munk Larsen,2023-06-04T22:25:34.528Z,NA,NA | |
1343 (https://gitlab.com/libeigen/eigen/-/merge_requests/1343),Fix unary pow error handling and test,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Previous MR was developed using a faulty unit test. This patch fixes the test and the error handling routine for pow(float,float). | |
I applied this better testing methodology to pow(float,int) and discovered other mis-handled edge cases: | |
- if the exponent is signed and equal to its lowest value, i.e. `NumTraits<int>::lowest()`, then std/numext::abs will return a negative number and cause all sorts of mayhem. Introduced `safe_abs` to cast a signed integer to its unsigned counterpart and apply the absolute value. This is a lot of work to address one edge case per signed integer type, but the actual computational work is still fairly minimal and only needs to be performed once. | |
- int_pow may not correctly handle underflow if the exponent is negative. fixed this by moving the reciprocation to the start of the routine. Originally, this was at the end as it was found to be marginally more accurate | |
- if the exponent is astoundingly gigantic such that it cannot be stored as a double, then std::pow may return incorrect values. This is addressed in the testing. Likewise with an underflow issue specific to MSVC | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-06-06T18:46:55.727Z,NA,NA | |
1344 (https://gitlab.com/libeigen/eigen/-/merge_requests/1344),Avoid underflow in prsqrt.,NA,Rasmus Munk Larsen,2023-06-06T21:22:42.552Z,NA,NA | |
1328 (https://gitlab.com/libeigen/eigen/-/merge_requests/1328),Partially Vectorize Cast,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Specialize the evaluator of `scalar_cast_op` to handle different input and output packet types, multiple input packets per output packet, etc. Works as long as `type_casting_traits` is correctly defined and `pcast` returns a single packet. | |
Assuming `type_casting_traits::VectorizedCast` is `1` and the `pcast` is defined, there are 3 scenarios: | |
1. The destination packet is the same size as or larger than the default source packet: fully vectorized | |
2. The destination packet is smaller than the default source packet: a suitable half (or quarter) packet is selected to satisfy 1) | |
3. The destination packet is smaller than the default source packet and no suitable half packet is available. A run-time check verifies if the packet load would not result in an out-of-bounds data access. Otherwise, the packet op is synthesized from scalar operations. | |
Benchmarks: | |
For a pure cast, `dst = src.cast<DstType>()` I saw very little difference in performance from the scalar path, which is understandable as there is very little work being done to justify the overhead of the loads and stores. However, I also saw no decrease in performance, which is good. | |
The story is different for a more complex expression like `dst = src.abs2().sqrt().log().cast<DstType>();` | |
AVX (double->float): -55% | |
From this example, we see that the meat and potatoes of the expression -- the arithmetic operations -- vastly outweigh the the cost of the cast. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-06-09T16:54:32.486Z,NA,NA | |
1347 (https://gitlab.com/libeigen/eigen/-/merge_requests/1347),Compile- and run-time assertions for the construction of Ref<const>.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
libeigen/eigen#2667 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Ref<const...> now asserts the success of its construction - which fails at run-time if it has copied the given expression into its member variable m_object, but the StrideType (possibly in combination with the expression's shape) does not support mapping m_object's contiguous memory layout. Whenever this is foreseeable, an assertion is triggered already at compile-time in this case, resulting in a compile error. | |
### Additional information | |
<!--Any additional information you think is important.-->",wilfried.karel,2023-06-14T15:49:59.666Z,NA,NA | |
1346 (https://gitlab.com/libeigen/eigen/-/merge_requests/1346),define a move constructor for Ref<const...>,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
libeigen/eigen#2668 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Ref<const...> now defines a move constructor. If the storage of Ref<const...>'s member variable m_object is dynamic, then this avoids copying it. In any case, the moved-from Ref<const...> not only remains in a valid state, but it continues to map the same data - which follows the unusual, existing definition of Ref<const...>'s copy constructor. | |
### Additional information | |
<!--Any additional information you think is important.-->",wilfried.karel,2023-06-14T20:10:52.443Z,NA,NA | |
1349 (https://gitlab.com/libeigen/eigen/-/merge_requests/1349),Fix AVX pstore,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Eigen's abstract `pstore` function for calling AVX aligned store intrinsics was incorrectly calling unalinged store for integer types. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-06-15T01:47:38.804Z,NA,NA | |
1351 (https://gitlab.com/libeigen/eigen/-/merge_requests/1351),Fix svd test,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
The CI often fails due to excessive resource consumption during the svd tests. This removes the testing of deprecated behavior. Not sure if this will solve all the problems, but it is a start. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-06-22T17:37:25.094Z,NA,NA | |
1350 (https://gitlab.com/libeigen/eigen/-/merge_requests/1350),Fix safe_abs in int_pow,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Previous ""safe"" absolute value function did not work on clang as static_cast is undefined if the input is outside the range of the result type. | |
https://godbolt.org/z/r9zrPjhz8 | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-06-23T04:12:42.011Z,NA,NA | |
1353 (https://gitlab.com/libeigen/eigen/-/merge_requests/1353),delete deprecated function call in svd test,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-06-23T14:17:27.925Z,NA,NA | |
1352 (https://gitlab.com/libeigen/eigen/-/merge_requests/1352),rint round floor ceil,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
As suggested by https://gitlab.com/pkgoogle in !1340 | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-06-23T16:29:17.534Z,NA,NA | |
1354 (https://gitlab.com/libeigen/eigen/-/merge_requests/1354),Add optional offset parameter to ploadu_partial and pstoreu_partial,Add optional offset parameter to ploadu_partial and pstoreu_partial - to be consistent with aligned versions.,Chip Kerchner,2023-06-23T19:53:06.139Z,NA,NA | |
1355 (https://gitlab.com/libeigen/eigen/-/merge_requests/1355),Disable FP16 arithmetic for arm32.,"Clang currently only supports fp16 for aarch64 for _all_ fp16 functions, | |
even if they are listed as supporting A32 in Arm's developer guide. | |
But even in the guide, some intrinsics are only available for A64, | |
such as [comparisons](https://developer.arm.com/architectures/instruction-sets/intrinsics/#q=vceqh_f16], [div](https://developer.arm.com/architectures/instruction-sets/intrinsics/#q=vdivq_f16), [sqrt](https://developer.arm.com/architectures/instruction-sets/intrinsics/#q=vsqrt_f16). Therefore, we simply disable fp16 on arm32 for now.",Antonio Sánchez,2023-06-26T18:39:43.688Z,NA,NA | |
1356 (https://gitlab.com/libeigen/eigen/-/merge_requests/1356),Ensure EIGEN_HAS_ARM64_FP16_VECTOR_ARITHMETIC is always defined on arm.,Eliminates a compilation warning on clang.,Antonio Sánchez,2023-06-26T19:21:55.328Z,NA,NA | |
1345 (https://gitlab.com/libeigen/eigen/-/merge_requests/1345),Add Quaternion constructor from real scalar and imaginary vector,"Adds a constructor for Quaternions from a real part as a scalar and an imaginary part as a 3-vector. This new ctor makes it much easier to write common expressions like the angular velocity formula | |
```math | |
\dot{\mathbf{q}} = \frac{1}{2} \mathbf{q} \otimes \begin{bmatrix} \boldsymbol{\omega} \\ 0 \end{bmatrix} | |
``` | |
perturbation quaternions | |
```math | |
\delta\mathbf{q} = \begin{bmatrix}\frac{1}{2}\delta\boldsymbol{\phi} \\ 1 \end{bmatrix} | |
``` | |
and even the basic quaternion action on point formula (which is implemented as the test case that covers this new ctor) | |
```math | |
\begin{bmatrix}\mathbf{v}_{new} \\ 0\end{bmatrix} = \mathbf{q} \otimes \begin{bmatrix}\mathbf{v} \\ 0 \end{bmatrix} \otimes \mathbf{q}^* | |
``` | |
whereas when current users need to construct the scalar-vector quaternions in the expressions above, they must either write `Eigen::Quaternion(re, im.x(), im.y(), im.z())`, or default construct a mutable quaternion and set it using `w()` and `vec()`.",H S Helson Go,2023-06-27T05:52:41.488Z,NA,NA | |
1357 (https://gitlab.com/libeigen/eigen/-/merge_requests/1357),Fix supportsMMA to obey EIGEN_ALTIVEC_MMA_DYNAMIC_DISPATCH compilation flag and compiler support.,Fix supportsMMA to obey EIGEN_ALTIVEC_MMA_DYNAMIC_DISPATCH compilation flag and compiler support.,Chip Kerchner,2023-06-28T17:57:22.427Z,NA,NA | |
1360 (https://gitlab.com/libeigen/eigen/-/merge_requests/1360),Fix ivcSize return type in IndexedViewMethods.h,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #1881 -- only took 3 years! | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-07-03T03:49:37.630Z,NA,NA | |
1362 (https://gitlab.com/libeigen/eigen/-/merge_requests/1362),Fix argument for _mm256_cvtps_ph imm parameter,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
During compilation, MSVC generates a C4556 warning due to ""_MM_FROUND_NO_EXC"" being OR'ed as argument for the imm parameter of the ""_mm256_cvtps_ph"" intrinsic. The value for ""_MM_FROUND_NO_EXC"" (0b1000) is out of bounds for imm, since only imm[1:0] and imm[2] are used and imm[7:3] is ignored by the processor according to Intel's Software Developer Manual. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html | |
Table 5-13. from Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2C | |
A similar problem ( albeit with the 128 bit instruction ) is discussed here: | |
https://developercommunity.visualstudio.com/t/-mm-cvtps-ph-doesnt-accept-mm-fround-no-exc/1343857",kevle,2023-07-04T17:12:23.204Z,NA,NA | |
1361 (https://gitlab.com/libeigen/eigen/-/merge_requests/1361),Altivec: fix compilation with C++20 and higher,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This fixes compilation of the Altivec code with `-std=c++20` and `-std=c++23`. | |
Use of a `simple-template-id` as name of the constructor seems to have been allowed in C++17 and earlier, | |
but does not really add anything. The simple class name works with `-std=c++98` as well. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
Problem discovered when compiling `tflite` as a dependency of `chromium`, which uses `-std=gnu++2a`.",Marcus Comstedt,2023-07-05T13:29:46.249Z,NA,NA | |
1363 (https://gitlab.com/libeigen/eigen/-/merge_requests/1363),Fix use of arg function in CUDA.,"A global `::arg` function does not officially exist for CUDA, and fails with MSVC+C++20. | |
Replacing it with `std::arg` seems to work on device.",Antonio Sánchez,2023-07-07T18:37:14.801Z,NA,NA | |
1358 (https://gitlab.com/libeigen/eigen/-/merge_requests/1358),Fix annoying warnings,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This resolves all of the compiler warnings in clang default/avx2. There are many outstanding warnings in gcc. Summary of changes: | |
1) `equal_strict` no longer complains about comparing unsigned/signed integers. To address this, I check if the signed argument is negative, return false if so, otherwise cast it to its unsigned counterpart and do the comparison. | |
2) dot no longer complains about an unused typedef | |
3) pcarg no longer complains about an unused typedef | |
4) plainobjectbase::_init2 no longer complains about an unused typedef | |
5) ref.h enum comparisons | |
6) SVD uninitialized variables | |
7) arraycwise underflow issues with arm | |
8) deleted a bunch of stupid tests and split up the remaining tests in arraycwise to help with CI failures due to lack of resources | |
9) random_cast_without_overflow -- dont negate unsigned arguments | |
10) std_vector test doesnt complain about un-initialized memory (for a test that's not run!) | |
11) tensorblock: enum comparisons | |
12) FFT: change enum to constexpr variables to avoid shadowing existing `Default` enum in Core | |
13) disable deprecation warning in euler angles | |
14) NNLS no longer complains about unused typedef | |
15) tensor casts: use cast_impl instead of static_cast to correctly handle real->complex casts | |
16) tensor reduction: judiciously cast to double | |
17) make convert_index fancier by handling signed/unsigned integer range check | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-07-07T20:19:58.929Z,NA,NA | |
1359 (https://gitlab.com/libeigen/eigen/-/merge_requests/1359),Fix AVX512 nomalloc issues in trsm.,"The AVX512 trsm kernels end up always trying to allocate via malloc. | |
Notably, `EIGEN_STACK_ALLOCATION_LIMIT` is 0 for AVX512 by default | |
(and within tests), so we cannot use | |
`ei_declare_aligned_stack_constructed_variable` either. The only | |
solution is to disable the AVX512 kernels if malloc is disabled. | |
This fixes the nomalloc tests.",Antonio Sánchez,2023-07-10T16:42:14.016Z,NA,NA | |
1364 (https://gitlab.com/libeigen/eigen/-/merge_requests/1364),Optimize check_rows_cols_for_overflow,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #2694 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
`check_rows_cols_for_overflow` performs a sanity check to ensure that the number of rows and columns does not exceed the maximum permitted by the index type. @imyxh noted that this check could be optimized when the rows are known at compile time. Currently, we always divide by columns, which the compiler is able to optimize for objects such as `Matrix<double,Dynamic,1>` but not `Matrix<double,1,Dynamic>`. This fix uses partial template specialization to use compile time information when its available. | |
Also fixes a nit where matrices such as `Matrix<double,Dynamic,0>` were tagged with a `Dynamic` size at compile time. This probably had no effect on runtime performance, but certainly made the template meta programming more annoying for matrices with a zero dimension at compile time. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-07-10T17:40:18.670Z,NA,NA | |
1367 (https://gitlab.com/libeigen/eigen/-/merge_requests/1367),Fix more gcc compiler warnings / sort-of bugs,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Fixes several subtle issues that were triggering lots of verbose compiler warnings in gcc: | |
1) Blocks with zero columns are permitted in the comma initializer (and perhaps other situations). These zero-size blocks are harmless (probably), but may hold a pointer that is incremented beyond the bounds of the allocated memory. | |
``` | |
Matrix2d A(2, 2); | |
MatrixXd A1(1, 2); | |
Matrix<double,1,0> A2(1, 0); | |
(A << A1, A2, | |
A1, A2).finished(); | |
``` | |
Here, the bottom right block of `A` is constructed to have pointer `A.data() + 5`, which is clearly beyond the bounds of `A.data()`. This pointer is never dereferenced, as the evaluator loop won't execute anything, but gcc can't deduce that. This patch always nulls the pointer if the block size is 0, which suppresses the warning. In the block test, we must check if the block object is zero-sized before calling coeff. This is usually implicit in a loop `for(int i = 0; i < rows; i++)`, but in this particular case we (unconditionally) check a single coefficient's value. | |
2) VectorBlock explicitly declared the `=` assignment operator, but no copy constructor. This patch uses the built-in macro `EIGEN_INHERIT_ASSIGNMENT_OPERATORS` to do that. | |
3) The triangular solver does a runtime check if the result and rhs are in fact the same object. I suspect this is causing the `-Wmaybe-uninitialized` warning which only seems to happen for fixed size objects. Perhaps this is also the reason for the SVD warnings? | |
``` | |
if(!is_same_dense(dst,m_rhs)) | |
dst = m_rhs; | |
``` | |
If `dst` is an implicitly declared fixed-size matrix, the storage is allocated on the stack and is not initialized, and gcc can't deduce whether `dst = m_rhs` will be run. This patch changes the test so that `dst` is unambiguously initialized. I'm not sure if this is actually the cause of the warning, but it works. | |
4) explicitly initialize matrix in `random_matrix` test | |
build:linux:cross:x86-64:gcc-10:avx2 log files: | |
Before: `5220` lines | |
After: `2363` lines | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-07-14T21:12:45.793Z,NA,NA | |
1369 (https://gitlab.com/libeigen/eigen/-/merge_requests/1369),fix arm build warnings,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
1. Fix `-Wcast-align` in `TensorContraction.h` by changing `reinterpret_cast<T*>(x)` to `static_cast<T*>(static_cast<void*>(x))` | |
2. Fix signed/unsigned integer comparison in `TensorDimensions.h` | |
3. Rename some shadowed variables | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-07-17T20:37:27.964Z,NA,NA | |
1370 (https://gitlab.com/libeigen/eigen/-/merge_requests/1370),Fix -Waggressive-loop-optimizations,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
x86-64 gcc version 10+ emits a `Waggressive-loop-optimizations` warning for certain matrix-vector products. The warning appears when the dimensions of the matrix is a fixed size multiple of the packet type (?). `-mavx2` silences these warnings for `Matrix<float,32,32>`, but not `64`. | |
Regardless of the reason, the warnings are probably false positives, as the code block in question is never executed. Explicitly defining the loop bounds fixes the issue. All additional integer arithmetic involves compile time powers of 2 so calculating the bounds is fairly trivial. | |
Minimum reproducer: | |
``` | |
int main() | |
{ | |
Matrix<float,32,32> A; | |
Matrix<float,1,32> b, c; | |
c.noalias() = b * A; | |
} | |
``` | |
https://godbolt.org/z/bsjYcE8bT | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-07-21T03:47:41.689Z,NA,NA | |
1371 (https://gitlab.com/libeigen/eigen/-/merge_requests/1371),Fix -Wmaybe-uninitialized in SVD,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Fixes a very subtle issue for fixed-size JacobiSVD/BDCSVD. Upon construction, the size of the solver is initialized as such: | |
``` | |
m_rows(-1), | |
m_cols(-1), | |
m_diagSize(0), | |
``` | |
regardless if these values are known at compile time. Apparently, the SVD code is not tight enough to preclude the possibility that `m_workspace` (possibly other members) is accessed even when `RowsAtCompileTime == ColsAtCompileTime`. (TODO: do we really need a copy of the matrix just to scale it? The QR preconditioner just makes a copy of the matrix anyway!) If we instead initialize these dimensions to the compile-time values (and treat them as const members), the warnings go away. Also saves a few bytes in the fixed-size case. | |
With this fix, the GCC 10 default full build is warning free! I also split up the SVD tests to reduce the number of failed tests due to lack of resources. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-07-25T22:22:18.653Z,NA,NA | |
1373 (https://gitlab.com/libeigen/eigen/-/merge_requests/1373),Fixes #2703 by adding max_digits10 function,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
#2703 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Eigen used [digits10](https://en.cppreference.com/w/cpp/types/numeric_limits/digits10) as the number of full precision decimal digits. However, in my use cases like serialization to text files, digits10 does not provide enough decimal digits number to uniquely represent a double. To fix this issue, I introduced max_digits10 in Eigen::NumTraits. Also, current implementation for digits10 produced different result as std::numeric_limits [demo](https://godbolt.org/z/qe766WrME). I fixed digits10 as well. | |
### Additional information | |
<!--Any additional information you think is important.-->",Yingnan Wu,2023-07-26T16:02:53.140Z,NA,NA | |
1372 (https://gitlab.com/libeigen/eigen/-/merge_requests/1372),Fix problems with recent changes and Tensorflow in Power,"Fix problems with recent changes and Tensorflow in Power. | |
This includes issues with partial packets, cpu support, DataMappers not having stride() and bfloat16.",Chip Kerchner,2023-07-26T16:24:59.098Z,NA,NA | |
1376 (https://gitlab.com/libeigen/eigen/-/merge_requests/1376),Fix nullptr dereference issue in triangular product.,"Found by UBSAN testing, if matrix sizes are zero, we previously dereferenced a nullptr.",Antonio Sánchez,2023-07-27T22:10:22.465Z,NA,NA | |
1331 (https://gitlab.com/libeigen/eigen/-/merge_requests/1331),[SYCL-2020] Add test to validate SYCL in Eigen core.,This MR adds a test to validate some of the basic SYCL functionalities in Eigen core.,Alejandro Acosta,2023-07-28T15:45:09.713Z,NA,NA | |
1365 (https://gitlab.com/libeigen/eigen/-/merge_requests/1365),Add missing x86 pcasts,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Add some missing pcasts, namely: | |
- float->bool (SSE, AVX) | |
- int->double (SSE, AVX, AVX512) | |
- float->double (AVX512) | |
Also introduce simplified method for enabling pcasts with a default inheritable struct `vectorized_type_casting_traits`. Remove `_MM_FROUND_NO_EXC` bit from `_mm256_cvtps_ph` per https://gitlab.com/libeigen/eigen/-/merge_requests/1362. Clean up array_cwise (including very annoying ""condition always true"" warning in MSVC). | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-07-28T23:41:39.566Z,NA,NA | |
1377 (https://gitlab.com/libeigen/eigen/-/merge_requests/1377),Fix another UB access.,"This time for triangular solves. If the system is empty, we can't access element (0, 0).",Antonio Sánchez,2023-07-31T19:18:46.209Z,NA,NA | |
1378 (https://gitlab.com/libeigen/eigen/-/merge_requests/1378),Fix clang-tidy warning,"The existing code triggers the following warning: | |
""Forwarding reference passed to std::move(), which may unexpectedly cause lvalues to be moved; use std::forward() instead.""",Rasmus Munk Larsen,2023-07-31T21:26:29.178Z,NA,NA | |
1379 (https://gitlab.com/libeigen/eigen/-/merge_requests/1379),Fix nullptr dereference in SVD.,"If the bidiagonal is only have size 1, then the upper-diagonal may be | |
empty, resulting in a nullptr dereference when calling `&coeffRef(0)`. | |
Here we explicitly check before doing so, return `nullptr` explicitly | |
if the upper-diagonal is empty.",Antonio Sánchez,2023-08-01T16:33:17.633Z,NA,NA | |
1375 (https://gitlab.com/libeigen/eigen/-/merge_requests/1375),Add architecture definition files for Qualcomm Hexagon Vector Extension (HVX),"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
N/A | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This merge adds architecture-specific template instantiation for Qualcomm Hexagon Vector Extension (HVX). | |
1. add define of EIGEN_VECTORIZE and EIGEN_VECTORIZE_HVX for compiler build flag __HVX__ | |
2. add new HVX directory under src/Core/arch and add two architecture definitions file there. The architecture file is used with EIGEN_VECTORIZE_HVX flag | |
3. add a new Architecture::HVX target for used with EIGEN_VECTORIZE_HVX flag. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
The change is guarded by compile build flag so should only impact build for Qualcomm Hexagon DSP. There is no build and test support which requires Qualcomm Hexagon SDK.",cheng wang,2023-08-01T17:47:58.251Z,NA,NA | |
1380 (https://gitlab.com/libeigen/eigen/-/merge_requests/1380),Fix unaligned scalar alignment UB.,"It is undefined behavior to try to bind a scalar to a misaligned memory | |
address. We need to disable a test to avoid this, and put explicit | |
assertions in `MapBase` to ensure proper alignment.",Antonio Sánchez,2023-08-01T19:39:08.706Z,NA,NA | |
1381 (https://gitlab.com/libeigen/eigen/-/merge_requests/1381),fix boost mp test to refer to new svd tests,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-08-02T13:38:13.700Z,NA,NA | |
1382 (https://gitlab.com/libeigen/eigen/-/merge_requests/1382),Fix tensor stridedlinearbuffercopy,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #2706 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Avoid negative indices when calculating loop bounds by using division and multiplication (of sizes that are compile-time powers of two). Unsigned integers wrap around instead of going negative, which is defined behavior and probably why no compiler warnings were emitted. Good catch @rsimsek71! | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-08-03T20:36:43.000Z,NA,NA | |
1383 (https://gitlab.com/libeigen/eigen/-/merge_requests/1383),Add temporary macro to allow unaligned scalar UB.,To give us time to fix some TFLite failures that currently block the next Eigen update.,Antonio Sánchez,2023-08-15T15:58:41.907Z,NA,NA | |
1384 (https://gitlab.com/libeigen/eigen/-/merge_requests/1384),Add IWYU private pragmas to internal headers.,"Generated via recurseive sed based on looking for ""InternalHeaderCheck.h"": | |
``` | |
$ find Eigen -type f -exec sed -E -i 's|(#include.*InternalHeaderCheck.h"")|// IWYU pragma: private\n\1|' {} + | |
$ find unsupported -type f -exec sed -E -i 's|(#include.*InternalHeaderCheck.h"")|// IWYU pragma: private\n\1|' {} + | |
``` | |
This is to help tooling better determine which headers to suggest in auto-completion.",Antonio Sánchez,2023-08-21T16:25:23.255Z,NA,NA | |
1385 (https://gitlab.com/libeigen/eigen/-/merge_requests/1385),Rename plugin headers to .inc.,"These are non-standard headers, included in specific places within an existing Eigen namespace. | |
Changing the extension prevents tools from treating them as standard headers, preventing them | |
from being auto-suggested.",Antonio Sánchez,2023-08-21T16:26:13.486Z,NA,NA | |
1386 (https://gitlab.com/libeigen/eigen/-/merge_requests/1386),Fix arm32 float division and related bugs,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
ARM32 NEON intrinsics flush to zero. This is problematic for denormal input, and also for some very large input whose reciprocal is denormal (among other issues). This patch fixes the following: | |
- ARM32 has no vectorized float32 division, but it has reciprocal intrinsics. Currently, division is computing as `a / b = a * recip(b)`. If `b` is very large, then `precip(b)` is denormal, and is flushed to zero. This patch uses the following procedure: `a / b = f * a * reciprocal(f * b)` where `f = 0.25`. `f` is only used when `b` is very large, thus maintaining support for very small (normal) values of `b`. | |
- Increase reciprocal refinement iterations to 2. Currently, there is only 1 refinement step, which is insufficient for many applications (a particularly egregious example is `1.0 / 1.0 != 1.0f`). This fixes several floating point functions that rely on reasonably accurate `pdiv`. | |
- ARM32 has no vectorized sqrt, but has reciprocal sqrt intrinsics. Use these intrinsics instead of the generic implementation. Use two refinement steps. Minimize needless error handling while still handling edge cases correctly. | |
- Change the tests so that ARM32 doesn't attempt computations on denormal numbers (these will always fail), and don't check for correct results if the reference solution is denormal. | |
Fixes the following tests in cross ci testing: | |
- 35 - packetmath_1 (Child aborted) | |
- 49 - packetmath_15 (Child aborted) | |
- 247 - array_cwise_11 (Child aborted) | |
- 249 - array_cwise_12 (Child aborted) | |
- 251 - array_cwise_14 (Child aborted) | |
- 253 - array_cwise_16 (Child aborted) | |
- 258 - array_cwise_21 (Child aborted) | |
- 449 - qr_colpivoting_1 (Child aborted) | |
- 493 - eigensolver_selfadjoint_3 (Child aborted) | |
- 550 - jacobisvd_26 (Child aborted) | |
- 551 - jacobisvd_27 (Child aborted) | |
- 606 - bdcsvd_27 (Child aborted) | |
- 607 - bdcsvd_28 (Child aborted) | |
- 643 - geo_quaternion_1 (Child aborted) | |
https://gitlab.com/libeigen/eigen_ci_cross_testing/-/pipelines/944835667 | |
Also, I got rid of the sparse permutation test that counted the number of allocations for `P * alpha * M`. This only fails on arm32. I figure the test is bad, but I really have no idea why. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-08-29T00:36:08.196Z,NA,NA | |
1387 (https://gitlab.com/libeigen/eigen/-/merge_requests/1387),Unwind Block of Blocks,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
A backwards compatible version of https://gitlab.com/libeigen/eigen/-/merge_requests/1290 without the bulk of the improvements from @cantonios | |
Adds an explicit method to convert block of block expressions to a simple block. The implicit conversion operator is problematic as the unwinding of other block types, e.g. `Block<SparseMatrix>` is not yet supported. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-08-29T17:21:42.320Z,NA,NA | |
1388 (https://gitlab.com/libeigen/eigen/-/merge_requests/1388),Stage will not be ok if pardiso returned error,"### Reference issue | |
Fixes #2704 | |
### What does this implement/fix? | |
The solution implemented changes the lines of code related with whether a stage is ok or not. Now, instead of simply assigining a `true` value once it is executed, it forces that the stage will be ok if and only if `m_info == Eigen::Success`",Daniel Benedí García,2023-09-06T16:17:15.309Z,NA,NA | |
1389 (https://gitlab.com/libeigen/eigen/-/merge_requests/1389),New panel modes for GEMM MMA (real & complex).,"New panel modes for GEMM MMA (real & complex). Better register usage and pipeline. | |
Up to 2.84X faster for small matrices. | |
34% faster for F32 MMA real-only, 75% for F64 MMA real-only - large matrices. | |
48% faster for F32 MMA complex, 32% for F64 MMA complex - large matrices. | |
Up to 20% better performance for packing. | |
Some other fixes for various compilers.",Chip Kerchner,2023-09-06T20:03:46.921Z,NA,NA | |
1391 (https://gitlab.com/libeigen/eigen/-/merge_requests/1391),Export ThreadPool symbols from legacy header.,This silences some new clang include-cleaner warnings.,Antonio Sánchez,2023-09-10T20:56:20.839Z,NA,NA | |
1392 (https://gitlab.com/libeigen/eigen/-/merge_requests/1392),Fix call to static functions from device by adding EIGEN_DEVICE_FUNC attribute to run methods,"I noticed `operator * ` was not working as expected on cuda device functions. | |
The reason was the missing `EIGEN_DEVICE_FUNC ` in `static EIGEN_DEVICE_FUNC ResultType run`.",François Girinon,2023-09-13T04:16:53.585Z,NA,NA | |
1330 (https://gitlab.com/libeigen/eigen/-/merge_requests/1330),[SYCL-2020] Enabling half precision support for SYCL.,This MR add support for half precision type in SYCL-2020. The implementation uses `Eigen::half` as an abstraction for `cl::sycl::half`. Math operations and packets know to handle the conversions between `Eigen::half` and `cl::sycl::half` whenever necessary.,Alejandro Acosta,2023-09-13T16:30:18.014Z,NA,NA | |
1394 (https://gitlab.com/libeigen/eigen/-/merge_requests/1394),Fix extra semicolon in XprHelper,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
N/A | |
### What does this implement/fix? | |
Fixes an extra semicolon in XprHelper, which causes compilation errors when compiled using `-Wextra-semi` compiler flag. | |
``` | |
In file included from ../../third_party/eigen/Eigen/Dense:1: | |
In file included from ../../third_party/eigen/Eigen/Core:173: | |
../../third_party/eigen/Eigen/src/Core/util/XprHelper.h:828:81: error: extra ';' after member function definition [-Werror,-Wextra-semi] | |
static constexpr bool is_inner_panel(bool inner_panel) { return inner_panel; }; | |
^ | |
1 error generated | |
``` | |
### Additional information | |
<!--Any additional information you think is important.-->",Kevin,2023-09-14T18:12:10.565Z,NA,NA | |
1396 (https://gitlab.com/libeigen/eigen/-/merge_requests/1396),Fix sparse triangular view iterator,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
The sparse triangular iterator had its `row()` and `col()` functions commented out (over 8 years ago), seemingly by accident. This causes bizarre results, and even segfaults. This bug is as old as `3.3.4`!!! | |
Thanks to sarah8389 on the discord channel for pointing this out. | |
``` | |
#include <Eigen/SparseCore> | |
#include <iostream> | |
using namespace Eigen; | |
auto main() -> int { | |
using Mat = SparseMatrix<double, ColMajor, int>; | |
Mat m(4, 4); | |
m.setIdentity(); | |
std::cout << m.triangularView<UnitLower>().toDense(); | |
} | |
prints | |
0 0 0 1 | |
1 0 0 0 | |
0 1 0 0 | |
0 0 1 0 | |
``` | |
### Additional information | |
<!--Any additional information you think is important.--> | |
https://godbolt.org/z/je5q6P6aY",Charles Schlosser,2023-10-05T17:13:38.632Z,NA,NA | |
1397 (https://gitlab.com/libeigen/eigen/-/merge_requests/1397),Consolidate multiple implementations of divup/div_up/div_ceil.,Consolidate multiple implementations of divup/div_up/div_ceil.,Rasmus Munk Larsen,2023-10-10T17:17:00.519Z,NA,NA | |
1398 (https://gitlab.com/libeigen/eigen/-/merge_requests/1398),Eliminate use of _res.,"It conflicts with a leaked macro from resolv.h, leading to compile errors. | |
Fixes #2725.",Antonio Sánchez,2023-10-16T19:56:55.156Z,NA,NA | |
1393 (https://gitlab.com/libeigen/eigen/-/merge_requests/1393),[ROCm] Replace HIP_PATH with ROCM_PATH for rocm 6.0,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
In preparation for the upcoming 6.0 release, this commit replaces the `HIP_PATH` cmake cache variable with a new cache variable `ROCM_PATH` to reflect changes in the ROCm directory structure: | |
https://rocm.docs.amd.com/en/docs-5.3.3/understand/file_reorg.html | |
By default, `ROCM_PATH` points at the root of the ROCm installation (e.g. /opt/rocm).",Ioannis Assiouras,2023-10-16T20:56:37.368Z,NA,NA | |
1400 (https://gitlab.com/libeigen/eigen/-/merge_requests/1400),Pass div_ceil arguments by value.,"Otherwise, it leads to odr-usage errors, since we use this function | |
with internal constants like | |
``` | |
static const Index l0_size = 4; | |
``` | |
which don't currently have separate definitions. | |
Passing by value should be okay since we always call with built-in | |
integer inputs.",Antonio Sánchez,2023-10-17T18:46:20.091Z,NA,NA | |
1401 (https://gitlab.com/libeigen/eigen/-/merge_requests/1401),fix typo in comment,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Anatoly Borisov,2023-10-18T18:32:32.904Z,NA,NA | |
1402 (https://gitlab.com/libeigen/eigen/-/merge_requests/1402),Work around MSVC issue in Block XprType.,"MSVC seems to get confused and mixes up the attributes, particularly RowMajor, | |
in some odd and complex circumstances. Removing the dependent XprType typdef | |
seems to fix this.",Antonio Sánchez,2023-10-19T22:02:04.228Z,NA,NA | |
1404 (https://gitlab.com/libeigen/eigen/-/merge_requests/1404),Avoid building docs if cross-compiling or not top level.,Fixes #2736.,Antonio Sánchez,2023-10-23T17:55:02.293Z,NA,NA | |
1399 (https://gitlab.com/libeigen/eigen/-/merge_requests/1399),Disable denorm deprecation warnings in MSVC C++23.,Fixes #2713.,Antonio Sánchez,2023-10-23T17:56:05.007Z,NA,NA | |
1403 (https://gitlab.com/libeigen/eigen/-/merge_requests/1403),Fixes #2735: Component-wise cbrt,"### Reference issue | |
The MR implements the feature request in #2735, to add cbrt to the supported component-wise operations to arrays and matrices. | |
### What does this implement/fix? | |
Following the current implementation of the sqrt/cWiseSqrt, I added the functions and structs to enable cbrt calculations using std::cbrt, as well as the mkl vml implementation of cbrt. | |
### Additional information | |
I added tests for cbrt to the existing cWise test cases, as well as adding some documentation. The main thing that I am unsure about is the functor_traits specialization for the scalar_cbrt_op | |
undefined | |
template <typename Scalar> | |
struct scalar_cbrt_op { | |
EIGEN_DEVICE_FUNC inline const Scalar operator()(const Scalar& a) const { return numext::cbrt(a); } | |
}; | |
template <typename Scalar> | |
struct functor_traits<scalar_cbrt_op<Scalar> > { | |
enum { Cost = 5 * NumTraits<Scalar>::MulCost, PacketAccess = false }; | |
}; | |
For this I copied from the acosh functor_traits since it also doesn't have built-in architecture support (I think...). I kept the Cost as 5 \* NumTraits::MulCost, but I don't know if this is correct (or if it even matters). | |
Lastly, I am having trouble running all of the tests and building all the docs on my local (windows) machine, so I'm hoping that the pipelines will point out anything I missed.",Kyle Macfarlan,2023-10-25T03:06:15.211Z,NA,NA | |
1406 (https://gitlab.com/libeigen/eigen/-/merge_requests/1406),TensorReduction: replace divup with div_ceil,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
So many deprecation warnings. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-10-25T16:44:38.696Z,NA,NA | |
1407 (https://gitlab.com/libeigen/eigen/-/merge_requests/1407),fix Wshorten-64-to-32 warnings in div_ceil,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Lots of warnings. https://gitlab.com/libeigen/eigen_ci_cross_testing/-/jobs/5387131494 | |
Is the implicit widening conversion safe? | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-10-27T15:52:04.715Z,NA,NA | |
1410 (https://gitlab.com/libeigen/eigen/-/merge_requests/1410),Fix int overflow causing cxx11_tensor_gpu_1 to fail.,"It's failing in div_ceil, where the first argument overflows an `int` to become negative, | |
triggering an assertion and causing the test to crash. By explicitly using `DenseIndex` | |
and adding appropriate casts when necessary to avoid implicit conversions, we can avoid | |
the error.",Antonio Sánchez,2023-11-06T17:10:17.267Z,NA,NA | |
1411 (https://gitlab.com/libeigen/eigen/-/merge_requests/1411),Fix typo to allow nomalloc test to pass on AVX512.,Silly typo: `EIGEN_NO_RUNTIME_MALLOC` -> `EIGEN_RUNTIME_NO_MALLOC`,Antonio Sánchez,2023-11-06T18:58:45.208Z,NA,NA | |
1412 (https://gitlab.com/libeigen/eigen/-/merge_requests/1412),"Backport ""disambiguate overloads for empty index list"" to 3.4 branch","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
Fix #2745 by backporting https://gitlab.com/libeigen/eigen/-/merge_requests/765 to 3.4 branch. | |
### What does this implement/fix? | |
It fixes the compilation error documented in https://gitlab.com/libeigen/eigen/-/issues/2745 . | |
### Additional information | |
OpenCV is a widely used project and package, so it would be great to get this merged before a new release of Eigen is done and is packaged by distributions and package managers.",Silvio Traversaro,2023-11-10T04:03:13.826Z,NA,NA | |
1408 (https://gitlab.com/libeigen/eigen/-/merge_requests/1408),Generalize parallel GEMM implementation in Core to work with ThreadPool in addition to OpenMP.,"This generalizes the implementation of parallel dense matrix multiplication in Eigen Core to work with Eigen::ThreadPool, in addition to OpenMP. | |
Example code: | |
``` | |
#define EIGEN_GEMM_THREADPOOL | |
#include <Eigen/Core> | |
int num_threads = 8; | |
Eigen::ThreadPool pool(num_threads); | |
Eigen::setGemmThreadPool(&pool); | |
Eigen::MatrixXf u, v, x; | |
v.setOnes(n, n); u.setOnes(n, n); x.setOnes(n, n); | |
x.noalias() = v * u; | |
``` | |
Initial measurements are in https://gitlab.com/libeigen/eigen/-/snippets/3618686 | |
Eventually, we want to tie this into the device framework in https://gitlab.com/libeigen/eigen/-/merge_requests/1395, such that you could achieve the same effect with | |
``` | |
ThreadPool pool(num_threads); | |
SimpleThreadPoolDevice device(pool); | |
x.device(device).noalias() = u * v; | |
``` | |
Just to make it clear: The purpose of this MR is not to _improve_ the parallel GEMM implementation in Core, which is still inferior to the parallel tensor contraction. The purpose is to make it available on platforms without OpenMP. Below is a strong scaling plot for `n=m=k=4096`. This was measured on my Lenovo P920 workstation, which sports 2 sockets x 18 physical cores x 2 threads (`Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz`) running Linux. | |
",Rasmus Munk Larsen,2023-11-10T17:42:31.115Z,NA,NA | |
1413 (https://gitlab.com/libeigen/eigen/-/merge_requests/1413),traits<Ref>::match: use correct strides,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
The construction of Ref<T, Options, Stride<0, 0>> from an instance with contiguous memory layout (plain object T, Map<T, Options>) | |
- does not compile for mutable T, and | |
- copies T into Ref::m_object, although this is unnecessary. | |
The reason for both is that internal::traits<Ref<PlainObjectType_, Options_, StrideType_>>::match simply uses StrideType_::InnerStrideAtCompileTime and StrideType_::OuterStrideAtCompileTime instead of the adapted values already computed in its base class. | |
With this fix, Ref<T, Options, Stride<0, 0>> can be created from an object with contiguous memory layout, and without copying it.",wilfried.karel,2023-11-11T14:25:16.455Z,NA,NA | |
1415 (https://gitlab.com/libeigen/eigen/-/merge_requests/1415),Link pthread for product_threaded test,Tests are currently broken at HEAD.,Antonio Sánchez,2023-11-13T19:49:44.621Z,NA,NA | |
1416 (https://gitlab.com/libeigen/eigen/-/merge_requests/1416),Fix Wshorten-64-to-32 warning in gemm parallelizer,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-11-14T13:51:40.919Z,NA,NA | |
1417 (https://gitlab.com/libeigen/eigen/-/merge_requests/1417),Fix a bug in commit 76e8c0455396446f8166c798da5efe879e010bdc:,"When not parallelizing, `getNbThreads()` must return 1.",Rasmus Munk Larsen,2023-11-15T21:45:38.173Z,NA,NA | |
1421 (https://gitlab.com/libeigen/eigen/-/merge_requests/1421),Gemv microoptimization,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Explicitly defining the loop bounds for the unrolled stages that increment by `PacketSize` fixes aggressive loop optimization compiler warnings. I learned this trick to minimize the overhead of rounding down the nearest power of two. Dividing and multiplying by a compile-time power of two entails a left and right shift. This can be further optimized to a single bitwise and. | |
Normally this optimization is automatically applied by the compiler -- if the type is an unsigned integer. `Index` is a signed integer, so the compiler plays it safe. Our indices are always non-negative, so we can skip this check. | |
https://godbolt.org/z/a6drKb6W8 | |
I wanted to address this fix before cherry picking it to 3.4. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-11-20T17:26:41.807Z,NA,NA | |
1422 (https://gitlab.com/libeigen/eigen/-/merge_requests/1422),Fix (u)int64_t->float conversion on arm,"Fix (u)int64_t->float conversion on arm | |
The neon version of the conversion code was narrowing the 64 bit integer | |
to 32 bits before converting to float. This meant that any value larger | |
that 32 bits was truncated, even though both the source and destination | |
types are perfectly capable of represeting the value. | |
The fix consists of first converting the (64 bit) integer value to a | |
64-bit float value (double), and only then narrowing to 32-bit (single | |
precision) float.",Pavel Labath,2023-11-21T16:09:13.228Z,NA,NA | |
1424 (https://gitlab.com/libeigen/eigen/-/merge_requests/1424),Update file GeneralMatrixVector.h,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
https://godbolt.org/z/W9ahfcfoa | |
This should convey all performance benefits if `PacketSize` is a power of two, and still be optimal in the unlikely event its not. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-11-21T19:50:48.412Z,NA,NA | |
1423 (https://gitlab.com/libeigen/eigen/-/merge_requests/1423),Static asserts to check for matching NumDimensions,"### What does this implement/fix? | |
The two Tensor constructors: | |
``` | |
template<typename OtherDerived> | |
EIGEN_DEVICE_FUNC | |
EIGEN_STRONG_INLINE Tensor(const TensorBase<OtherDerived, ReadOnlyAccessors>& other) | |
... | |
template<typename OtherDerived> | |
EIGEN_DEVICE_FUNC | |
EIGEN_STRONG_INLINE Tensor(const TensorBase<OtherDerived, WriteAccessors>& other) | |
``` | |
do not currently make any checks about the dimensions of `OtherDerived`. This means that code such as: | |
``` | |
void a_function(Eigen::Tensor<float, 2> const &a); | |
int main() { | |
Eigen::Tensor<float, 3> b; | |
a_function(b); | |
} | |
``` | |
compiles - see: https://godbolt.org/z/TT71bxzbh. However, this code will almost certainly crash at runtime due to mismatched dimensions. | |
This MR adds `EIGEN_STATIC_ASSERT()` to both constructors to check the number of dimensions. | |
I am not sure if `operator=` in Tensor.h should have the same checks. In my own codebase, adding the above was sufficient to catch dimension errors even in assignment expressions.",Tobias Wood,2023-11-22T23:29:29.598Z,NA,NA | |
1425 (https://gitlab.com/libeigen/eigen/-/merge_requests/1425),Fix typecasting for arm32,Was broken by !1422.,Antonio Sánchez,2023-11-23T00:47:51.775Z,NA,NA | |
1419 (https://gitlab.com/libeigen/eigen/-/merge_requests/1419),Ensure that mc is not smaller than Traits::nr,"This Fixes #2747 for the test case in that issue. | |
### Reference issue | |
Fixes #2747 | |
### What does this implement/fix? | |
Ensures that mc >= Traits::nr before checking that mc is a multiple of Traits::nr",Drew Lewis,2023-11-28T22:50:18.109Z,NA,NA | |
1429 (https://gitlab.com/libeigen/eigen/-/merge_requests/1429),Apply clang-format,"### What does this implement/fix? | |
Apply clang-format to the whole code base in one commit. I ran this precise command, and I had to run it several times `find Eigen unsupported/Eigen -name ""*"" ! -name ""*.txt"" -type f -print0 | xargs -0 -n 1 clang-format -i --verbose`. | |
This is because some of the files have long macro definitions that need to split over multiple lines, and clang-format will only split one line on each run.",Tobias Wood,2023-12-01T00:56:36.010Z,NA,NA | |
1430 (https://gitlab.com/libeigen/eigen/-/merge_requests/1430),Add .git-blame-ignore-revs file.,"Contains the large clang-format change in !1429. | |
To apply locally, use | |
``` | |
git config blame.ignoreRevsFile .git-blame-ignore-revs | |
```",Antonio Sánchez,2023-12-01T01:05:51.577Z,NA,NA | |
1431 (https://gitlab.com/libeigen/eigen/-/merge_requests/1431),Fix scalar_logistic_function overflow for complex inputs.,"For complex inputs, direct comparison to `inf` often fails for large | |
inputs because `(inf + inf*j) != (inf + 0*j)`. We instead need to | |
extract the real value and compare _that_ to `inf`. | |
Updated the array_cwise test to catch this issue.",Antonio Sánchez,2023-12-05T18:21:05.414Z,NA,NA | |
1432 (https://gitlab.com/libeigen/eigen/-/merge_requests/1432),"Clang-format tests, examples, libraries, benchmarks, etc.","Did a large clang-format-17 on all files, then manually reverted any weird changes and the following: | |
- cmake-files | |
- doxygen files | |
- text files | |
- bash scripts | |
- misc non-source files",Antonio Sánchez,2023-12-05T21:22:56.342Z,NA,NA | |
1433 (https://gitlab.com/libeigen/eigen/-/merge_requests/1433),Add formatting change to .git-blame-ignore-revs,From !1432,Antonio Sánchez,2023-12-05T21:56:32.668Z,NA,NA | |
1434 (https://gitlab.com/libeigen/eigen/-/merge_requests/1434),Fix CUDA syntax error introduced by clang-format.,NA,Rasmus Munk Larsen,2023-12-05T22:21:47.658Z,NA,NA | |
1435 (https://gitlab.com/libeigen/eigen/-/merge_requests/1435),Protect kernel launch syntax from clang-format,"Clang-format (at least versions 13-18) introduces spaces in the second set of kernel launch brackets: | |
``` | |
run_on_gpu_meta_kernel<<<Grids, Blocks> > >(ker, n, d_in, d_out); | |
``` | |
which causes a syntax error. We need to protect this. All other instances in the codebase seem to be unaffected.",Antonio Sánchez,2023-12-05T22:42:53.178Z,NA,NA | |
1436 (https://gitlab.com/libeigen/eigen/-/merge_requests/1436),Add internal ctz/clz implementation.,"This will be useful for generating random numbers, and for detecting | |
pointer alignment.",Antonio Sánchez,2023-12-11T21:03:10.699Z,NA,NA | |
1439 (https://gitlab.com/libeigen/eigen/-/merge_requests/1439),fix msvc clz,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
`_BitScanReverse` does not return the leading zeros. It returns the **index** of the first (MSB to LSB) set bit! `_BitScanForward/ctz` does the same thing. | |
For example: `0001000000010000`. The number of leading zeros is `3`. The index of the first set (MSB to LSB) bit is `12`. `16-1-12 == 3`. By contrast, the number of trailing zeros is `4`. The index of the first set bit (LSB to MSB) is `4`. | |
Clear as mud? | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-12-13T18:51:16.208Z,NA,NA | |
1428 (https://gitlab.com/libeigen/eigen/-/merge_requests/1428),Set up clang-format in CI,"### What does this implement/fix? | |
Add a stage to the CI to check the formatting is correct. !1429 needs to be merged first.",Tobias Wood,2023-12-13T21:08:08.225Z,NA,NA | |
1441 (https://gitlab.com/libeigen/eigen/-/merge_requests/1441),Fix up clang-format CI.,"We want non-interactive mode, and seem to need `clang-format` | |
installed as well.",Antonio Sánchez,2023-12-14T00:15:12.358Z,NA,NA | |
1446 (https://gitlab.com/libeigen/eigen/-/merge_requests/1446),Remove c++11 from ctz/clz,Accidentally patched c++11 code.,Antonio Sánchez,2023-12-21T00:45:36.822Z,NA,NA | |
1409 (https://gitlab.com/libeigen/eigen/-/merge_requests/1409),Fix compiler warnings in 3.4,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Suppresses numerous compiler warnings that had easy fixes and includes major bug fixes in `Memory.h` | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2023-12-21T00:57:22.514Z,NA,NA | |
1448 (https://gitlab.com/libeigen/eigen/-/merge_requests/1448),Fix MSAN failures.,Two instances of use of uninitialized memory - matrices weren't initialized.,Antonio Sánchez,2023-12-22T03:18:47.518Z,NA,NA | |
1449 (https://gitlab.com/libeigen/eigen/-/merge_requests/1449),Fix GPU+clang+asan.,"Function pointers can result in illegal memory accesses on device, | |
but lambdas work.",Antonio Sánchez,2024-01-04T17:29:37.964Z,NA,NA | |
1447 (https://gitlab.com/libeigen/eigen/-/merge_requests/1447),Fix various asan errors.,"Fix various asan errors. | |
- `ComplexShur`: in same edge-cases, `iu` can have a value of 1, leading to an index-out-of-bounds error (e.g. `eigensolver_complex_2 s1703113117`) | |
- `thread_non_blocking_thread_pool`: order of destruction left destroyed local references in the thread-pool causing use-after-scope errors | |
- `TensorForcedEval`: internal temporary buffer wasn't cleaned up (e.g. `cxx11_tensor_block_evaluator_6` ) | |
Also silenced an unused variable warning in `MarketIO`. | |
After these changes, all our CPU tests seem to pass with clang asan/ubsan.",Antonio Sánchez,2024-01-08T00:13:17.838Z,NA,NA | |
1450 (https://gitlab.com/libeigen/eigen/-/merge_requests/1450),Clean up stableNorm,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Suppresses irritating maybe-uninitialized warning on gcc. | |
`../Eigen/src/Core/CoreEvaluators.h:1004:48: warning: '*((void*)&<anonymous> +32)' may be used uninitialized in this function [-Wmaybe-uninitialized]` | |
Breaking the computation into blocks doesn't appear to add value, and appears to make an extra copy of the entire expression. The bulk of the work is this line `Scalar maxCoeff = bl.cwiseAbs().maxCoeff();` which is already lazily evaluated, vectorized, etc. The warning is probably caused by the object in `const Ref` which is not initialized. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2024-01-08T23:28:41.973Z,NA,NA | |
1445 (https://gitlab.com/libeigen/eigen/-/merge_requests/1445),Add factor getters to Cholmod LLT/LDLT,"The Cholmod LLT and LDLT system solvers don't expose the L, Lᵀ, or D factors. This MR fixes that.",Tyler Veness,2024-01-09T02:14:05.756Z,NA,NA | |
1452 (https://gitlab.com/libeigen/eigen/-/merge_requests/1452),Doc: Fix Basic slicing examples minor issues,"### What does this implement/fix? | |
Minor issues in `doc/TutorialSlicingIndexing.dox` basic slicing examples",Arnaud Billon,2024-01-09T18:17:08.207Z,NA,NA | |
1438 (https://gitlab.com/libeigen/eigen/-/merge_requests/1438),Improve documentation of SparseLU,"The documentation of SparseLU was quite sparse. | |
Not really clear the relation between `compute`, `analyzePattern` and `factorize`, make it clear now. | |
+ comestic changes and some fix like the doc of `factorize` talking about an internal `info` not reachable by user.",Nicolas Cornu,2024-01-09T18:18:06.486Z,NA,NA | |
1453 (https://gitlab.com/libeigen/eigen/-/merge_requests/1453),Fix TensorForcedEval in the case of the evaluator being copied.,"Copying the evaluator ends up copying the temporary buffer, which means | |
we can't actually deallocate/reallocate it, or it will lead to double-free | |
or memory access issues. If the buffer is to be shared between evaluators, | |
it must be a `shared_ptr`. | |
Existing usages assume the temporary buffer is populated _after_ copying, | |
so we can't just restrict the temporary buffer per instance.",Antonio Sánchez,2024-01-10T00:45:40.503Z,NA,NA | |
1455 (https://gitlab.com/libeigen/eigen/-/merge_requests/1455),[ROCm] MI300 related test support,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### What does this implement/fix? | |
MI300 related test support for Eigen | |
### Additional information | |
MI300 series contain gfx940, gfx941 and gfx942",Chao Chen,2024-01-11T23:46:26.269Z,NA,NA | |
1456 (https://gitlab.com/libeigen/eigen/-/merge_requests/1456),check pointers before freeing,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
fixes #2758 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
This adds a probably negligible cost to free'ing allocated memory, with the advantages described in the referenced issue. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2024-01-12T06:09:46.991Z,NA,NA | |
1458 (https://gitlab.com/libeigen/eigen/-/merge_requests/1458),Fix stableNorm when input is zero-sized.,Edge-case breakage was accidentally introduced with the change in !1450.,Antonio Sánchez,2024-01-16T18:31:07.544Z,NA,NA | |
1443 (https://gitlab.com/libeigen/eigen/-/merge_requests/1443),Update CI with testing framework from eigen_ci_cross_testing.,"Merging in test CI from https://gitlab.com/libeigen/eigen_ci_cross_testing. | |
Now that all tests are green, we should move it to this main repo. | |
Tested on the main Eigen repo here: https://gitlab.com/libeigen/eigen/-/pipelines/1106209054",Antonio Sánchez,2024-01-19T17:55:10.610Z,NA,NA | |
1459 (https://gitlab.com/libeigen/eigen/-/merge_requests/1459),add missing constexpr qualifier,NA,Nuno Gonçalves,2024-01-19T18:49:55.011Z,NA,NA | |
1457 (https://gitlab.com/libeigen/eigen/-/merge_requests/1457),Add asserts for .chip,"### What does this implement/fix? | |
This adds static and dynamic asserts that the chipping dimension and offset are valid. | |
### Additional information | |
I'm very open to comments about whether this is the best approach, and what tests (if any) should be added.",Tobias Wood,2024-01-19T19:18:19.813Z,NA,NA | |
1460 (https://gitlab.com/libeigen/eigen/-/merge_requests/1460),"Revert ""Clean up stableNorm""","Revert ""Clean up stableNorm"" | |
This reverts commit a1a96fafde4d8d6e30a2677b669a31526b3da78d. | |
Leads to performance regression on large vectors which do not fit in cache. | |
```cpp | |
#include <Eigen/Geometry> | |
#include <benchmark/benchmark.h> | |
void BM_StableNormVector(benchmark::State& state) { | |
size_t offset = state.range(0); | |
size_t n = state.range(1); | |
Eigen::VectorXf vec = Eigen::VectorXf::Random(n + offset); | |
float norm; | |
for (auto _ : state) { | |
benchmark::DoNotOptimize(norm = vec.segment(offset, n).stableNorm()); | |
} | |
} | |
void BM_StableNormMatrix(benchmark::State& state) { | |
size_t offset = state.range(0); | |
size_t n = state.range(1); | |
Eigen::MatrixXf vec = Eigen::MatrixXf::Random(n + offset, n + offset); | |
float norm; | |
for (auto _ : state) { | |
benchmark::DoNotOptimize(norm = vec.block(offset, offset, n, n).stableNorm()); | |
} | |
} | |
BENCHMARK(BM_StableNormVector)->ArgsProduct({ | |
benchmark::CreateDenseRange(0, 7, /*step=*/1), | |
benchmark::CreateRange(1, 1<<26, /*multi=*/2)}); | |
BENCHMARK(BM_StableNormMatrix)->ArgsProduct({ | |
benchmark::CreateDenseRange(0, 7, /*step=*/1), | |
benchmark::CreateRange(1, 1<<15, /*multi=*/2)}); | |
```",Antonio Sánchez,2024-01-19T20:22:48.716Z,NA,NA | |
1461 (https://gitlab.com/libeigen/eigen/-/merge_requests/1461),Fix unused warnings in failtest.,NA,Antonio Sánchez,2024-01-19T23:57:00.415Z,NA,NA | |
1462 (https://gitlab.com/libeigen/eigen/-/merge_requests/1462),Allow specifying a temporary directory for fileio outputs.,"This allows us to run these tests on systems that cannot write to the current directory. | |
Tested on both windows and linux.",Antonio Sánchez,2024-01-20T00:55:15.897Z,NA,NA | |
1463 (https://gitlab.com/libeigen/eigen/-/merge_requests/1463),"Revert ""Add asserts for .chip""",Broken tests,Antonio Sánchez,2024-01-20T05:14:42.834Z,NA,NA | |
1444 (https://gitlab.com/libeigen/eigen/-/merge_requests/1444),[Compressed Storage] Use smaller type of Index & StorageIndex for determining maximum size during resize.,"### What does this implement/fix? | |
Eigen::SPQR sets the `StorageIndex` type always to [`SuiteSparse_long`][1]. | |
When `Eigen::Index` is set to a smaller integer type (eg `int32_t` via `-DEIGEN_DEFAULT_DENSE_INDEX_TYPE=int32_t`), | |
then the call to `std::min<Index>(NumTraits<StorageIndex>::highest())` will overflow | |
to a negative value which then results in `internal::throw_std_bad_alloc()` being called. | |
[1]: https://gitlab.com/libeigen/eigen/-/blob/master/Eigen/src/SPQRSupport/SuiteSparseQRSupport.h?ref_type=heads#L75 | |
The proposed change uses the smaller index type to query `highest()`. | |
### Additional information | |
Built and ran Eigen unit tests. Additionally, this fix also works in our codebase.",Andreas Forster,2024-01-22T00:35:32.548Z,NA,NA | |
1451 (https://gitlab.com/libeigen/eigen/-/merge_requests/1451),"SPQR: Fix build error, Index/StorageIndex mismatch.","### What does this implement/fix? | |
This fixes a build error (Apple clang version 15.0.0 (clang-1500.1.0.2.5)) | |
in `SPQR::compute()` with `_MatrixType=SparseMatrix<double>`. | |
Before this change, the call to SuiteSparseQR() passed a mix of StorageIndex | |
and Index to function parameters of the same templated type (`Int`), which can | |
apparently fail even if `long` and `long long` have the same underlying type. | |
### Additional information | |
**Note:** This works in my project, but I'm having trouble building/running the Eigen SPQR tests locally, hope they're enabled in CI :smile: | |
Abbreviated compiler diagnostic for reference: | |
``` | |
In file included from /.../Eigen/SPQRSupport:36: | |
/.../Eigen/src/SPQRSupport/SuiteSparseQRSupport.h:146:14: error: no matching function for call to 'SuiteSparseQR' | |
m_rank = SuiteSparseQR<Scalar>(m_ordering, pivotThreshold, col, &A, &m_cR, &m_E, &m_H, &m_HPinv, &m_HTau, &m_cc); | |
^~~~~~~~~~~~~~~~~~~~~ | |
/.../CoMISo/NSolver/NewtonSolver.cc:879:20: note: in instantiation of member function 'Eigen::SPQR<Eigen::SparseMatrix<double>>::compute' requested here | |
spqr_solver_.compute(_KKT); | |
^ | |
/opt/homebrew/include/SuiteSparseQR.hpp:501:55: note: candidate template ignored: deduced conflicting types for parameter 'Int' ('Index' (aka 'long') vs. 'StorageIndex' (aka 'long long')) | |
template <typename Entry, typename Int = int64_t> Int SuiteSparseQR | |
^ | |
/opt/homebrew/include/SuiteSparseQR.hpp:485:55: note: candidate function template not viable: requires 9 arguments, but 10 were provided | |
template <typename Entry, typename Int = int64_t> Int SuiteSparseQR | |
[...] | |
````",Martin Heistermann,2024-01-22T17:37:37.738Z,NA,NA | |
1466 (https://gitlab.com/libeigen/eigen/-/merge_requests/1466),Chipping Asserts v2,"### What does this implement/fix? | |
Asserts on the dimension index for chipping operations. | |
I removed the check on the tensor dimensions as I understand expressions do not store their dimensions, and hence I don't think it is currently possible to check in the general case.",Tobias Wood,2024-01-22T18:08:24.523Z,NA,NA | |
1454 (https://gitlab.com/libeigen/eigen/-/merge_requests/1454),Add half and quarter vector support to HVX architecture,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Since HVX uses 128-byte (1024-bit) vector register, full-size vectorization cannot benefit data size less than 128-byte. This change allows to use half and quarter of the HVX vector for vectorization through the ""half"" type in ""packet_traits"" and ""unpacket_traits"". For small matrix multiplication (matrix size ranging from 8 to 31 elements of single precision float), this change can get 1.37X-3.1X speedup on Snapdragon XR2 Gen 2. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
The change is only on HVX architecture specific packet math file which should not impact other architectures.",Cheng Wang,2024-01-22T21:23:21.810Z,NA,NA | |
1467 (https://gitlab.com/libeigen/eigen/-/merge_requests/1467),Fix compile-time error caused by chip static asserts,"Introduced with recent compile-time checks on !1466. The test previously had a runtime check on the dimension, but with static assertions, this now failed at compile-time. | |
/cc @spinicist",Antonio Sánchez,2024-01-22T21:40:57.238Z,NA,NA | |
1469 (https://gitlab.com/libeigen/eigen/-/merge_requests/1469),Remove explicit specialization of member function.,"Even though clang allows it, it's against the standard and breaks gcc/msvc.",Antonio Sánchez,2024-01-23T16:40:44.491Z,NA,NA | |
1470 (https://gitlab.com/libeigen/eigen/-/merge_requests/1470),Formatting.,NA,Antonio Sánchez,2024-01-23T16:56:28.486Z,NA,NA | |
1468 (https://gitlab.com/libeigen/eigen/-/merge_requests/1468),Fix arm32 issues.,"1) `fpclassify` is only properly defined for float/double, not Eigen::half - so replace with explicit check for values in subnormal range. | |
2) arm32's mlaq is not a true FMA, so accuracy is lost in range reduction - use alternate path.",Antonio Sánchez,2024-01-23T22:04:56.766Z,NA,NA | |
1471 (https://gitlab.com/libeigen/eigen/-/merge_requests/1471),LAPACK CPU time functions.,Followed the naming convention of LAPACK for these configurable files.,Antonio Sánchez,2024-01-23T23:58:58.761Z,NA,NA | |
1477 (https://gitlab.com/libeigen/eigen/-/merge_requests/1477),Remove simple relicense script.,"It's no longer necessary, and nothing more than a search-and-replace call.",Antonio Sánchez,2024-01-25T05:50:37.720Z,NA,NA | |
1473 (https://gitlab.com/libeigen/eigen/-/merge_requests/1473),Update documentation of lapack second/dsecnd.,NA,Antonio Sánchez,2024-01-25T17:51:39.899Z,NA,NA | |
1478 (https://gitlab.com/libeigen/eigen/-/merge_requests/1478),Fix bug in checking subnormals.,Silly comparison mistake.,Antonio Sánchez,2024-01-25T17:52:09.395Z,NA,NA | |
1481 (https://gitlab.com/libeigen/eigen/-/merge_requests/1481),Fix CI for clang-6 when cross-compiled.,"We need consistent versions of GLIBC, otherwise tests fail with: | |
``` | |
lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.29' not found | |
```",Antonio Sánchez,2024-01-27T05:13:21.804Z,NA,NA | |
1479 (https://gitlab.com/libeigen/eigen/-/merge_requests/1479),Fix busted formatting in Eigen::Tensor README.md.,It appears the documentation markdown for Eigen::Tensor was unintentionally autoformatted in some way that broke a lot of the markdown. This MR fixes it.,Pascal Getreuer,2024-01-29T08:12:44.321Z,NA,NA | |
1475 (https://gitlab.com/libeigen/eigen/-/merge_requests/1475),Remove MoreVectorization.,"The `pasin` implementation has already been generalized and moved to GenericPacketMath, | |
so including this just leads to duplicate definitions and ODR violations.",Antonio Sánchez,2024-01-29T18:48:29.054Z,NA,NA | |
1483 (https://gitlab.com/libeigen/eigen/-/merge_requests/1483),Use stableNorm in ComplexEigenSolver.,"This doesn't severly impact runtime, and leads to more stable results. | |
We _might_ want to _always_ use `stableNorm()` generally for algorithm | |
internals in Eigen. | |
Fixes #2773.",Antonio Sánchez,2024-01-29T23:46:24.199Z,NA,NA | |
1474 (https://gitlab.com/libeigen/eigen/-/merge_requests/1474),Remove Skyline.,"It doesn't build, hasn't for years, and would need a complete rewrite to get it to build again. | |
It's also not tested at all.",Antonio Sánchez,2024-01-30T00:13:18.357Z,NA,NA | |
1482 (https://gitlab.com/libeigen/eigen/-/merge_requests/1482),Fix preshear transformation.,"Fixes #2777. The `preshear` function seems to have always used an invalid constructor | |
internally, and has been broken for a while. Fixed the implementation and added a test.",Antonio Sánchez,2024-01-30T06:37:34.491Z,NA,NA | |
1414 (https://gitlab.com/libeigen/eigen/-/merge_requests/1414),Implement plog_complex,"This is the first attempt to deal with vectorized complex functions described in issue #2635 | |
In this commit, I implemented `plog_complex`. I tested it on my Windows laptop and on my MacBook M1. All the edge cases work `std::complex<double>` but I get some differences on the last pi (or a fraction of) digits for some corner cases for `std::complex<float>`. | |
P.S. This is my first merge request ever in Eigen, any comments are welcome.",Damiano Franzò,2024-01-30T19:16:27.548Z,NA,NA | |
1476 (https://gitlab.com/libeigen/eigen/-/merge_requests/1476),Fix a bunch of ODR violations.,"- `indexed_view.cpp`: | |
- `IndexPair` conflicts with the version in `TensorMeta`. | |
- `<valarray>` has limited use here, and contains a `min()` call that conflicts with the macro in `main.h` | |
- `packetmath.cpp`/`packetmath_test_shared.h`: | |
- explicitly defining `pxor`, `pandnot` and `por` for these float types leads | |
to explicit specializations after instantiations, since they are used elsewhere first. | |
Not sure what the purpose is, since tests seem to pass without these. | |
- `TensorGlobalFunctions.h`: | |
- `betainc` clashes with the one in `SpecialFunctionsArrayAPI.h`, we need to explicitly specialize for `TensorBase`.",Antonio Sánchez,2024-01-30T22:38:44.567Z,NA,NA | |
1437 (https://gitlab.com/libeigen/eigen/-/merge_requests/1437),improve random,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #2749 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Currently, one call of `std::rand` is used to provide the target scalar with random bits. This appears to work well enough for 32 bit scalars if `RAND_MAX` is equal to `INT_MAX`, but this is not the case on MSVC. This is insufficient on most platforms if the target scalar is 64 bits. | |
This implementation calls `std::rand` as many times as necessary to fill the target scalar with the required entropy. For integers, this is the entire scalar. For floating points, this is equal to the mantissa bits. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2024-01-31T08:16:30.285Z,NA,NA | |
1486 (https://gitlab.com/libeigen/eigen/-/merge_requests/1486),Fix gcc-6 bug in the rand test.,"There seems to be a silly optimization bug in gcc-6 that completely elides `r`, causing the check `r >= x` to compare garbage values and fail. Marking the function `noinline` forces the compiler to keep the return value and lets the test pass.",Antonio Sánchez,2024-02-03T02:37:56.708Z,NA,NA | |
1487 (https://gitlab.com/libeigen/eigen/-/merge_requests/1487),fix skew symmetric test,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Since we changed the random number generation facilities in Eigen, we've been getting failures in the skew-symmetric test. | |
``` | |
2/2 Test #206: skew_symmetric_matrix3_2 .........Child aborted***Exception: 0.19 sec | |
Initializing random number generator with seed 1707029388 | |
Repeating each test 10 times | |
Difference too large wrt tolerance 0.001, relative error is: 2.33333 | |
Test skewSymmetricMultiplication(MatrixXf(3, internal::random<int>(1, 320))) failed in ../test/skew_symmetric_matrix3.cpp (120) | |
verifyIsApprox(m1.transpose() * (sk * m1), (m1.transpose() * sk) * m1) | |
``` | |
This test randomly fails on several versions of clang, with various vectorization options. I think this has nothing to do with the compiler or the architecture -- the test fails when we get unlucky with the random size. I can reproduce the failure on MSVC. | |
The diagonal of the product `m^T * S * m` where `m` is a `3 x k` matrix and `S` is a `3 x 3` skew-symmetric matrix is zero. This is problematic when `k == 1` (the product is a scalar) and we are testing for equality, as catastrophic cancellation renders the comparison tricky. | |
We can ""fix"" the test by not testing for the not-so-trivial case where `k == 1`. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2024-02-05T04:27:27.710Z,NA,NA | |
1485 (https://gitlab.com/libeigen/eigen/-/merge_requests/1485),Fix PPC rand and other failures.,"The tests previously indirectly forced all random integer values to be in the range [-10, 10] | |
for tests on PPC by setting a macro. This broke the new `rand` tests for checking the full | |
range of values. | |
Removing that hack, however, caused other failures for tests involving signed integer overflows | |
(e.g. matrix multiplication of random integer matrices). Signed int overflow in computations like | |
this is technically UB. The results are predictable/consistent on other platforms, but apparently | |
not on ppc64le. | |
Fixed the failing PPC tests by limiting the random integer inputs in a handful of select places. | |
We may still be overflowing signed ints in places, but at least the tests pass. | |
Also needed to add `ploadquad` for `Packet16(u)c`. Not sure how the packetmath tests are passing | |
in the CI, but it's failing when running locally on qemu due to the missing function.",Antonio Sánchez,2024-02-05T20:07:16.177Z,NA,NA | |
1488 (https://gitlab.com/libeigen/eigen/-/merge_requests/1488),"fix tests when scalar is bfloat16, half","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Our funny 16 bit float types don't support constexpr. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2024-02-07T04:50:11.921Z,NA,NA | |
1489 (https://gitlab.com/libeigen/eigen/-/merge_requests/1489),Fix the fuzz,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=66419 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
If `getRandomBits(/*int numRandomBits*/ 0)` is called (as is the case for 80-bit `random<long double>`), then `const BitsType mask = BitsType(-1) >> (ScalarBits - numRandomBits);` invokes undefined behavior as we are shifting an integer by its width. Possible remedies: | |
a) `if(numRandomBits == 0) return 0` Easy to read, but adds a branch. | |
b) `const BitsType mask = BitsType(-1) >> ((ScalarBits - numRandomBits) & (ScalarBits - 1));` If the shift is equal to ScalarBits, mask it out. Otherwise, do nothing. This results in much cleaner assembly. In fact, this changes nothing for Clang and gcc. My guess is that the major compilers account for this theoretical UB. Hopefully the fuzzer respects this. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2024-02-07T04:52:19.694Z,NA,NA | |
1490 (https://gitlab.com/libeigen/eigen/-/merge_requests/1490),Fix UB in bool packetmath test.,"It's UB to explicitly load 0xFF into a bool for the select mask. Changing the mask to use valid boolean values causes `pselect` to fail, since the intel blend intrinsic checks the high bit instead of the low bit.",Antonio Sánchez,2024-02-09T19:46:46.679Z,NA,NA | |
1494 (https://gitlab.com/libeigen/eigen/-/merge_requests/1494),Fix segfault in CholmodBase::factorize() for zero matrix,NA,Tyler Veness,2024-02-12T03:27:57.352Z,NA,NA | |
1492 (https://gitlab.com/libeigen/eigen/-/merge_requests/1492),"Fix C++20 error, Arithmetic between different enumeration types","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
While compiling eigen inside my project with C++20, I was seeing following error: | |
``` | |
Arithmetic between different enumeration types ('Eigen::internal::gebp_traits<double, double, false, false, 0>::(unnamed enum at /Users/gjha/repos/eigen/Eigen/src/Core/products/GeneralBlockPanelKernel.h:407:3)' and 'Eigen::internal::gebp_traits<double, double>::(unnamed enum at /Users/gjha/repos/eigen/Eigen/src/Core/arch/NEON/GeneralBlockPanelKernel.h:106:3)') is deprecated | |
Arithmetic between different enumeration types ('Eigen::internal::gebp_traits<float, float, false, false, 0>::(unnamed enum at /Users/gjha/repos/eigen/Eigen/src/Core/products/GeneralBlockPanelKernel.h:407:3)' and 'Eigen::internal::gebp_traits<float, float>::(unnamed enum at /Users/gjha/repos/eigen/Eigen/src/Core/arch/NEON/GeneralBlockPanelKernel.h:46:3)') is deprecated | |
``` | |
By grouping multiply operation type promotion will happen automatically. And compiler will not complain about `Arithmetic between different enumeration types`. | |
### Additional information | |
<!--Any additional information you think is important.-->",Gautam Kumar,2024-02-12T08:14:46.320Z,NA,NA | |
1491 (https://gitlab.com/libeigen/eigen/-/merge_requests/1491),Apply clang-format to lapack/blas directories,NA,Antonio Sánchez,2024-02-12T19:43:31.116Z,NA,NA | |
1496 (https://gitlab.com/libeigen/eigen/-/merge_requests/1496),Fix division by zero UB in packet size logic.,"We never actually _use_ the UB result, but internal sanitizer tests fail because of this.",Antonio Sánchez,2024-02-12T21:01:21.071Z,NA,NA | |
1498 (https://gitlab.com/libeigen/eigen/-/merge_requests/1498),Remove r_cnjg due to conflicts with f2c.,"Remove r_cnjg due to conflicts with f2c. | |
The `r_cnjg` and `d_cnjg` functions come from libf2c. If linked with | |
a project that also links f2c, we run into duplicate symbol errors. | |
The functions are short and simple enough that we can inline them | |
wherever they are used.",Antonio Sánchez,2024-02-12T23:16:04.254Z,NA,NA | |
1499 (https://gitlab.com/libeigen/eigen/-/merge_requests/1499),Eliminate warning about writing bytes directly to non-trivial type.,"We're using the type as an bitmask anyways, so casting to `void*` should be fine.",Antonio Sánchez,2024-02-12T23:27:49.305Z,NA,NA | |
1500 (https://gitlab.com/libeigen/eigen/-/merge_requests/1500),Fixes 2780,Explicit Scalar conversion in the ternary expression (also fixes #2780 ),Alec Jacobson,2024-02-14T01:19:20.486Z,NA,NA | |
1497 (https://gitlab.com/libeigen/eigen/-/merge_requests/1497),Remove return int types from BLAS/LAPACK functions.,"The int return types are non-standard, mainly introduced by f2c. This causes | |
conflicts when loading Eigen BLAS in packages like SuiteSparse, which | |
may also have their own blas declarations. | |
Removed extraneous `ftlen` arguments introduced by f2c, which also | |
cause conflicts due to symbol mismatches compared to standard BLAS.",Antonio Sánchez,2024-02-14T19:51:38.047Z,NA,NA | |
1505 (https://gitlab.com/libeigen/eigen/-/merge_requests/1505),Disable float16 packet casting if native AVX512 f16 is available.,The packet ops aren't defined for the native type.,Antonio Sánchez,2024-02-14T20:05:01.552Z,NA,NA | |
1504 (https://gitlab.com/libeigen/eigen/-/merge_requests/1504),Fix failure on ARM with latest compilers.,"There's UB in pabsdiff if the difference overflows, so we limit the values.",Antonio Sánchez,2024-02-14T23:00:57.076Z,NA,NA | |
1507 (https://gitlab.com/libeigen/eigen/-/merge_requests/1507),Fix deflation in BDCSVD.,"Fixes #2491. | |
There were a couple issues with deflation44. Went back to the original paper to fix: | |
- swapped role of i, j to be more consistent with the paper | |
- use `hypot` instead of `sqrt(c^2 + r^2)` for better numeric stability | |
- original code copied over the wrong diagonal element, which led to incorrect comparisons in future deflation44 steps, and a non-strictly-increasing diagonal. | |
With these changes, we now converge for the edge case of a large constant matrix - the diagonal | |
is now guaranteed to be strictly increasing as described in the paper, so we no longer need the | |
hack to avoid singularities in the perturbation step.",Antonio Sánchez,2024-02-15T23:53:59.609Z,NA,NA | |
1506 (https://gitlab.com/libeigen/eigen/-/merge_requests/1506),Use traits<Matrix>::Options instead of Matrix::Options.,"Not all objects have an \`Options\` attribute (e.g. Ref), but do store this | |
information in \`traits\<...\>\`. Fixes #2335.",Antonio Sánchez,2024-02-16T00:11:58.209Z,NA,NA | |
1503 (https://gitlab.com/libeigen/eigen/-/merge_requests/1503),Fix random for custom scalars that don't have constexpr digits().,"The `NumTraits<Scalar>::digits()` function isn't actually `constexpr` | |
in some cases. In particular, our default implementation uses `std::log2`, | |
which isn't `constexpr`. Unfortunately, this means we can't actually precompute | |
the number of mantissa bits in general.",Antonio Sánchez,2024-02-16T02:30:55.606Z,NA,NA | |
1509 (https://gitlab.com/libeigen/eigen/-/merge_requests/1509),Rename generic_fast_tanh_float to ptanh_float and move it to...,Fixes #2622,Rasmus Munk Larsen,2024-02-16T21:27:23.490Z,NA,NA | |
1501 (https://gitlab.com/libeigen/eigen/-/merge_requests/1501),Implement float pexp_complex,"In this MR I propose the SIMD complex function `pexp_complex` for `float`, which is the vectorized counterpart of `std::exp(std::complex<float>)`, as previously described in issue #2635 | |
Rather than creating a brand new `sincos` function, I slightly modified the current implementation of `sincos_float.` The idea is that the implementation of `pexp_complex` will remain the same after the implementation of `sincos_double`, as they will share the same interface. I may have performed too many comparisons for the IEEE edge cases, so I would appreciate it if you spot any inefficiencies.",Damiano Franzò,2024-02-17T00:27:18.708Z,NA,NA | |
1513 (https://gitlab.com/libeigen/eigen/-/merge_requests/1513),fix pexp_complex_test,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Check your c++ language standard privilege, bro! | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2024-02-17T03:08:24.579Z,NA,NA | |
1495 (https://gitlab.com/libeigen/eigen/-/merge_requests/1495),"JacobiSVD: get rid of m_scaledMatrix, m_adjoint, hopefully fix some compiler warnings","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
`m_scaledMatrix` served no purpose. It is not used at all when `rows == cols`. Even when `rows != cols`, it is not used when there is no preconditioner. Even _then_, it is immediately copied into the preconditioner. We can avoid all these pointless scenarios by passing `matrix/scale` as an expression into the preconditioner object and handling it if/when it is needed. | |
Similarly with `m_adjoint`. Since the constructor of `m_qr` copies the entire input matrix, this serves no purpose -- and costs a lot of memory! | |
Finally, we should initialize the runtime computation options with their compile time counterparts. Currently, we initialize `m_computeFullU` to false, and then assign it to its compile time value. I think this may prevent some compiler optimizations. | |
There are many easy improvements like this in the SVD module. Hopefully breaking it down into small chunks can make it more manageable. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2024-02-17T03:41:55.682Z,NA,NA | |
1514 (https://gitlab.com/libeigen/eigen/-/merge_requests/1514),fix exp complex test: use int instead of index,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Nothing to see here. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2024-02-17T03:55:33.391Z,NA,NA | |
1510 (https://gitlab.com/libeigen/eigen/-/merge_requests/1510),Fix real schur and polynomial solver.,"For certain inputs, the real schur decomposition might get stuck in a cycle. | |
Exceptional shifts are supposed to knock us out of that - but previously | |
they were only ever applied at iteration 10 and 30, which doesn't help if | |
the cycle starts after cycle 30. Modified to apply a shift every 16 iterations | |
(for reference, LAPACK seems to do it every 6 iterations). | |
Also added an assert in polynomial solver to verify that the schur decomposition | |
was successful. | |
Fixes #2633.",Antonio Sánchez,2024-02-17T15:22:12.026Z,NA,NA | |
1516 (https://gitlab.com/libeigen/eigen/-/merge_requests/1516),Fix GPU build for ptanh_float.,Make declarations consistent and enable on device.,Antonio Sánchez,2024-02-20T16:08:51.330Z,NA,NA | |
1517 (https://gitlab.com/libeigen/eigen/-/merge_requests/1517),Fix use of uninitialized memory in kronecker_product test.,We were multiplying uninitialized matrices.,Antonio Sánchez,2024-02-20T16:53:51.740Z,NA,NA | |
1511 (https://gitlab.com/libeigen/eigen/-/merge_requests/1511),Enable direct access for IndexedView.,"Added a .data() method, strides, and required helper methods when it does have direct access, as well as coeffReff methods for use in matmul. | |
Fixes #2358.",Antonio Sánchez,2024-02-20T18:21:46.463Z,NA,NA | |
1518 (https://gitlab.com/libeigen/eigen/-/merge_requests/1518),Make header guards in GeneralMatrixMatrix.h and Parallelizer.h consistent:...,Fix build error reported for commit 76e8c0455396446f8166c798da5efe879e010bdc.,Rasmus Munk Larsen,2024-02-20T20:03:18.889Z,NA,NA | |
1512 (https://gitlab.com/libeigen/eigen/-/merge_requests/1512),Add method signDeterminant() to QR and related decompositions.,Fixes #2520,Rasmus Munk Larsen,2024-02-20T23:44:29.104Z,NA,NA | |
1521 (https://gitlab.com/libeigen/eigen/-/merge_requests/1521),Fix crash in IncompleteCholesky when the input has zeros on the diagonal.,"The incomplete Cholesky algorithm adds a positive shift to the diagonal of the original matrix until the factorization suceeds. If the original matrix has an implicit zero on the diagonal, there is no space for such a value in the sparse data-structure and the code ends up asserting, trying to modify the `nnz+1`'st element of the value array. This change inspects the diagonal of the input and inserts explicit zero values before computing the factorizing. | |
Fixes #2548",Rasmus Munk Larsen,2024-02-22T22:22:22.300Z,NA,NA | |
1520 (https://gitlab.com/libeigen/eigen/-/merge_requests/1520),"Remove ""using namespace Eigen"" from blas/common.h.","It causes havoc when you try to include the blas headers in other code, introducing symbol | |
collisions with `real`.",Antonio Sánchez,2024-02-22T22:51:43.100Z,NA,NA | |
1519 (https://gitlab.com/libeigen/eigen/-/merge_requests/1519),Change array_size result from enum to constexpr.,"This fixes comparisons, allowing us to comparing values of type `Index` without compiler warnings. | |
Fixes #2782.",Antonio Sánchez,2024-02-22T22:52:26.407Z,NA,NA | |
1523 (https://gitlab.com/libeigen/eigen/-/merge_requests/1523),Speed up SparseQR.,"### Reference issue #2583 | |
### What does this implement/fix? | |
Speeds up SparseQR on the matrix in #2583 from 256s to 200s.",Rasmus Munk Larsen,2024-02-23T00:59:55.403Z,NA,NA | |
1524 (https://gitlab.com/libeigen/eigen/-/merge_requests/1524),Fix signed integer UB in random.,"There were some last signed overflows. Required some re-writing of the | |
test to compute uniform bin probabilities.",Antonio Sánchez,2024-02-24T13:16:24.387Z,NA,NA | |
1525 (https://gitlab.com/libeigen/eigen/-/merge_requests/1525),Speed up sparse x dense dot product.,"This applies a small trick used in https://gitlab.com/libeigen/eigen/-/blame/master/Eigen/src/SparseCore/SparseDenseProduct.h#L69 to speed up sparse x dense dot products. Also applies the ""inline"" keyword to the methods in SparseDot.h for a small improvement. | |
This reduces the time for SparseQR applied to the test matrix in #2583 from ~200s to ~165s. | |
Profile before: https://gitlab.com/libeigen/eigen/-/snippets/3678759 | |
Profile after: https://gitlab.com/libeigen/eigen/-/snippets/3679225 | |
FYI: I also tried to manually vectorize the dot product, but with no success - it was slower, despite trying multiple variations.",Rasmus Munk Larsen,2024-02-24T19:13:34.295Z,NA,NA | |
1526 (https://gitlab.com/libeigen/eigen/-/merge_requests/1526),Fix MSVC GPU build.,"It doesn't like the out-of-line definition of `allocate()` because of the difference | |
between `Eigen::Index` and `JacobiSVD::Index` (which is equal to `Eigen::Index`). Silly | |
MSVC + NVCC.",Antonio Sánchez,2024-02-27T23:26:07.751Z,NA,NA | |
1527 (https://gitlab.com/libeigen/eigen/-/merge_requests/1527),delete shadowed typedefs,"### Reference issue | |
### What does this implement/fix? | |
**pedant:** noun ped·ant ˈpe-dᵊnt | |
**Synonyms** of **pedant:** | |
**1**. one who is unimaginative or who unduly emphasizes minutiae in the presentation or use of knowledge | |
**2**. one who makes a show of knowledge | |
**3**. a formalist or precisionist in teaching | |
**4**. a compiler or compiler engineer implementing the --pedantic option | |
\[**obsolete** : a male schoolteacher\] | |
**Antononyms** of **pedant:** | |
**1**. one who is crackin' or slappin' | |
**2**. one who is not doing too much, bruh | |
### Additional information",Charles Schlosser,2024-02-28T02:40:46.467Z,NA,NA | |
1528 (https://gitlab.com/libeigen/eigen/-/merge_requests/1528),Fix QR colpivoting warnings and test failure.,"The test was failing because a raw `abs` call uses the integer `abs` for | |
floating-point types, causing a mismatch and generating a compile warning. Changing to `numext::abs` | |
addresses this.",Antonio Sánchez,2024-02-28T15:00:14.733Z,NA,NA | |
1531 (https://gitlab.com/libeigen/eigen/-/merge_requests/1531),Add degenerate checks before calling BLAS routines.,"Fixes #2754. Any multiplication or solve involving a zero-sized matrix or vector | |
should be a noop instead of a crash.",Antonio Sánchez,2024-02-29T18:56:37.276Z,NA,NA | |
1532 (https://gitlab.com/libeigen/eigen/-/merge_requests/1532),Update error about c++14 requirement.,Fixes #2741.,Antonio Sánchez,2024-02-29T20:45:15.418Z,NA,NA | |
1530 (https://gitlab.com/libeigen/eigen/-/merge_requests/1530),Eliminate FindCUDA cmake warning.,Fixes #2768,Antonio Sánchez,2024-02-29T20:49:42.431Z,NA,NA | |
1529 (https://gitlab.com/libeigen/eigen/-/merge_requests/1529),Fix triangular matrix-vector multiply uninitialized warning.,"The warning seems to be caused by a mix of | |
- `const_cast` abuse, to convert `actualRhs.data()` from `const` to non-`const`, for no other reason than we _might_ copy into `actualRhsPtr` if we end up allocating a temporary buffer. | |
- potential use of alloca to allocate a temporary buffer | |
- the confusing mix and match of a bunch of conditionals. | |
Explicitly spelling everything out and skipping the const_cast abuse, we get the same behavior, but without the warning. | |
Fixes #2787.",Antonio Sánchez,2024-02-29T21:00:59.485Z,NA,NA | |
932 (https://gitlab.com/libeigen/eigen/-/merge_requests/932),"Rip out make_coherent, add CoherentPadOp.","Rip out make_coherent, add CoherentPadOp. | |
The `make_coherent` approach really irks me, since it modifies `const` | |
inputs, and ends up requiring re-sizing expressions like | |
`CwiseNullaryOp` to get working. | |
Here we replace with a `CoherentPadOp`, that artificially pads | |
an input with zeros if the derivative sizes do not match. This only | |
affects cases where the derivative vectors are dynamic-sized, or | |
if both fixed sized and not equal. | |
Ran benchmarks, and this seems to be about 20% faster on average than the | |
existing `make_coherent` approach. | |
Benchmark source: | |
```cpp | |
#include <benchmark/benchmark.h> | |
#include <unsupported/Eigen/AutoDiff> | |
template<typename DerivativeType> | |
void BM_AutoDiffScalar(benchmark::State& state) { | |
int nder = state.range(0); | |
using AD = Eigen::AutoDiffScalar<DerivativeType>; | |
AD a(1, nder, 1); | |
AD b(1, nder, 2); | |
AD c(1, nder, 3); | |
AD d(1, nder, nder); | |
AD e = 1.0; // No initialized derivatives. | |
e.derivatives().resize(nder); | |
e.derivatives().setZero(); | |
for (auto s : state) { | |
AD r; | |
benchmark::DoNotOptimize(r = sin(2 * a + AD(3) * b + AD(4) * cos(c))); | |
// #2235. | |
benchmark::DoNotOptimize(r = 2.0 * e - d); | |
} | |
} | |
BENCHMARK_TEMPLATE(BM_AutoDiffScalar, Eigen::Vector4d)->Arg(4); | |
BENCHMARK_TEMPLATE(BM_AutoDiffScalar, Eigen::VectorXd)->Arg(4)->Arg(8)->Arg(16)->Arg(32)->Arg(64)->Arg(128); | |
``` | |
``` | |
Comparing ./autodiff_master to ./autodiff_pad | |
Benchmark Time CPU Time Old Time New CPU Old CPU New | |
------------------------------------------------------------------------------------------------------------------------------------- | |
BM_AutoDiffScalar<Eigen::Vector4d>/4 -0.0022 -0.0023 1 1 1 1 | |
BM_AutoDiffScalar<Eigen::VectorXd>/4 -0.2929 -0.2929 95 67 95 67 | |
BM_AutoDiffScalar<Eigen::VectorXd>/8 -0.2811 -0.2811 97 69 97 69 | |
BM_AutoDiffScalar<Eigen::VectorXd>/16 -0.3013 -0.3014 101 70 101 70 | |
BM_AutoDiffScalar<Eigen::VectorXd>/32 -0.2763 -0.2763 111 80 111 80 | |
BM_AutoDiffScalar<Eigen::VectorXd>/64 -0.1948 -0.1949 131 105 131 105 | |
BM_AutoDiffScalar<Eigen::VectorXd>/128 -0.2250 -0.2250 212 165 212 165 | |
OVERALL_GEOMEAN -0.2303 -0.2303 0 0 0 0 | |
```",Antonio Sánchez,2024-02-29T23:15:02.954Z,NA,NA | |
1536 (https://gitlab.com/libeigen/eigen/-/merge_requests/1536),fix unaligned access in trmv,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Fixes failing test `nomalloc_3`. I guess its not aligned :disappointed: | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2024-03-03T04:20:10.720Z,NA,NA | |
1537 (https://gitlab.com/libeigen/eigen/-/merge_requests/1537),Fix static_assert for c++14.,NA,Antonio Sánchez,2024-03-03T08:49:29.528Z,NA,NA | |
1538 (https://gitlab.com/libeigen/eigen/-/merge_requests/1538),Return 0 volume for empty AlignedBox,"### What does this implement/fix? | |
Returns 0 volume for empty `AlignedBox`. The previous implementation would return a negative volume for empty `AlignedBox` which could be quite surprising. | |
Used `==` even though the comparison might involve floating-point numbers in the unit test because in this case the volume should be exactly 0.",Ghost User,2024-03-04T17:32:45.473Z,NA,NA | |
1533 (https://gitlab.com/libeigen/eigen/-/merge_requests/1533),Fix pexp complex test edge-cases.,"There were some missing edge-cases in the tests, and some failures. | |
Tested on clang, msvc.",Antonio Sánchez,2024-03-04T17:44:39.830Z,NA,NA | |
1535 (https://gitlab.com/libeigen/eigen/-/merge_requests/1535),Fix deprecated anonymous enum-enum conversion warnings,NA,Tyler Veness,2024-03-06T21:30:19.375Z,NA,NA | |
1541 (https://gitlab.com/libeigen/eigen/-/merge_requests/1541),Fix packetmath plog test on Windows.,"MSVC gets some edge-cases wrong, but these are already fixed in | |
`numext::log`. Switching comparison to use `numext::log`.",Antonio Sánchez,2024-03-06T23:51:47.913Z,NA,NA | |
1539 (https://gitlab.com/libeigen/eigen/-/merge_requests/1539),Allow aligned assignment in TRMV.,"Always align static vector allocations. Otherwise, we run into some weird edge-cases that need special handling. This shouldn't really affect performance, since it only applies to fixed-sized vectors.",Antonio Sánchez,2024-03-06T23:59:33.271Z,NA,NA | |
1540 (https://gitlab.com/libeigen/eigen/-/merge_requests/1540),Fix pexp test for ARM.,"32-bit arm flushes subnormals to zero, causing the test to fail. We need to flush them in the comparison.",Antonio Sánchez,2024-03-07T00:19:58.524Z,NA,NA | |
1542 (https://gitlab.com/libeigen/eigen/-/merge_requests/1542),Split up cxx11_tensor_gpu to reduce timeouts.,"The cxx11_tensor_gpu_3 test seems to time out a lot on Windows. Split | |
up the tests to reduce this.",Antonio Sánchez,2024-03-07T17:21:38.974Z,NA,NA | |
1543 (https://gitlab.com/libeigen/eigen/-/merge_requests/1543),Fix incomplete cholesky.,"We can't always insert an element on the diagonal if zero, since | |
it might be an explicit zero, in which case the element already exists. | |
Instead, added a method to `SparseMatrix`: `findOrInsertCoeff`, that | |
sets an output parameter to indicate if the entry did not previously exist. | |
Also allow returning the shift parameter from `IncompleteCholesky`, so | |
we can verify the decomposition. | |
Fixes #2791.",Antonio Sánchez,2024-03-08T19:18:11.453Z,NA,NA | |
1545 (https://gitlab.com/libeigen/eigen/-/merge_requests/1545),Fix CwiseUnaryView.,"The main difference between a CwiseUnaryView and a CwiseUnaryOp is that the | |
former is supposed to allow direct access and modification to the object | |
being viewed. The current main use-case is `.real()` and `.imag()` to access | |
and manipulate the real/imaginary components of a complex array. | |
Previously `CwiseUnaryView` relied on a const-cast hack, and prevented | |
anyone from creating their own view expression via `unaryViewExpr()`. | |
Direct access was also broken for the `const` case, since it would try | |
to access the address of a temporary returned by `coeff()` (since typically | |
const expressions in Eigen return coefficients by value). | |
Here, we modify `CwiseUnaryView` to properly allow non-const access without | |
const_cast, fixed the direct-access issue for const objects, and added checks | |
to ensure the view is being used correctly. | |
Fixes #2348.",Antonio Sánchez,2024-03-11T19:08:31.130Z,NA,NA | |
1544 (https://gitlab.com/libeigen/eigen/-/merge_requests/1544),Add Packet2l for SSE.,"### What does this implement/fix? | |
This adds support for vectorizing int64_t operations with SSE. | |
### Additional information | |
Support for 64 integers is also needed in certain double math functions, as in e.g. https://gitlab.com/libeigen/eigen/-/merge_requests/1522. Casting and half packets for `Packet4l` will be added in a followup.",Rasmus Munk Larsen,2024-03-11T19:54:56.149Z,NA,NA | |
1547 (https://gitlab.com/libeigen/eigen/-/merge_requests/1547),Fix const input and c++20 compatibility in unary view.,"- Preserve const-ness of the Scalar. | |
- Replace call of `std::result_of` with the Eigen-internal version that preserves const/ref-ness. | |
Fixes #2795.",Antonio Sánchez,2024-03-13T16:59:44.939Z,NA,NA | |
1549 (https://gitlab.com/libeigen/eigen/-/merge_requests/1549),Fix CwiseUnaryView const access (Attempt 2).,"The previous attempt (!1547) broke a bunch of builds. | |
Took a different approach here: test for matrix mutability, | |
and only enable mutable access functions if true.",Antonio Sánchez,2024-03-14T21:04:50.469Z,NA,NA | |
1550 (https://gitlab.com/libeigen/eigen/-/merge_requests/1550),Don't hide rbegin/rend for GPU.,"Although they don't work on GPU, guarding them out leads to compile errors. | |
We'll get a compile error if anyone tries to use `rbegin`/`rend` on device, | |
but that was never supported anyways. | |
Fixes #2797.",Antonio Sánchez,2024-03-14T21:11:45.229Z,NA,NA | |
1553 (https://gitlab.com/libeigen/eigen/-/merge_requests/1553),Restore C++03 compatibility.,"2x2 linear matrix construction cannot use initializer lists prior to c++11, so we | |
manually construct a 2x2 matrix.",Antonio Sánchez,2024-03-15T17:55:05.026Z,NA,NA | |
1551 (https://gitlab.com/libeigen/eigen/-/merge_requests/1551),Work around VS2015 compile bug.,"The compiler is treating `bool(...)` as a function declaration rather than a | |
cast, then fails. Explicitly adding `static_cast` allows it to pass. | |
Related to #2798.",Antonio Sánchez,2024-03-15T18:07:03.810Z,NA,NA | |
1552 (https://gitlab.com/libeigen/eigen/-/merge_requests/1552),Fix CwiseUnaryView for MSVC.,"MSVC doesn't seem to understand the default paramater on the original | |
`CwiseUnaryViewImpl` declaration, but does understand it on the | |
definition. Re-arrange the code to make it happy.",Antonio Sánchez,2024-03-17T16:28:18.031Z,NA,NA | |
1557 (https://gitlab.com/libeigen/eigen/-/merge_requests/1557),Fix Jacobi module doc.,"Modified tag on `applyOnTheRight` to be consistent with the others, so it appears | |
in the right place in the docs. | |
Fixes #2591.",Antonio Sánchez,2024-03-17T23:08:05.484Z,NA,NA | |
1558 (https://gitlab.com/libeigen/eigen/-/merge_requests/1558),Remove slow index check in Tensor::resize from release mode.,"This change also gets rid of pre-C++11 code guarded by `EIGEN_EMULATE_CXX11_META_H`, and changes `static const` to `static constexpr` in a few places.",Rasmus Munk Larsen,2024-03-18T23:43:26.521Z,NA,NA | |
1555 (https://gitlab.com/libeigen/eigen/-/merge_requests/1555),Make more Matrix functions constexpr,This makes the default constructor and assignment operators constexpr.,Tyler Veness,2024-03-21T22:32:36.187Z,NA,NA | |
1559 (https://gitlab.com/libeigen/eigen/-/merge_requests/1559),Fix Packet*l for 32-bit builds.,"Apparently `_mm_cvtsi128_si64` and `_mm_extract_epi64` are only available | |
on x86_64 targets. Added work-arounds that extract values via bit-casts | |
to double. | |
I have no idea why only those two instructions don't seem to be available, but all the other `*epi64` functions are fine. But with this change, it now builds on both linux with gcc/clang and on windows with msvc when compiling for 32-bit.",Antonio Sánchez,2024-03-22T17:16:43.500Z,NA,NA | |
1546 (https://gitlab.com/libeigen/eigen/-/merge_requests/1546),Add support for casting between double and int64_t for SSE and AVX2.,"Improves casting double->int64. Measurement of tensor cast expression `B = A.template cast<OUT>()`: | |
``` | |
SSE4.2: | |
name old cpu/op new cpu/op delta | |
BM_cast<double,int64_t>/8 6.95ns ± 4% 4.09ns ± 0% -41.20% (p=0.000 n=59+46) | |
BM_cast<double,int64_t>/64 29.4ns ± 2% 26.1ns ± 1% -11.39% (p=0.000 n=44+51) | |
BM_cast<double,int64_t>/512 212ns ± 0% 209ns ± 0% -1.47% (p=0.000 n=51+55) | |
BM_cast<double,int64_t>/4k 1.79µs ± 3% 1.80µs ± 1% ~ (p=0.748 n=60+54) | |
BM_cast<double,int64_t>/32k 14.2µs ± 2% 14.3µs ± 1% +0.95% (p=0.000 n=58+54) | |
BM_cast<double,int64_t>/256k 171µs ± 3% 171µs ± 3% ~ (p=0.767 n=60+60) | |
BM_cast<double,int64_t>/1M 731µs ±12% 742µs ±16% ~ (p=0.275 n=50+49) | |
BM_cast<int64_t,double>/8 5.17ns ± 1% 5.18ns ± 1% ~ (p=0.072 n=52+54) | |
BM_cast<int64_t,double>/64 19.9ns ± 1% 19.9ns ± 2% ~ (p=0.362 n=41+57) | |
BM_cast<int64_t,double>/512 119ns ± 0% 119ns ± 0% ~ (p=0.771 n=54+55) | |
BM_cast<int64_t,double>/4k 1.35µs ± 0% 1.35µs ± 1% -0.12% (p=0.002 n=44+52) | |
BM_cast<int64_t,double>/32k 10.8µs ± 1% 10.7µs ± 1% -0.19% (p=0.016 n=51+51) | |
BM_cast<int64_t,double>/256k 158µs ± 3% 157µs ± 2% -0.33% (p=0.019 n=60+60) | |
BM_cast<int64_t,double>/1M 684µs ±16% 690µs ±20% ~ (p=0.913 n=53+54) | |
```",Rasmus Munk Larsen,2024-03-22T22:32:30.586Z,NA,NA | |
1562 (https://gitlab.com/libeigen/eigen/-/merge_requests/1562),Protect use of alloca.,Caused breakages on some 32-bit arm systems.,Antonio Sánchez,2024-03-23T16:43:49.255Z,NA,NA | |
1561 (https://gitlab.com/libeigen/eigen/-/merge_requests/1561),"Remove ""extern C"" in CholmodSupport.","cholmod.h itself does this when required. | |
Fixes #2580.",Antonio Sánchez,2024-03-25T00:03:29.348Z,NA,NA | |
1564 (https://gitlab.com/libeigen/eigen/-/merge_requests/1564),cross3_product vectorization,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #2779 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.--> | |
The changes to `Eigen/src/Core/arch/AVX/TypeCasting.h` were needed to fix compilation on MSVC, even if the function is not referenced. `_mm256_set_m128d` is a macro that tripped up the parser.",Charles Schlosser,2024-03-25T00:06:34.396Z,NA,NA | |
1563 (https://gitlab.com/libeigen/eigen/-/merge_requests/1563),Add custom formatting of complex numbers for Numpy/Native.,"Numpy formatting in python prints complex numbers as `1.+2.j`, | |
rather than `(1, 2)`. Here we simplify as just `1+2j`, which | |
is at least copy-paste-able into numpy. | |
Similarly, for `Native`, we format complex numbers as `{1, 2}`, | |
so that it can be copy-pasted and treated as a complex constructor. | |
Fixes #2545.",Antonio Sánchez,2024-03-25T17:41:45.608Z,NA,NA | |
1566 (https://gitlab.com/libeigen/eigen/-/merge_requests/1566),Fix another instance of Packet2l on win32.,Fixes #2800,Antonio Sánchez,2024-03-26T15:48:45.453Z,NA,NA | |
1568 (https://gitlab.com/libeigen/eigen/-/merge_requests/1568),Fix using ScalarPrinter redefinition for gcc.,Fixes CI.,Antonio Sánchez,2024-03-26T22:41:31.758Z,NA,NA | |
1567 (https://gitlab.com/libeigen/eigen/-/merge_requests/1567),More fixes for 32-bit.,"SSE 32-bit doesn't support conversions from double->int64 either, | |
so we need to do it in two steps. | |
Also added 32-bit/64-bit windows build smoketests. | |
Fixes #2800 (for good, hopefully).",Antonio Sánchez,2024-03-26T22:53:38.951Z,NA,NA | |
1569 (https://gitlab.com/libeigen/eigen/-/merge_requests/1569),Sparse move,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #2238 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Optimized `SparseMatrix` move constructor and move assignment to call `swap` directly. Adds `SparseMatrix` move and assignment constructors for arguments that inherit from `SparseCompressedBase`. These don't always result in a cheap copy, but they use existing `markAsRValue()` logic to enable optimizations if conditions are right. | |
And vectors. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2024-03-27T17:44:51.099Z,NA,NA | |
1570 (https://gitlab.com/libeigen/eigen/-/merge_requests/1570),Use truncation rather than rounding when casting Packet2d to Packet2l.,Fix small bug in https://gitlab.com/libeigen/eigen/-/merge_requests/1546,Rasmus Munk Larsen,2024-03-27T22:01:05.804Z,NA,NA | |
1554 (https://gitlab.com/libeigen/eigen/-/merge_requests/1554),Add SimplicialNonHermitianLLT and SimplicialNonHermitianLDLT,"This PR adds two new sparse matrix solvers: `SimplicialNonHermitianLLT` and `SimplicialNonHermitianLDLT`. These are similar to the existing LLT and LDLT solvers, but work with complex symmetric rather than complex hermitian matrices. Such matrices arise in some physics problems (e.g. electromagnetic FEM simulations). | |
These solvers are just a minor variation of the existing solvers, the only difference is that (1) a regular transpose should be used instead of a conjugate transpose, and (2) it can't be assumed that the main diagonal is real. | |
I *think* I managed to implement this in a way that doesn't break any public API. The only user-visible API change is the fact that `SparseSolverBase::setShift` now takes arguments of type `DiagonalScalar` rather than `RealScalar`, but for the existing solvers this is merely a typedef for `RealScalar` so this shouldn't break anything. | |
I had to make some changes to the internals though, in particular I didn't want to make far-reaching changes to the `SelfAdjointView` class to support non-conjugate transpose views, so I changed the code in `SparseSolverBase` to use the lower-level internal functions `permute_symm_to_symm` and `permute_symm_to_fullsymm` instead. | |
I've also added tests for the new solvers to `test/simplicial_cholesky.cpp`.",Maarten Baert,2024-03-28T00:22:28.345Z,NA,NA | |
1560 (https://gitlab.com/libeigen/eigen/-/merge_requests/1560),"Add missing cwiseSquare, tests for cwise matrix ops.","Fixes #2516. | |
In added tests, caught that there was a typo for the `cwiseCbrt` op.",Antonio Sánchez,2024-03-28T04:26:55.932Z,NA,NA | |
1565 (https://gitlab.com/libeigen/eigen/-/merge_requests/1565),Allow symbols to be used in compile-time expressions.,"This enables us to determine compile-time sizes in expressions involving `Eigen::indexing::last`. | |
Required some bigger changes: | |
- Refactored SymbolicIndex to allow compile-time evaluation. | |
- Renamed UndefinedIncr to Undefined to use for first/size expressions as well | |
- Refactored IndexededViewHelper to simplify handling of first/size/incr, reducing the number of free functions | |
With these, indexed expressions with compile-time constants now behave exactly like the block versions | |
in the slicing tutorial. | |
Fixes #2535.",Antonio Sánchez,2024-03-28T18:43:51.486Z,NA,NA | |
1571 (https://gitlab.com/libeigen/eigen/-/merge_requests/1571),Fix usages of Eigen::array to be compatible with std::array.,"There is currently a difference in their constructors - `Eigen::array` allows `array(a, b, c)` construction, but `std::array` requires an initializer list. | |
Once we move to C++17, we can drop the custom Eigen::array implementation, since `std::array` would work on GPU due to expanded set of `constexpr` functions in the standard library.",Antonio Sánchez,2024-03-29T15:51:15.934Z,NA,NA | |
1572 (https://gitlab.com/libeigen/eigen/-/merge_requests/1572),AVX2 - double->int64_t casting,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Fully vectorizes the cast `double` -> `int64_t` when AVX2 is available. Unfortunately, this approach cannot be used for SSE, as vector shift intrinsics (where the shift count can be different for each element), e.g. `_mm256_srlv_epi64`, are not available until AVX2. | |
For a pure casting operation, this appears to improve throughput by 70%. | |
Also took the liberty to clean up some AVX2 code, applying lessons learned. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2024-03-29T21:35:10.513Z,NA,NA | |
1515 (https://gitlab.com/libeigen/eigen/-/merge_requests/1515),Fix random again,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Provides a default random generator for custom floats. This uses `double` whose effective mantissa is the smaller of the target scalar's mantissa and that of `double`. As I learned when testing `bfloat16` and `half`, this is necessary to prevent rounding bias when casting to the target scalar (if its smaller than `double`). | |
I went ahead and deleted the explicit specializations for `AnnoyingScalar`, `MoveableScalar` and `SafeScalar` as the resulting bit pattern should be exactly the same. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2024-03-29T21:49:28.898Z,NA,NA | |
1574 (https://gitlab.com/libeigen/eigen/-/merge_requests/1574),AVX: guard Packet4l definition,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #2569 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2024-04-01T00:31:47.276Z,NA,NA | |
1575 (https://gitlab.com/libeigen/eigen/-/merge_requests/1575),Fix long double random,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
https://oss-fuzz-build-logs.storage.googleapis.com/log-7399495d-a3c8-4ae9-887f-45268bc96304.txt | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
When we fall back to `double` for unsupported `long double` configurations (e.g. double-double), we need to use `std::numeric_limits<double>::digits-1` for the mantissa bits. | |
Also got rid of redundant static asserts that will never be triggered as pointed out by @ChipKerchner | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2024-04-02T12:05:40.767Z,NA,NA | |
1577 (https://gitlab.com/libeigen/eigen/-/merge_requests/1577),Fix preverse for PowerPC.,Fix preverse for PowerPC.,Chip Kerchner,2024-04-03T20:09:06.975Z,NA,NA | |
1576 (https://gitlab.com/libeigen/eigen/-/merge_requests/1576),Fix preprocessor condition on when to use fast float logistic implementation.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
This fixes an issue introduced by a30ecb7221a46824b85cad5f9016efe6e5871d69, which effectively disabled the fast float implementation for the logistic function *for all build configurations* (including those using a compiler targeting GPU). This commit probably got through unnoticed because two cancelling issues: `EIGEN_CPU_CC` should really be `EIGEN_CPUCC`, whereas `#ifdef` should have been `#ifndef`. | |
As a result, recent Tensorflow Lite libraries suffer from this, in the sense that the generic float implementation is always chosen over the preferred fast implementation. As of today, the Tensorflow repository seems completely oblivious to this issue, as can be deduced from this comment: https://github.com/tensorflow/tensorflow/blob/b8867cbc656c3b65998b42b907b8d00515f8f681/tensorflow/lite/kernels/internal/reference/logistic.h#L42. EDIT: I also opened an issue [there](https://github.com/tensorflow/tensorflow/issues/64981). | |
### Additional information | |
<!--Any additional information you think is important.-->",Dieter Dobbelaere,2024-04-03T22:02:27.707Z,NA,NA | |
1578 (https://gitlab.com/libeigen/eigen/-/merge_requests/1578),Update file Geometry_SIMD.h,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
@rmlarsen1 does this fix the issue? | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2024-04-05T18:30:40.724Z,NA,NA | |
1580 (https://gitlab.com/libeigen/eigen/-/merge_requests/1580),Add support for Packet8l to AVX512.,NA,Rasmus Munk Larsen,2024-04-09T22:58:45.485Z,NA,NA | |
1581 (https://gitlab.com/libeigen/eigen/-/merge_requests/1581),"Add constexpr to accessors in DenseBase, Quaternions and Translations","Tiny changes which add constexpr to some accessors to DenseBase, Quaternions and Translations. | |
### Reference issue | |
None: discussed on Discord with chuckyschluz. | |
### What does this implement/fix? | |
It furthers the liberties of using those classes at compile time, which allows to do pre-computations rather than relying on ""magic numbers"" to not waste time at run-time. | |
### Additional information | |
Nothing really.",Stéphane T.,2024-04-11T14:46:50.053Z,NA,NA | |
1583 (https://gitlab.com/libeigen/eigen/-/merge_requests/1583),Speed up pldexp_generic.,"Speeds up `pexp` by up to 6%. | |
Measurements on SkylakeX: | |
``` | |
SSE4.2: | |
name old cpu/op new cpu/op delta | |
BM_eigen_exp_float/1 1.88ns ± 1% 1.68ns ± 1% -10.88% (p=0.000 n=54+47) | |
BM_eigen_exp_float/8 28.9ns ± 1% 28.5ns ± 0% -1.37% (p=0.000 n=51+47) | |
BM_eigen_exp_float/64 145ns ± 1% 139ns ± 0% -4.09% (p=0.000 n=49+43) | |
BM_eigen_exp_float/512 1.11µs ± 1% 1.06µs ± 0% -4.42% (p=0.000 n=42+46) | |
BM_eigen_exp_float/4k 8.80µs ± 0% 8.40µs ± 0% -4.54% (p=0.000 n=42+42) | |
BM_eigen_exp_float/32k 70.2µs ± 0% 67.6µs ± 3% -3.74% (p=0.000 n=46+59) | |
BM_eigen_exp_float/256k 561µs ± 0% 537µs ± 1% -4.27% (p=0.000 n=45+45) | |
BM_eigen_exp_float/1M 2.24ms ± 0% 2.15ms ± 1% -4.15% (p=0.000 n=39+43) | |
AVX2: | |
name old cpu/op new cpu/op delta | |
BM_eigen_exp_float/1 1.70ns ± 6% 1.70ns ± 5% ~ (p=0.488 n=60+60) | |
BM_eigen_exp_float/8 30.9ns ± 0% 30.9ns ± 0% ~ (p=0.352 n=49+50) | |
BM_eigen_exp_float/64 84.1ns ± 4% 81.0ns ± 4% -3.71% (p=0.000 n=59+58) | |
BM_eigen_exp_float/512 520ns ± 4% 489ns ± 3% -5.96% (p=0.000 n=57+58) | |
BM_eigen_exp_float/4k 3.99µs ± 4% 3.77µs ± 4% -5.45% (p=0.000 n=48+46) | |
BM_eigen_exp_float/32k 31.8µs ± 5% 29.9µs ± 5% -5.87% (p=0.000 n=50+53) | |
BM_eigen_exp_float/256k 253µs ± 4% 239µs ± 4% -5.65% (p=0.000 n=50+53) | |
BM_eigen_exp_float/1M 1.01ms ± 4% 0.95ms ± 4% -6.04% (p=0.000 n=60+56) | |
AVX512: | |
name old cpu/op new cpu/op delta | |
BM_eigen_exp_float/1 2.64ns ± 1% 2.65ns ± 2% ~ (p=0.061 n=51+54) | |
BM_eigen_exp_float/8 33.9ns ± 2% 33.9ns ± 2% ~ (p=0.546 n=49+46) | |
BM_eigen_exp_float/64 88.5ns ± 3% 88.7ns ± 4% ~ (p=0.703 n=57+59) | |
BM_eigen_exp_float/512 275ns ± 3% 274ns ± 3% -0.60% (p=0.009 n=52+54) | |
BM_eigen_exp_float/4k 1.77µs ± 3% 1.76µs ± 3% -0.62% (p=0.006 n=59+59) | |
BM_eigen_exp_float/32k 13.7µs ± 3% 13.7µs ± 4% ~ (p=0.153 n=58+60) | |
BM_eigen_exp_float/256k 119µs ± 5% 118µs ± 4% ~ (p=0.453 n=60+58) | |
BM_eigen_exp_float/1M 475µs ± 6% 475µs ± 5% ~ (p=0.723 n=60+60)",Rasmus Munk Larsen,2024-04-12T01:32:18.385Z,NA,NA | |
1582 (https://gitlab.com/libeigen/eigen/-/merge_requests/1582),Refactor indexed view to appease MSVC 14.16.,"We're getting all kinds of warnings about redefining default template | |
parameters, and the MSVC 14.16 build is breaking (likely due to a | |
compiler bug) because it can't find the right template definitions | |
from inside DenseBase. Moving all the template class definitions | |
out of the IndexedViewMethods.inc plugin fixes this.",Antonio Sánchez,2024-04-12T17:05:21.791Z,NA,NA | |
1573 (https://gitlab.com/libeigen/eigen/-/merge_requests/1573),"Fix ""unary minus operator applied to unsigned type, result still unsigned"" on MSVC and other stupid warnings","<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
I fixed a few irritating compiler warnings that only seem to be important to MSVC, including: | |
- negating unsigned integers. On gcc/llvm, the negation operator `-x` seems to be synonymous with `0 - x` (for integers). I make that explicit in Eigen. | |
- Various implicit cast from real to `std::complex` warnings which are suppressed by casting to `RealScalar` instead of `Scalar` | |
- avx512 pblend warnings | |
- warnings in unit tests | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2024-04-12T19:35:05.980Z,NA,NA | |
1585 (https://gitlab.com/libeigen/eigen/-/merge_requests/1585),Handle missing AVX512 intrinsic,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
`pfirst<Packet16i>` added here https://gitlab.com/libeigen/eigen/-/merge_requests/1580 is affected by a gcc bug. | |
### Additional information | |
<!--Any additional information you think is important.--> | |
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95483",Charles Schlosser,2024-04-14T16:41:24.269Z,NA,NA | |
1584 (https://gitlab.com/libeigen/eigen/-/merge_requests/1584),Eigen pblend,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
While addressing some observations by @cantonios , I took this opportunity to apply some lessons learned to clean up the Intel `pblend` implementations. | |
- converting an array of `bool` to a mask is efficiently accomplished by subtracting the bool (in integer form) from zero. | |
- there is no reason to use floating point comparisons to create bit masks. integer math and comparisons are much faster | |
- `_mm_movemask_epi8` efficiently converts a byte mask into a true bit mask. https://godbolt.org/z/eMsqrz4rM | |
- defer to `pselect` instead of spamming macro guards | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2024-04-15T16:19:54.323Z,NA,NA | |
1522 (https://gitlab.com/libeigen/eigen/-/merge_requests/1522),Simd sincos double,"### Reference issue | |
This is another step in the SIMD implementation campaign #2635. This MR involves the vectorized implementation of sine and cosine. | |
### What does this implement/fix? | |
This is a draft of the `sincos` implementation with `double` precision. I used Veltkamp method to split pi/2 into four values for the argument reduction. I used a Padé approximant with an order guaranteeing double precision in the whole \[-pi/4, pi/4\] domain for the polynomial approximation. | |
Regarding numerical results, the implemented approach reverts to using `std::sin` and `std::cos` for inputs larger than `1e14`. | |
Any comments or ideas are welcome! | |
### Benchmark results | |
For benchmarking, I used this code | |
``` | |
void BM_vec_f_cos(benchmark::State& state) { | |
Eigen::VectorXd x; | |
x.setRandom(state.range(0)); | |
for (auto s : state) { | |
x = x.array().cos(); | |
} | |
} | |
// Register the function as a benchmark | |
BENCHMARK(BM_vec_f_cos)->Range(1, 1 << 20); | |
``` | |
For reference, I computed the timing for the float version (already in the code) and the double version (implemented in the current MR). Here's the result | |
``` | |
##DOUBLE | |
# NOT Vectorized | |
BM_vec_f_cos/1 13.7 ns 13.7 ns 53268784 | |
BM_vec_f_cos/8 56.8 ns 56.7 ns 12128308 | |
BM_vec_f_cos/64 444 ns 443 ns 1563514 | |
BM_vec_f_cos/512 3553 ns 3545 ns 195555 | |
BM_vec_f_cos/4096 28355 ns 28283 ns 24766 | |
BM_vec_f_cos/32768 228040 ns 227619 ns 3046 | |
BM_vec_f_cos/262144 1811000 ns 1807908 ns 371 | |
BM_vec_f_cos/1048576 7370421 ns 7357243 ns 84 | |
# SSE4.1 | |
BM_vec_f_cos/1 14.5 ns 14.5 ns 50010744 | |
BM_vec_f_cos/8 65.1 ns 64.9 ns 10544386 | |
BM_vec_f_cos/64 515 ns 513 ns 1360944 | |
BM_vec_f_cos/512 4049 ns 4038 ns 173452 | |
BM_vec_f_cos/4096 32180 ns 32097 ns 21695 | |
BM_vec_f_cos/32768 257746 ns 257044 ns 2714 | |
BM_vec_f_cos/262144 2011084 ns 2005974 ns 336 | |
BM_vec_f_cos/1048576 8049604 ns 8028964 ns 89 | |
# AVX2 | |
BM_vec_f_cos/1 15.0 ns 15.0 ns 37314589 | |
BM_vec_f_cos/8 47.3 ns 47.1 ns 14208618 | |
BM_vec_f_cos/64 361 ns 360 ns 1895361 | |
BM_vec_f_cos/512 2897 ns 2889 ns 243382 | |
BM_vec_f_cos/4096 22173 ns 22118 ns 31485 | |
BM_vec_f_cos/32768 178418 ns 177959 ns 3932 | |
BM_vec_f_cos/262144 1391234 ns 1387836 ns 487 | |
BM_vec_f_cos/1048576 5677948 ns 5663670 ns 127 | |
# AVX512 | |
BM_vec_f_cos/1 14.0 ns 14.0 ns 49550612 | |
BM_vec_f_cos/8 27.2 ns 27.1 ns 25774820 | |
BM_vec_f_cos/64 101 ns 101 ns 6903473 | |
BM_vec_f_cos/512 813 ns 812 ns 876030 | |
BM_vec_f_cos/4096 6535 ns 6523 ns 109188 | |
BM_vec_f_cos/32768 53296 ns 53206 ns 13487 | |
BM_vec_f_cos/262144 440815 ns 439662 ns 1609 | |
BM_vec_f_cos/1048576 1740265 ns 1735814 ns 391 | |
##FLOAT | |
#NON-VECTORIZED | |
BM_vec_f_cos/1 10.3 ns 10.3 ns 62599112 | |
BM_vec_f_cos/8 23.9 ns 23.8 ns 28803165 | |
BM_vec_f_cos/64 191 ns 191 ns 3608446 | |
BM_vec_f_cos/512 1500 ns 1495 ns 469985 | |
BM_vec_f_cos/4096 11913 ns 11879 ns 59246 | |
BM_vec_f_cos/32768 100809 ns 100460 ns 7526 | |
BM_vec_f_cos/262144 891223 ns 887243 ns 709 | |
BM_vec_f_cos/1048576 2912187 ns 2905854 ns 219 | |
#AVX2 | |
BM_vec_f_cos/1 11.3 ns 11.3 ns 61839107 | |
BM_vec_f_cos/8 23.4 ns 23.3 ns 29502403 | |
BM_vec_f_cos/64 90.1 ns 89.8 ns 8276673 | |
BM_vec_f_cos/512 766 ns 764 ns 947448 | |
BM_vec_f_cos/4096 5781 ns 5764 ns 120739 | |
BM_vec_f_cos/32768 57926 ns 57683 ns 15030 | |
BM_vec_f_cos/262144 326851 ns 325917 ns 2104 | |
BM_vec_f_cos/1048576 1262296 ns 1258357 ns 536 | |
#SSE4.1 | |
BM_vec_f_cos/1 10.9 ns 10.8 ns 64594625 | |
BM_vec_f_cos/8 22.7 ns 22.6 ns 31031063 | |
BM_vec_f_cos/64 163 ns 162 ns 3969898 | |
BM_vec_f_cos/512 1327 ns 1322 ns 476229 | |
BM_vec_f_cos/4096 10240 ns 10204 ns 69161 | |
BM_vec_f_cos/32768 82552 ns 82252 ns 8256 | |
BM_vec_f_cos/262144 619372 ns 617440 ns 1025 | |
BM_vec_f_cos/1048576 2422650 ns 2416255 ns 288 | |
``` | |
On the new double implementation, the SSE version barely makes it for large vectors, but the situation isn't much different for float, although it is slightly better. AVX performs around 1.5x. | |
Indeed, there is room for improvement.",Damiano Franzò,2024-04-15T21:12:34.283Z,NA,NA | |
1588 (https://gitlab.com/libeigen/eigen/-/merge_requests/1588),"Fix build for pblend and psin_double, pcos_double when AVX but not AVX2 is supported.",NA,Rasmus Munk Larsen,2024-04-16T16:12:42.905Z,NA,NA | |
1591 (https://gitlab.com/libeigen/eigen/-/merge_requests/1591),Fix compilation problems with PacketI on PowerPC.,Fix compilation problems with PacketI on PowerPC.,Chip Kerchner,2024-04-18T14:55:16.785Z,NA,NA | |
1592 (https://gitlab.com/libeigen/eigen/-/merge_requests/1592),Fix new psincos for ppc and arm32.,"Enables vectorized psincos for double on PPC. | |
Fixes a failing test for 32-bit arm due to lack of integer_packet | |
for scalars.",Antonio Sánchez,2024-04-19T00:31:10.345Z,NA,NA | |
1590 (https://gitlab.com/libeigen/eigen/-/merge_requests/1590),more pblend optimizations,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
A few more optimizations to `pblend`. This provide a utility `blend_mask_helper` for generating a bitmask rom an array of bool. This uses explicit loop unrolling to convert the bool array (0/1) to a bitmask (0xff/0xff). The loop unrolling does not appear to be necessary for clang but doesn't hurt either. It does goad gcc into auto vectorizing the loop. The autovectorization is useful so we don't have to specialize the helper for every scalar size and account for subtle differences in available instructions (especially AVX vs. AVX2). Some observations: | |
- Even Clang is not smart enough to convert a floating point comparison to an integer comparison when original data is boolean | |
- unrolling the loop helps gcc apply auto vectorization. | |
- both gcc and clang know how to piece together SSE intrinsics to emulate AVX2 functionality once the loops are unrolled. clang is MUCH better at this, and I can't figure out how to nudge gcc to generate the same assembly | |
- using the **signed** integer type produces better assembly in gcc, probably because the integer intrinsics mostly apply to signed types (even though we are only concerned with 0 and 1) | |
- separating the zero extension (static cast to the integer type) and the negation helps gcc produce better assembly | |
I declare this bike shed to be fully painted. | |
``` | |
Tensor<bool, 3> selector(sizea, sizeb, sizec); | |
Tensor<float, 3> mat1(sizea, sizeb, sizec); | |
Tensor<float, 3> mat2(sizea, sizeb, sizec); | |
Tensor<float, 3> result(sizea, sizeb, sizec); | |
int repeats = 1000; | |
for (int i = 0; i < repeats; i++) | |
{ | |
result += selector.select(mat1, mat2); | |
} | |
``` | |
LLVM on Windows: | |
SSE: -0.13% | |
AVX: -4.40% | |
AVX2: -0.43% | |
With MSVC on Windows (besides run times being 3x longer), the current AVX code is actually faster than AVX2. Could it be that MSVC does not like to mix n' match integer and float code? | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2024-04-19T02:02:28.563Z,NA,NA | |
1593 (https://gitlab.com/libeigen/eigen/-/merge_requests/1593),Eigen select,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #2734 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Specialized the ternary evaluator to intercept common expressions like `(a < b).select(c, d)` to use typed, fully vectorized comparisons. Template specialization kicks in, if: | |
- `TernaryOp` is `scalar_boolean_select_op<Scalar, Scalar, bool>` | |
- `Arg3Type` is `CwiseBinaryOp<scalar_cmp_op<Scalar, Scalar, cmp, false>, CmpLhsType, CmpRhsType>` | |
and replaces the implementation with: | |
- `scalar_boolean_select_op<Scalar, Scalar, Scalar>` | |
- `CwiseBinaryOp<scalar_cmp_op<Scalar, Scalar, cmp, true>, CmpLhsType, CmpRhsType>` | |
Since the scalar type of the output has nothing to do with the internal comparison, it should be safe. And the usual vectorization logic should process the various expressions that are passed to it like normal. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2024-04-19T17:52:35.571Z,NA,NA | |
1594 (https://gitlab.com/libeigen/eigen/-/merge_requests/1594),Fix `tridiagonalization_inplace_selector::run()` when called from CUDA,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fix #2810 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Adding `EIGEN_DEVICE_FUNC` to `tridiagonalization_inplace_selector::run()` | |
### Additional information | |
<!--Any additional information you think is important.--> | |
This code [here](https://github.com/Ahdhn/EigenCUDA) tests this bugfix",Ahmed Mahmoud,2024-04-19T21:06:59.991Z,NA,NA | |
1595 (https://gitlab.com/libeigen/eigen/-/merge_requests/1595),Update CI scripts.,"- Fix some windows issues with cache and folder creation | |
- Add AVX tests | |
- Disable MSVC+CUDA 9.2 due to compiler bug | |
- Add some scripts to replicate CI environment locally for Windows",Antonio Sánchez,2024-04-20T01:08:20.882Z,NA,NA | |
1597 (https://gitlab.com/libeigen/eigen/-/merge_requests/1597),fix autodiff enum comparison warnings,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2024-04-22T18:14:21.431Z,NA,NA | |
1596 (https://gitlab.com/libeigen/eigen/-/merge_requests/1596),Fix unused variable warnings in TensorIO,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2024-04-22T18:14:55.609Z,NA,NA | |
1598 (https://gitlab.com/libeigen/eigen/-/merge_requests/1598),fix transposed matrix product bug,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
Fixes #2266 | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
The transpose / adjoint of matrix products is being evaluated correctly, but incurs unnecessary allocations even if `noalias` is called. For example, the below snippet will allocate a useless temporary matrix: | |
```cpp | |
#define EIGEN_RUNTIME_NO_MALLOC | |
#include <Eigen/Core> | |
int main() | |
{ | |
int n = 10; | |
MatrixXd A = MatrixXd::Random(n, n); | |
MatrixXd B = MatrixXd::Random(n, n); | |
MatrixXd C = MatrixXd::Zero(n, n); | |
Eigen::internal::set_is_malloc_allowed(false); | |
C.noalias() = (A * B).transpose(); // triggers malloc despite noalias() | |
Eigen::internal::set_is_malloc_allowed(true); | |
} | |
``` | |
This is because `evaluator<Transpose<Product<Lhs,Rhs>>>` will first create the evaluator for `Product<Lhs,Rhs>`, which by default creates a temporary. An easy fix is to write out the expanded matrix product `C.noalias() = B.transpose() * A.transpose();` which is handled correctly. This patch specializes `transpose()` and `adjoint()` to return an expanded expression of the transposed product. This is arguably syntactic sugar but the issue is so subtle it could be causing unnecessary allocations in many codebases. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2024-04-23T03:25:58.676Z,NA,NA | |
1599 (https://gitlab.com/libeigen/eigen/-/merge_requests/1599),Don't let the PPC runner try to cross-compile.,"It seems to always fail with ""error: unknown emulation: elf64lppc"" | |
when trying to build non-ppc targets. | |
A new tag ""cross-compiler"" was added to the arm/intel runners.",Antonio Sánchez,2024-04-23T03:40:41.505Z,NA,NA | |
1601 (https://gitlab.com/libeigen/eigen/-/merge_requests/1601),Fix sin/cos on PPC.,The missing comparison function was causing the wrong sin/cos to sometimes be selected.,Antonio Sánchez,2024-04-24T23:09:56.402Z,NA,NA | |
1602 (https://gitlab.com/libeigen/eigen/-/merge_requests/1602),Slightly adjust error bound for nonlinear tests.,"With AVX and without FMA, we have a slightly increased error, but the algorithm | |
has still clearly converged and the results are correct.",Antonio Sánchez,2024-04-25T18:04:49.983Z,NA,NA | |
1604 (https://gitlab.com/libeigen/eigen/-/merge_requests/1604),Unbork avx512 preduce_mul on MSVC.,"MSVC seems to have a buggy implementation of `_mm512_reduce_mul_epi64` | |
when the output is negative, instead producing what seems like garbage. | |
We need to use a different implementation.",Antonio Sánchez,2024-04-26T15:28:04.891Z,NA,NA | |
1606 (https://gitlab.com/libeigen/eigen/-/merge_requests/1606),Fix undefined behavior for generating inputs to the predux_mul test.,"For a packet size of 1, we had a signed int overflow cast that was | |
causing some tests to fail. Fixing the input allows the test to pass.",Antonio Sánchez,2024-04-29T20:32:10.473Z,NA,NA | |
1607 (https://gitlab.com/libeigen/eigen/-/merge_requests/1607),Fix more hard-coded magic bounds.,"Depending on platform, the hard-coded error bound for the nonlinear | |
tests is too tight. Ever-so-slightly relaxing it allows the tests to | |
pass.",Antonio Sánchez,2024-04-29T21:21:12.216Z,NA,NA | |
1605 (https://gitlab.com/libeigen/eigen/-/merge_requests/1605),Remove unnecessary semicolons.,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
N/A - fix for downstream builds that depend on eigen. | |
### What does this implement/fix? | |
Removes unnecessary semicolons in two places at the end of function declarations. | |
### Additional information | |
<!--Any additional information you think is important.-->",Jonathan Freed,2024-04-29T21:31:27.622Z,NA,NA | |
1493 (https://gitlab.com/libeigen/eigen/-/merge_requests/1493),Add truncation op,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
While we do have the other std:: nearest integer functions: | |
- `floor` (to negative infinity), | |
- `ceil` (to positive infinity), | |
- `round` (to nearest with ties away from zero), and | |
- `rint` (whatever the current rounding mode is) | |
We were inexplicably missing `trunc` (to zero). I needed this for something I was working on. | |
In addition to adding the various packet ops, I cleaned and consolidated the generic implementations used by SSE2 and NEON, including adding a generic implementation for `pround`. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2024-04-29T23:45:50.698Z,NA,NA | |
1600 (https://gitlab.com/libeigen/eigen/-/merge_requests/1600),Eigen transpose product,"<!-- | |
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message. | |
If the MR fixes an issue, please include ""Fixes #issue"" in the commit message and the MR description. | |
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR. | |
Before submitting the MR, you also need to complete the following checks: | |
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes. | |
- Rebase before committing | |
- For code changes, run the test suite (at least the tests that are likely affected by the change). | |
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests). | |
- If possible, add a test (both for bug-fixes as well as new features) | |
- Make sure new features are documented | |
Note that we are a team of volunteers; we appreciate your patience during the review process. | |
Again, thanks for contributing! --> | |
### Reference issue | |
<!-- You can link to a specific issue using the gitlab syntax #<issue number> --> | |
### What does this implement/fix? | |
<!--Please explain your changes.--> | |
Follow up on https://gitlab.com/libeigen/eigen/-/merge_requests/1598, which I reverted due to a few loose ends. | |
Essentially, products written like `C.noalias() = (A * B).transpose()` trigger an avoidable memory allocation, which is never desirable for time and memory complexity, especially for users who have strict memory constraints. This can easily be avoided by writing the product with an algebraically equivalent expression `C.noalias() = B.transpose() * A.transpose()`. This MR does that by specializing `Product::transpose()` and `Product::adjoint()` to do just that. | |
The tricky part is that `transpose()` may not be consistently defined for each expression. `DenseBase<Derived>::tranpose()` returns `Tranpose<Derived>`. `PermutationBase<Derived>::transpose()` returns `Inverse<Derived>`. It is still worth handling permutation matrices, as `(A * P).transpose()` still triggers an avoidable allocation that is avoided by writing `P.inverse() * A.transpose()`. | |
To cover all the scenarios that I have not considered, I am defaulting the transposed product to the current implementation where we just transpose the result. If other expression combinations are identified that could benefit from this strategy, we can add an additional specialization. | |
### Additional information | |
<!--Any additional information you think is important.-->",Charles Schlosser,2024-04-30T13:32:54.067Z,NA,NA | |
1610 (https://gitlab.com/libeigen/eigen/-/merge_requests/1610),Fix new generic nearest integer ops on GPU.,NA,Rasmus Munk Larsen,2024-04-30T22:18:26.600Z,NA,NA | |
1609 (https://gitlab.com/libeigen/eigen/-/merge_requests/1609),Judge unitary-ness relative to scaling.,"We have some flaky tests when checking if eigenvectors are all orthonormal, | |
since our error increases with the scaling (or largest eigenvalue) of | |
the original matrix. Modify the test to increase the error tolerance.",Antonio Sánchez,2024-04-30T22:28:47.600Z,NA,NA | |
1556 (https://gitlab.com/libeigen/eigen/-/merge_requests/1556),Reorganize CMake and minimize configuration for non-top-level builds.,"Main changes: | |
- Re-arranged main CMakeLists.txt to group target types together | |
- ~~Only add install targets if top-level~~ edit: re-enabled to support dependency installations. See comments in the CMakeLists.txt file. | |
- Only check for math library + other build settings if we're actually building _something_ | |
- Only configure tests and add test build flags if `EIGEN_BUILD_TESTING` | |
- Don't exclude BLAS/LAPACK from `all ` target, so install target will work properly if EIGEN_BUILD_BLAS/EIGEN_BUILD_LAPACK are on | |
- Remove clang-format 9 target (we didn't use it and don't need it now with `git clang-format`) | |
- Explicitly add `Eigen3::Eigen |
View raw
(Sorry about that, but we can’t show files that are this big right now.)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment