- OpenBLAS can be built with NOFORTRAN, in that case it's CBLAS + f2c'd LAPACK
- OpenBLAS can be built without LAPACK support, Arch Linux currently does this (see scipy/scipy#17465)
- OpenBLAS library can be renamed with an option in its Makefile
- Build options:
- conda-forge: https://github.com/conda-forge/openblas-feedstock/blob/49ca08fc9d1ff220804aa9b894b9a6fe5db45057/recipe/conda_build_config.yaml
- Spack: https://github.com/spack/spack/blob/develop/var/spack/repos/builtin/packages/openblas/package.py#L52
- vendored into NumPy/SciPy wheels: https://github.com/MacPython/openblas-libs/blob/master/tools/build_openblas.sh#L53
(lp64) $ ls lib/libopenblas*
lib/libopenblas.a lib/libopenblasp-r0.3.21.a lib/libopenblasp-r0.3.21.so lib/libopenblas.so lib/libopenblas.so.0
(lp64) $ ls include/
cblas.h f77blas.h lapacke_config.h lapacke.h lapacke_mangling.h lapacke_utils.h lapack.h openblas_config.h
(ilp64) $ ls lib/libopenblas*
lib/libopenblas64_.a lib/libopenblas64_p-r0.3.21.a lib/libopenblas64_p-r0.3.21.so lib/libopenblas64_.so lib/libopenblas64_.so.0
(ilp64) $ ls include/
cblas.h f77blas.h lapacke_config.h lapacke.h lapacke_mangling.h lapacke_utils.h lapack.h openblas_config.h
(lp64) $ pkg-config --cflags openblas
-I/path/to/env/include
(lp64) $ pkg-config --libs openblas
-L/path/to/env/lib -lopenblas
(ilp64) $ pkg-config --cflags openblas
-I/path/to/env/include
ilp64) $ pkg-config --libs openblas
-L/path/to/envlib -lopenblas
Relevant discussions:
- For OpenBLAS, see OpenMathLib/OpenBLAS#646 for standardized agreement on shared library name and symbol suffix.
- PRs that added support for ILP64 to
numpy.distutils
:- with
64_
symbol suffix: numpy/numpy#15012 - generalized to non-suffixed build: numpy/numpy#15069
- with
- PR that added support for ILP64 to SciPy: scipy/scipy#11302
- Note to self: when SciPy uses ILP64, it also still requires LP64, because not all submodules support ILP64. Not for the same extensions though.
$ objdump -t lp64/lib/libopenblas.so | grep -E "scopy*" # output cleaned up cblas_scopy scopy_
$ objdump -t ilp64/lib/libopenblas64_.so | grep -E "scopy*" cblas_scopy64_ scopy_64_
What should be implemented in Meson, and what should be left to users? Thoughts:
-
Meson should support a keyword to select the desired interface (
interface : 'ilp64'
), defaulting to'lp64'
because that is what reference BLAS/LAPACK provide and what is typically expected. -
The dependency object returned by
dependency('openblas')
or similar should be query-able for what the interface is. -
For OpenBLAS, Meson should look for
libopenblas64_.so
for ILP64. -
For Fortran, should Meson automatically set the required compile flag
-fdefault-integer-8
?- Note: this flag is specific to gfortran and Clang; For Intel compilers it is
-integer-size 64
(Linux/macOS),/integer-size: 64
(Windows). - This is almost always the right thing to do; however for integer
variables that reference an external non-BLAS/LAPACK interface and must
not be changed to 64-bit, those should then be explicitly
integer*4
in the user code.
- Note: this flag is specific to gfortran and Clang; For Intel compilers it is
-
Users are responsible for implementing name mangling. I.e., appending
64_
to BLAS/LAPACK symbols when they are requesting ILP64, and also using portable integer types in their code if they want to be able to build with both LP64 and ILP64. This typically looks something like:#ifdef HAVE_CBLAS64_ #define CBLAS_FUNC(name) name ## 64_ #else #define CBLAS_FUNC(name) name #endif CBLAS_FUNC(cblas_dgemm)(...);
-
Users are responsible for implementing a build option (e.g., in
meson_options.txt
) if they want to allow switching between LP64 and ILP64. -
Meson doesn't know anything about f2py; users have to instruct f2py to use 64-bit integers with
--f2cmap
or similar. Seeget_f2py_int64_options
in SciPy for details. -
When mixing C and Fortran code, the C code typically needs mangling because Fortran expects a trailing underscore. This is up to the user to implement.
-
TBD:
numpy.distutils
does a symbol prefix/suffix check and provides the result to its users, it could be helpful if Meson did this. See https://github.com/numpy/numpy/blob/6094eff9/numpy/distutils/system_info.py#L2271-L2278.
Initial rough notes:
- Not all implementations provide CBLAS,
- The header is typically named
cblas.h
, however MKL calls itmkl_cblas.h
, - OpenBLAS can be built without a Fortran compiler, in that case it's CBLAS + f2c'd LAPACK,
- It would be useful if the dependency objects that Meson returned can be queried for whether CBLAS is present or not.
- numpy.distutils detects CBLAS and defines
HAVE_CBLAS
if it's found. - BLIS doesn't build the CBLAS interface by default. To build it, define
BLIS_ENABLE_CBLAS
. - NumPy requires CBLAS, it's not optional.
- MKL docs: https://www.intel.com/content/www/us/en/develop/documentation/onemkl-windows-developer-guide/top/linking-your-application-with-onemkl/linking-in-detail/linking-with-interface-libraries/using-the-ilp64-interface-vs-lp64-interface.html
- MKL link line advisor: https://www.intel.com/content/www/us/en/developer/tools/oneapi/onemkl-link-line-advisor.html
$ cd /opt/intel/oneapi/mkl/latest/lib # from recommended Intel installer
$ ls pkgconfig/
mkl-dynamic-ilp64-iomp.pc mkl-dynamic-lp64-iomp.pc mkl-static-ilp64-iomp.pc mkl-static-lp64-iomp.pc
mkl-dynamic-ilp64-seq.pc mkl-dynamic-lp64-seq.pc mkl-static-ilp64-seq.pc mkl-static-lp64-seq.pc
$ pkg-config --libs mkl-dynamic-ilp64-seq
-L/opt/intel/oneapi/mkl/latest/lib/pkgconfig/../../lib/intel64 -lmkl_intel_ilp64 -lmkl_sequential -lmkl_core -lpthread -lm -ldl
$ pkg-config --cflags mkl-dynamic-ilp64-seq
-DMKL_ILP64 -I/opt/intel/oneapi/mkl/latest/lib/pkgconfig/../../include
$ ls intel64/libmkl*
intel64/libmkl_avx2.so.2 intel64/libmkl_cdft_core.so.2 intel64/libmkl_intel_lp64.so.2 intel64/libmkl_sequential.a
intel64/libmkl_avx512.so.2 intel64/libmkl_core.a intel64/libmkl_intel_thread.a intel64/libmkl_sequential.so
intel64/libmkl_avx.so.2 intel64/libmkl_core.so intel64/libmkl_intel_thread.so intel64/libmkl_sequential.so.2
intel64/libmkl_blacs_intelmpi_ilp64.a intel64/libmkl_core.so.2 intel64/libmkl_intel_thread.so.2 intel64/libmkl_sycl.a
intel64/libmkl_blacs_intelmpi_ilp64.so intel64/libmkl_def.so.2 intel64/libmkl_lapack95_ilp64.a intel64/libmkl_sycl.so
intel64/libmkl_blacs_intelmpi_ilp64.so.2 intel64/libmkl_gf_ilp64.a intel64/libmkl_lapack95_lp64.a intel64/libmkl_sycl.so.2
intel64/libmkl_blacs_intelmpi_lp64.a intel64/libmkl_gf_ilp64.so intel64/libmkl_mc3.so.2 intel64/libmkl_tbb_thread.a
intel64/libmkl_blacs_intelmpi_lp64.so intel64/libmkl_gf_ilp64.so.2 intel64/libmkl_mc.so.2 intel64/libmkl_tbb_thread.so
intel64/libmkl_blacs_intelmpi_lp64.so.2 intel64/libmkl_gf_lp64.a intel64/libmkl_pgi_thread.a intel64/libmkl_tbb_thread.so.2
intel64/libmkl_blacs_openmpi_ilp64.a intel64/libmkl_gf_lp64.so intel64/libmkl_pgi_thread.so intel64/libmkl_vml_avx2.so.2
intel64/libmkl_blacs_openmpi_ilp64.so intel64/libmkl_gf_lp64.so.2 intel64/libmkl_pgi_thread.so.2 intel64/libmkl_vml_avx512.so.2
intel64/libmkl_blacs_openmpi_ilp64.so.2 intel64/libmkl_gnu_thread.a intel64/libmkl_rt.so intel64/libmkl_vml_avx.so.2
intel64/libmkl_blacs_openmpi_lp64.a intel64/libmkl_gnu_thread.so intel64/libmkl_rt.so.2 intel64/libmkl_vml_cmpt.so.2
intel64/libmkl_blacs_openmpi_lp64.so intel64/libmkl_gnu_thread.so.2 intel64/libmkl_scalapack_ilp64.a intel64/libmkl_vml_def.so.2
# intel64/libmkl_blacs_openmpi_lp64.so.2 intel64/libmkl_intel_ilp64.a intel64/libmkl_scalapack_ilp64.so intel64/libmkl_vml_mc2.so.2
# intel64/libmkl_blas95_ilp64.a intel64/libmkl_intel_ilp64.so intel64/libmkl_scalapack_ilp64.so.2 intel64/libmkl_vml_mc3.so.2
# intel64/libmkl_blas95_lp64.a intel64/libmkl_intel_ilp64.so.2 intel64/libmkl_scalapack_lp64.a intel64/libmkl_vml_mc.so.2
intel64/libmkl_cdft_core.a intel64/libmkl_intel_lp64.a intel64/libmkl_scalapack_lp64.so
intel64/libmkl_cdft_core.so intel64/libmkl_intel_lp64.so intel64/libmkl_scalapack_lp64.so.2
$ objdump -t intel64/libmkl_intel_ilp64.so | grep scopy # cleaned up output:
0000000000000000 *UND* 0000000000000000 mkl_blas_scopy
0000000000323000 g F .text 0000000000000030 cblas_scopy_64
00000000002cecf0 g F .text 0000000000000030 cblas_scopy
000000000025aca0 g F .text 00000000000001d0 mkl_blas__scopy
000000000025aca0 g F .text 00000000000001d0 scopy_64_
000000000025aca0 g F .text 00000000000001d0 scopy_64
000000000025aca0 g F .text 00000000000001d0 scopy_
000000000025aca0 g F .text 00000000000001d0 scopy
tl;dr: for MKL there's 8 pkg-config file, so you choose lp64/ilp64, dynamic/static, and pthreads/openmp; after that you can pick whatever symbols you like, they all exist and are aliases.
A test with SciPy & MKL:
$ # No pkg-config files for MKL in conda-forge yet, so use the Intel installer:
$ export PKG_CONFIG_PATH=/opt/intel/oneapi/mkl/latest/lib/pkgconfig/
$ meson setup build --prefix=$PWD/build-install -Dblas=mkl-dynamic-ilp64-seq -Dlapack=mkl-dynamic-ilp64-seq
$ python dev.py build
$ ldd build/scipy/linalg/_flapack.cpython-310-x86_64-linux-gnu.so
linux-vdso.so.1 (0x00007ffe2cd14000)
libmkl_intel_ilp64.so.2 => /opt/intel/oneapi/mkl/latest/lib/intel64/libmkl_intel_ilp64.so.2 (0x00007f75a5000000)
libmkl_sequential.so.2 => /opt/intel/oneapi/mkl/latest/lib/intel64/libmkl_sequential.so.2 (0x00007f75a3400000)
libmkl_core.so.2 => /opt/intel/oneapi/mkl/latest/lib/intel64/libmkl_core.so.2 (0x00007f759f000000)
libm.so.6 => /usr/lib/libm.so.6 (0x00007f75a62e2000)
libc.so.6 => /usr/lib/libc.so.6 (0x00007f759ee19000)
/usr/lib64/ld-linux-x86-64.so.2 (0x00007f75a63e8000)
libdl.so.2 => /usr/lib/libdl.so.2 (0x00007f75a62db000)
libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007f75a62d6000
$ # Due to RPATH stripping on install, this doesn't actually work unless we put MKL into our conda env:
$ ldd build-install/lib/python3.10/site-packages/scipy/linalg/_flapack.cpython-310-x86_64-linux-gnu.so
linux-vdso.so.1 (0x00007ffdf6bc6000)
libmkl_intel_ilp64.so.2 => not found
libmkl_sequential.so.2 => not found
libmkl_core.so.2 => not found
libm.so.6 => /usr/lib/libm.so.6 (0x00007f7e35318000)
libc.so.6 => /usr/lib/libc.so.6 (0x00007f7e35131000)
/usr/lib64/ld-linux-x86-64.so.2 (0x00007f7e356e5000)
Also need to remember that MKL uses a g77 ABI (Accelerate too), while OpenBLAS and most other BLAS
libraries will be using the gfortran ABI. See the use-g77-abi
option in SciPy's meson_options.txt.
Pkg-config file names for ArmPL to use (from spack/spack#34979 (comment)):
armpl-dynamic-ilp64-omp armpl-Fortran-static-ilp64-omp
armpl-dynamic-ilp64-omp.pc armpl-Fortran-static-ilp64-omp.pc
armpl-dynamic-ilp64-seq armpl-Fortran-static-ilp64-seq
armpl-dynamic-ilp64-seq.pc armpl-Fortran-static-ilp64-seq.pc
armpl-dynamic-lp64-omp armpl-Fortran-static-lp64-omp
armpl-dynamic-lp64-omp.pc armpl-Fortran-static-lp64-omp.pc
armpl-dynamic-lp64-seq armpl-Fortran-static-lp64-seq
armpl-dynamic-lp64-seq.pc armpl-Fortran-static-lp64-seq.pc
armpl-Fortran-dynamic-ilp64-omp armpl-static-ilp64-omp
armpl-Fortran-dynamic-ilp64-omp.pc armpl-static-ilp64-omp.pc
armpl-Fortran-dynamic-ilp64-seq armpl-static-ilp64-seq
armpl-Fortran-dynamic-ilp64-seq.pc armpl-static-ilp64-seq.pc
armpl-Fortran-dynamic-lp64-omp armpl-static-lp64-omp
armpl-Fortran-dynamic-lp64-omp.pc armpl-static-lp64-omp.pc
armpl-Fortran-dynamic-lp64-seq armpl-static-lp64-seq
armpl-Fortran-dynamic-lp64-seq.pc armpl-static-lp64-seq.pc
Note that as of Jan'23, ArmPL ships the files without a .pc
extension (that
will hopefully be fixed), and Spack renames then so adds the .pc
copies of
the original files.
In macOS >=13.3, two LP64 and one ILP64 build of vecLib are shipped. Due to compatibility, the legacy interfaces (providing LAPACK 3.2.1) will be used by default. To use the new interfaces (providing LAPACK 3.9.1), including ILP64, it is necessary to set some #defines before including Accelerate / vecLib headers:
-DACCELERATE_NEW_LAPACK
: use the new interfaces-DACCELERATE_LAPACK_ILP64
: use new ILP64 interfaces (note this requires-DACCELERATE_NEW_LAPACK
to be set as well)
The normal F77 symbols will remain as the legacy implementation. The newer
interfaces have separate symbols with suffixes $NEWLAPACK
or $NEWLAPACK$ILP64
.
Example binary symbols:
_dgemm_
: this is the legacy implementation_dgemm$NEWLAPACK
: this is the new implementation_dgemm$NEWLAPACK$ILP64
: this is the new ILP64 implementaion
If you use Accelerate / vecLib headers with the above defines, you don't need to worry about the symbol names. They'll get aliased correctly.
For headers and linker flags, check if these directories exist before using them:
-I/System/Library/Frameworks/vecLib.framework/Headers
, flags: ['-Wl,-framework', '-Wl,Accelerate']-I/System/Library/Frameworks/vecLib.framework/Headers
, flags: ['-Wl,-framework', '-Wl,vecLib']
Note that the dylib's are no longer physically present, they're provided in the shared linker cache.