- Make sure about the assertions.
- Merging ArrayCallInstruction with CallInstruction(or Bye Bye ArrayCall;)
- Remove the _underscores
- Coefficient collector for the strides
- Handle temporaries
- Stride for all types
- Negotiation for inames??? i.e. dont have mulitple
l.0along the interface - read about it: TYpe inference: check from user
- Learned about
callable - One line after the mvim folding marker and title with lower case
- After the discussion with the Firedrake group on 8th Februray, we started working towards function calls on arrays in Loopy.
- In order to tackle the generalized problem we are working towards calling kernels from kernels within Loopy, so that the intended operation on arrays could be packed in a separate callee kernel.
- WIP Merge Request: https://gitlab.tiker.net/inducer/loopy/merge_requests/232
The expected behavior is:
callee_knl = lp.make_kernel(
"{i, j}: 0<=i, j<16",- Context:
argmin/argmaxare one of the last bits whose tests are needed to be handled. I am having the following troubles with intergrating these reductions into the new function interface.
Outline about how ArgReductionOp works:
- During the creation phase -- gets identified and tagged as a reduction operation.
- During the type inference, gets interpreted as a single instruction(which is a good thing). For the type inference redirects to the
result_dtypesmethod of the classArgReductionOp. - At the
realize_reductionstep in the preprocess part of the pipeline it gets converted to 3 instructions and one of the instructions would point be a call toCall(<class ArgExtOp>, (*parameters)) - In the end, we append the
inline dtype1 loopy_argmin_dtype1_dtype2_op(...)function to the set of preambles of the device code.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #include <math.h> | |
| #include <petsc.h> | |
| static double const form_t0[3 * 3] = { 0.6666666666666669, 0.16666666666666663, 0.16666666666666666, 0.16666666666666674, 0.16666666666666663, 0.6666666666666665, 0.16666666666666669, 0.6666666666666666, 0.16666666666666663 }; | |
| static double const form_t4[3] = { 0.16666666666666666, 0.16666666666666666, 0.16666666666666666 }; | |
| void wrap_form00_cell_integral_otherwise(int const start, int const end, Mat const mat0, double const *__restrict__ dat0, double const *__restrict__ glob0, int const *__restrict__ map0) | |
| { | |
| double form_t1; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import loopy as lp | |
| from loopy.isl_helpers import simplify_via_aff | |
| from pymbolic.primitives import CallWithKwargs | |
| from loopy.kernel.function_interface import (get_kw_pos_association, | |
| register_pymbolic_calls_to_knl_callables) | |
| from loopy.symbolic import IdentityMapper | |
| class DimChanger(IdentityMapper): | |
| def __init__(self, caller_arg_dict, callee_arg_dict, callee_to_caller_args): |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import loopy as lp | |
| import pyopencl as cl | |
| import pyopencl.clrandom as cl_random | |
| import numpy as np | |
| import numpy.linalg as la | |
| from loopy.transform.register_callable import _match_caller_callee_argument_dimension | |
| ctx = cl.create_some_context() | |
| queue = cl.CommandQueue(ctx) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| from petsc4py import PETSc | |
| import numpy as np | |
| import coffee.system | |
| from pyop2 import compilation | |
| import ctypes | |
| n = 10 | |
| a_petsc = PETSc.Vec().create(PETSc.COMM_WORLD) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| from petsc4py import PETSc | |
| import loopy as lp | |
| import numpy as np | |
| import pyopencl as cl | |
| import coffee.system | |
| from pyop2 import compilation | |
| import ctypes | |
| from mako.template import Template | |
| import re |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
-
Do a local install of petsc.
-
the configuration options of petsc are:
-
CUDA:
./configure --download-eigen=/home/kgk2/pack/eigen-3.3.3.tgz --with-fortran-bindings=0 --download-chaco --download-metis --download-parmetis --download-scalapack --download-hypre --download-mumps --download-netcdf --download-hdf5=https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.10/hdf5-1.10.6/src/hdf5-1.10.6.tar.bz2 --download-pnetcdf --download-exodusii --download-fblaslapack --with-cuda=1 --with-cuda-dir=/home/kgk2/pack/cuda --download-zlib --with-cudac=nvcc -
ViennaCL:
./configure --download-eigen=/home/kgk2/pack/eigen-3.3.3.tgz --with-fortran-bindings=0 --download-chaco --download-metis --download-parmetis --download-scalapack --download-hypre --download-mumps --download-netcdf --download-hdf5=https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.10/hdf5-1.10.6/src/hdf5-1.10.6.tar.bz2 --download-pnetcdf --download-exodusii --download-fblaslapack --download-zlib --with-viennacl=1 --download-viennacl