Skip to content

Instantly share code, notes, and snippets.

View kaushikcfd's full-sized avatar

Kaushik Kulkarni kaushikcfd

View GitHub Profile
  • Make sure about the assertions.
  • Merging ArrayCallInstruction with CallInstruction(or Bye Bye ArrayCall;)
  • Remove the _underscores
  • Coefficient collector for the strides
  • Handle temporaries
  • Stride for all types
  • Negotiation for inames??? i.e. dont have mulitple l.0 along the interface
  • read about it: TYpe inference: check from user
  • Learned about callable
  • One line after the mvim folding marker and title with lower case

Introduction

  • After the discussion with the Firedrake group on 8th Februray, we started working towards function calls on arrays in Loopy.
  • In order to tackle the generalized problem we are working towards calling kernels from kernels within Loopy, so that the intended operation on arrays could be packed in a separate callee kernel.
  • WIP Merge Request: https://gitlab.tiker.net/inducer/loopy/merge_requests/232

The expected behavior is:

callee_knl = lp.make_kernel(
 "{i, j}: 0<=i, j<16",
  • Context: argmin/argmax are one of the last bits whose tests are needed to be handled. I am having the following troubles with intergrating these reductions into the new function interface.

Problems about argmin/argmax

Outline about how ArgReductionOp works:

  • During the creation phase -- gets identified and tagged as a reduction operation.
  • During the type inference, gets interpreted as a single instruction(which is a good thing). For the type inference redirects to the result_dtypes method of the class ArgReductionOp.
  • At the realize_reduction step in the preprocess part of the pipeline it gets converted to 3 instructions and one of the instructions would point be a call to Call(<class ArgExtOp>, (*parameters))
  • In the end, we append the inline dtype1 loopy_argmin_dtype1_dtype2_op(...) function to the set of preambles of the device code.
#include <math.h>
#include <petsc.h>
static double const form_t0[3 * 3] = { 0.6666666666666669, 0.16666666666666663, 0.16666666666666666, 0.16666666666666674, 0.16666666666666663, 0.6666666666666665, 0.16666666666666669, 0.6666666666666666, 0.16666666666666663 };
static double const form_t4[3] = { 0.16666666666666666, 0.16666666666666666, 0.16666666666666666 };
void wrap_form00_cell_integral_otherwise(int const start, int const end, Mat const mat0, double const *__restrict__ dat0, double const *__restrict__ glob0, int const *__restrict__ map0)
{
double form_t1;
import loopy as lp
from loopy.isl_helpers import simplify_via_aff
from pymbolic.primitives import CallWithKwargs
from loopy.kernel.function_interface import (get_kw_pos_association,
register_pymbolic_calls_to_knl_callables)
from loopy.symbolic import IdentityMapper
class DimChanger(IdentityMapper):
def __init__(self, caller_arg_dict, callee_arg_dict, callee_to_caller_args):
import loopy as lp
import pyopencl as cl
import pyopencl.clrandom as cl_random
import numpy as np
import numpy.linalg as la
from loopy.transform.register_callable import _match_caller_callee_argument_dimension
ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)
from petsc4py import PETSc
import numpy as np
import coffee.system
from pyop2 import compilation
import ctypes
n = 10
a_petsc = PETSc.Vec().create(PETSc.COMM_WORLD)
from petsc4py import PETSc
import loopy as lp
import numpy as np
import pyopencl as cl
import coffee.system
from pyop2 import compilation
import ctypes
from mako.template import Template
import re
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
  • Do a local install of petsc.

  • the configuration options of petsc are:

  • CUDA:

    ./configure --download-eigen=/home/kgk2/pack/eigen-3.3.3.tgz --with-fortran-bindings=0 --download-chaco --download-metis --download-parmetis --download-scalapack --download-hypre --download-mumps --download-netcdf --download-hdf5=https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.10/hdf5-1.10.6/src/hdf5-1.10.6.tar.bz2 --download-pnetcdf --download-exodusii --download-fblaslapack --with-cuda=1 --with-cuda-dir=/home/kgk2/pack/cuda --download-zlib --with-cudac=nvcc
    
  • ViennaCL:

    ./configure --download-eigen=/home/kgk2/pack/eigen-3.3.3.tgz --with-fortran-bindings=0 --download-chaco --download-metis --download-parmetis --download-scalapack --download-hypre --download-mumps --download-netcdf --download-hdf5=https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.10/hdf5-1.10.6/src/hdf5-1.10.6.tar.bz2 --download-pnetcdf --download-exodusii --download-fblaslapack --download-zlib --with-viennacl=1 --download-viennacl