Skip to content

Instantly share code, notes, and snippets.

View stellaraccident's full-sized avatar

Stella Laurenzo stellaraccident

  • Amd
  • Seattle Washington, US
View GitHub Profile
@stellaraccident
stellaraccident / tie_shape_isses.md
Created June 2, 2020 00:46
Issues with tie_shape

Multi-layer perceptron with dynamic batch size IR

func @predict(%arg0: tensor<?x784xf32>) -> tensor<?x10xf32> attributes {iree.module.export, iree.reflection = {abi = "sip", abiv = 1 : i32, sip = "I8!S5!k0_0R3!_0"}, tf._input_shapes = [#tf.shape<?x784>, #tf.shape<*>, #tf.shape<*>, #tf.shape<*>, #tf.shape<*>, #tf.shape<*>, #tf.shape<*>], tf.signature.is_stateful} {
    %0 = flow.variable.address @__iree_flow___sm_node1__h1_weights : !iree.ptr<tensor<784x256xf32>>
    %1 = flow.variable.address @__iree_flow___sm_node4__h1_bias : !iree.ptr<tensor<256xf32>>
    %2 = flow.variable.address @__iree_flow___sm_node2__h2_weights : !iree.ptr<tensor<256x256xf32>>
    %3 = flow.variable.address @__iree_flow___sm_node5__h2_bias : !iree.ptr<tensor<256xf32>>
    %4 = flow.variable.address @__iree_flow___sm_node3__out_weights : !iree.ptr<tensor<256x10xf32>>
    %5 = flow.variable.address @__iree_flow___sm_node6__out_bias : !iree.ptr<tensor<10xf32>>

I plumbed through support today for the __array_func__ hook. This is defined in NEP-18 and there is a new proposal for __array_module__ in NEP-37. For a from-scratch implementation __array_func__ is quite nice. I believe that __array_module__ may be of some help to existing implementations or those needing to layer multiple hooks together.

This let me get tracing working for numpy.dot and numpy.transpose ops. For the latter, we run in to for the first time the limits of what can be extracted in a local transformation: the axes parameter can either be None or a list of ints. In more static op sets, it is often represented as an attribute (as opposed to SSA-value/operand), and in general, the more is known about it, the better that any code generation strat

I've been fiddling with the modeling of the numpy ufunc abstractions, because if you get that right, you have a large swath of standard operations.

Mapped to MLIR, they represent several ops: a definition (which defines a module level symbol) and various operations on the ufunc (such as ufunc_call, ufunc_reduce, ufunc_accumulate, ufunc_reduceat, ufunc_outer, ufunc_at). (The at variants perform in place updates)

The ufunc definition itself defines some metadata but is primarily an overloaded set of FunctionType signatures that operate on scalars. I've modeled this as an op that combines an array of FunctionType and variadic of regions:

def Numpy_GenericUfuncOp : Numpy_Op<"generic_ufunc", [
    IsolatedFromAbove,

Just committed some updates to npcomp and getting excited... Full disclosure: it doesn't do anything yet, but the ideas are crystalizing for me.

It feels really good to get this idea out of my head. It's been bumping around for a while... A lot of the compilers I see for python-ey things have a really big hurdle you have to jump over to invoke compilation or lack a way to get enough constraints in place to get the optimizations that are important in a lot of cases (usually relying on fat annotations, fixed class hierarchies, etc). My idea is that we aren't really using the interactive power of python to do the program extraction... There shouldn't be one big "compile" method (or one trace annotation like @tf.function, etc). There should be a conversation with the system, just like there is in normal python programming. In that conversation, if I give it more, it should give me more.

So most of this stuff falls

*** IR Dump After Canonicalizer ***
module {
func @dynamic_tensor(%arg0: tensor<?x?xf32>) -> tensor<?x?xf32> attributes {iree.module.export} {
%0 = "xla_hlo.abs"(%arg0) : (tensor<?x?xf32>) -> tensor<?x?xf32>
return %0 : tensor<?x?xf32>
}
}
*** IR Dump Before Canonicalizer ***
module {
func @simple_mul(%arg0: tensor<?x?xf32>) -> tensor<?x?xf32> attributes {iree.module.export} {
%0 = "xla_hlo.abs"(%arg0) : (tensor<?x?xf32>) -> tensor<?x?xf32>
return %0 : tensor<?x?xf32>
}
}
module {
func @simple_mul(%arg0: tensor<?xf32>) -> tensor<?xf32> attributes {iree.module.export} {
%0 = "xla_hlo.abs"(%arg0) : (tensor<?xf32>) -> tensor<?xf32>
return %0 : tensor<?xf32>
}
}
*** IR Dump After xla_hlo::(anonymous namespace)::LegalizeControlFlow ***
func @simple_mul(%arg0: tensor<?xf32>) -> tensor<?xf32> attributes {iree.module.export} {
%0 = "xla_hlo.abs"(%arg0) : (tensor<?xf32>) -> tensor<?xf32>
// Copyright 2020 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// https://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
module attributes {tf.versions = {bad_consumers = [], min_consumer = 12 : i32, producer = 293 : i32}} {
flow.variable @h1_bias mutable dense<1.51671076> : tensor<16xf32>
flow.variable @h1_weights mutable dense<-1.32382154> : tensor<16x16xf32>
flow.variable @h2_bias mutable dense<-0.967021465> : tensor<16xf32>
flow.variable @h2_weights mutable dense<-2.13222814> : tensor<16x16xf32>
flow.variable @out_bias mutable dense<0.437576413> : tensor<10xf32>
flow.variable @out_weights mutable dense<-0.216886863> : tensor<16x10xf32>
func @predict(%arg0: tensor<?x16xf32>) -> tensor<?x10xf32> attributes {iree.module.export, iree.reflection = {abi = "sip", abiv = 1 : i32, sip = "I8!S5!k0_0R3!_0"}, tf._input_shapes = ["tfshape$dim { size: -1 } dim { size: 16 }", "tfshape$unknown_rank: true", "tfshape$unknown_rank: true", "tfshape$unknown_rank: true", "tfshape$unknown_rank: true", "tfshape$unknown_rank: true", "tfshape$unknown_rank: true"], tf.signature.is_stateful} {
%0 = xla_hlo.constant dense<0xFF800000>
@stellaraccident
stellaraccident / full multi-layer-perceptron.mlir
Created January 27, 2020 19:43
examples of hlo granularity
module attributes {tf.versions = {bad_consumers = [], min_consumer = 12 : i32, producer = 293 : i32}} {
flow.variable @h1_bias mutable dense<[1.51671076, -1.03060472, 0.786281049, -0.111620337, 1.81119263, -0.489863962, -1.35557854, 1.12750614, -2.68010569, -1.31835032, 1.32360709, -0.169878066, -2.02759194, -1.08895075, 0.00321231596, -0.31182602]> : tensor<16xf32>
flow.variable @h1_weights mutable dense<[[-1.32382154, -0.549432516, 0.367527097, 0.727583051, -0.200104922, 0.803734958, 0.12167716, 1.32091141, -0.532794356, -0.784785628, -0.228998855, 0.517136097, -0.699431359, -0.73973155, 0.743836284, -0.946993887], [0.179826155, -0.49125874, -1.25974214, 1.15823603, -0.431264639, 0.251312494, -0.443345934, -2.01710749, 1.14093256, 0.719460964, -0.530746937, -2.79470325, -1.53676498, -1.249620e+00, 0.0132142669, -0.21497789], [1.0148958, -0.07205116, 0.406493574, -1.13559055, 0.363096684, 0.349495828, -1.05846632, 0.435198575, -0.0271317028, -0.122605741, -0.127589807, -0.243348598, -2.24777889, 0.65326