Skip to content

Instantly share code, notes, and snippets.

@mppf
Last active July 7, 2017 18:34
Show Gist options
  • Save mppf/f3ec886f241ecc2f46ceaff69b689871 to your computer and use it in GitHub Desktop.
Save mppf/f3ec886f241ecc2f46ceaff69b689871 to your computer and use it in GitHub Desktop.
order independent vs loop carried
var sum: int;
for i in vectorizeOnly(1..n) {
sum += i; // is sum "order independent" ? Is there a loop-carried dependency to respect?
// say it turns in to
// %1 = load %sum
// %2 = add %1, i
// store %2, %sum
// would it be wrong to assert there is no loop-carried dependency?
}
global = sum;
}
// Here is a working example that shows the issue
var global = 0;
config const n = 100;
class MyClass {
var x:int;
}
proc test_class() {
var sum = new MyClass(0);
for i in vectorizeOnly(1..n) {
sum.x += i; // turns into load, add, store
}
global = sum.x;
delete sum;
}
test_class();
writeln(global);
// here is a sample program that actually shows a different output when
// LLVM vectorization is enabled after PR #6533.
var histogram:[0..255] int;
config const n = 100;
var A = [i in 1..n] 1;
proc update_histo() {
for i in vectorizeOnly(1..n) {
histogram[A[i]] += 1;
}
}
update_histo();
writeln(histogram);
@coodie
Copy link

coodie commented Jul 7, 2017

There are two terms to be precise:

  1. Order dependency
  2. Loop carried dependency

Order dependency means that any rearrangements of iterations of loop yields exactly the same algorithm.
Loop carried dependency means that some statement in loop uses value computed in previous iteration of the loop.

Note that:

sum += i;

and

sum = sum + i;

Are equivalent operations, and according to second definition there is loop carried dependency, but loop is still order independent.
I think that this loop gets optimized out before the vectorizer even gets a chance.

According to http://llvm.org/docs/Vectorizers.html#reductions:

Normally, this would prevent vectorization, but the vectorizer can detect that ‘sum’ is a reduction variable.

Sometimes this kind of loop can be vectorized, but sometimes it cannot.

@coodie
Copy link

coodie commented Jul 7, 2017

For the code:

int loop1(int n)
{
  int sum = 1;
  for(int i = 1; i <= n; i++)
  {
    sum *= i;
    sum += 1;
  }
  return sum;
}

int loop2(int n)
{
  int sum = 1;
  for(int i = 1; i <= n; i++)
  {
    sum += i;
  }
  return sum;
}

Using this clang command:

clang test.c -emit-llvm -S -c -Rpass-analysis=loop-vectorize -Rpass-missed=loop-vectorize -Rpass=loop-vectorize -O3

I have warning

test.c:11:1: remark: loop not vectorized: value that could not be identified as reduction is used outside the loop [-Rpass-analysis=loop-vectorize]
}
^
test.c:7:9: remark: loop not vectorized: use -Rpass-analysis=loop-vectorize for more info [-Rpass-missed=loop-vectorize]
    sum *= i;

Note that there is no warning related to the second loop, meaning that it was optimized out before loop vectorizer pass.

@mppf
Copy link
Author

mppf commented Jul 7, 2017

Only -Rpass-missed=loop-vectorize -O3 seems to be necessary.

@mppf
Copy link
Author

mppf commented Jul 7, 2017

I can get similar reporting from Chapel with --mllvm -pass-remarks-missed=loop-vectorize --mllvm -pass-remarks-analysis=loop-vectorize

@mppf
Copy link
Author

mppf commented Jul 7, 2017

Adding -g gets file/line info:

chpl vectorize-remarks.chpl --llvm --fast --mllvm -pass-remarks-missed=loop-vectorize --mllvm -pass-remarks-analysis=loop-vectorize -g

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment