Last active
July 7, 2017 18:34
-
-
Save mppf/f3ec886f241ecc2f46ceaff69b689871 to your computer and use it in GitHub Desktop.
order independent vs loop carried
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
var sum: int; | |
for i in vectorizeOnly(1..n) { | |
sum += i; // is sum "order independent" ? Is there a loop-carried dependency to respect? | |
// say it turns in to | |
// %1 = load %sum | |
// %2 = add %1, i | |
// store %2, %sum | |
// would it be wrong to assert there is no loop-carried dependency? | |
} | |
global = sum; | |
} | |
// Here is a working example that shows the issue | |
var global = 0; | |
config const n = 100; | |
class MyClass { | |
var x:int; | |
} | |
proc test_class() { | |
var sum = new MyClass(0); | |
for i in vectorizeOnly(1..n) { | |
sum.x += i; // turns into load, add, store | |
} | |
global = sum.x; | |
delete sum; | |
} | |
test_class(); | |
writeln(global); | |
// here is a sample program that actually shows a different output when | |
// LLVM vectorization is enabled after PR #6533. | |
var histogram:[0..255] int; | |
config const n = 100; | |
var A = [i in 1..n] 1; | |
proc update_histo() { | |
for i in vectorizeOnly(1..n) { | |
histogram[A[i]] += 1; | |
} | |
} | |
update_histo(); | |
writeln(histogram); |
For the code:
int loop1(int n)
{
int sum = 1;
for(int i = 1; i <= n; i++)
{
sum *= i;
sum += 1;
}
return sum;
}
int loop2(int n)
{
int sum = 1;
for(int i = 1; i <= n; i++)
{
sum += i;
}
return sum;
}
Using this clang command:
clang test.c -emit-llvm -S -c -Rpass-analysis=loop-vectorize -Rpass-missed=loop-vectorize -Rpass=loop-vectorize -O3
I have warning
test.c:11:1: remark: loop not vectorized: value that could not be identified as reduction is used outside the loop [-Rpass-analysis=loop-vectorize]
}
^
test.c:7:9: remark: loop not vectorized: use -Rpass-analysis=loop-vectorize for more info [-Rpass-missed=loop-vectorize]
sum *= i;
Note that there is no warning related to the second loop, meaning that it was optimized out before loop vectorizer pass.
Only -Rpass-missed=loop-vectorize -O3
seems to be necessary.
I can get similar reporting from Chapel with --mllvm -pass-remarks-missed=loop-vectorize --mllvm -pass-remarks-analysis=loop-vectorize
Adding -g
gets file/line info:
chpl vectorize-remarks.chpl --llvm --fast --mllvm -pass-remarks-missed=loop-vectorize --mllvm -pass-remarks-analysis=loop-vectorize -g
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
There are two terms to be precise:
Order dependency means that any rearrangements of iterations of loop yields exactly the same algorithm.
Loop carried dependency means that some statement in loop uses value computed in previous iteration of the loop.
Note that:
sum += i;
and
Are equivalent operations, and according to second definition there is loop carried dependency, but loop is still order independent.
I think that this loop gets optimized out before the vectorizer even gets a chance.
According to http://llvm.org/docs/Vectorizers.html#reductions:
Sometimes this kind of loop can be vectorized, but sometimes it cannot.