
60 Cray T3E User’s Guide
The compiler can be directed to attempt to unroll all loops generated
for the program with the command-line option -hunroll.
The amount of unrolling specified on the unroll directive overrides
those chosen by the compiler when the command-line option -hunroll
is specified.
In the following example, assume that the outer loop of the following
nest will be unrolled by two:
#pragma _CRI unroll 2
for (i=0; i<10; i++) {
for (j=0; j<100; j++) {
a[i][j] = b[i][j] + 1;
}
}
With outer loop unrolling, the compiler produces the following nest, in
which the two bodies of the inner loop are adjacent to each other:
for (i=0; i<10; i+=2) {
for (j=0; j<100; j++) {
a[i][j] = b[i][j] + 1;
}
for (j=0; j<100; j++) {
a[i+1][j] = b[i+1][j] + 1;
}
}
The compiler then fuses the inner two loop bodies, producing the fol-
lowing nest:
for (i=0; i<10; i+=2) {
for (j=0; j<100; j++) {
a[i][j] = b[i][j] + 1;
a[i+1][j] = b[i+1][j] + 1;
}
}
Outer loop unrolling is not always legal because the transformation can
change the semantics of the original program. For example, unrolling
the following loop nest on the outer loop would change the program se-
mantics because of the dependency between array elements a[i][...]
and a[i+1][...]:
/* Directive will cause incorrect code due to dependencies */
#pragma _CRI unroll 2
for (i=0; i<10; i++) {
for (j=1; j<100; j++) {
a[i][j] = a[i+1][j-1] + 1;
}
}
Comentarios a estos manuales