GIMPLE to GRAPHITE transformation
Google Summer of Code 2008
Tobias Grosser (grosser at gcc dot gnu dot org)
27.03.2008
Abstract
Since about 20 month people are working on polyhedral loop representation in the gcc called GRAPHITE. The idea is to abstract from normal control flow structures and generate a more mathematical representation using polyhedrons. Traditional structures like the CFG and conditional statements are left behind. This new representation allows normal loop optimization to be simple mathematical operations, which can easily be applied.
At the moment the general shape of the GRAPHITE framework is quite clear and people have started to work on different areas. While they reached new areas of development, it became quite clear, that the GRAPHITE frontend, the translation from GIMPLE to GRAPHITE has some serious drawbacks, which stop people to continue their development fast and efficiently.
During Summer of Code I would like to solve these problems and build a solid basis for further graphite development.
The first part of my work will be to collect example code to cover all code the gcc may generate and discuss the expected representation with the graphite developers. There seem to be several important edge cases, where the representation is not easy. These cases have to be discussed early to get a solid polyhedral representation even for edge cases, before much code depends on the current representation.
The next part of my work is to fix the SCoP detection, the extraction of the code, which we can be represented in our polyhedral model. At the moment this code works for some simple cases, but is incorrect for most testcases.
After we can detect the code we can represent, I will add conditional statements to the GIMPLE to GRAPHITE transformation. Now GRAPHITE should be able to transform every affine natural loop nest.
To verify my work, I will make cloog regenerate the original code structure out of the polyhedral model.
After I finished my work, the GCC has a solid polyhedral loop representation for future GRAPHITE work.
Complete Application (without personal part)
Additional Information:
Blog
There is a Weblog available:
GRAPHITE Summer of Code news
Version Control
Most of my GRAPHITE work took place in the gcc/graphite branch. During
Google SOC I will continue to commit regulary to this branch, after my work was
reviewed/tested by other graphite developers.
Incomplete work and features waiting for review are available in my git
repository:
#> git clone http://www.fmi.uni-passau.de/~grosser/gcc/.git
Webaccess:
CGIT
TODO
- SOC application (27.03.2008)
- Install SOC blog (27.03.2008)
- Complete the SOC Website (WIP)
- SOC 2008 ;-)
- SCoP detection
- GIMPLE to GRAPHITE translation
- Cloog output
Project Details
I would like to show some examples to give people some insight in GRAPHITE, the
current state of GRAPHITE and what I will add during Google Summer of Code.
SCoP detection
Before we can transform code from GIMPLE to GRAPHITE, we have to extract the
parts of the CFG, that can be represented in the polyhedral model. These parts
are called SCoPs. SCoPs are not allowed to contain bbs,
which may be dangerous. For example bbs, which contain function calls,
or non affine functions in array accesses or loop borders. In this cases we do
not know what happens, because it is unknown (function call) or too difficult
to calculate (non affine function). So a polyhedral representation is not possible.
In the following examples, difficult bbs are painted with black background. If a
bb is part of a SCoP, we paint the background around the bb number in a different
color. All examples contain the current GRAPHITE results and the SCoP detection,
I would like to implement.
Simple loop
The easiest example available, one single loop. This loop should be in one SCoP,
as there are no difficult basic blocks.
1 int toto()
2 {
3 int i;
4 int a[100];
5
6 for (i = 1; i < 100; i++)
7 {
8
9 a[i] = a[i-1] + 2;
10 }
11
12 return a[3];
13 }
| Now |
Correct |
|
|
Comments
- The SCoP detection is correct
- In BB0 and BB1 the bb number is shown two times. A bug in
the graphviz output.
Two loops in a row
Now we add another loop after the first loop. Both are in the same SCoP,
as we do not have difficult bbs right now. So there is no reason for a cut.
1 int toto()
2 {
3 int i, j;
4 int a[100];
5
6 for (i = 1; i < 100; i++)
7 {
8
9 a[i] = a[i-1] + 2;
10 }
11
12 for (j = 1; j < 100; j++)
13 {
14
15 a[j+3] = a[j-1] + 2;
16 }
17
18 return a[3];
19 }
| Now |
Correct |
|
|
Comments:
- The SCoP detection is correct
- In BB0 and BB1 the bb number is shown two times. A bug in the
graphviz output.
Two loops in a row, with difficult bb in between
We add a function call in between the two loops. This difficult bb cuts
the SCoP in two parts.
1 int bar(void);
2
3 int toto()
4 {
5 int i, j;
6 int a[100];
7
8 for (i = 1; i < 100; i++)
9 {
10
11 a[i] = a[i-1] + 2;
12 }
13
14 bar();
15
16 for (j = 1; j < 100; j++)
17 {
18
19 a[j+3] = a[j-1] + 2;
20 }
21
22 return a[3];
23 }
| Now |
Correct |
|
|
Comments:
- In BB0 and BB1 the bb number is shown two times. A bug in the
graphviz output.
- BB3 and BB5 are part of two SCoPs. This is invalid, one bb can only be
part of one SCoP. It is also wrong, that BB5, a difficult bb, is part
of a SCoP at all. A difficult bb can never be part of a SCoP.
This bug may be introduced by earlier changes. In the last weeks, we
discussed the SCoP borders. Before the exit and entry bbs were not part
of the SCoP, just borders. But this made it impossible to represent
some SCoPs with conditional control flow.
Right now I am not sure if our current representation is correct for
every possible case. It seems that this needs more thoughts.
- The red and blues SCoPs are divided by BB3 without any reason.
Nested loops with difficult bb
Now we add a difficult bb in between a nested loop set.
This example is a little bit more difficult, as we have
to create several SCoPs on different loop depths.
1 void bar (void);
2
3 int toto()
4 {
5 int i, j, k;
6 int a[100][100];
7 int b[100];
8
9 for (i = 1; i < 100; i++)
10 {
11 for (j = 1; j < 100; j++)
12 a[j][i] = a[j+1][i-1] + 2;
13
14 b[i] = b[i-1] + 2;
15
16 bar ();
17
18 for (j = 1; j < 100; j++)
19 a[2*j][i] = a[j+1][i+30] + 2;
20
21 b[5*i+8] = a[i-1][i+15] + 2;
22
23 for (j = 1; j < 100; j++)
24 a[j*100][i] = a[j+1][i-12] + 2;
25 }
26
27 return a[3][5] + b[1];
28 }
| Now |
Correct |
|
|
Comments:
- In BB0 and BB1 the number is shown two times. A bug in
graphviz output.
- The red SCoP reaches with BB13 in the loop. A loop has to be
in a SCoP complete or not at all. (Except the outermost loop)
- Blue and green SCoP. No need to divide these two SCoPs.
- BB5 same as above, difficult bb part of a SCoP.
- BB11, BB12 can not be part of the orange SCoP. A loop has to be
in a SCoP complete or not at all.
- BB11 can not be part of the purple SCoP. A SCoP has to have one exit
Where to go?
These are the easiest and best discussed examples. But to get the SCoP detection
and the polyhedral representation right, we also have to support conditional
statements and a mix of conditions and loop nests. As GIMPLE does not distinguish
between loops and GOTOs, we also have to support irregular controlflow with
GOTOs.
The aim of this phase is to be able to analyze a complete application (e.g. some
SPEC tests) without SEGFAULT or invalid SCoPs. It may be that we don't get all
SCoPs, but it should never happen that we create invalid results.
For further GRAPHITE development, it is important to extend the space of
programs we can handle as far as possible. Otherwise development in the backend
may depend on assumptions, which are only valid for these simple testcases.
To find these later will be very hard. It may even be possible, that we have to
change the internal representation. (For example should the SCoP entry and exit be
part of the SCoP?)
Difficult example
To show where to go a "difficult example".
It contains conditional control flow in a nested loop set.
The current GRAPHITE SCoP detection does not support conditional controlflow.
Sometimes it produces incorrect output (like in this example) or it just
SEGFAULTS.
1 void bar (void);
2
3 int toto()
4 {
5 int i, j, k;
6 int a[100][100];
7 int b[100];
8
9 for (i = 1; i < 100; i++)
10 {
11 for (j = 1; j < 100; j++)
12 b[i+j] = b[i+j-1] + 2;
13
14 if (i * 2 == i + 8)
15 b[i+k] = b[i+k-1] + 2;
16 else
17 {
18 for (k = 1; k < 100; k++)
19 bar ();
20 }
21
22 for (k = 1; k < 100; k++)
23 b[i+k] = b[i+k-5] + 2;
24 }
25
26 return a[3][5] + b[1];
27 }
| Now |
Correct |
|
|
GIMPLE to GRAPHITE translation
In the traditional CFG representation, bbs are connected using edges, which
describe the control flow information.
The idea of the polyhedral model is to move all control flow information in
a separate data structure. So bbs do not contain any IF-Statements at the end
and do not have predecessors or successors. They are completely independent and
just contain a linear control flow.
The control flow is described by two data structures.
The domain
It describes the iteration space of the bbs.
For example:
1 for (i = 1; i < 30; i++)
2 for (j = 20; j < 50; j++)
3 A;
4 for (k = 80; k < 100; k++)
5 B;
| Statement |
Domain |
| A |
1 <= i
i <= 30
20 <= j
j <= 50
|
| B |
80 <= k
k <= 100
|
Statement A is executed for all i in between 1 and 30 and for all j in
between 20 and 50. Statement B is executed for all k in between 80 and 100.
The Schedule
At the moment we do not know, if A is executed before B or B
before A. This information will be contained in the Schedule.
| Statement |
Schedule |
| A |
1 1 1
|
| B |
2 1 0
|
The schedule of A is smaller than the schedule of B. So we know
A has to be executed before B.
Conditional Statements
To add conditional statements we have to add these conditions to the domain.
For example:
1 for (i = 1; i < 100; i++)
2 if (i >= 80)
3 A;
4 else
5 B;
| Statement |
Domain |
| A |
1 <= i
i <= 100
80 <= i
|
| B |
1 <= i
i <= 100
i <= 79
|
Nested boolean expressions in conditional statements are possible as well. To
handle them is more difficult. They have to be simplified and we have to think
about, how to represent the boolean OR. We could use a union of polyhedrals or
we copy the bb and add two bbs. One for every part of the union.
GRAPHITE translation (GIMPLE to GRAPHITE) status
GRAPHITE supports this translation. But the support is incomplete.
Code that is missing:
- Support for conditional statements
- Support for code in between loop exit and loop latch
- Support for different domains for bbs in the same loop (Necessary to
support the other two points)
As we are also missing working cloog
output, the current code base is only tested with very easy examples.
Cloog
The first step from GRAPHITE back to GIMPLE is Cloog. Cloog is used to calculate
the new control flow structure, from the polyhedral representation. Right now, we are able to
create the cloog data structures, but they are incomplete and wrong in several
cases. So cloog is not able to use them to calculate the new loop structure.
Known problems are:
- The scattering function can not be applied (cloog will die).
So the schedule can not be applied and the order of the loops is wrong.
- At the moment we create one domain for every loop. This is wrong, we need
one domain for every bb. This is necessary for nested loop sets,
as the current representation is not powerful enough to represent nested loop
sets and many other simple examples.
Cloog support is essential for further testing as well as for the backend
output. Only if we know, that we can recreate the original control flow
structure of our examples, we are able verify the transformation.
Simple example
Code
1 int toto()
2 {
3 int i, j;
4 int a[100];
5
6 for (i = 1; i < 100; i++)
7 {
8
9 a[i] = a[i-1] + 2;
10 }
11
12 for (j = 1; j < 100; j++)
13 {
14
15 a[j+3] = a[j-1] + 2;
16 }
17
18 return a[3];
19 }
Current cloog output
for (s_0=0;s_0<=98;s_0++) {
S1 ;
S0 ;
}
CFG
Expected cloog output
S2 ;
for (s_0=0;s_0<=98;s_0++) {
S3 ;
S4 ;
}
S5 ;
for (s_0=0;s_0<=98;s_0++) {
S6 ;
S7 ;
}
S8 ;
Remove unused variables and control flow structures
As we move all the information contained in the control flow structures of loops
and conditions and the information of the induction variables to the polyhedral
model, we have to remove this information from the bbs.
After this step they contain only the code, executed e.g. in the body of a loop.
So we can just move around these loop bodies (if the dependencies allow) and
execute them in any order we like, without thinking about old loop structures.
They will be replaced in the backend with new loop structures.