1
0

Wrote more
All checks were successful
/ Build pdf (push) Successful in 30s

This commit is contained in:
Matthias Veigel 2025-07-05 11:40:27 +02:00
parent 32cc559d4c
commit 721bb62a20
Signed by: root
GPG Key ID: 2437494E09F13876

160
main.typ
View File

@ -8,6 +8,10 @@
#import "@preview/cetz:0.3.2" #import "@preview/cetz:0.3.2"
#import "@preview/cetz-plot:0.1.1": chart as cetz_chart #import "@preview/cetz-plot:0.1.1": chart as cetz_chart
#import "@preview/tblr:0.3.1": tblr, rows as tblr_rows, hline as tblr_hline #import "@preview/tblr:0.3.1": tblr, rows as tblr_rows, hline as tblr_hline
#import "@preview/codly:1.3.0": codly-init, codly
#show: codly-init.with()
#codly(zebra-fill: none, display-icon: false, display-name: false, stroke: none, radius: 0mm, inset: 0.2em)
#let booktbl = tblr.with( #let booktbl = tblr.with(
stroke: none, stroke: none,
@ -104,7 +108,7 @@ This work is divided into the following sections: in @background_c the backgroun
caption: [C code and respective SSA in control flow graph form, adapted from Fig.1 in [@slr-13]], caption: [C code and respective SSA in control flow graph form, adapted from Fig.1 in [@slr-13]],
kind: "raw", kind: "raw",
grid( grid(
columns: (1fr, 1fr), columns: (1fr, 1.25fr),
```C ```C
int x = 2 * 2 + 4; int x = 2 * 2 + 4;
x = x - 2; x = x - 2;
@ -118,15 +122,31 @@ This work is divided into the following sections: in @background_c the backgroun
) )
) <ssa_form_example> ) <ssa_form_example>
Many modern compilers and analysis tools operate on a Static Single-Assignment (SSA) form @cooper_keith_d_engineering_2011 @cytron_efficiently_1991. The SSA form works by assigning each variable only once. This is done by creating multiple sub-variables $x_1, x_2, ...$ for each variable $x$. After a branch in the program a #{sym.Phi}-Node is used to select the new value of the variable based on branch executed. Many modern compilers and analysis tools operate on a Static Single-Assignment (SSA) form @cooper_keith_d_engineering_2011 @cytron_efficiently_1991. The SSA form works by assigning each variable only once. This is done by creating multiple sub-variables $x_1, x_2, ...$ for each variable $x$. After a branch in the program a #{sym.Phi}-Node is used to select the new value of the variable based on branch executed.
An example of the SSA form can be seen in @ssa_form_example. On the left is a simple C code in a function body and on right is the respective SSA form of the C code. \ An example of the SSA form can be seen in @ssa_form_example. On the left is a simple C code in a function body and on right is the respective SSA form of the C code. The immediate representation of LLVM is closely modeled after the SSA form.
== Dataflow analysis (DFA) == Dataflow analysis (DFA)
A compiler can perform dataflow analysis either on the original source code or an intermediate representation. When performing the analysis on the source code, the original structure and flow constructs of the program is available, while performing the analysis on an immediate representation has the advantage of being usable for many different languages but in the translation step from source code to immediate representation a lot of information about control flow and similar could have already been lost. LLVM for example already has a lot of generic optimization steps for its own immediate representation, which allows language developers to focus designing their language and a compiler from their language to the LLVM IR with language specific optimizations instead of having to implement a full compiler and optimizations themselves. A big problem with DFA is the long runtime and because it is a sequential algorithm it is complicated to parallelize it. This makes it harder to use DFA in a Just-In-Time (JIT) compiler, since it has a limited timeframe for compilation. \ A compiler can perform dataflow analysis either on the original source code or an intermediate representation. When performing the analysis on the source code, the original structure and flow constructs of the program is available, while performing the analysis on an immediate representation has the advantage of being usable for many different languages but in the translation step from source code to immediate representation a lot of information about control flow and similar could have already been lost. LLVM for example already has a lot of generic optimization steps for its own immediate representation, which allows language developers to focus designing their language and a compiler from their language to the LLVM IR with language specific optimizations instead of having to implement a full compiler and optimizations themselves. A big problem with DFA is the long runtime and because it is a sequential algorithm it is complicated to parallelize it. This makes it harder to use DFA in a Just-In-Time (JIT) compiler, since it has a limited timeframe for compilation.
=== Forward and backward analysis
Dataflow analysis can be performed in two directions: forward and backward.
Forward analysis is done by walking from the entry of the program to the exit. This allows to look what conditions apply before a statement is executed and which statements can be reached by the program.
Backward analysis goes from the exit of the program to the entry, thereby enabling to calculated what variables are still required at a specific point in the program.
=== Must and may analysis
#figure( // must_may_example
caption: [Must/may analysis],
kind: "raw",
```C
int x, y, z;
if (...) { x = y = 2; z = 1; }
else { x = z = 2; }
printf("%d %d %d", x, y, z);
```
) <must_may_example>
The facts which the algorithm knows about variable either must be true or may be true. When they must be true, every path leading to the current point must ensure that the fact is true. The facts in @must_may_example on line 4 are: `x` and `z` must be initialized since that is done in both branches of the if, while `y` only may be initialized. `x` also must be `2`, since it gets assigned `2` in both branches, `z` may be `2` or may be `1`. The must constraints are mostly used for optimization, while the may constraints are mostly used for showing warnings.
=== Constant folding and propagation @optimizing_compiler_wikipedia === Constant folding and propagation @optimizing_compiler_wikipedia
An example based on @ssa_form_example would be the compiler calculating $x_1$ to be $8$. This is called constant folding and done by replacing all calculations which are possible at compile time with their result. Constant propagation then replaces the $x_1$ in the calculation of $x_2$ with its value. When constant folding is the applied again $x_2$ would be $6$. An example based on @ssa_form_example would be the compiler calculating $x_1$ to be $8$. This is called constant folding and done by replacing all calculations which are possible at compile time with their result. Constant propagation then replaces the $x_1$ in the calculation of $x_2$ with its value. When constant folding is the applied again $x_2$ would be $6$.
=== Dead branch elimination @optimizing_compiler_wikipedia === Dead branch elimination @optimizing_compiler_wikipedia
Continuing from the last optimization it would be possible to conclude that the branch $x_2 < 4$ would always evaluate to $0$. This would result in elimination of the $1$ branch and result in $x_5$ always being $12$ and $y_1 = 24$. These two optimizations would already allow to replace the whole code from @ssa_form_example with `int x = 12; int y = 24;`. In this simple simple example this seems obvious, but if $x$ is a function parameter there still could be instances where this branch could be eliminated because of the function argument supplied elsewhere. Continuing from the last optimization it would be possible to conclude that the branch $x_2 < 4$ would always evaluate to $0$. This would result in elimination of the $1$ branch and result in $x_5$ always being $12$ and $y_1 = 24$. These two optimizations would already allow to replace the whole code from @ssa_form_example with `int x = 12; int y = 24;`. In this simple simple example this seems obvious, but if $x$ is a function parameter there still could be instances where this branch could be eliminated because of the function argument supplied elsewhere.
=== Common subexpression elimination @optimizing_compiler_wikipedia === Common subexpression elimination @optimizing_compiler_wikipedia
Expressions like `d = (a * b) - c * (a * b)` can be rewritten to `tmp = a * b; d = tmp - c * tmp` as long as `a` and `b` remain the same between the two calculations of `a * b`. Common subexpression elimination is done by finding cases where a calculation or a function call without side-effect is done multiple times with the same variables and values. An example for this would be the expressions `d = (a * b) - c * (a * b)`, which can be rewritten to `tmp = a * b; d = tmp - c * tmp` as long as `a` and `b` remain the same between the two calculations of `a * b`.
= Methodology <methodology_c> = Methodology <methodology_c>
This work is created following the process described in @process_fig. The protocol for the review is divided up into the following chapters: for the objective of the research see @research_questions_s, for the search strategy see @sas_s, for the selection criteria see @selection_criteria_s and for the data extraction strategy see @data_extraction_s. This work is created following the process described in @process_fig. The protocol for the review is divided up into the following chapters: for the objective of the research see @research_questions_s, for the search strategy see @sas_s, for the selection criteria see @selection_criteria_s and for the data extraction strategy see @data_extraction_s.
@ -152,34 +172,6 @@ This goal has been defined in two research questions:
This questions aims to identify how DFA is already used in current compilers, what optimizations are done with it and if it is used during normal compilation or if it has to be explicitly enabled. This questions aims to identify how DFA is already used in current compilers, what optimizations are done with it and if it is used during normal compilation or if it has to be explicitly enabled.
== Search and selection strategy <sas_s> == Search and selection strategy <sas_s>
#figure( // sas_search_string
caption: [Search string used in electronic databases],
kind: "raw",
align(left)[
// ("dataflow analysis" OR "data flow analysis") AND (compiler OR compilers OR compilation) AND (optimization OR optimizations) AND (advantages OR disadvantages OR strengths OR limitations OR trade-offs) AND (implementation OR usage OR used OR applied)
// ("Full Text .AND. Metadata":"dataflow analysis" OR "Full Text .AND. Metadata":"data flow analysis") AND ("Full Text .AND. Metadata":compiler OR "Full Text .AND. Metadata":compilers OR "Full Text .AND. Metadata":compilation) AND ("Full Text .AND. Metadata":optimization OR "Full Text .AND. Metadata":optimizations) AND ("Full Text .AND. Metadata":advantages OR "Full Text .AND. Metadata":disadvantages OR "Full Text .AND. Metadata":strengths OR "Full Text .AND. Metadata":limitations OR "Full Text .AND. Metadata":trade-offs) AND ("Full Text .AND. Metadata":implementation OR "Full Text .AND. Metadata":usage OR "Full Text .AND. Metadata":used OR "Full Text .AND. Metadata":applied)
#set raw(syntaxes: "search-string.sublime-syntax", theme: "search-string.tmTheme")
```SearchString
("dataflow analysis" OR "data flow analysis")
AND
(compiler OR compilers OR compilation)
AND
(optimization OR optimizations)
AND
(advantages OR disadvantages OR strengths OR limitations OR trade-offs)
AND
(implementation OR usage OR used OR applied)
```
]
) <sas_search_string>
Our search strategy consists of 5 steps as seen in @sas_fig. \
The papers from the first steps are collected from the electronic databases ACM Digital Library, IEEE Xplore, Springer Link with the search string seen in @sas_search_string.
The search string in @sas_search_string was created using the research questions in @research_questions_s and was always applied to the full text of the papers. The search string is divided into the keywords for dataflow analysis, the keywords for compiler optimization, keywords for advantages or disadvantages to help finding papers relevant for answering RQ1, and keyword signaling an implementation in the paper to help answer RQ2. The different keywords were made from the abstracts, titles and keywords of my source papers.
In the second step all duplicates which where returned from multiple databases where removed from the results and the amount was limited to fit the scope of this paper.
In the third step the selection was filtered by applying all selection criteria from @selection_criteria_s.
In the fourth step we snowballed the previously acquired results. This was to find relevant papers which where not included because of either the search string or the search criteria.
Afterwards all papers found via the snowballing where filtered again by applying the selection criteria in @selection_criteria_s.
In the end all papers from the third step and the papers of the snowballing where evaluated based on the data extraction items mentioned in @data_extraction_s.
#[ // sas_fig #[ // sas_fig
#set text(size: 8pt) #set text(size: 8pt)
#figure( #figure(
@ -224,6 +216,34 @@ In the end all papers from the third step and the papers of the snowballing wher
}) })
) <sas_fig> ) <sas_fig>
] ]
#figure( // sas_search_string
caption: [Search string used in electronic databases],
kind: "raw",
align(left)[
// ("dataflow analysis" OR "data flow analysis") AND (compiler OR compilers OR compilation) AND (optimization OR optimizations) AND (advantages OR disadvantages OR strengths OR limitations OR trade-offs) AND (implementation OR usage OR used OR applied)
// ("Full Text .AND. Metadata":"dataflow analysis" OR "Full Text .AND. Metadata":"data flow analysis") AND ("Full Text .AND. Metadata":compiler OR "Full Text .AND. Metadata":compilers OR "Full Text .AND. Metadata":compilation) AND ("Full Text .AND. Metadata":optimization OR "Full Text .AND. Metadata":optimizations) AND ("Full Text .AND. Metadata":advantages OR "Full Text .AND. Metadata":disadvantages OR "Full Text .AND. Metadata":strengths OR "Full Text .AND. Metadata":limitations OR "Full Text .AND. Metadata":trade-offs) AND ("Full Text .AND. Metadata":implementation OR "Full Text .AND. Metadata":usage OR "Full Text .AND. Metadata":used OR "Full Text .AND. Metadata":applied)
#set raw(syntaxes: "search-string.sublime-syntax", theme: "search-string.tmTheme")
```SearchString
("dataflow analysis" OR "data flow analysis")
AND
(compiler OR compilers OR compilation)
AND
(optimization OR optimizations)
AND
(advantages OR disadvantages OR strengths OR limitations OR trade-offs)
AND
(implementation OR usage OR used OR applied)
```
]
) <sas_search_string>
Our search strategy consists of 5 steps as seen in @sas_fig. \
The papers from the first steps are collected from the electronic databases ACM Digital Library, IEEE Xplore, Springer Link with the search string seen in @sas_search_string.
The search string in @sas_search_string was created using the research questions in @research_questions_s and was always applied to the full text of the papers. The search string is divided into the keywords for dataflow analysis, the keywords for compiler optimization, keywords for advantages or disadvantages to help finding papers relevant for answering RQ1, and keyword signaling an implementation in the paper to help answer RQ2. The different keywords were made from the abstracts, titles and keywords of my source papers.
In the second step all duplicates which where returned from multiple databases where removed from the results and the amount was limited to fit the scope of this paper.
In the third step the selection was filtered by applying all selection criteria from @selection_criteria_s.
In the fourth step we snowballed the previously acquired results. This was to find relevant papers which where not included because of either the search string or the search criteria.
Afterwards all papers found via the snowballing where filtered again by applying the selection criteria in @selection_criteria_s.
In the end all papers from the third step and the papers of the snowballing where evaluated based on the data extraction items mentioned in @data_extraction_s.
== Selection criteria <selection_criteria_s> == Selection criteria <selection_criteria_s>
For a publication to be relevant it has to satisfy at least one inclusion criteria and not any exclusion criteria. The criteria were chosen to include as any publications as possible but still filter out irrelevant ones. For a publication to be relevant it has to satisfy at least one inclusion criteria and not any exclusion criteria. The criteria were chosen to include as any publications as possible but still filter out irrelevant ones.
@ -322,6 +342,42 @@ In this chapter we list our findings from the conducted systematic literature an
} }
) <demographic_pub_year> ) <demographic_pub_year>
As seen in @demographic_pub_year most of the analyzed publication are from the last 15 years, which indicates that this field is still actively being researched and explore, but research has already start back in 1983. Since research started over 50 years ago it indicates that this field is by now well-established. There are definitely more publications which are not listed here and not represented in this figure, but that is because the scope of this papers was very limited. \ As seen in @demographic_pub_year most of the analyzed publication are from the last 15 years, which indicates that this field is still actively being researched and explore, but research has already start back in 1983. Since research started over 50 years ago it indicates that this field is by now well-established. There are definitely more publications which are not listed here and not represented in this figure, but that is because the scope of this papers was very limited. \
=== Research focus
#figure( // demographic_research_focus
caption: "Research focus of the publications",
{
let data = (
("Algorithms and Techniques", 5), // 1, 2, 5, 7, 12
("Implementation and Reusability", 2), // 3, 8
("Analysis speed improvement", 4), // 4, 6, 14, 15
("Custom IR for analysis", 3), // 9, 10, 13
("Tools for implementation of DFA", 1), // 11
)
cetz.canvas({
//let colors = (red, eastern, green, blue, navy, purple, maroon, orange)
let colors = gradient.linear(..color.map.rainbow.map(v => v.darken(20%).saturate(20%)))
// cspell:disable-next-line
cetz_chart.piechart(
data,
value-key: 1,
label-key: 0,
radius: 3,
slice-style: colors,
inner-radius: 0,
inner-label: (content: (value, _) => [#text(white, str(value))], radius: 150%),
outer-label: (content: (value, _) => [], radius: 0%),
legend: (
position: "south",
anchor: "north",
orientation: ttb
)
)
})
}
) <demographic_research_focus>
The focus of the different papers can be seen in @demographic_research_focus. Most of the papers [@slr-1, @slr-2, @slr-5, @slr-7, @slr-12] included focus on creating and implementing new algorithms and techniques. Another big focus of the included papers is speeding up the analysis, which also makes it more viable for using in JIT compilers. While [@slr-4, @slr-6] try to do this by simply running parts of the analysis on different threads, [@slr-14] tries to pipeline the analysis of functions and [@slr-15] tries to skip parts by only lazily iterating over nodes of the IR. [@slr-9, @slr-10, @slr-13] implement a custom IR to make it easier to run parts of the DFA or to have a better structure then the previous code or IR. The focus of [@slr-3, @slr-8] is to provide a generic library for implementing DFA and using it and to provide an example implementation of the library to show how it works. [@slr-11] creates a Domain-Specific Language (DSL) for implementing DFA algorithm in the LLVM framework to make it easier for researchers to try out new ideas and implement them. \
=== Target languages === Target languages
#figure( // demographic_target_lang #figure( // demographic_target_lang
caption: "Target languages of the publications", caption: "Target languages of the publications",
@ -361,42 +417,6 @@ As seen in @demographic_pub_year most of the analyzed publication are from the l
} }
) <demographic_target_lang> ) <demographic_target_lang>
@demographic_target_lang shows a 33% trend towards implementing DFA optimizations either with LLVM directly or by operating on the LLVM IR, while Java is either directly used as bytecode or as SSA representation of Java. This shows that LLVM is a good platform for implementing optimizations and that it has a lower barrier of entry for developing optimizations. \ @demographic_target_lang shows a 33% trend towards implementing DFA optimizations either with LLVM directly or by operating on the LLVM IR, while Java is either directly used as bytecode or as SSA representation of Java. This shows that LLVM is a good platform for implementing optimizations and that it has a lower barrier of entry for developing optimizations. \
=== Research focus
#figure( // demographic_research_focus
caption: "Research focus of the publications",
{
let data = (
("Algorithms and Techniques", 5), // 1, 2, 5, 7, 12
("Implementation and Reusability", 2), // 3, 8
("Analysis speed improvement", 4), // 4, 6, 14, 15
("Custom IR for analysis", 3), // 9, 10, 13
("Tools for implementation of DFA", 1), // 11
)
cetz.canvas({
//let colors = (red, eastern, green, blue, navy, purple, maroon, orange)
let colors = gradient.linear(..color.map.rainbow.map(v => v.darken(20%).saturate(20%)))
// cspell:disable-next-line
cetz_chart.piechart(
data,
value-key: 1,
label-key: 0,
radius: 3,
slice-style: colors,
inner-radius: 0,
inner-label: (content: (value, _) => [#text(white, str(value))], radius: 150%),
outer-label: (content: (value, _) => [], radius: 0%),
legend: (
position: "south",
anchor: "north",
orientation: ttb
)
)
})
}
) <demographic_research_focus>
The focus of the different papers can be seen in @demographic_research_focus. Most of the papers [@slr-1, @slr-2, @slr-5, @slr-7, @slr-12] included focus on creating and implementing new algorithms and techniques. Another big focus of the included papers is speeding up the analysis, which also makes it more viable for using in JIT compilers. While [@slr-4, @slr-6] try to do this by simply running parts of the analysis on different threads, [@slr-14] tries to pipeline the analysis of functions and [@slr-15] tries to skip parts by only lazily iterating over nodes of the IR. [@slr-9, @slr-10, @slr-13] implement a custom IR to make it easier to run parts of the DFA or to have a better structure then the previous code or IR. The focus of [@slr-3, @slr-8] is to provide a generic library for implementing DFA and using it and to provide an example implementation of the library to show how it works. [@slr-11] creates a Domain-Specific Language (DSL) for implementing DFA algorithm in the LLVM framework to make it easier for researchers to try out new ideas and implement them. \
== RQ1: Advantages and disadvantages of using Dataflow analysis for compiler optimization == RQ1: Advantages and disadvantages of using Dataflow analysis for compiler optimization
DFA makes many big compiler optimizations possible but it also brings many trade-offs and not just for performance. DFA makes many big compiler optimizations possible but it also brings many trade-offs and not just for performance.
@ -413,7 +433,9 @@ One solutions to this problem is described in [@slr-3] by implementing a library
[@slr-11] describes a domain specific language to implement LLVM optimization passes. This is done by a having a simple language for directly implementing the logic of the optimization, while a custom transpiler then converts it into a LLVM pass written in C++. Since the LLVM pass is implemented in a more generic way to fit this purpose, it leads to a moderate compile time increase. There is no formal verification done on the implemented optimization pass. Because of these disadvantages it is a great tool to quickly implement, test and iterate optimizations, but for a more permanent passes, hand-written C++ code should be used. [@slr-11] describes a domain specific language to implement LLVM optimization passes. This is done by a having a simple language for directly implementing the logic of the optimization, while a custom transpiler then converts it into a LLVM pass written in C++. Since the LLVM pass is implemented in a more generic way to fit this purpose, it leads to a moderate compile time increase. There is no formal verification done on the implemented optimization pass. Because of these disadvantages it is a great tool to quickly implement, test and iterate optimizations, but for a more permanent passes, hand-written C++ code should be used.
=== Limitations === Limitations
DFA is hard to parallelize because variables are often dependant on other variables or function arguments. While it is possible to analyze multiple functions at the surface level, they still depend on the context of other functions calling it. As already mentioned [@slr-14] already shows how it is still possible to run parallel analysis while still waiting for the results of other threads. \ DFA is hard to parallelize because variables are often dependant on other variables or function arguments. While it is possible to analyze multiple functions at the surface level, they still depend on the context of other functions calling it. As already mentioned [@slr-14] already shows how it is still possible to run parallel analysis while still waiting for the results of other threads. \
Global variables also make analysis more complicated since the can be accessed and modified by all functions and either need to be treated as an unknown value every time or all functions which work with this variable are analytically dependant on each other and should be locked at when checking the value of the variable. A similar problem exists for variables shared across threads, because the analysis has to look at all functions which could modify the variable. Alternatively the variable should be well synchronized so that only one thread can write it or multiple threads can read it, but not both options at the same time [@slr-5]. Global variables also make analysis more complicated since the can be accessed and modified by all functions and either need to be treated as an unknown value every time or all functions which work with this variable are analytically dependant on each other and should be locked at when checking the value of the variable. A similar problem exists for variables shared across threads, because the analysis has to look at all functions which could modify the variable. Alternatively the variable should be well synchronized so that only one thread can write it or multiple threads can read it, but not both options at the same time [@slr-5]. \
Another thing that complicates DFA in languages like C is the usage of pointers because they allow the program to modify all variables in unpredictable ways which thereby invalidates all facts and assumptions which were made up to that point about all variables. \
Since inlining is required to perform rewrites, it can lead to bloating the executable and make it overly huge.
== RQ2: Usage of dataflow analysis in current compilers == RQ2: Usage of dataflow analysis in current compilers
The Glasgow Haskell Compiler (GHC), LLVM, and GCC are good examples for compilers which already extensively use DFA to implement optimizations. The Glasgow Haskell Compiler (GHC), LLVM, and GCC are good examples for compilers which already extensively use DFA to implement optimizations.