dfa-for-co-slr/main.typ

// cSpell:ignoreRegExp @[a-zA-Z0-9_]+
// cSpell:ignoreRegExp #\w+\(
// cSpell:ignore cetz booktbl instcombine
// cSpell:disable
#import "clean-acmart.typ": acmart
#import "@preview/cetz:0.3.4"
#import "@preview/lilaq:0.3.0" as lq
#import "@preview/cetz:0.3.2"
#import "@preview/cetz-plot:0.1.1": chart as cetz_chart
#import "@preview/tblr:0.3.1": tblr, rows as tblr_rows, hline as tblr_hline
#import "@preview/codly:1.3.0": codly-init, codly

#show: codly-init.with()
#codly(zebra-fill: none, display-icon: false, display-name: false, stroke: none, radius: 0mm, inset: 0.2em)

#let booktbl = tblr.with(
  stroke: none,
  column-gutter: 0.6em,
  // booktabs style rules
  tblr_rows(within: "header", auto, inset: (y: 0.5em)),
  tblr_rows(within: "header", auto, align: center),
  tblr_hline(within: "header", y: 0, stroke: 0.08em),
  tblr_hline(within: "header", y: end, position: bottom, stroke: 0.05em),
  tblr_rows(within: "body", 0, inset: (top: 0.5em)),
  tblr_hline(y: end, position: bottom, stroke: 0.08em),
  tblr_rows(end, inset: (bottom: 0.5em)),
)

#let title = [Dataflow Analysis for Compiler Optimization]
#let authors = (
  (
    name: "Matthias Veigel",
    email: "matthias.veigel@uni-ulm.de",
    department: [Institute of Software Engineering and Programming Languages],
    institute: [University Ulm]
  ),
)

#show: acmart.with(
  title: title,
  authors: authors,
  copyright: none
  // Set review to submission ID for the review process or to "none" for the final version.
  // review: [\#001],
)

#set heading(supplement: "Sect.")
#set figure(supplement: [Fig.])
#show figure.caption: it => [
  #set text(size: 8pt)
  *#it.supplement #context it.counter.display(it.numbering)*
  #it.body
]
#show figure.where(kind: "raw"): set figure(supplement: [Listing])
#show figure.where(kind: "raw"): it => align(left)[
  #v(8pt, weak: true)
  #it.body
  #v(4pt, weak: true)
  #it.caption
  #v(8pt, weak: true)
]
#show figure: it => [
  #v(1.25em, weak: true)
  #it
  #v(1.25em, weak: true)
]
#show ref: it => {
  let el = it.element
  if el != none {
    if el.func() == figure and el.kind == "slr" {
      return link(el.location(), text(el.body, weight: "bold"))
    }
    if it.supplement == auto {
      let counter = if el.func() == figure { el.counter } else { counter(heading) }
      let numb = numbering(el.numbering, ..counter.at(el.location()))
      return link(el.location(), box(el.supplement + " " + numb))
    }
  }
  return it
}
#show heading.where(level: 1): it => [
  #v(4mm, weak: true)
  #it
]
#show heading.where(level: 2): it => [
  #v(2mm, weak: true)
  #it
]

#set heading(numbering: "1.1.1")

// cSpell:enable

= Abstract
Dataflow analysis is an important part of compiler optimization since it allows to eliminate or rewrite parts of the code with various techniques such as: constant propagation, dead code elimination, branch elimination. This work aims to look at the advantages and disadvantages of using dataflow analysis, how it is already used in current compilers, on which programming languages or immediate representations it operates and what limitations still exist. \
For this purpose we conducted a systematic literature in which we analyze 15 publications selected from 571 entries. Finally, following conclusions were drawn: dataflow analysis is used in many of todays popular compilers and the field is actively being researched. The advantages of dataflow analysis are huge for performance gain, but its implementations are complex and you need to be careful that the implementation does not change the program in an unwanted way.

= Introduction
Program performance remains a large concern in modern computing and programming, since it has a direct impact on user and developer experience. As software is becoming more complex, manual optimization is increasingly complex and harder for developers to implement.
Another problem with this increasing complexity is that large codebases are spread out over more files, which also makes it harder for developers to keep an overview and to implement optimizations. Because of these reasons automatic optimization is needed in compilers. \
Dataflow analysis is a technique used to gather information about the state of variables throughout the flow of the program. It plays an important role in many compilers, since by analyzing how, where and what variables are assigned and how these variables are used, many complex optimizations, which require context from the surrounding code, can be implemented. \
Dataflow analysis is a well-established field where regularly new techniques are created and older techniques improved. Different compilers and analysis framework implement different methods and optimizations with dataflow analysis. This work aims to summarize the current state and past achievements of this technology. \
This work is divided into the following sections: in @background_c the background required to understand this work is given, in @methodology_c the methodology used to create this work is described, in @findings_c the contents of the papers are analyzed and evaluated, in @conclusion_c the results from this work are summarized.

= Background <background_c>
== Static Single Assignment form (SSA / SSA form)
#figure( // ssa_form_example
  caption: [C code and respective SSA in control flow graph form, adapted from Fig.1 in [@slr-13]],
  kind: "raw",
  grid(
    columns: (1fr, 1.25fr),
    ```C
    int x = 2 * 2 + 4;
    x = x - 2;
    if (x < 4)
      x = 10;
    else
      x = 12;
    int y = x * 2;
    ```,
    image("ssa-example.svg", height: 16em)
  )
) <ssa_form_example>
Many modern compilers and analysis tools operate on a Static Single-Assignment (SSA) form @cooper_keith_d_engineering_2011 @cytron_efficiently_1991. The SSA form works by assigning each variable only once. This is done by creating multiple sub-variables $x_1, x_2, ...$ for each variable $x$. After a branch in the program a #{sym.Phi}-Node is used to select the new value of the variable based on branch executed.
An example of the SSA form can be seen in @ssa_form_example. On the left is a simple C code in a function body and on right is the respective SSA form of the C code. The immediate representation of LLVM is closely modeled after the SSA form.
== Dataflow analysis (DFA)
A compiler can perform dataflow analysis either on the original source code or an intermediate representation. When performing the analysis on the source code, the original structure and flow constructs of the program is available, while performing the analysis on an immediate representation has the advantage of being usable for many different languages but in the translation step from source code to immediate representation a lot of information about control flow and similar could have already been lost. LLVM for example already has a lot of generic optimization steps for its own immediate representation, which allows language developers to focus designing their language and a compiler from their language to the LLVM IR with language specific optimizations instead of having to implement a full compiler and optimizations themselves. A big problem with DFA is the long runtime and because it is a sequential algorithm it is complicated to parallelize it. This makes it harder to use DFA in a Just-In-Time (JIT) compiler, since it has a limited timeframe for compilation.
=== Forward and backward analysis
Dataflow analysis can be performed in two directions: forward and backward.
Forward analysis is done by walking from the entry of the program to the exit. This allows to look what conditions apply before a statement is executed and which statements can be reached by the program.
Backward analysis goes from the exit of the program to the entry, thereby enabling to calculated what variables are still required at a specific point in the program.
=== Must and may analysis
#figure( // must_may_example
  caption: [Must/may analysis],
  kind: "raw",
  ```C
  int x, y, z;
  if (...) { x = y = 2; z = 1; }
  else { x = z = 2; }
  printf("%d %d %d", x, y, z);
  ```
) <must_may_example>
The facts which the algorithm knows about variable either must be true or may be true. When they must be true, every path leading to the current point must ensure that the fact is true. The facts in @must_may_example on line 4 are: `x` and `z` must be initialized since that is done in both branches of the if, while `y` only may be initialized. `x` also must be `2`, since it gets assigned `2` in both branches, `z` may be `2` or may be `1`. The must constraints are mostly used for optimization, while the may constraints are mostly used for showing warnings.
=== Constant folding and propagation @optimizing_compiler_wikipedia
An example based on @ssa_form_example would be the compiler calculating $x_1$ to be $8$. This is called constant folding and done by replacing all calculations which are possible at compile time with their result. Constant propagation then replaces the $x_1$ in the calculation of $x_2$ with its value. When constant folding is the applied again $x_2$ would be $6$.
=== Dead branch elimination @optimizing_compiler_wikipedia
Continuing from the last optimization it would be possible to conclude that the branch $x_2 < 4$ would always evaluate to $0$. This would result in elimination of the $1$ branch and result in $x_5$ always being $12$ and $y_1 = 24$. These two optimizations would already allow to replace the whole code from @ssa_form_example with `int x = 12; int y = 24;`. In this simple simple example this seems obvious, but if $x$ is a function parameter there still could be instances where this branch could be eliminated because of the function argument supplied elsewhere.
=== Common subexpression elimination @optimizing_compiler_wikipedia
Common subexpression elimination is done by finding cases where a calculation or a function call without side-effect is done multiple times with the same variables and values. An example for this would be the expressions `d = (a * b) - c * (a * b)`, which can be rewritten to `tmp = a * b; d = tmp - c * tmp` as long as `a` and `b` remain the same between the two calculations of `a * b`.

= Methodology <methodology_c>
This work is created following the process described in @process_fig. The protocol for the review is divided up into the following chapters: for the objective of the research see @research_questions_s, for the search strategy see @sas_s, for the selection criteria see @selection_criteria_s and for the data extraction strategy see @data_extraction_s.
#place( // process_fig
  bottom + center,
  scope: "parent",
  float: true,
  [
    #figure(
      caption: [Overview of the review process. Adapted from @federico_ciccozzi_execution_2019 and @gotz_claimed_2021.],
      image("review_process.png")
    ) <process_fig>
  ]
)

== Objective and research questions <research_questions_s>
The goal of this research paper is to find claims about the advantages and disadvantages of using dataflow analysis for compiler optimization and where DFA is already implemented in Compilers.
This goal has been defined in two research questions:
- RQ1 --- What are the advantages and disadvantages of using dataflow analysis for compiler optimization? \
  This questions aims to identify which advantages DFA has over other optimization techniques and which disadvantages it has when used.

- RQ2 --- How is dataflow analysis used in current compilers? \
  This questions aims to identify how DFA is already used in current compilers, what optimizations are done with it and if it is used during normal compilation or if it has to be explicitly enabled.

== Search and selection strategy <sas_s>
#[ // sas_fig
  #set text(size: 8pt)
  #figure(
    caption: [Search and selection process],
    cetz.canvas({
      import cetz.draw: *
      let bs = (2.5, 1.3)
      let bm = (0.6, 0.5)

      let bx(px, py, name, inner) = {
        content(((bs.at(0)+bm.at(0))*px, -(bs.at(1)+bm.at(1))*py), (rel: bs), name: name, box(
          align(center + horizon, par(justify: false, leading: 0.425em, inner)),
          stroke: 0.5pt,
          width: 100%,
          height: 100%,
          inset: 0.3em
        ))
      }

      set-style(stroke: (thickness: 0.5pt))

      bx(0, 0, "acm")[ACM \ Digital Library \ n = 3594]
      bx(0, 1, "ieee")[IEEE Xplore \ n = 1720]
      bx(0, 2, "springer")[Springer Link \ n = 786]
      bx(1, 0, "dup")[Duplicate removal \ and preliminary \ filtering \ n = 471]
      bx(2, 0, "sel")[Application of \ selection criteria \ n = 10]
      bx(2, 1, "snow")[Snowballing \ n = 110]
      bx(2, 2, "reap")[Reapplication \ of selection \ criteria \ n = 15]
      bx(1, 2, "inc")[Publications \ included \ n = 15]

      line("acm.east", (rel: (0.25, 0)), name: "dlu")
      line("ieee.east", (rel: (0.25, 0)))
      line("springer.east", (rel: (0.25, 0)), name: "dld")
      line("dlu.end", "dld.end", name: "dl")

      set-style(mark: (end: "triangle", fill: black))
      line("dl.start", "dup.west")
      line("dup.east", "sel.west")
      line("sel.south", "snow.north")
      line("snow.south", "reap.north")
      line("reap.west", "inc.east")
    })
  ) <sas_fig>
]
#figure( // sas_search_string
  caption: [Search string used in electronic databases],
  kind: "raw",
  align(left)[
    // ("dataflow analysis" OR "data flow analysis") AND (compiler OR compilers OR compilation) AND (optimization OR optimizations) AND (advantages OR disadvantages OR strengths OR limitations OR trade-offs) AND (implementation OR usage OR used OR applied)
    // ("Full Text .AND. Metadata":"dataflow analysis" OR "Full Text .AND. Metadata":"data flow analysis") AND ("Full Text .AND. Metadata":compiler OR "Full Text .AND. Metadata":compilers OR "Full Text .AND. Metadata":compilation) AND ("Full Text .AND. Metadata":optimization OR "Full Text .AND. Metadata":optimizations) AND ("Full Text .AND. Metadata":advantages OR "Full Text .AND. Metadata":disadvantages OR "Full Text .AND. Metadata":strengths OR "Full Text .AND. Metadata":limitations OR "Full Text .AND. Metadata":trade-offs) AND ("Full Text .AND. Metadata":implementation OR "Full Text .AND. Metadata":usage OR "Full Text .AND. Metadata":used OR "Full Text .AND. Metadata":applied)
    #set raw(syntaxes: "search-string.sublime-syntax", theme: "search-string.tmTheme")
    ```SearchString
    ("dataflow analysis" OR "data flow analysis")
    AND
    (compiler OR compilers OR compilation)
    AND
    (optimization OR optimizations)
    AND
    (advantages OR disadvantages OR strengths OR limitations OR trade-offs)
    AND
    (implementation OR usage OR used OR applied)
    ```
  ]
) <sas_search_string>
Our search strategy consists of 5 steps as seen in @sas_fig. \
The papers from the first steps are collected from the electronic databases ACM Digital Library, IEEE Xplore, Springer Link with the search string seen in @sas_search_string.
The search string in @sas_search_string was created using the research questions in @research_questions_s and was always applied to the full text of the papers. The search string is divided into the keywords for dataflow analysis, the keywords for compiler optimization, keywords for advantages or disadvantages to help finding papers relevant for answering RQ1, and keyword signaling an implementation in the paper to help answer RQ2. The different keywords were made from the abstracts, titles and keywords of my source papers.
In the second step all duplicates which where returned from multiple databases where removed from the results and the amount was limited to fit the scope of this paper.
In the third step the selection was filtered by applying all selection criteria from @selection_criteria_s.
In the fourth step we snowballed the previously acquired results. This was to find relevant papers which where not included because of either the search string or the search criteria.
Afterwards all papers found via the snowballing where filtered again by applying the selection criteria in @selection_criteria_s.
In the end all papers from the third step and the papers of the snowballing where evaluated based on the data extraction items mentioned in @data_extraction_s.

== Selection criteria <selection_criteria_s>
For a publication to be relevant it has to satisfy at least one inclusion criteria and not any exclusion criteria. The criteria were chosen to include as any publications as possible but still filter out irrelevant ones.
#[
  #v(4pt)
  #set enum(numbering: (.., i) => "IC" + str(i))
  + Publications discussing advantages and disadvantages of DFA compared to other optimization techniques.
  + Publications focusing on one or more compilers (e.g., LLVM, Java JIT, C\# JIT).
  + Publications providing an implementation for a DFA optimization.
  #v(4pt)
]
We chose _IC1_ to help answer _RQ1_.
_IC2_ is to include publications which talk about a compiler and how DFA is implemented in it.
_IC3_ is to further include publications which directly provide an implementation or talk about creating an implementation. This is to allow analyzing how DFA is used in compilers.
#[
  #v(4pt)
  #set enum(numbering: (.., i) => "EC" + str(i))
  + Publications which discuss DFA in a non-compiler context.
  + Publications written in a language other than english.
  + Secondary and tertiary publications (e.g., systematic literature reviews, surveys).
  + Publications in the form of tutorial papers, short papers, poster papers, editorials.
  + Publications for which the full text is not available.
  #v(4pt)
]
_EC1_ is to exclude publications which talk about DFA in other contexts which are not relevant to compiler optimization.
_EC2--EC5_ are to exclude publications which do not provide enough information to include them in this publication.

== Data extraction <data_extraction_s>
Based on the research questions, we collected 9 data items to extract from all included publications. @data_extraction_table lists all data items. \
Data items _D1--D3_ are to document the source of the publication. \
_D4_ and _D5_ are to explicitly list the advantages and disadvantages for answering _RQ1_. \
_D6_ and _D7_ show in which compiler DFA was implemented and if it is running directly on a programming language like C++ or if it runs on a intermediate language like LLVM IR. \
_D8_ lists which optimizations where performed based on the results of DFA and _D9_ lists the limitations of the executed DFA. (e.g., only run on function scope). \
All data items were extracted from the full text of all included publications.
#[ // data_extraction_table
  #set par(leading: 0.4em)
  #set text(size: 9pt)
  #figure(
    caption: [Data items],
    supplement: "Table",
    booktbl(
      columns: (1fr, 8fr, 3.9fr),
      align: left,
      inset: (x: 6pt, y: 2pt),
      [ID], [Data], [Purpose],
      ..(
        ([Author(s)], [Documentation]),
        ([Publication year], [Documentation]),
        ([Title], [Documentation]),
        ([Named advantage(s) of DFA for CO], [RQ1]),
        ([Named disadvantage(s) of DFA for CO], [RQ1]),
        ([Analyzed compiler(s)], [RQ2]),
        ([Targeted language(s) of the optimization], [RQ2]),
        ([What optimizations are implemented with DFA], [RQ2]),
        ([Limitations of the analysis], [RQ2])
      ).enumerate(start: 1).map(((i, arr)) => ([D#i], ..arr)).flatten()
    )
  ) <data_extraction_table>
]

= Findings <findings_c>
In this chapter we list our findings from the conducted systematic literature analysis.

== Demographic
[@slr-10] shows that dataflow analysis is not only used to optimize software for normal computers, but also to optimize hardware description languages like Verilog or VHDL, which are then turned into hardware via a Field Programmable Gate Array (FPGA).
=== Publication year
#figure( // demographic_pub_year
  caption: "Publication years of the publications",
  {
    let data = (
      (1973, 1),
      (1997, 1),
      (2010, 2),
      (2011, 2),
      (2012, 1),
      (2013, 2),
      (2015, 1),
      (2018, 1),
      (2019, 1),
      (2020, 2),
      (2024, 1)
    )
    // cSpell:disable
    lq.diagram(
      width: 8.5cm,
      xlim: (1972, 2026),
      ylim: (0, 2.5),
      yaxis: (subticks: none, ticks: range(0, 3)),
      xaxis: (ticks: range(1975, 2026, step: 5)),
      lq.bar(
        data.map(v => v.at(0)),
        data.map(v => v.at(1))
      )
    )
    // cSpell:enable
  }
) <demographic_pub_year>
As seen in @demographic_pub_year most of the analyzed publication are from the last 15 years, which indicates that this field is still actively being researched and explore, but research has already start back in 1983. Since research started over 50 years ago it indicates that this field is by now well-established. There are definitely more publications which are not listed here and not represented in this figure, but that is because the scope of this papers was very limited. \
=== Research focus
#figure( // demographic_research_focus
  caption: "Research focus of the publications",
  {
    let data = (
      ("Algorithms and Techniques", 5), // 1, 2, 5, 7, 12
      ("Implementation and Reusability", 2), // 3, 8
      ("Analysis speed improvement", 4), // 4, 6, 14, 15
      ("Custom IR for analysis", 3), // 9, 10, 13
      ("Tools for implementation of DFA", 1), // 11
    )

    cetz.canvas({
      //let colors = (red, eastern, green, blue, navy, purple, maroon, orange)
      let colors = gradient.linear(..color.map.rainbow.map(v => v.darken(20%).saturate(20%)))

      // cspell:disable-next-line
      cetz_chart.piechart(
        data,
        value-key: 1,
        label-key: 0,
        radius: 3,
        slice-style: colors,
        inner-radius: 0,
        inner-label: (content: (value, _) => [#text(white, str(value))], radius: 150%),
        outer-label: (content: (value, _) => [], radius: 0%),
        legend: (
          position: "south",
          anchor: "north",
          orientation: ttb
        )
      )
    })
  }
) <demographic_research_focus>
The focus of the different papers can be seen in @demographic_research_focus. Most of the papers [@slr-1, @slr-2, @slr-5, @slr-7, @slr-12] included focus on creating and implementing new algorithms and techniques. Another big focus of the included papers is speeding up the analysis, which also makes it more viable for using in JIT compilers. While [@slr-4, @slr-6] try to do this by simply running parts of the analysis on different threads, [@slr-14] tries to pipeline the analysis of functions and [@slr-15] tries to skip parts by only lazily iterating over nodes of the IR. [@slr-9, @slr-10, @slr-13] implement a custom IR to make it easier to run parts of the DFA or to have a better structure then the previous code or IR. The focus of [@slr-3, @slr-8] is to provide a generic library for implementing DFA and using it and to provide an example implementation of the library to show how it works. [@slr-11] creates a Domain-Specific Language (DSL) for implementing DFA algorithm in the LLVM framework to make it easier for researchers to try out new ideas and implement them. \
=== Target languages
#figure( // demographic_target_lang
  caption: "Target languages of the publications",
  {
    let data = (
      ("None", 1),
      ("Custom", 1),
      ("C", 3),
      ("LLVM IR", 5),
      ("Java Bytecode", 2),
      ("Graal IR", 1),
      ("SSA of Java", 2)
    )

    cetz.canvas({
      //let colors = (red, eastern, green, blue, navy, purple, maroon, orange)
      let colors = gradient.linear(..color.map.rainbow.map(v => v.darken(20%).saturate(20%)))

      // cspell:disable-next-line
      cetz_chart.piechart(
        data,
        value-key: 1,
        label-key: 0,
        radius: 3,
        slice-style: colors,
        inner-radius: 0,
        inner-label: (content: (value, _) => [#text(white, str(value))], radius: 150%),
        outer-label: (content: (value, _) => [], radius: 0%),
        legend: (
          position: "east",
          anchor: "south",
          orientation: ttb,
          offset: (1.7cm, -2.5cm)
        )
      )
    })
  }
) <demographic_target_lang>
@demographic_target_lang shows a 33% trend towards implementing DFA optimizations either with LLVM directly or by operating on the LLVM IR, while Java is either directly used as bytecode or as SSA representation of Java. This shows that LLVM is a good platform for implementing optimizations and that it has a lower barrier of entry for developing optimizations. \

== RQ1: Advantages and disadvantages of using Dataflow analysis for compiler optimization
DFA makes many big compiler optimizations possible but it also brings many trade-offs and not just for performance.
These optimizations eliminate unused code and simplify expressions, which reduces execution time and memory footprint during runtime.
[@slr-1] is one of the first publications talking about DFA and how it allows to use previously existing optimizations, which could only be applied on code sections without branches, with branching by checking how data flows through the branches.
Later publications [@slr-2, @slr-5] describe ways to apply these optimization interprocedurally and across thread synchronization boundaries. [@slr-2] does this be inlining the called procedure and then performing dataflow analysis. This makes every procedure call optimized for every call location, but brings the disadvantage of very rapidly increasing the size of the optimized program. An important requirement that [@slr-5] describes, is that programs must be well synchronized, otherwise DFA can not be used because of possible data races. \

=== Analysis performance
While performance is not the biggest concern for DFA, since it runs at compile-time and accuracy is more important [@slr-4], many publications [@slr-4, @slr-6, @slr-14, @slr-15] have investigated how to improve the performance of DFA. This is done with several techniques: In [@slr-4, @slr-6] different function calls are run on different threads, but it has the problem of creating and queue a task for each function, which can lead to a big overhead. In [@slr-6] independent branches are also run on separate threads. A big problem with both approaches is to avoid, that some functions could be queued for analysis be more than one thread, which leads to unnecessary redundancy. \
Another approach [@slr-14] is to pipeline the function calls. This is done by analyzing all variables, which do not depend on any function calls. When the function calls have finished being analyzed, the variables, which depend on that function call are analyzed. Thereby more parallel work is possible.
=== Implementation complexity
Another problem with DFA is the difficulty to implement optimizations with it [@slr-3, @slr-11]. DFA is often also deeply entangled with the compiler internals, which makes it difficult to reuse existing optimizations with other compilers or implement new optimizations quickly and it is complicated to implemented, as seen in LLVM: "simple peephole optimizations in the LLVM instcombine pass contain approximately 30000 lines of complex C++ code, despite the transformations being simple" [@slr-11]  \
One solutions to this problem is described in [@slr-3] by implementing a library in Haskell which performs the dataflow analysis and provides an interface, which "is made possible by sophisticated aspects of Haskell’s type system, such as higher-rank polymorphism, GADTs, and type functions" [@slr-3], to implement various optimizations, which also then can be reused for other compilers. The biggest drawback of this library is it's limited to compilers implemented in Haskell. \
[@slr-11] describes a domain specific language to implement LLVM optimization passes. This is done by a having a simple language for directly implementing the logic of the optimization, while a custom transpiler then converts it into a LLVM pass written in C++. Since the LLVM pass is implemented in a more generic way to fit this purpose, it leads to a moderate compile time increase. There is no formal verification done on the implemented optimization pass. Because of these disadvantages it is a great tool to quickly implement, test and iterate optimizations, but for a more permanent passes, hand-written C++ code should be used.
=== Limitations
DFA is hard to parallelize because variables are often dependant on other variables or function arguments. While it is possible to analyze multiple functions at the surface level, they still depend on the context of other functions calling it. As already mentioned [@slr-14] already shows how it is still possible to run parallel analysis while still waiting for the results of other threads. \
Global variables also make analysis more complicated since the can be accessed and modified by all functions and either need to be treated as an unknown value every time or all functions which work with this variable are analytically dependant on each other and should be locked at when checking the value of the variable. A similar problem exists for variables shared across threads, because the analysis has to look at all functions which could modify the variable. Alternatively the variable should be well synchronized so that only one thread can write it or multiple threads can read it, but not both options at the same time [@slr-5]. \
Another thing that complicates DFA in languages like C is the usage of pointers because they allow the program to modify all variables in unpredictable ways which thereby invalidates all facts and assumptions which were made up to that point about all variables. \
Since inlining is required to perform rewrites, it can lead to bloating the executable and make it overly huge.

== RQ2: Usage of dataflow analysis in current compilers
The Glasgow Haskell Compiler (GHC), LLVM, and GCC are good examples for compilers which already extensively use DFA to implement optimizations.
These optimizations include common sub-expression elimination [@slr-1, @slr-7, @slr-13], copy propagation [@slr-5, @slr-7], constant propagation [@slr-1], conditional branch elimination [@slr-2] and dead code elimination [@slr-13].

= Conclusion <conclusion_c>
Our findings show that DFA is already extensively used in current compilers and brings big advantages for runtime speed. The cost of this is a higher compilation duration, which makes it unsuitable for JIT compilation. Furthermore, DFA allows complex optimizations across branches and function boundaries which would not be possible with traditional straight-line optimizations. \
The high implementation complexity and the deep entangled with the compiler internals also poses a big problem for advancing this field further.
The recent release of new publications on this topic indicates that researchers are continuously searching for better and faster ways to implement DFA and to make better use of the analysis results. \
The adaptability of LLVM and the associated immediate representation makes it an invaluable platform to do testing and research with DFA.

#pagebreak(weak: true)
#set heading(numbering: none)
#bibliography("refs.bib", title: "References", style: "association-for-computing-machinery")

#pagebreak(weak: true)
#set heading(numbering: "A.a.a")
#counter(heading).update(0)

#{ // slr results table
  set page(flipped: true, columns: 1, margin: 2em)
  [= SLR Results]
  v(1em)
  counter(heading).update(0)
  set table(stroke: (x, _) => if x in (1, 4, 6) { (x: 2pt, y: 1pt) } else { 1pt })
  show heading: set text(weight: "regular")
  table(
    columns: (auto, auto, auto, auto, auto, auto, 6em, 4.05em, auto, auto),
    inset: (x: 5pt, y: 3pt),
    ..csv("pubs.csv")
      .map(v => {
        if v.at(0) != "ID" {
          let id = v.at(0).slice(1)
          v.at(0) = [#figure([P#id], kind: "slr", supplement: none) #label("slr-" + id)]
        }
        return v
      })
      .flatten()
  )
}