//#import "@preview/clean-acmart:0.0.1": acmart, acmart-ccs, acmart-keywords, acmart-ref, to-string #import "clean-acmart.typ": acmart #import "@preview/cetz:0.3.4" #let title = [Dataflow Analysis for Compiler Optimization] #let authors = ( ( name: "Matthias Veigel", email: "matthias.veigel@uni-ulm.de", department: [Institute of Software Engineering and Programming Languages], institute: [University Ulm] ), ) #show: acmart.with( title: title, authors: authors, copyright: none // Set review to submission ID for the review process or to "none" for the final version. // review: [\#001], ) #set figure(supplement: [Fig.]) #show figure.caption: it => [ #set text(size: 8pt) *#it.supplement #context it.counter.display(it.numbering)* #it.body ] #show figure.where(kind: "raw"): set figure(supplement: [Listing]) #show figure.where(kind: "raw"): it => align(left)[ #v(8pt, weak: true) #it.body #v(4pt, weak: true) #it.caption #v(8pt, weak: true) ] #set heading(numbering: "1.1.1") = Abstract // define DFA and CO here or in introduction todo = Introduction todo = Methodology This publication is created following the process described in @process_fig. The protocol for the review is divided up into the object of the research see @research_questions_s, the search strategy see @sas_s, the selection criteria see @selection_criteria_s and the data extraction strategy see @data_extraction_s. #place( bottom + center, scope: "parent", float: true, [ #figure( caption: [Overview of the review process. Adapted from @ciccozzi_execution_2019 and @gotz_claimed_2021.], image("review_process.png") ) ] ) == Objective and research questions The goal of this research paper is to find claims about the advantages and disadvantages of using dataflow analysis for compiler optimization and where DFA is already implemented in Compilers. This goal has been defined in two research questions: - RQ1 --- What are the advantages and disadvantages of using dataflow analysis for compiler optimization? \ This questions aims to identify which advantages DFA has over other optimization techniques and which disadvantages it has when used. - RQ2 --- How is dataflow analysis used in current compilers? \ This questions aims to identify how DFA is already used in current compilers, what optimizations are done with it and if it is used during normal compilation or if it has to be explicitly enabled. == Search and selection strategy Our search strategy consisted of 4 steps as seen in @sas_fig. \ #figure( caption: [Search string used in electronic databases], kind: "raw", align(left)[ // ("dataflow analysis" OR "data flow analysis") AND (compiler OR compilers OR compilation) AND (optimization OR optimizations) AND (advantages OR disadvantages OR strengths OR limitations OR trade-offs) AND (implementation OR usage OR used OR applied) // ("Full Text .AND. Metadata":"dataflow analysis" OR "Full Text .AND. Metadata":"data flow analysis") AND ("Full Text .AND. Metadata":compiler OR "Full Text .AND. Metadata":compilers OR "Full Text .AND. Metadata":compilation) AND ("Full Text .AND. Metadata":optimization OR "Full Text .AND. Metadata":optimizations) AND ("Full Text .AND. Metadata":advantages OR "Full Text .AND. Metadata":disadvantages OR "Full Text .AND. Metadata":strengths OR "Full Text .AND. Metadata":limitations OR "Full Text .AND. Metadata":trade-offs) AND ("Full Text .AND. Metadata":implementation OR "Full Text .AND. Metadata":usage OR "Full Text .AND. Metadata":used OR "Full Text .AND. Metadata":applied) #set raw(syntaxes: "search-string.sublime-syntax", theme: "search-string.tmTheme") // AND ("compiler optimization" OR "compilation optimization" OR "compiler optimizations" OR "compilation optimizations" OR "optimizing compiler" OR "optimizing compilers") ```SearchString ("dataflow analysis" OR "data flow analysis") AND (compiler OR compilers OR compilation) AND (optimization OR optimizations) AND (advantages OR disadvantages OR strengths OR limitations OR trade-offs) AND (implementation OR usage OR used OR applied) ``` ] ) The papers from the first steps are collected from the electronic databases ACM Digital Library, IEEE Xplore, Springer Link with the search string seen in @sas_search_string. The search string in @sas_search_string was created using the research questions in @research_questions_s and was always applied to the full text of the papers. \ In the second step all duplicates which where returned from multiple databases where removed from the results. \ In the third step the selection was filtered by applying all selection criteria from @selection_criteria_s. \ In the fourth step I snowballed the previously acquired results. This was to find relevant papers which where not included because of either the search string or the search criteria. \ Afterwards all papers of the snowballing where evaluated based on the data extraction items mentioned in @data_extraction_s. #place( auto, scope: "parent", float: true, [ #set par(leading: 0.3em) #set text(size: 8pt) #figure( caption: [Search and selection process], cetz.canvas({ import cetz.draw: * let bs = (2.8, 1) set-style(stroke: (thickness: 0.5pt)) rect((0, 0), (rel: bs), name: "acm") rect((0, -(bs.at(1)+0.3)*1), (rel: bs), name: "ieee") rect((0, -(bs.at(1)+0.3)*2), (rel: bs), name: "springer") rect((bs.at(0)+1.5, -(bs.at(1)+0.3)), (rel: bs), name: "dup") rect((bs.at(0)*2+2.25, -(bs.at(1)+0.3)), (rel: bs), name: "sel") rect((bs.at(0)*3+3, -(bs.at(1)+0.3)), (rel: bs), name: "snow") rect((bs.at(0)*4+3.75, -(bs.at(1)+0.3)), (rel: bs), name: "inc") line("acm.east", (rel: (0.75, 0)), name: "dlu") line("ieee.east", (rel: (0.75, 0))) line("springer.east", (rel: (0.75, 0)), name: "dld") line("dlu.end", "dld.end", name: "dl") set-style(mark: (end: "straight")) line("dl.50%", "dup.west") line("dup.east", "sel.west") line("sel.east", "snow.west") line("snow.east", "inc.west") content("acm", align(center)[ACM Digital Library \ n = ]) content("ieee", align(center)[IEEE Xplore \ n = ]) content("springer", align(center)[Springer Link \ n = ]) content("dup", align(center)[Duplicate removal \ n = ]) content("sel", align(center)[Application of \ selection criteria \ n = ]) content("snow", align(center)[Snowballing \ n = ]) content("inc", align(center)[Publications included \ n = ]) }) ) ] ) == Selection criteria For a publication to be relevant it has to satisfy at least one inclusion criteria and not any exclusion criteria. The criteria were chosen to include as any publications as possible but still filter out irrelevant ones. #[ #v(10pt) #set enum(numbering: (.., i) => "IC" + str(i)) + Publications discussing advantages and disadvantages of DFA compared to other optimization techniques. + Publications focusing on one or more compilers (e.g., LLVM, Java JIT, C\# JIT). + Publications providing an implementation for a DFA optimization. #v(10pt) ] We chose _IC1_ to help answer _RQ1_. \ _IC2_ is to include publications which talk about a compiler and how DFA is implemented in it. \ _IC3_ is to further include publications which directly provide an implementation. #[ #v(10pt) #set enum(numbering: (.., i) => "EC" + str(i)) + Publications which discuss DFA in a non-compiler context. + Publications written in a language other than english. + Secondary and tertiary publications (e.g., systematic literaturer reviews, surveys). + Publications in the form of tutorial papers, short papers, poster papers, editorials. + Publications for which the full text is not available. + Publications published before 2010. #v(10pt) ] _EC1_ is to exclude publications which talk about DFA in other contexts which are not relevant to compiler optimization. \ _EC2-EC5_ are to exclude publications which do not provide enough information to include them in this publication. \ _EC6_ is to make sure the publications are still relevant. == Data extraction Based on the research questions I collected 9 data items to exrtract from all included publications. @data_extraction_table lists all data items. \ Data items _D1-D3_ are to document the source of the publication. \ _D4_ and _D5_ are to explicitly list the advantages and disadvantages for answering _RQ1_. \ _D6_ and _D7_ show in which compiler DFA was implemented and if it is running directly on a programming language like C++ or if it runs on a intermediate language like LLVM IR. \ _D8_ lists which optimizations where performed based on the results of DFA and _D9_ lists the limitations of the executed DFA. (e.g., only run on function scope). \ All data items were extracted from the full text of all included publications. #place( auto, scope: "parent", float: true, [ #set par(leading: 0.3em) #set text(size: 9pt) #figure( caption: [Data items], supplement: "Table", table( columns: (1fr, 8fr, 2fr), stroke: (x, y) => if y == 0 { (bottom: 0.7pt + black) }, align: left, inset: (x: 6pt, y: 2pt), [ID], [Data], [Purpose], ..( ([Author(s)], [Documentation]), ([Publication year], [Documentation]), ([Title], [Documentation]), ([Named advantage(s) of DFA for CO], [RQ1]), ([Named disadvantage(s) of DFA for CO], [RQ1]), ([Analyzed compiler(s)], [RQ2]), ([Targeted language(s) of the optimization], [RQ2]), ([What optimizations are implemented with DFA], [RQ2]), ([Limitations of the analysis], [RQ2]) ).enumerate(start: 1).map(((i, arr)) => ([D#i], ..arr)).flatten() ) ) ] ) #colbreak() #set heading(numbering: none) #bibliography("refs.bib", title: "References", style: "association-for-computing-machinery") /* #colbreak(weak: true) #set heading(numbering: "A.a.a") = Artifact Appendix In this section we show how to reproduce our findings. */