README.md 7.85 KB
Newer Older
Dr. Michael Petter's avatar
Dr. Michael Petter committed
1
# SimpleC
Dr. Michael Petter's avatar
Dr. Michael Petter committed
2

Dr. Michael Petter's avatar
Dr. Michael Petter committed
3
This is a simplified C-like frontend, which produces Compilation Units, which
Dr. Michael Petter's avatar
Dr. Michael Petter committed
4 5
hold Control Flow Graphs.

Dr. Michael Petter's avatar
Dr. Michael Petter committed
6 7 8 9 10 11
## binary distribution

You can get a bleeding edge built from the git sources from https://versioncontrolseidl.in.tum.de/petter/simplec/-/jobs/artifacts/master/download?job=build

## sources

Dr. Michael Petter's avatar
Dr. Michael Petter committed
12 13 14 15
Both projects come with a Netbeans Project file -- but You can of course just
use the command line build tool ANT, whose build.xml files come with the
projects.

Dr. Michael Petter's avatar
Dr. Michael Petter committed
16
```bash
Dr. Michael Petter's avatar
Dr. Michael Petter committed
17
git clone git@versioncontrolseidl.in.tum.de:petter/simpleC.git
Dr. Michael Petter's avatar
Dr. Michael Petter committed
18
cd simplec
Dr. Michael Petter's avatar
Dr. Michael Petter committed
19
ant
Dr. Michael Petter's avatar
Dr. Michael Petter committed
20
```
Dr. Michael Petter's avatar
Dr. Michael Petter committed
21

Dr. Michael Petter's avatar
Dr. Michael Petter committed
22
After compilation, you find the usable Frontend in simpleC/simplec/dist/Compiler.jar
Dr. Michael Petter's avatar
Dr. Michael Petter committed
23

24 25 26
It works on a simpleC file like:

```c
Dr. Michael Petter's avatar
Dr. Michael Petter committed
27
void* malloc(int i);
28 29 30 31 32
int foo(){
  int i = 42;
  return i;
}

33 34
int (*g)();

Dr. Michael Petter's avatar
Dr. Michael Petter committed
35
void main(){
36
  int i;
Dr. Michael Petter's avatar
Dr. Michael Petter committed
37 38 39 40 41 42 43
  int* j;
  j = malloc(10);
  for (i=0;i<10;i++){
      if ((i/2)*2==i) continue;
      j[i]=i+1;
  }
  i = j[9];
44 45
  j = &i;
  i = *j;
46
  if (i==5) 
Dr. Michael Petter's avatar
Dr. Michael Petter committed
47
    foo(); // triggers a MethodCall Transition
48 49 50 51 52 53 54 55
  else    
    i=foo(); // triggers an Assignment with an embedded MethodCall Expression!
  while(i==4711) {
    i=42-42;
  }
}
```

Dr. Michael Petter's avatar
Dr. Michael Petter committed
56
## simplec - subproject
Dr. Michael Petter's avatar
Dr. Michael Petter committed
57

Dr. Michael Petter's avatar
Dr. Michael Petter committed
58 59 60 61
The project is split in two subprojects, the SimpleC Frontend and the
Intermediate Representation CFGStructure. SimpleC contains the complete
Frontend of the SimpleC language; You can obtain a SimpleC Controlflowgraph via

Dr. Michael Petter's avatar
Dr. Michael Petter committed
62
```java
Dr. Michael Petter's avatar
Dr. Michael Petter committed
63
CompilationUnit cu = petter.simplec.Compiler.parse(File f);
Dr. Michael Petter's avatar
Dr. Michael Petter committed
64
```
Dr. Michael Petter's avatar
Dr. Michael Petter committed
65

Dr. Michael Petter's avatar
Dr. Michael Petter committed
66
SimpleC supports the following features:
67
- Definitions of functions/procedures with parameters and local variables
68
- Datatypes int, void, function and pointer to a type; arrays are atm. just pointers
Dr. Michael Petter's avatar
Dr. Michael Petter committed
69
- Prototypes for functions
70
- Types for (sub-) expressions
71
- Function pointers
72
- TODO: Weird initializers for arrays
Dr. Michael Petter's avatar
Dr. Michael Petter committed
73
- TODO: initializers/memoryallocation for structs/arrays
74
- TODO: Sizes for types; 
75
- TODO: type checking
Dr. Michael Petter's avatar
Dr. Michael Petter committed
76
- TODO: Casts
77
- basic arithmetics like +/-/*//
Dr. Michael Petter's avatar
Dr. Michael Petter committed
78
- address operators &/* 
79
- Prefix/Postfix operators work, just not for array expresssions
Dr. Michael Petter's avatar
Dr. Michael Petter committed
80
- control flow structures: if/while/for/goto/switch-case/do-while incl. break/continue
81
- Procedure call as (sub-) expressions
Dr. Michael Petter's avatar
Dr. Michael Petter committed
82
- TODO: break/continue-labels
83

Dr. Michael Petter's avatar
Dr. Michael Petter committed
84

Dr. Michael Petter's avatar
Dr. Michael Petter committed
85
## cfgstructure - subproject
Dr. Michael Petter's avatar
Dr. Michael Petter committed
86

Dr. Michael Petter's avatar
Dr. Michael Petter committed
87 88 89 90 91 92 93 94 95
The data structures for the IR can be found in the cfgstructure subproject. The
root for all IR related data is petter.cfg.CompilationUnit . You can obtain
all procedures and fields from a compilation unit, as well as general information
like all states and translations of variable-ids (integers) to 
source-level-variable-names. A petter.cfg.Procedure is a model of a particular CFG.
It offers informations like programstates, beginstate, endstate, and transitions
between the states. Normally, You do not have to touch this one directly, instead
rely on the fixpoint engine:

Dr. Michael Petter's avatar
Dr. Michael Petter committed
96 97 98 99 100 101 102
## Program Transformations

The CFG structure package is designed based on edges, whose static data (i.e. label,
e.g. lhs/rhs of an assignment) is immutable after construction. In case of manipulation
of the CFG, the programmer is supposed to introduce new edges and delete old ones.


Dr. Michael Petter's avatar
Dr. Michael Petter committed
103
## Fixpoint engine
Dr. Michael Petter's avatar
Dr. Michael Petter committed
104

Dr. Michael Petter's avatar
Dr. Michael Petter committed
105
For the fixpoint-engine, you need to create your own class:
Dr. Michael Petter's avatar
Dr. Michael Petter committed
106

Dr. Michael Petter's avatar
Dr. Michael Petter committed
107
```java
Dr. Michael Petter's avatar
Dr. Michael Petter committed
108 109 110 111
import petter.cfg.*;
import petter.cfg.edges.*;
import java.io.*;

Dr. Michael Petter's avatar
Dr. Michael Petter committed
112

Dr. Michael Petter's avatar
Dr. Michael Petter committed
113
public class ReachabilityAnalysis extends AbstractPropagatingVisitor<Boolean>{
Dr. Michael Petter's avatar
Dr. Michael Petter committed
114 115 116 117

    static Boolean lub(Boolean b1,Boolean b2){
	    if (b1==null) return b2;
	    if (b2==null) return b1;
118
	    return b1||b2;
Dr. Michael Petter's avatar
Dr. Michael Petter committed
119 120 121 122 123 124 125
    }
    static boolean lessoreq(Boolean b1,Boolean b2){
	    if (b1==null) return true;
	    if (b2==null) return false;
	    return ((!b1) || b2);
    }

Dr. Michael Petter's avatar
Dr. Michael Petter committed
126 127
    CompilationUnit cu;
    public ReachabilityAnalysis(CompilationUnit cu){
Dr. Michael Petter's avatar
Dr. Michael Petter committed
128
        super(true); // forward reachability
Dr. Michael Petter's avatar
Dr. Michael Petter committed
129
	    this.cu=cu;
Dr. Michael Petter's avatar
Dr. Michael Petter committed
130
    }
Dr. Michael Petter's avatar
Dr. Michael Petter committed
131
    public Boolean visit(ProcedureCall m,Boolean b){
Dr. Michael Petter's avatar
Dr. Michael Petter committed
132 133 134
        // method calls need special attention; in this case, we just 
        // continue with analysing the next state and triggering the analysis
        // of the callee
Dr. Michael Petter's avatar
Dr. Michael Petter committed
135
	    enter(cu.getProcedure(m.getCallExpression().getName()),true);
Dr. Michael Petter's avatar
Dr. Michael Petter committed
136
	    return b;
Dr. Michael Petter's avatar
Dr. Michael Petter committed
137
    }
Dr. Michael Petter's avatar
Dr. Michael Petter committed
138 139 140 141 142 143 144 145
    public Boolean visit(State s,Boolean newflow){
	    Boolean oldflow = dataflowOf(s);
	    if (!lessoreq(newflow,oldflow)){
	        Boolean newval = lub(oldflow,newflow);
	        dataflowOf(s,newval);
	        return newval;
	    }
	    return null;
Dr. Michael Petter's avatar
Dr. Michael Petter committed
146 147 148
    }
   
    public static void main(String[] args) throws Exception {
Dr. Michael Petter's avatar
Dr. Michael Petter committed
149
        CompilationUnit cu = petter.simplec.Compiler.parse(new File(args[0]));
Dr. Michael Petter's avatar
Dr. Michael Petter committed
150 151
        ReachabilityAnalysis ra = new ReachabilityAnalysis(cu);
        Procedure foo = cu.getProcedure("main");
Dr. Michael Petter's avatar
Dr. Michael Petter committed
152
	    DotLayout layout = new DotLayout("jpg","main.jpg");
Dr. Michael Petter's avatar
Dr. Michael Petter committed
153
        ra.enter(foo,true);
Dr. Michael Petter's avatar
Dr. Michael Petter committed
154
        ra.fullAnalysis();
Dr. Michael Petter's avatar
Dr. Michael Petter committed
155
	    for (State s: foo.getStates()){
Dr. Michael Petter's avatar
Dr. Michael Petter committed
156
	        layout.highlight(s,(ra.dataflowOf(s))+"");
Dr. Michael Petter's avatar
Dr. Michael Petter committed
157 158
	    }
	    layout.callDot(foo);
Dr. Michael Petter's avatar
Dr. Michael Petter committed
159 160
    }
}
161
```
Dr. Michael Petter's avatar
Dr. Michael Petter committed
162
#### Abstract Data Flow Control
163 164 165 166 167 168 169 170
The AbstractPropagatingVisitor class already has a queue and evaluates
every node it finds within this queue by calling the particular visit(a,d)
method on the visitor itself. The value, returned by the visitor is given
to the next graph node, according to the Analysis' direction, either forward
along the transition or backward. This goes on until the a visit method 
returns null, which means, that vor this call, nothing will be entered in the
queue.

Dr. Michael Petter's avatar
Dr. Michael Petter committed
171
#### Abstract transitions via overwritten visits
Dr. Michael Petter's avatar
Dr. Michael Petter committed
172 173 174 175
It is thus common in particular analyses to overwrite the appropriate visit(_,d)
methods in order to manipulate the dataflow value d according to the CFG 
transition, that is visited at that time. All not overwritten methods call by
default the defaultBehaviour(a,d) method, in which you can implement generic 
Dr. Michael Petter's avatar
Dr. Michael Petter committed
176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192
transitions behaviour if desired. 

#### Abstractly interpreting expressions
In most cases, you will probably also want to
inspect expressions, that occur at a certain transition. For that, you will have
to either manually traverse the expression tree with lots of instanceofs or
use the petter.cfg.expression.AbstractExpressionVisitor . 

```java
Expression e =..
e.accept(new AbstractExpressionVisitor(){// silly visitor printing all constants
    public void postVisit(IntegerConstant s){
        System.out.println("Found constant "+s.getIntegerConst());
    }
});
```
#### Accumulating dataflow values
Dr. Michael Petter's avatar
Dr. Michael Petter committed
193

194 195 196 197 198 199
In order to implement a Fixpoint iteration, all that remains, is to store
the current dataflowvalue d in the CFG's state s via dataflowOf(s,d) and update
it in case that the newly arriving value in the visit(State,newval) method
is not already captured by it. You can see this in the example's implementation
of visit(State,Boolean).

Dr. Michael Petter's avatar
Dr. Michael Petter committed
200 201

## Graphical output
Dr. Michael Petter's avatar
Dr. Michael Petter committed
202 203 204 205 206 207 208

You can use the petter.cfg.DotLayout class to generate nice output of your
Analysis project; you create it with a file-format-specifier and a file name, and
it creates an OS-process of dot, from the graph layout software Graphviz. If you
then call callDot(p) with a cfgstructure-Procedure, then the tool will feed dot
with a graphical output of the CFG. If you call highlight(a,s) on the DotLayout
object with an Analyzable from the CFG and a custom string, then a speech bubble
209 210
with the string s pointing to a will be added to the graph.

211 212
![CFG example](/cfg.png "example output")

Dr. Michael Petter's avatar
Dr. Michael Petter committed
213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238
The upper graph is the graphical output for
```c
int b=42;
int a[42];
int **d;
int f();
int main(int a,int **f()){
  a=42;
  int c[42];
  while (a!=0 ){
    for(a=0;a<b;a++)
      a--;
    f();
    
  }
  switch(b){
  case 42: return 0;
  case 0: d[a]=*f();
  case 4711: f();
  defaut: break;
  }
  return a[0];
}
```


239 240 241
You can even visualize your fixpoint iteration, by overwriting the 
AbstractPropagatingVisitor.enter(a,d) method, to highlight the currently propagated
dataflow value at the appropriate graph component, and draw each step into a
Dr. Michael Petter's avatar
Dr. Michael Petter committed
242 243 244 245 246 247 248 249
different file, creating a sequence which visualizes your whole fixpoint computation.

## See also

For a simplified C Frontend for Ocaml, make sure to check out the following two projects:

* copt https://versioncontrolseidl.in.tum.de/teaching/copt
* copt2 https://versioncontrolseidl.in.tum.de/teaching/copt2