Chapter 11. Control Flow – Andrew Shitov

This is a chapter from
Creating a compiler with Raku

We started Chapter 8 with an attempt to implement the if keyword. It was tricky to do so in the interpreter that executes the code line by line. Introducing the AST allowed to parse the program without executing it, so we can first build the node and only run the code behind it if the condition is true.

Implementing `if`

The language construction we are going to implement first is the prefix if clause, demonstrated on the following example:

my flag = 42;
if flag say "Depends";
say "Always";

The if keyword is followed by a value (which is a condition) and a statement. A value is a value in the sense of our grammar, thus it is either a variable as in the example, or a string, or an expression. A statement can be any construct recognised by the statement rule.

Embedding the if clause to the grammar is simple, as we have all statement definitions in one rule:

rule condition {
    'if' <value>
}

rule statement {
    <condition> ? [
        | <statement=variable-declaration>
        | <statement=assignment>
        | <statement=function-call>
    ]
}

We have seen earlier that we should not execute the statement before we know if the condition is true. In the AST world, we can pick the statement node out of the tree and place it to an attribute of the condition node. Later, the evaluator decides whether to execute the statement node. So, we need a new kind of an AST node:

class AST::Condition is ASTNode {
    has ASTNode $.value;
    has ASTNode $.statement;
}

In the actions class, the method trigged in response to a parsed statement should now be split into two branches. When there is no if clause, everything remains as is. When there is such a clause, the statement node is enters the tree via a proxy AST::Condition node:

multi method statement($/ where !$<condition>) {
    $/.make($<statement>.made);
}

multi method statement($/ where $<condition>) {
    $/.make(AST::Condition.new(
        value => $/<condition><value>.made,
        statement => $/<statement>.made,
    ));
}

The whole AST for the test program is drawn below.

Take a look at the two calls of say. The first call is a child of the AST::Condition node. The second one (in the right side of the diagram) is located at the top level inside the TOP’s @.statements array.

The tree is ready and we can evaluate it. The LinguaEvaluator class needs a way to react to the AST::Condition type of AST nodes. If should check the condition value and pass execution further if needed.

multi method eval-node(AST::Condition $node) {
    if $node.value.value {
        self.eval-node($node.statement);
    }
}

That’s all! Test the program with different values of the flag variable and see how the program reacts.

To complete the implementation, add a test in the t directory:

my flag = 1;
if flag say "Printed"; ## Printed

flag = 0;
if flag say "Ignored";

say "Done"; ## Done

Implementing `else`

A logical next step is implementing the else clause. Let us expand the test program:

my flag = 0;
if flag say "Depends" else say "Otherwise";
say "Always";

The else keyword can only appear if there was an if before. In our current grammar, adding the else block after the statement is tricky. It is better to re-organise the grammar so that the statement rule contains different variants of statements only.

Let us allow the if–else construct on the top level:

rule TOP {
    [
        | <one-line-comment>
        | <statement=conditional-statement> ';'
        | <statement> ';'
    ]*
}

rule conditional-statement {
    'if' <value> <statement>
    [
        'else' <statement>
    ]?
}

rule statement {
    | <statement=variable-declaration>
    | <statement=assignment>
    | <statement=function-call>
}

Notice that in the TOP rule, the new alternative is given an alias, statement, which let us keep the same loop in the corresponding action:

method TOP($/) {
    . . .
    for $<statement> -> $statement {
        $top.statements.push($statement.made);
    }
    . . .
}

OK, we can have one or two statements in the conditional construct now. The AST::Conditional node gets the second attribute:

class AST::Condition is ASTNode {
    has ASTNode $.value;
    has ASTNode $.statement;
    has ASTNode $.antistatement;
}

The new attribute is filled with either AST::Null (if there is no else clause) or with a AST::Statement node:

method conditional-statement($/) {
    $/.make(AST::Condition.new(
        value => $/<value>.made,
        statement => $/<statement>[0].made,
        antistatement => $/<statement>[1] ?? 
                         $/<statement>[1].made !! AST::Null,
    ));
}

The statement action can now be reverted to a single method with a trivial body:

method statement($/) {
    $/.make($<statement>.made);
}

Finally, the evaluator must deside which statement to execute:

multi method eval-node(AST::Condition $node) {
    if $node.value.value {
        self.eval-node($node.statement);
    }
    else {
        self.eval-node($node.antistatement);
    }
}

We have to implement the no-operation behaviour of the AST::Null node too:

multi method eval-node(AST::Null) {
}

The else keyword is ready and functioning, and a test t/else.t must be added to the repository:

if 1 say "A" else say "B"; ## A
if 0 say "A" else say "B"; ## B

my x;
if 1 x = 10 else x = 20;
say x; ## 10

if 0 x = 30 else x = 40;
say x; ## 40

Implementing a loop

Let us implement a loop in our Lingua language. At the moment, let us make it simple and allow only the loop keyword with the counter variable decremented by the compiler.

my n = 5;
loop n say n;

This program prints the values from 5 to 1 and stops.

The syntax of the loop clause is similar to the if condition we saw in this chapter, and it can be embedded on the top level of the grammar too:

rule TOP {
    [
        | <one-line-comment>
        | <statement=conditional-statement> ';'
        | <statement=loopped-statement> ';'
        | <statement> ';'
    ]*
}

rule loopped-statement {
    'loop' <variable-name> <statement>
}

Next, we need a new type of AST node:

class AST::Loop is ASTNode {
    has ASTNode $.variable;
    has ASTNode $.statement;
}

The syntax is chosen in such a way so that you have to provide the loop with an already existing variable. In this case, we do not need to create a temporary loop variable. In the action method, a reference to the variable is kept in an AST::Variable node:

method loopped-statement($/) {
    $/.make(AST::Loop.new(
        variable => AST::Variable.new(
            variable-name => ~$/<variable-name>
        ),
        statement => $/<statement>.made,
    ));
}

Decreasing the counter is done in the evaluator:

multi method eval-node(AST::Loop $node) {
    while %!var{$node.variable.variable-name} {
        self.eval-node($node.statement);
        %!var{$node.variable.variable-name}--;
    }
}

It may be safer to re-write the stop condition a bit:

while %!var{$node.variable.variable-name} > 0 {
    . . .

This allows to avoid infinite loops when the initial value is not an integer number, for example:

my n = 5.5;
loop n say n;

Let’s save our achievements in the test suite. As the loop prints a few lines, you should add them all to the test file:

my n = 3;
loop n say n;
## 3
## 2
## 1

Statement blocks

Until now, both if and else constructs could contain only a single statement. Of course, if you need more than one, you can always add an if clause to each of them, but it would be much nicer to implement support of code blocks, which is present in many other languages. The goal of this section is to execute the following program with the blocks directed by both if–else and loop:

my c = 0;
if c {
    say "c = $c";
    say "IF block";
}
else {
    say "c = $c";
    say "ELSE block";
}

my n = 5;
loop n {
    say n;
    say 2 * n;
}

Let us make a first attempt to replace a statement with a newly defined block. If the block contains a single instruction, curly braces are optional:

rule block {
    | '{' ~ '}' <statement>* %% ';'
    | <statement> ';'
}

The '{' ~ '}' <X> construct in Raku is equivalent to '{' <X> '}', but it often looks cleaner, especially when X is a represented by a big piece of code.

On a conceptual level, we have to allow code blocks where a statement is allowed in the grammar. From a practical perspective, we have to think about semicolons at the end of the statements.

For example, a semicolon is not allowed before the else keyword:

if 1 say "A"; else say "B";

The problem is that it still stands after a statement in the grammar. So, we should distinguish between a statement in the if clause with and without the else clause.

Also, we should not insist on a semicolon after the closing curly brace:

loop n {
    say n;
    say 2 * n;
};

And finally, a semicolon before the closing brace can be declared optional:

loop n {
    say n;
    say 2 * n
}

To satisfy all the conditions listed above, let us tell the grammar where we expect a semicolon and where it is not needed. We can introduce two multi-alternatives of the block rule (remember how we were adding arguments to the methods in our grammar for the calculator).

multi rule block() {
    | '{' ~ '}' <statement>* %% ';'
    | <statement>
}

multi rule block(';') {
    | '{' ~ '}' <statement>* %% ';'
    | <statement> ';'
}

By doing so, we do not add a new rule name. Now, we can rewrite the rules for conditions and preliminary introduce the while loop (we’ll return to it at the end of the chapter):

rule conditional-statement {
    | 'if' <value> <block()> 'else' <block(';')>
    | 'if' <value> <block(';')>
}

rule loopped-statement {
    'while' <variable-name> <block(';')>
}

Notice that in the conditional-statement rule, it was also possible to use square brackets to extract the common part:

rule conditional-statement {
    'if' <value> [
        | <block()> 'else' <block(';')>
        | <block(';')>
    ]
}

You may prefer either; my own personal preference is using the form where you see both branches one after another:

    | 'if' <value> <block()> 'else' <block(';')>
    | 'if' <value> <block(';')>

Finally, adjust the TOP rule to remove extra semicolons from there:

rule TOP {
    [
        | <one-line-comment>
        | <statement=conditional-statement>
        | <statement=loopped-statement>
        | <statement> ';'
    ]*
}

Now to the actions part. First, lets us change the actions for the conditional constructs and use blocks there:

method conditional-statement($/) {
    $/.make(AST::Condition.new(
        value => $/<value>.made,
        statement => $/<block>[0]<statement>[0].made,
        antistatement => $/<block>[1] ??
            $/<block>[1]<statement>[0].made !! AST::Null,
    ));
}

method loopped-statement($/) {
    $/.make(AST::Loop.new(
        variable => AST::Variable.new(
            variable-name => ~$/<variable-name>
        ),
        statement => $/<block><statement>[0].made,
    ));
}

Although the result contains long chains of keys and indices of the match object, such as $/<block>[1]<statement>[0], the transformation is simple and easy to understand. You can test the program, and it will execute the first statement from each curly block.

The next step is to modify the AST so that it keeps more than one statement where a block is allowed. We have to keep all of them, so let us use arrays instead of scalars:

class AST::Condition is ASTNode {
    has ASTNode $.value;
    has ASTNode @.statements;
    has ASTNode @.antistatements;
}

class AST::Loop is ASTNode {
    has ASTNode $.variable;
    has ASTNode @.statements;
}

The corresponding actions should create the new AST nodes and pass all the statements from the code blocks. In the next fragment, we are using maps to loop over the statements:

method conditional-statement($/) {
    $/.make(AST::Condition.new(
        value => $/<value>.made,
        statements => ($/<block>[0]<statement>.map: *.made),
        antistatements => $/<block>[1] ?? 
            ($/<block>[1]<statement>.map: *.made) !! (AST::Null),
    ));
}

method loopped-statement($/) {
    $/.make(AST::Loop.new(
        variable => AST::Variable.new(
            variable-name => ~$/<variable-name>
        ),
        statements => ($/<block><statement>.map: *.made),
    ));
}

Blocks in the AST

OK, we can return to the AST and let the nodes keep arrays of statements when the block in curly braces was seen.

Let us take a closer look at the loop from the example above:

loop n {
    say n;
    say 2 * n;
}

In the form of AST, this fragment takes the following shape:

AST::Loop.new(
    variable => AST::Variable.new(
        . . .
    ),
    statements => Array[ASTNode].new(
        AST::FunctionCall.new(
            . . .
        ),
        AST::FunctionCall.new(
            . . .
        )
    )
)

Again, the abstract nature of the abstract syntax tree is demonstrated here. Although we added the block rule to the grammar, there is no trace of it in the tree. The two say calls appeared immediately as the elements of the statements array attribute within the AST::Loop node.

What we need is to loop over the statements in the evaluator:

multi method eval-node(AST::Condition $node) {
    if $node.value.value {
        self.eval-node($_) for $node.statements;
    }
    else {
        self.eval-node($_) for $node.antistatements;
    }
}

multi method eval-node(AST::Loop $node) {
    while %!var{$node.variable.variable-name} > 0 {
        self.eval-node($_) for $node.statements;
        %!var{$node.variable.variable-name}--;
    }
}

That’s it! Introducing the { } blocks was a relatively simple task.

A real-life test

If you run the loop, you’ll see an error very soon:

5
10
4
Cannot shift from an empty Array[ASTNode]
  in method value at /Users/ash/lingua/LinguaAST.rakumod    
  (LinguaAST) line 112

The first three lines display the expected beginning of the result, but then, everything suddenly stopped. The error message helps to understand that this happened because we were using shift to compute the expressions. Within a loop, expressions can be reused, so we should not change them. Let us change all shifts to loops. There are two such places in our code. The first one in the AST::MathOperations class:

method value() {
    my $result = @.operands[0].value;

    for 1 .. @.operands.elems - 1 -> $i {
        $result = operation(@.operators[$i - 1], 
                  $result, @.operands[$i].value);
    }

    return $result;
}

The second in one of the multi-methods of the evaluator:

multi method eval-node(AST::HashAssignment $node) {
    %!var{$node.variable-name} = Hash.new;
    for 0 .. $node.keys.elems - 1 -> $i {
        %!var{$node.variable-name}.{$node.keys[$i]} =
            $node.values[$i].value;
    }
}

The test code is working now. Let us fix the behaviour in the test with similar instructions:

my n = 5;
loop n {
    my n2 = n * n;
    say "n = $n, n2 = $n2";
}
## n = 5, n2 = 25
## n = 4, n2 = 16
## n = 3, n2 = 9
## n = 2, n2 = 4
## n = 1, n2 = 1

Don’t forget to test the if and else blocks, too. The program should be able to execute all the statements from each of them.

Conditional operators

So far, the if clause can only check if the value is zero or not. Let us make it more powerful and allow conditional and Boolean operators:

|  &  <  <=  >  >=  ==  !=

We are not going to introduce the Boolean data type, but instead, let’s add the operators to existing expressions. It makes both 3 + n and x < y valid expressions, which can be used as a condition in the if test.

The above listed operators are binary operators and have precedence lower than arithmetic operators + and -. Among themselves, & has a higher precedence. So, we have to update the op tokens in the grammar and add the new operators:

multi token op(1) {
    | '|'
    | '<'  | '>'
    | '<=' | '>='
    | '==' | '!='
}

multi token op(2) {
    '&'
}

Update the rest to explicitly give the precedence:

multi token op(3) {
    '+' | '-'
}

multi token op(4) {
    '*' | '/'
}

multi token op(5) {
    '**'
}

multi rule expr(6) {
    | <number>
    | <variable-name> <index>?
    | '(' <expression> ')'
}

Add the functions to the AST::MathOperations class:

multi sub operation('|', $a, $b) {
    $a || $b
}

multi sub operation('&', $a, $b) {
    $a && $b
}

multi sub operation('<', $a, $b) {
    $a < $b
}

multi sub operation('<=', $a, $b) {
    $a <= $b
}

multi sub operation('>', $a, $b) {
    $a > $b
}

multi sub operation('>=', $a, $b) {
    $a >= $b
}

multi sub operation('!=', $a, $b) {
    $a != $b
}

multi sub operation('==', $a, $b) {
    $a == $b
}

We rely on Raku’s operators here. Another alternative would be to only generate 1 or 0, for example:

multi sub operation('<', $a, $b) {
    $a < $b ?? 1 !! 0
}

In the test, try different operators and their combinations:

my x = 10;
my y = 20;

if x > y say ">" else say "<"; ## <
if x < y say ">" else say "<"; ## >

if x != y say "!=" else say "=="; ## !=
if x != x say "!=" else say "=="; ## ==

if x == y say "==" else say "!="; ## !=
if x == x say "==" else say "!="; ## ==

if 5 <= 5 say "5 <= 5"; ## 5 <= 5
if 5 <= 6 say "5 <= 6"; ## 5 <= 6

if 5 >= 5 say "5 >= 5"; ## 5 >= 5
if 6 >= 5 say "6 >= 5"; ## 6 >= 5

if (10 > 1) & (5 < 50) say "OK 1"; ## OK 1
if (10 < 1) | (5 < 50) say "OK 2"; ## OK 2

Implementing the `while` loop

Before the end of the chapter, let us implement the while keyword, which is illustrated by the following code:

my n = 1;
while n <= 5 {
    say n;
    n = n + 1
}

The loop body is run until the condition stops being true.

Syntactically and semantically, the while loop is very close to the loop cycle we worked on before. Implementing it is mostly a copy-and-paste work.

In the grammar:

rule TOP {
    [
        . . .
        | <statement=loopped-statement>
        | <statement=while-statement>
        . . .
    ]*
}

rule while-statement {
    'while' <value> <block(';')>
}

Define a new AST node:

class AST::While is ASTNode {
    has ASTNode $.value;
    has ASTNode @.statements;
}

In the actions:

method while-statement($/) {
    $/.make(AST::While.new(
        value => $/<value>.made,
        statements => ($/<block><statement>.map: *.made),
    ));
}

In the evaluator:

multi method eval-node(AST::While $node) {
    while $node.value.value {
        self.eval-node($_) for $node.statements;
    }
}

The while construct is implemented. Add a test and move on to the next chapter.

my n = 1;
while n <= 5 {
    say n;
    n = n + 1
}
## 1
## 2
## 3
## 4
## 5

my k = 1;
while k < 10 k = k + 1;
say k; ## 10

2 thoughts on “Chapter 11. Control Flow”

Pingback: Chapter 10. Test suite – Andrew Shitov’s Blog
Nikhil Bhamidipati says:

November 11, 2023 at 2:42 pm

What a masterpiece this entire series of articles has been!!
Until now i have only heard Raku is a power tool, but these articles make me see what a magnificient super-tool it is!!

Implementing if

Implementing else