Adding support for if, while, and do while, to a simple compiler and virtual machine written in Php

1. Download the simple compiler and virtual machine in Php

 

2. Previous blog posts

Let's continue to build our little bytecode compiler in Php, along with a small virtual stack machine to execute the resulting bytecode.

  • In the previous blog post, we built a simple bytecode compiler and a virtual stack machine in Php, that can execute arbitrary expressions and function calls.
  • In the blog post before that, we used the Bison parser generator to generate parser tables in XML for use in Php.
  • In the first blog post on the subject, we demonstrated how to use the PCRE library, which is built into Php (preg_match()), as an effective lexer engine.

In order to obtain a turing-complete programming language, that is, a language in which we can implement and execute any arbitrary algorithm, that is, that other turing-complete machines can execute too, we now need to add a way of (conditionally) branching, that is, the IF statement, and at least one way of repeating statements, that is the WHILE or the DO WHILE statements.

If you carefully look at real CPU machine code, besides storing and retrieving values in memory, looping and branching is pretty much all that a contemporary Von Neumann CPU can do. Therefore, a programming language is turing-complete, if it can do the same.

 

3. The example TL program

The following example program is written in Turing-complete Language (TL). By the end of this blog post, our compiler will be able to compile it, and our virtual machine will be able to execute it:

Show/Hidden php code
View source
 
//This is an turing-complete-language program
a=0;
while(a<15)
{
        if(a<5) println('a is smaller than 5');
        else println('a is larger than 5');
        if(a>=9 && a<=10) println('a is between 9 and 10');
        if(a==1 || a==9) println('a is now 1 or 9');
        println('a:'.a);
        a=a+1;
}
x=1;
do
{
        y=x^2;
        println('x is:'.x.' and y is:'.y);
        if(x==4 || (y>4 && y<30)) 
        {
                println('x is 4 or y is larger than 4, but smaller than 30');
        }
        else
        {
                println('nothing special to mention about x or y');
        }
        x=x+1;
}
while(x<10);
 

As you can see, the TL language allows for expressing IF, WHILE, and DO WHILE statements. In the remainder of this blog post, we will add support for these statements to the simple expression language (EL) compiler and virtual stack machine, implemented in previous blog posts.

 

4. The IF statement

4.1. Analysis of the IF statement

A typical IF statement looks like this:

Show/Hidden php code
View source
 
if_statement:
        if ( condition ) if_statements else else_statements;
 

If the condition is true, we execute the if_statements; otherwise, we execute the else_statements. Therefore, in terms of a stack machine, we first evaluate the condition. The condition's result will be available at the top of the stack. If it is false, we jump over the if_statements, at the point at which the else_statements start; otherwise, we start executing the if_statements. After executing the if_statements, we jump until after the else_statements:

Show/Hidden php code
View source
 
--condition--       
                        IF_FALSE <LABEL_ELSE>
--if_statements --
                        JUMP <LABEL_IF_END>
                        LABEL <LABEL_ELSE>
--else_statements --
                        LABEL <LABEL_IF_END>
 

Our stack machine will, therefore, need the ability to JUMP unconditionally, and conditionally, IF_FALSE, to a LABEL.

4.2. Adding support to the lexer for the IF statement

Our lexer now needs to the ability to recognize the IF and ELSE keywords, as well as the left and right curly brackets ({ }). We extend the lexer rules for this, in the file TLParser.php, in the method prepareLexer():

Show/Hidden php code
View source
 
$lexerPattern->addRule('IF','if');
$lexerPattern->addRule('ELSE','else');
//brackets: CBRACKET_LEFT CBRACKET_RIGHT
$lexerPattern->addRule('CBRACKET_LEFT','\{');
$lexerPattern->addRule('CBRACKET_RIGHT','\}');
 

4.3. Adding support to the parser for the IF statement

First, we need to add support for a statement block. Anywhere, where we can write statement, we can also write { statement1; statement2; ... }. The Bison grammar rules for a statement block are:

Show/Hidden php code
View source
 
block: statement;
block: CBRACKET_LEFT block_statements CBRACKET_RIGHT;
block: CBRACKET_LEFT CBRACKET_RIGHT;
block_statements: block_statements block_statement;
block_statements: block_statement;
block_statement: statement;
 

Watch out for the empty block statement: { /* empty */ }. There are many alternative ways to specify empty blocks, but Bison does not accept all alternatives. For quite a few alternatives, Bison will generate shift/reduce conflict warnings. It may not be easy to understand, at first glance, why exactly Bison generates these conflict warnings. This is a general problem with optional (=empty) clauses in a grammar. When a rule is optional, it must be possible to unambiguously know from the lookahead character at the point of the empty rule, that it is the empty rule that must be reduced, and not another rule. This is sometimes difficult to phrase correctly in Bison grammar. In this case, we avoid the whole shift/reduce conflict, by stating explictly that the block can be empty. There is no confusion possible, by phrasing the rule as following:

Show/Hidden php code
View source
 
block: CBRACKET_LEFT CBRACKET_RIGHT;
 

The rules for the IF statement, initially look like this:

Show/Hidden php code
View source
 
if_statement: IF ( condition ) block else_clause; 
else_clause: ELSE block;
else_clause: ;
 

The bison parser only stops parsing to call our custom functions, each time he reduces a new rule. It is our custom functions that write out bytecode. So, we must make sure that the bison parser calls back our functions every time we need to add virtual machine instructions to the output. Therefore, we need more callback points <<CB>>, than available in the initial rules. In order to generate the required JUMP, IF_FALSE, and LABEL instructions, we need the following callback points:

Show/Hidden php code
View source
 
if_statement: IF ( condition ) <<CB>> block <<CB>> else_clause; <<CB>>
else_clause: ELSE block;
else_clause: ;
 

The parser will, for example, also call us back after recognizing the else_clause. We do not really need it, however. So, we do not attach any callback to the else_clause rules. The parser will also not call us back, however, after recognizing the IF ( condition ) part of the if_statement, or after the block part, where we actually do need it. In order to be able to add the callback points needed, we decompose the rule for the if_statement:

Show/Hidden php code
View source
 
if_statement: if_clause else_clause; <<CB>>
if_clause: if_condition block; <<CB>>
if_condition: IF ( expression ); <<CB>>
else_clause: ELSE block;
else_clause: ;
 

Since the parser calls back automatically at the end of each rule, we have now enough callback points to generate our instructions:

Show/Hidden php code
View source
 
                                     AFTER REDUCING THE RULE, WRITE:
if_condition: IF ( expression );     IF_FALSE <ELSE>
if_clause: if_condition block;       JUMP <IF_END>
                                     LABEL <ELSE>
if_statement: if_clause else_clause; LABEL <IF_END>                                                        
else_clause: ELSE block ;            
else_clause: ;                
 

4.4. Dealing with embedded IF statements

An IF statement can be embedded inside another IF statement. For example:

Show/Hidden php code
View source
 
if(x>5)
{
        if(x>12 && x<15)
        {
                y=x+2;
                println('x='.x.' y='.y);
        }
}
else
{
        print('x is just greater than 5');
}
 

The JUMP statements must not jump to the label for another IF statement, than the one they have been generated for. Therefore, we distinguish between the different IF statements, by numbering their labels. In order to keep track of label numbers, we will assign the label numbers inside the stack states that the parser returns to us, when he reduces a rule. That will allow us to retrieve the number assigned to the IF statement, and number the labels correctly:

Show/Hidden php code
View source
 
if_condition: IF ( expression );     IF_FALSE <ELSE>-<number>      assign <number> inside if_condition stack state
if_clause: if_condition block;       JUMP <IF_END>-<number>        retrieve <number> from if keyword stack state
                                     LABEL <ELSE>-<number>         store <number> inside if_clause stack state
if_statement: if_clause else_clause; LABEL <IF_END>-<number>       retrieve <number> from if_condition stack state
 
else_clause: ELSE block ;            
else_clause: ;                
 

4.5. Adding custom callback functions to the parser for the IF statement

First, we need to recognize the correct rules to which to associate callback functions. We do this in the TLParser.php file in the associateRuleNumbers() method:

Show/Hidden php code
View source
 
$this->associateRuleNumberForLHSSymbol('IF_STATEMENT','if_statement');
$this->associateRuleNumberForLHSSymbol('IF_CLAUSE','if_clause');
$this->associateRuleNumberForLHSSymbol('IF_CONDITION','if_condition');
 

Next, we associate functions to the rule names in the associateReduceCallBacks() method:

Show/Hidden php code
View source
 
$this->registerNamedReduceCallback(
        'IF_CONDITION',
        function($me,$stackStates)
        {
                $me->newLabelNumber($stackStates);
                $me->emitIfFalse($stackStates,'ELSE');
        }
);
$this->registerNamedReduceCallback(
        'IF_CLAUSE',
        function($me,$stackStates)
        {
                $me->propagateLabelNumber($stackStates);
                $me->emitJump($stackStates,'IF_END');
                $me->emitLabel($stackStates,'ELSE');
        }
);
$this->registerNamedReduceCallback(
        'IF_STATEMENT',
        function($me,$stackStates)
        {
                $me->propagateLabelNumber($stackStates);
                $me->emitLabel($stackStates,'IF_END');
        }
);
 

As you can see, the label numbering is done by a set of label numbering methods. These methods inspect the $stackStates supplied, and make sure that. during the next callback, the callback functions are able to retrieve the current label number:

Show/Hidden php code
View source
 
function findLabelNumber($stackStates)
{
        //retrieve first stack state
        $stackState=$stackStates[0];
        //try to find it in the first token
        if($stackState->labelNumber!=null)
                return $stackState->labelNumber;
        //try to find it in the token of the first child stack state
        else if($stackState->token->value[0]->labelNumber!=null)
                return $stackState->token->value[0]->labelNumber;
        else $this->error('Cannot find label number in stack states for RHS');
}
function newLabelNumber($stackStates)
{
        //retrieve first stack state
        $stackState=$stackStates[0];
        //increment label number
        $this->lastLabelNumber++;
        $labelNumber=$this->lastLabelNumber;
        //store label number in IF stack state
        $stackState->labelNumber=$labelNumber;
}
function propagateLabelNumber($stackStates)
{
        //find label number
        $labelNumber=$this->findLabelNumber($stackStates);
        //retrieve first stack state
        $stackState=$stackStates[0];
        //store it in the first stack state
        $stackState->labelNumber=$labelNumber;
}
 

The emission of bytecode, is handled by the emit() series of functions. These are all simple variations on the emitLabeledInstruction($stackStates,$instructionCode,$labelPrefix) method:

Show/Hidden php code
View source
 
function emitLabeledInstruction($stackStates,$instructionCode,$labelPrefix)
{
        $labelNumber=$this->findLabelNumber($stackStates);
        $instruction=Instruction::fromToken(
                        $stackStates[0]->token,
                        $instructionCode,
                        '');
        $instruction->symbol=''; //no symbol
        $instruction->value="<{$labelPrefix}_{$labelNumber}>";
        $this->emit($instruction);                                
}
function emit($instruction)
{
        $this->instructions[]=$instruction;
}
 

After successfully parsing an entire TL program text, we will simply write all instructions in plain text, one by one, to the output:

Show/Hidden php code
View source
 
function printInstructions()
{
        Instruction::debugPrintColumnHeadings();
        foreach($this->instructions as $instruction)
        {
                $instruction->printInstruction();
        }
}
 

4.6. Adding support to the virtual machine for the IF statement

Now we can compile IF statements. However, we still need the virtual machine to gain the capability to execute the new bytecode instructions, that is, LABEL, JUMP, and IF_FALSE. For a starters, the virtual machine needs the capability to continue execution of the bytecode from a different instruction number. To that end, we reform the loop that runs through the vm instructions, and add a $currentInstructionNumber variable to the virtual machine:

Show/Hidden php code
View source
 
class VirtualMachine
{
...
        var $currentInstructionNumber=null; //Current instruction number
...
        function execVMInstructions()
        {
                $this->resetStack();
                //start with the first instruction at the beginning of the array of instructions
                $this->currentInstructionNumber=0;
                //count the total number of instructions
                $countInstructions=count($this->vmInstructions);
                //loop until the last instruction has been reached
                while($this->currentInstructionNumber<$countInstructions)
                {
                        //take the next instruction
                        $vmInstruction=$this->vmInstructions[$this->currentInstructionNumber];
                        //look at its code
                        $code=$vmInstruction->code;
                        //the code must exist. Otherwise, it's an error
                        if(array_key_exists($code,$this->instructionCallbacks))
                        {
                                //retrieve the function associated with this vm instruction code
                                $function=$this->instructionCallbacks[$code];
                                //call the function
                                $function($this,$vmInstruction);
                        }
                        else
                        {
                                $this->error($vmInstruction,
                                        "Unknown IL instruction code: '$code'");
                        }
                        //move to the next instruction
                        $this->currentInstructionNumber++;
                }
        }
 

By reforming the instruction loop, we are now able to manipulate its $currentInstructionNumber variable. First, we register the callbacks for the LABEL, JUMP, and IF_FALSE instructions:

Show/Hidden php code
View source
 
$this->registerInstructionCallback('LABEL',
        function($vm,$vmInstruction)
        {
                /* ignore labels */
        }
);
$this->registerInstructionCallback('JUMP',
        function($vm,$vmInstruction)
        {
                $vm->execJump($vmInstruction);
        }
);
$this->registerInstructionCallback('IF_FALSE',
        function($vm,$vmInstruction)
        {
                $vm->branchConditionally($vmInstruction,false);
        }
);                
 

When reading and parsing the file with vm instructions, we create an array with the instruction (index) numbers for the labels. Now, we can implement the JUMP and IF_FALSE instructions, simply by retrieving the instruction index in the instruction array, and assigning it to the $currentInstructionNumber variable:

Show/Hidden php code
View source
 
function branchConditionally($vmInstruction,$branchCondition)
{
        //get condition result off the stack
        $stackCondition=(bool)$this->popItem($vmInstruction)->value;
        $this->resetStack();
        //if condition fulfilled, then jump
        if(($branchCondition==$stackCondition))
                $this->execJump($vmInstruction);
}
function execJump($vmInstruction)
{
        if(array_key_exists($vmInstruction->value,$this->labelInstructionNumbers))
        {
                $instructionNumber=$this->labelInstructionNumbers[$vmInstruction->value];
                $this->currentInstructionNumber=$instructionNumber;
        }
        else
        {
                $vm->error($vmInstruction,
                        "Undefined IL label '{$vmInstruction->value}'");                                        
        }
}
 

The IF statement is now fully implemented both in the compiler as in the virtual machine. We still need to do the same for the WHILE and the DO WHILE statements.

 

5. The WHILE statement

5.1. Analysis of the WHILE statement

Show/Hidden php code
View source
 
while_statement: while ( condition ) while_statements;
 

The WHILE statement keeps repeating the execution of the while_statements for as long as the condition is true. Therefore, in terms of the stack machine, we first evaluate the condition. The condition's result will be available at the top of the stack. If it is false, we jump over the while_statements; otherwise, we simply start executing the while_statements. After executing the while_statements, we jump back in front of the condition, and start again by evaluating it again:

Show/Hidden php code
View source
 
                        LABEL <WHILE_START>
--condition--       
                        IF_FALSE <WHILE_END>
--while_statements --
                        JUMP <WHILE_START>
                        LABEL <WHILE_END>
 

As you can see, we can simply re-use the LABEL, JUMP, and IF_FALSE instructions already implemented for the IF statement.

5.2. Adding support to the lexer for the WHILE statement

Our lexer will now need to the ability to recognize the WHILE keyword. We extend the lexer rules for this, in the file TLParser.php, in the method prepareLexer():

Show/Hidden php code
View source
 
                $lexerPattern->addRule('WHILE','while');
 

5.3. Adding support to the parser for the WHILE statement

The rules for the WHILE statement, look initially like this:

Show/Hidden bash code
View source
 
while_statement: 
       while ( condition ) block ;
 

We need additional callback points, however:

Show/Hidden bash code
View source
 
while_statement: while <<CB>> ( condition ) <<CB>> block <<CB>>;
 

We decompose the rule for the WHILE statement, in order to add the callback points needed:

Show/Hidden php code
View source
 
                                                   AFTER REDUCING THE RULE, WRITE:
while_prefix: WHILE;      <<CB>>                   LABEL <WHILE_START>
while_clause: while_prefix ( expression ) <<CB>>   IF_FALSE <WHILE_END>   
 
while_statement: while_clause block ;   <<CB>>     JUMP <WHILE_START>
                                                   LABEL <WHILE_END>                         
 

Since we automatically obtain a callback point at the end of each rule, we have now enough callback points to generate our instructions.

5.4. Dealing with embedded WHILE statements

A WHILE statement can be embedded inside another WHILE statement. For example:

Show/Hidden php code
View source
 
y=0;
x=5;
while(x<25)
{
        while(x>12) y=y+1;
        println('x='.x.' y='.y);
        x=x+1;
}
 

the JUMP and IF_FALSE instructions must not jump to the label for another WHILE or IF statement than the one they have been generated for. Therefore, we distinguish between the different IF and WHILE statements, by numbering the labels:

Show/Hidden php code
View source
 
while_prefix: WHILE;                            LABEL <WHILE_START>-<number>              assign <number> in while_prefix
while_clause: while_prefix ( expression )       IF_FALSE <WHILE_END>-<number>             retrieve <number> from while_prefix   
                                                                                          store <number> in while_clause
while_clause block ;                            JUMP <WHILE_START>-<number>               retrieve <number> from while_clause
                                                LABEL <WHILE_END>-<number>                         
 

5.5. Adding custom callback functions to the parser for the WHILE statement

First, we need to recognize the correct rules to which to associate callback functions. We do this in the TLParser.php file in the associateRuleNumbers() method:

Show/Hidden php code
View source
 
$this->associateRuleNumberForLHSSymbol('WHILE_STATEMENT','while_statement');
$this->associateRuleNumberForLHSSymbol('WHILE_CLAUSE','while_clause');
$this->associateRuleNumberForLHSSymbol('WHILE_PREFIX','while_prefix');
 

Next, we associate functions to the rule names in the associateReduceCallBacks() method:

Show/Hidden php code
View source
 
$this->registerNamedReduceCallback(
        'WHILE_PREFIX',
        function($me,$stackStates)
        {
                $me->newLabelNumber($stackStates);
                $me->emitLabel($stackStates,'WHILE_START');
        }
);
$this->registerNamedReduceCallback(
        'WHILE_CLAUSE',
        function($me,$stackStates)
        {
                $me->propagateLabelNumber($stackStates);
                $me->emitIfFalse($stackStates,'WHILE_END');
        }
);
$this->registerNamedReduceCallback(
        'WHILE_STATEMENT',
        function($me,$stackStates)
        {
                $me->propagateLabelNumber($stackStates);
                $me->emitJump($stackStates,'WHILE_START');
                $me->emitLabel($stackStates,'WHILE_END');
        }
);
 

Of course, the emission of bytecode, is handled by the same emit() methods, as the ones we have implemented for the IF statement.

5.6. Adding support to the virtual machine for the WHILE statement

For the IF statement, we had already implemented support for the LABEL, JUMP, and IF_FALSE instructions. Therefore, the WHILE statement is now also automatically supported.

 

6. The DO WHILE statement

6.1. Analysis of the DO WHILE statement

Show/Hidden php code
View source
 
do_while_statement:
     do dowhile_statements
              while ( condition );
 

The DO WHILE statement executes the dowhile_statements. If the condition at the end of the statement is true, it executes the dowhile_statements again. The DO WHILE statement keeps repeating this cycle for as long as the condition is true. Therefore, in terms of the stack machine, first, we execute the dowhile_statements. Next, we evaulate the condition's result. It will be available at the top of the stack. If it is true, we jump back to the start of the dowhile_statements. Otherwise, we leave the DO WHILE statement:

Show/Hidden php code
View source
 
do {
                                LABEL <DO_WHILE_START>
--dowhile_statements --
} while --condition--       
                                IF_TRUE <DO_WHILE_START>
 

As you can see, we can simply re-use the existing LABEL instruction. We will, however, need to implement an IF_TRUE instruction, similar to the IF_FALSE instruction we implemented earlier on.

6.2. Adding support to the lexer for the DO WHILE statement

Our lexer now needs to the ability to recognize the DO keyword. We extend the lexer rules for this, in the file TLParser.php, in the method prepareLexer():

Show/Hidden php code
View source
 
$lexerPattern->addRule('DO','do');
 

6.3. Adding support to the parser for the DO WHILE statement

The rules for the DO WHILE statement, look initially like this:

Show/Hidden bash code
View source
 
do_while_statement: do { block } while ( expression );
 

We need an additional callback point, however:

Show/Hidden bash code
View source
 
do_while_statement: do <<CB>> block while ( condition ); <<CB>> 
 

We decompose the rule for the DO WHILE statement, in order to be able to add the callback point needed:

Show/Hidden php code
View source
 
do_start: do ;                              LABEL <DO_START>
do_while_statement: do_start { block } 
                    while ( expression ) ;  IF_TRUE <DO_START>
 

Since we automatically obtain a callback point at the end of each rule, we have now enough callback points to generate our instructions.

6.4. Dealing with embedded DO WHILE statements

The JUMP, IF_FALSE, and IF_TRUE instructions must not jump to the label for another IF, WHILE, or DO WHILE statement than the one they have been generated for. Therefore, we distinguish between the different IF, WHILE, and DO WHILE statements, by numbering the labels:

Show/Hidden php code
View source
 
do_start: do ;                    LABEL <DO_START>-<number>            assign <number> in do_start
do_while_statement: 
       do_start { block } 
            while ( expression ); IF_TRUE <DO_START>-<number>          retrieve <number> from do_start
 

6.5. Adding custom callback functions to the parser for the DO WHILE statement

First, we need to recognize the correct rules to which to associate callback functions. We do this in the TLParser.php file in the associateRuleNumbers() method:

Show/Hidden php code
View source
 
$this->associateRuleNumberForLHSSymbol('DO_WHILE_STATEMENT','do_while_statement');
$this->associateRuleNumberForLHSSymbol('DO_PREFIX','do_prefix');
 

Next, we associate functions to the rule names in the associateReduceCallBacks() method:

Show/Hidden php code
View source
 
$this->registerNamedReduceCallback(
        'DO_PREFIX',
        function($me,$stackStates)
        {
                $me->newLabelNumber($stackStates);
                $me->emitLabel($stackStates,'DO_START');
        }
);
$this->registerNamedReduceCallback(
        'DO_WHILE_STATEMENT',
        function($me,$stackStates)
        {
                $me->propagateLabelNumber($stackStates);
                $me->emitIfTrue($stackStates,'DO_START');
        }
);
 

Of course, the emission of bytecode, is handled by the same emit() series of functions, as the ones we have implemented for the IF statement. We just need to add the IF_TRUE variant:

Show/Hidden php code
View source
 
function emitIfTrue($stackStates,$labelPrefix)
{
        $this->emitLabeledInstruction($stackStates,'IF_TRUE',$labelPrefix);
}
 

6.6. Adding support to the virtual machine for the DO WHILE statement

For the IF statement, we had already implemented support for the LABEL instruction. Now, we still need to implement support for the IF_TRUE instruction, in a similar way as we did for the IF_FALSE instruction:

Show/Hidden php code
View source
 
$this->registerInstructionCallback('IF_TRUE',
        function($vm,$vmInstruction)
        {
                $vm->branchConditionally($vmInstruction,true);
        }
);                
 

 

7. Bytecode produced for the example TL program

Our little compiler now manages to produce the following bytecode for the example:

Show/Hidden bash code
View source
 
----          |----|------        |--- |---|---|---    |-----
code          |args|symbol        |pos |len|lin|col    |value
----          |----|------        |--- |---|---|---    |-----
PUSH_OPERAND  |    |IDENTIFIER    |46  |1  |2  |1      |a
PUSH_OPERAND  |    |NUMBER        |48  |1  |2  |3      |0
EXEC_OPERATOR |2   |ASSIGN        |47  |1  |2  |2      |=
LABEL         |    |              |52  |15 |4  |1      |<WHILE_START_1>
PUSH_OPERAND  |    |IDENTIFIER    |58  |1  |4  |7      |a
PUSH_OPERAND  |    |NUMBER        |60  |2  |4  |9      |15
EXEC_OPERATOR |2   |LT            |59  |1  |4  |8      |<
IF_FALSE      |    |              |0   |13 |0  |0      |<WHILE_END_1>
PUSH_OPERAND  |    |IDENTIFIER    |77  |1  |6  |12     |a
PUSH_OPERAND  |    |NUMBER        |79  |1  |6  |14     |5
EXEC_OPERATOR |2   |LT            |78  |1  |6  |13     |<
IF_FALSE      |    |              |74  |8  |6  |9      |<ELSE_2>
PUSH_OPERAND  |    |STRING        |90  |21 |6  |25     |a is smaller than 5
PUSH_OPERAND  |    |IDENTIFIER    |82  |7  |6  |17     |println
EXEC_OPERATOR |2   |FUNCTION_CALL |82  |0  |6  |17     |
JUMP          |    |              |0   |10 |0  |0      |<IF_END_2>
LABEL         |    |              |0   |8  |0  |0      |<ELSE_2>
PUSH_OPERAND  |    |STRING        |135 |20 |7  |22     |a is larger than 5
PUSH_OPERAND  |    |IDENTIFIER    |127 |7  |7  |14     |println
EXEC_OPERATOR |2   |FUNCTION_CALL |127 |0  |7  |14     |
LABEL         |    |              |0   |10 |0  |0      |<IF_END_2>
PUSH_OPERAND  |    |IDENTIFIER    |169 |1  |8  |12     |a
PUSH_OPERAND  |    |NUMBER        |172 |1  |8  |15     |9
EXEC_OPERATOR |2   |GE            |170 |2  |8  |13     |>=
PUSH_OPERAND  |    |IDENTIFIER    |177 |1  |8  |20     |a
PUSH_OPERAND  |    |NUMBER        |180 |2  |8  |23     |10
EXEC_OPERATOR |2   |LE            |178 |2  |8  |21     |<=
EXEC_OPERATOR |2   |AND           |174 |2  |8  |17     |&&
IF_FALSE      |    |              |166 |8  |8  |9      |<ELSE_3>
PUSH_OPERAND  |    |STRING        |192 |23 |8  |35     |a is between 9 and 10
PUSH_OPERAND  |    |IDENTIFIER    |184 |7  |8  |27     |println
EXEC_OPERATOR |2   |FUNCTION_CALL |184 |0  |8  |27     |
JUMP          |    |              |0   |10 |0  |0      |<IF_END_3>
LABEL         |    |              |0   |8  |0  |0      |<ELSE_3>
LABEL         |    |              |0   |10 |0  |0      |<IF_END_3>
PUSH_OPERAND  |    |IDENTIFIER    |229 |1  |9  |12     |a
PUSH_OPERAND  |    |NUMBER        |232 |1  |9  |15     |1
EXEC_OPERATOR |2   |EQ            |230 |2  |9  |13     |==
PUSH_OPERAND  |    |IDENTIFIER    |237 |1  |9  |20     |a
PUSH_OPERAND  |    |NUMBER        |240 |1  |9  |23     |9
EXEC_OPERATOR |2   |EQ            |238 |2  |9  |21     |==
EXEC_OPERATOR |2   |OR            |234 |2  |9  |17     |||
IF_FALSE      |    |              |226 |8  |9  |9      |<ELSE_4>
PUSH_OPERAND  |    |STRING        |251 |17 |9  |34     |a is now 1 or 9
PUSH_OPERAND  |    |IDENTIFIER    |243 |7  |9  |26     |println
EXEC_OPERATOR |2   |FUNCTION_CALL |243 |0  |9  |26     |
JUMP          |    |              |0   |10 |0  |0      |<IF_END_4>
LABEL         |    |              |0   |8  |0  |0      |<ELSE_4>
LABEL         |    |              |0   |10 |0  |0      |<IF_END_4>
PUSH_OPERAND  |    |STRING        |287 |4  |10 |17     |a:
PUSH_OPERAND  |    |IDENTIFIER    |292 |1  |10 |22     |a
EXEC_OPERATOR |2   |CONCAT        |291 |1  |10 |21     |.
PUSH_OPERAND  |    |IDENTIFIER    |279 |7  |10 |9      |println
EXEC_OPERATOR |2   |FUNCTION_CALL |279 |0  |10 |9      |
PUSH_OPERAND  |    |IDENTIFIER    |304 |1  |11 |9      |a
PUSH_OPERAND  |    |IDENTIFIER    |306 |1  |11 |11     |a
PUSH_OPERAND  |    |NUMBER        |308 |1  |11 |13     |1
EXEC_OPERATOR |2   |PLUS          |307 |1  |11 |12     |+
EXEC_OPERATOR |2   |ASSIGN        |305 |1  |11 |10     |=
JUMP          |    |              |0   |15 |0  |0      |<WHILE_START_1>
LABEL         |    |              |0   |13 |0  |0      |<WHILE_END_1>
PUSH_OPERAND  |    |IDENTIFIER    |314 |1  |14 |1      |x
PUSH_OPERAND  |    |NUMBER        |316 |1  |14 |3      |1
EXEC_OPERATOR |2   |ASSIGN        |315 |1  |14 |2      |=
LABEL         |    |              |320 |12 |16 |1      |<DO_START_5>
PUSH_OPERAND  |    |IDENTIFIER    |333 |1  |18 |9      |y
PUSH_OPERAND  |    |IDENTIFIER    |335 |1  |18 |11     |x
PUSH_OPERAND  |    |NUMBER        |337 |1  |18 |13     |2
EXEC_OPERATOR |2   |EXPONENTIATE  |336 |1  |18 |12     |^
EXEC_OPERATOR |2   |ASSIGN        |334 |1  |18 |10     |=
PUSH_OPERAND  |    |STRING        |356 |7  |19 |17     |x is:
PUSH_OPERAND  |    |IDENTIFIER    |364 |1  |19 |25     |x
EXEC_OPERATOR |2   |CONCAT        |363 |1  |19 |24     |.
PUSH_OPERAND  |    |STRING        |366 |12 |19 |27     | and y is:
EXEC_OPERATOR |2   |CONCAT        |365 |1  |19 |26     |.
PUSH_OPERAND  |    |IDENTIFIER    |379 |1  |19 |40     |y
EXEC_OPERATOR |2   |CONCAT        |378 |1  |19 |39     |.
PUSH_OPERAND  |    |IDENTIFIER    |348 |7  |19 |9      |println
EXEC_OPERATOR |2   |FUNCTION_CALL |348 |0  |19 |9      |
PUSH_OPERAND  |    |IDENTIFIER    |394 |1  |20 |12     |x
PUSH_OPERAND  |    |NUMBER        |397 |1  |20 |15     |4
EXEC_OPERATOR |2   |EQ            |395 |2  |20 |13     |==
PUSH_OPERAND  |    |IDENTIFIER    |403 |1  |20 |21     |y
PUSH_OPERAND  |    |NUMBER        |405 |1  |20 |23     |4
EXEC_OPERATOR |2   |GT            |404 |1  |20 |22     |>
PUSH_OPERAND  |    |IDENTIFIER    |410 |1  |20 |28     |y
PUSH_OPERAND  |    |NUMBER        |412 |2  |20 |30     |30
EXEC_OPERATOR |2   |LT            |411 |1  |20 |29     |<
EXEC_OPERATOR |2   |AND           |407 |2  |20 |25     |&&
EXEC_OPERATOR |2   |OR            |399 |2  |20 |17     |||
IF_FALSE      |    |              |391 |8  |20 |9      |<ELSE_6>
PUSH_OPERAND  |    |STRING        |452 |51 |22 |25     |x is 4 or y is larger than 4, but smaller than 30
PUSH_OPERAND  |    |IDENTIFIER    |444 |7  |22 |17     |println
EXEC_OPERATOR |2   |FUNCTION_CALL |444 |0  |22 |17     |
JUMP          |    |              |0   |10 |0  |0      |<IF_END_6>
LABEL         |    |              |0   |8  |0  |0      |<ELSE_6>
PUSH_OPERAND  |    |STRING        |563 |41 |26 |25     |nothing special to mention about x or y
PUSH_OPERAND  |    |IDENTIFIER    |555 |7  |26 |17     |println
EXEC_OPERATOR |2   |FUNCTION_CALL |555 |0  |26 |17     |
LABEL         |    |              |0   |10 |0  |0      |<IF_END_6>
PUSH_OPERAND  |    |IDENTIFIER    |625 |1  |28 |9      |x
PUSH_OPERAND  |    |IDENTIFIER    |627 |1  |28 |11     |x
PUSH_OPERAND  |    |NUMBER        |629 |1  |28 |13     |1
EXEC_OPERATOR |2   |PLUS          |628 |1  |28 |12     |+
EXEC_OPERATOR |2   |ASSIGN        |626 |1  |28 |10     |=
PUSH_OPERAND  |    |IDENTIFIER    |640 |1  |30 |7      |x
PUSH_OPERAND  |    |NUMBER        |642 |2  |30 |9      |10
EXEC_OPERATOR |2   |LT            |641 |1  |30 |8      |<
IF_TRUE       |    |              |0   |12 |0  |0      |<DO_START_5>
 

When compiling the program and executing the bytecode on the virtual machine, we obtain the following output:

Show/Hidden bash code
View source
 
turing-complete-lang-vm$ ./tl examples/example1.tl 
a is smaller than 5
a:0
a is smaller than 5
a is now 1 or 9
a:1
a is smaller than 5
a:2
a is smaller than 5
a:3
a is smaller than 5
a:4
a is larger than 5
a:5
a is larger than 5
a:6
a is larger than 5
a:7
a is larger than 5
a:8
a is larger than 5
a is between 9 and 10
a is now 1 or 9
a:9
a is larger than 5
a is between 9 and 10
a:10
a is larger than 5
a:11
a is larger than 5
a:12
a is larger than 5
a:13
a is larger than 5
a:14
x is:1 and y is:1
nothing special to mention about x or y
x is:2 and y is:4
nothing special to mention about x or y
x is:3 and y is:9
x is 4 or y is larger than 4, but smaller than 30
x is:4 and y is:16
x is 4 or y is larger than 4, but smaller than 30
x is:5 and y is:25
x is 4 or y is larger than 4, but smaller than 30
x is:6 and y is:36
nothing special to mention about x or y
x is:7 and y is:49
nothing special to mention about x or y
x is:8 and y is:64
nothing special to mention about x or y
x is:9 and y is:81
nothing special to mention about x or y
 

8. Conclusion

Our little compiler, and the little virtual machine assorted with it, are now Turing-complete. It can now express and execute any algorithm that can be executed by any other Turing-complete machine, including a Php machine.

In the following blog posts, I will, one by one, add support for additional language constructions that are not strictly necessary for Turing completeness, but that make programming more convenient. It is only after completing the TL language, that we will start working on making it run faster.

But then again, TL is and remains a scripting language. For as long as it remains a virtual-machine-on-virtual-machine architecture, TL programs will also never be as fast as programs in the Php host language. Even though some scripting engines may manage to come close to native C speed -- for some of their language constructs -- generally attaining parity with native C speed remains an elusive goal, that will most likely never be attained. I will elaborate in future series of blog posts why this goal is probably unattainable. Convenience in programming will always keep costing in terms of performance.