11. Compiler stages and targets in Perl 6

Welcome to the new year! Today, let us switch for a while from the discussion about obsolete messages to something different.

Stages

If you followed the exercises in the previous posts, you might have noticed that some statistics was printed in the console when compiling Rakudo:

Stage start      :   0.000
Stage parse      :  44.914
Stage syntaxcheck:   0.000
Stage ast        :   0.000
Stage optimize   :   4.245
Stage mast       :   9.476
Stage mbc        :   0.200

You could have also noticed that the bigger the file you changed, the slower it is compiled, up to dozens of seconds when you modify Grammar.pm.

It is also possible to see the statistics for your own programs. The --stagestats command-line option does the job:

$ ./perl6 --stagestats -e'say 42'
Stage start      :   0.000
Stage parse      :   0.065
Stage syntaxcheck:   0.000
Stage ast        :   0.000
Stage optimize   :   0.001
Stage mast       :   0.003
Stage mbc        :   0.000
Stage moar       :   0.000
42

So, let’s look at these stages. Roughly, half of them is about Perl 6, and half is about MoarVM. In the case Rakudo is configured to work with the JVM backend, the output will differ in the second half.

The Perl 6 part is clearly visible in the src/main.nqp file:

# Create and configure compiler object.
my $comp := Perl6::Compiler.new();
$comp.language('perl6');
$comp.parsegrammar(Perl6::Grammar);
$comp.parseactions(Perl6::Actions);
$comp.addstage('syntaxcheck', :before);
$comp.addstage('optimize', :after);
hll-config($comp.config);
nqp::bindhllsym('perl6', '$COMPILER_CONFIG', $comp.config);

Look at the selected lines. If you have played with Perl 6 Grammars, you know that big grammars are usually split into two parts: the grammar itself and the actions. The Perl 6 compiler does exactly the same thing for the Perl 6 grammar. There are two files: src/Perl6/Grammar.nqp and src/Perl6/Actions.nqp.

When looking at src/main.nqp, it is not quite clear that there are eight stages. Add the following line to the file:

for ($comp.stages()) { nqp::say($_) }

Now, recompile Rakudo and run any program:

$ ./perl6 -e'say 42'
start
parse
syntaxcheck
ast
optimize
mast
mbc
moar
42

Here they are.

The names of the first three stages—start, parse, and syntaxcheck—are quite self-explanatory. The ast stage is the stage of building an abstract syntax tree, which is then optimized in the optimize stage.

At this point, your Perl 6 program has been transformed into the abstract syntax tree and is about to be passed to the backend, MoarVM virtual machine in our case. The stages names start with m. The mast stage is the stage of the MoarVM assembly (not abstract) syntax tree, mbc stands for MoarVM bytecode and moar is when the VM executes the code.

Targets

Now that we know the stages of the Perl 6 program workflow, let’s make use of them. The --target option lets the compiler to stop at the given stage and display the result of it. This option supports the following values: parse, syntaxcheck, ast, optimize, and mast. With those options, Rakudo prints the output as a tree, and you can see how the program changes at different stages.

Even for small programs, the output, especially with the abstract syntax tree or an assembly tree of the VM is quite verbose. Let’s look at the parse tree of the ‘Hello, World!’ program, for example:

$ ./perl6 --target=parse -e'say "Hello, World!"'
- statementlist: say "Hello, World!"
  - statement: 1 matches
    - EXPR: say "Hello, World!"
      - args:  "Hello, World!"
        - arglist: "Hello, World!"
          - EXPR: "Hello, World!"
            - value: "Hello, World!"
              - quote: "Hello, World!"
                - nibble: Hello, World!
      - longname: say
        - name: say
          - identifier: say
          - morename:  isa NQPArray
        - colonpair:  isa NQPArray

All the names here correspond to rules, tokens, or methods of the Grammar. You can find them in src/Perl6/Grammar.nqp. As an exercise, try predicting if the name is a method, or a rule, or a token. Say, a value should be a token, as it is supposed to be a compact string, while a statementlist is a rule.

3. Playing with the code of Rakudo Perl 6

Yesterday, we looked at the two methods of the Bool class that return strings. The string representation that the functions produce is hardcoded in the source code.

Let’s use this observation and try changing the texts.

So, here is the fragment that we will modify:

Bool.^add_multi_method('gist', my multi method gist(Bool:D:) {
    self ?? 'True' !! 'False'
});

This gist method is used to stringify a defined variable.

To make things happen, you need to have the source codes of Rakudo on your computer so that you can compile them. Clone the project from GitHub first:

$ git clone https://github.com/rakudo/rakudo.git

Compile with MoarVM:

$ cd rakudo
$ perl Configure.pl --gen-moar --gen-nqp --backends=moar
$ make

Having that done, you get the perl6 executable in the rakudo directory.

Now, open the src/core/Bool.pm file and change the strings of the gist method to use the Unicode thumbs instead of plain text:

Bool.^add_multi_method('gist', my multi method gist(Bool:D:) {
    self ?? '👍' !! '👎'
});

After saving the file, you need to recompile Rakudo. Bool.pm is in the list of files to be compiled in Makefile:

M_CORE_SOURCES = \
    src/core/core_prologue.pm\
    src/core/traits.pm\
    src/core/Positional.pm\
    . . .
    src/core/Bool.pm\
    . . .

Run make and get the updated perl6. Run it and enjoy the result:

:~/rakudo$ ./perl6
To exit type 'exit' or '^D'
> my Bool $b = True;
👍
> $b = !$b; 
👎
>

As an exercise, let us improve your local Perl 6 by adding the gist method for undefined values. By default, it does not exist, and we saw that yesterday. It means that an attempt to interpolate an undefined variable in a string will be rejected. Let’s make it better.

Interpolation uses the Str method. It is similar to both gist and perl, so you will have no difficulties in creating the new version.

This is what currently is in Perl 6:

Bool.^add_multi_method('Str', my multi method Str(Bool:D:) {
    self ?? 'True' !! 'False'
});

This is what you need to add:

Bool.^add_multi_method('Str', my multi method Str(Bool:U:) {
    '¯\_(ツ)_/¯'
});

Notice that self is not needed (and cannot be used) in the second variant.

Compile and run perl6:

$ ./perl6
To exit type 'exit' or '^D'
> my Bool $b;
(Bool)
> "Here is my variable: $b"
Here is my variable: ¯\_(ツ)_/¯
>

It works as expected. Congratulations, you’ve just changed the behaviour of Perl 6 yourself!