13. Let 1 + 2 * 3 = 9

Is it easy to break the behaviour of Perl 6? Well, the answer probably depends on what exactly you want to break.

Playing with operator precedence, I wanted to change the rules of arithmetical operators + and * so that they are executed in different order, namely, multiplication first, addition second.

Sounds like an easy task. Go to src/Perl6/Grammar.nqp and change a couple of lines that set the precedence of the + and * infixes:

- token infix:sym<*>    { <sym> <O(|%multiplicative)> }
+ token infix:sym<*>    { <sym> <O(|%additive)> }
. . .
- token infix:sym<+>    { <sym> <O(|%additive)> }
+ token infix:sym<+>    { <sym> <O(|%multiplicative)> }

Ready? Compile!

Recompiling the grammar takes a long time, so at first it looks promising, but after a few seconds, the compilation stops with an error:

Month out of range. Is: -935111296, should be in 1..12

Makefile:517: recipe for target 'perl6-m' failed
make: *** [perl6-m] Error 1

Month out of range?? Oh, we changed the rules of the Universe and before Perl 6 is even compiled, the new rules of arithmetics are already applied.

OK, let’s add some anaesthesia and suppress the error message. The code that checks for the correct month value is located in src/core/DateTime.pm, namely, inside the DateTime constructor. Comment that line out:

method !new-from-positional(DateTime:
    Int() $year,
    Int() $month,
    Int() $day,
    Int() $hour,
    Int() $minute,
    :$timezone = 0,
) {
    # (1..12).in-range($month,'Month');
    (1 .. self.DAYS-IN-MONTH($year,$month)).in-range($day,'Day');
    . . .

This time, the month range check doesn’t stop us from going further but another error breaks in:

MVMArray: Index out of bounds

Makefile:517: recipe for target 'perl6-m' failed
make: *** [perl6-m] Error 1

Looks cryptic. MVMArray is a MoarVM array, obviously. So, we not only broke Perl 6 but MoarVM, too. Let’s go fix it.

The sources of MoarVM are located in a separate git repository at nqp/MoarVM. The message we saw can be found in nqp/MoarVM/src/6model/reprs/VMArray.c:

if (index < 0)
    MVM_exception_throw_adhoc(tc, "MVMArray: Index out of bounds");

There are two places like that, so let’s not guess which of them we need and preventatively change both of them to the following:

if (index < 0)
    index = 0;
    // MVM_exception_throw_adhoc(tc, "MVMArray: Index out of bounds");

(This is C, not Perl.)

From nqp/MoarVM, compile and re-install MoarVM and later try compiling Rakudo:

~/rakudo/nqp/MoarVM$ make
~/rakudo/nqp/MoarVM$ make install

~/rakudo/nqp/MoarVM$ cd ../..
~/rakudo$ make

This time, the error pops up immediately (as no NQP files are compiled):

Use of Nil in numeric context

Use of Nil in numeric context

Day out of range. Is: -51, should be in 1..0

Makefile:517: recipe for target 'perl6-m' failed
make: *** [perl6-m] Error 1

It looks like we can ignore Nils at the moment, but the DateTime hurts us again. We know the remedy:

# (1..12).in-range($month,'Month');
# (1 .. self.DAYS-IN-MONTH($year,$month)).in-range($day,'Day');

Yahoo! This time, the compilation process was calm and we got a new perl6 executable, which works as we wanted:

$ ./perl6 -e'say 1+2*3'

Don’t forget to restore the files before further experiments with Perl 6 🙂


In the comment to this blog post, you can see a reference to the commit, which changes the way Rakudo checks the validity of the DateTime object. Instead of using the in-range method, simpler checks are used now, for example:

1 <= $month <= 12
    || X::OutOfRange.new(:what<Month>,:got($month),:range<1..12>).throw;

Here are the time measures of the two runs of a loop creating DateTime objects before and after the update:

time ./perl6 -e'DateTime.new(2018,1,5,12,30,0) for ^500000'
real 0m7.261s
user 0m7.276s
sys 0m0.020s

. . .

$ time ./perl6 -e'DateTime.new(2018,1,5,12,30,0) for ^500000'
real 0m4.457s
user 0m4.476s
sys 0m0.012s

12. The beginning of the Grammar of Perl 6

Yesterday, we talked about the stages of the compiling process of a Perl 6 program and saw the parse tree of a simple ‘Hello, World!’ program. Today, our journey begins at the starting point of the Grammar.

So, here is the program:

say 'Hello, World!'

The grammar of Perl 6 is written in Not Quite Perl 6 and is located in Grammar.nqp 🙂 And that is amazing, as if you know how to work with grammars, you will be able to read the heart of the language.

The Perl 6 Grammar is defined as following:

grammar Perl6::Grammar is HLL::Grammar does STD {
    . . .

It is a class derived from HLL::Grammar (HLL stands for High-Level Language) and implements the STD (Standard) role. Let’s not focus on the hierarchy for now, though.

The Grammar has the TOP method. Notice that this is a method, not a rule or a token. The main feature of the method is that it is assumed that it contains some Perl 6 code, not regexes.

As we did earlier, let’s use our beloved method of reverse engineering by adding our own printing instructions to different places of Rakudo sources, recompiling it and watching how it works. The first target is the TOP method:

grammar Perl6::Grammar is HLL::Grammar does STD {
    my $sc_id := 0;
    method TOP() {
        nqp::say('At the TOP');
        . . .

As this is NQP, you need to call functions in the nqp:: namespace (although say is available without the namespace prefix, too). One of the notable differences between Perl 6 and NQP is the need to always have parentheses in function calls: if you omit them, the code won’t compile.

Perl inside regexes inside Perl

For training purposes, let’s try adding similar instruction to the comp_unit token (computational unit). This token is a part of the Grammar and is also called as one of the first methods during parsing Perl 6.

The body of the above shown TOP method is written in NQP. The body of a token is another language, and you should use regexes instead. Thus, to embed an instruction in Perl (or NQP), you need to switch the language.

There are two options: use a code block in curly braces or the colon-prefixed syntax that is very widely used in Rakudo sources to declare variables.

token comp_unit {
    :my $x := nqp::say('Var in grammar');
    . . .

Notice that it NQP, the binding := operator have to be used in place of the assignment =.

Statement list

So, back to the grammar. In the output that the --target=parse command-line option produces, we can see a statementlist node at the top of the parse tree. Let us look at its implementation in the Grammar. With some simplifications, it looks very lightweight:

rule statementlist($*statement_level = 0) {
    . . .
    | $
    | <?before <.[\)\]\}]>>
    | [ <statement> <.eat_terminator> ]*
    . . .

Basically, it says that a statement list is a list of zero or more statements. Square brackets in Perl 6 grammars create a non-capturing group, and we see three alternatives inside. One of the alternatives is just the end of data, another one is the end of the block (e. g., ending with a closing curly brace). For the sake of art, an additional vertical bar is added before the first alternative too.

The top-level rule is simple but the rest is becoming more and more complex. For example, let’s have a quick look at the eat terminator:

token eat_terminator {
    || ';'
    || <?MARKED('endstmt')> <.ws>
    || <?before ')' | ']' | '}' >
    || $
    || <?stopper>
    || <?before [if|while|for|loop|repeat|given|when] » > {
       $/.'!clear_highwater'(); self.typed_panic(
          'X::Syntax::Confused', reason => "Missing semicolon" ) }
    || { $/.typed_panic( 'X::Syntax::Confused', reason => "Confused" ) }

And this is just a small separator between the statements 🙂

The grammar file is more than 5500 lines of code; it is not possible to discuss and understand it all in a single blog post. Let us stop here for today and continue with easier stuff tomorrow.