🔬8. Digging into operator precedence in Perl 6, part 2

Yesterday, we took a look at how the ? and so operators are dispatched depending on the type of the variable. We did it with the intention to understand what is the difference between them.

Here is once again an excerpt from the src/core/Bool.pm file, where the bodies of the subs look alike:

proto sub prefix:<?>(Mu $) is pure {*}
multi sub prefix:<?>(Bool:D \a) { a }
multi sub prefix:<?>(Bool:U \a) { Bool::False }
multi sub prefix:<?>(Mu \a) { a.Bool }

proto sub prefix:<so>(Mu $) is pure {*}
multi sub prefix:<so>(Bool:D \a) { a }
multi sub prefix:<so>(Bool:U \a) { Bool::False }
multi sub prefix:<so>(Mu \a) { a.Bool }

Both of them coerce the arguments to a Bool value. The difference is in their operator precedence. You cannot say for sure what is the precedence if you only look at the Bool.pm file. You will find more details in the src/Perl6/Grammar.nqp file describing the Perl 6 language grammar. Here are the fragments we need:

token prefix:sym<so> { <sym><.end_prefix> <O(|%loose_unary)> }
. . .
token prefix:sym<?> { <sym> <!before '??'> <O(|%symbolic_unary)> }

These look complex but let’s first concentrate only on the last part of the token definitions: <O(|%loose_unary)> and <O(|%symbolic_unary)>. Obviously, these are what define the rules for precedence. You can find a list of about 30 different kind of precedences in the same file:

## Operators

. . .
my %symbolic_unary := nqp::hash('prec', 'v=', 'assoc', 'unary', 'dba', 'symbolic unary');
. . .
my %list_assignment := nqp::hash('prec', 'i=', 'assoc', 'right', 'dba', 'list assignment', 'sub', 'e=', 'fiddly', 1);
my %loose_unary := nqp::hash('prec', 'h=', 'assoc', 'unary', 'dba', 'loose unary');
my %comma := nqp::hash('prec', 'g=', 'assoc', 'list', 'dba', 'comma', 'nextterm', 'nulltermish', 'fiddly', 1);
. . .

Let’s avoid digging deeper into how it works at the moment. Looking at the list you can guess that the letters k, j, h, and g define the preference order of different kinds of preference rules. As well a right or left dictate the associativity of the operators.

So, the so operator has the loose unary precedence level and the ? operator has a higher symbolic unary precedence.

The old conditional operator

Before we wrap up for today, let’s look at another interesting place where the single question mark can be caught in the Perl 6 program. I am talking about the following token in the grammar (notice that this time this is for an infix, not for a prefix):

token infix:sym<?> {
    <sym> {} <![?]> <?before <.-[;]>*?':'>
    <.obs('? and : for the ternary conditional operator', '?? and !!')>
    <O(|%conditional)>
}

This code catches the usage of a single ?, which was a part of the ternary operator in Perl 5 unlike the double ?? from the ternary operator in Perl 6.

The <.obs...> part of the token regex prints a warning about obsolete syntax:

$ ./perl6 -e'say 1 ? True : False'
===SORRY!=== Error while compiling -e
Unsupported use of ? and : for the ternary conditional operator;
in Perl 6 please use ?? and !!
at -e:1
------> say 1 ?⏏ True : False

So, if you use the old syntax, you’ll get not only an error message but also a hint on how to fix the issue.

🔬7. Digging into operator precedence in Perl 6, part 1

Today, we’ll once again look at the src/core/Bool.pm file. This is a good example of a full-fledged Perl 6 class, which is still not very difficult to examine.

Look at the definitions of the ? and so operators:

proto sub prefix:<?>(Mu $) is pure {*}
multi sub prefix:<?>(Bool:D \a) { a }
multi sub prefix:<?>(Bool:U \a) { Bool::False }
multi sub prefix:<?>(Mu \a) { a.Bool }

proto sub prefix:<so>(Mu $) is pure {*}
multi sub prefix:<so>(Bool:D \a) { a }
multi sub prefix:<so>(Bool:U \a) { Bool::False }
multi sub prefix:<so>(Mu \a) { a.Bool }

There’s no visual difference between the two implementations, but it would be a mistake to conclude that there is no difference between the two of them. Both ? and so cast a value to the Bool type.

When am I called?

Before we go discussing the precedence, let us first examine when the above subs are called. For simplifying the task, add a few printing instructions into their bodies:

proto sub prefix:<?>(Mu $) is pure {*}
multi sub prefix:<?>(Bool:D \a) { say 1; a }
multi sub prefix:<?>(Bool:U \a) { say 2; Bool::False }
multi sub prefix:<?>(Mu \a) { say 3; a.Bool }

proto sub prefix:<so>(Mu $) is pure {*}
multi sub prefix:<so>(Bool:D \a) { say 4; a }
multi sub prefix:<so>(Bool:U \a) { say 5; Bool::False }
multi sub prefix:<so>(Mu \a) { say 6; a.Bool }

Re-compile Rakudo and make a few tests with both ? and so (you’ll get some numbers printed before the prompt appears):

$ ./perl6
> my Bool $b;
(Bool)
> ?$b;
2
> so $b;
5
>

At the moment, there are no surprises. For an undefined Boolean variable, those subs are called that have the (Bool:U) signature.

Now, try an integer:

> my Int $i;
(Int)
> ?$i;
3
> so $i;
6

Although the variable is of the Int type, the compiler calls the subs from Bool.pm (notice that those functions are regular subs, not the methods of the Bool class). This time, the subs having the (Mu) signature are called, as Int is a grand-grandchild of Mu (via Cool and Any). For the undefined variable, the subs call the Bool method from the Mu class.

proto method Bool() {*}
multi method Bool(Mu:U: --> False) { }
multi method Bool(Mu:D:) { self.defined }

For a defined integer, the Bool method of the Int class is used instead:

multi method Bool(Int:D:) {
    nqp::p6bool(nqp::bool_I(self));
}

To visualise the routes, add more printing commands to the files. In src/core/Mu.pm:

proto method Bool() {*}
multi method Bool(Mu:U:) { nqp::say('7'); False }
multi method Bool(Mu:D:) { nqp::say('8'); self.defined }

And in src/core/Int.pm:

multi method Bool(Int:D:) {
    say 9;
    nqp::p6bool(nqp::bool_I(self));
}

During the compilation, a lot of 7s and 8s flood the screen, which means that the changes we’ve just made are already used even during the compilation process.

It’s time to play with defined and undefined integers now:

> my Int $i;
(Int)
> say $i.Bool;
7
False
> $i = 42;
42
> say $i.Bool;
9
True

For the undefined variable, the method from the Mu class (printing 7) is triggered; for the defined variable, the one from Int (9).

🔬6. The dd routine of Rakudo Perl 6

In Rakudo, there is a useful routine dd, which is not a part of Perl 6 itself. It dumps its argument(s) in a way that you immediately see the type and content of a variable. For example:

$ ./perl6 -e'my Bool $b = True; dd($b)'
Bool $b = Bool::True

It works well with data of other types, for example, with arrays:

$ ./perl6 -e'my @a = < a b c >; dd(@a)'
Array @a = ["a", "b", "c"]

Today, we will look at the definition of the dd routine.

It is located in the src/core/Any.pm module as part of the Any class. The code is quite small, so let us show it here:

sub dd(|) {
    my Mu $args := nqp::p6argvmarray();
    if nqp::elems($args) {
        while $args {
            my $var  := nqp::shift($args);
            my $name := try $var.VAR.?name;
            my $type := $var.WHAT.^name;
            my $what := $var.?is-lazy
              ?? $var[^10].perl.chop ~ "... lazy list)"
              !! $var.perl;
            note $name ?? "$type $name = $what" !! $what;
        }
    }
    else { # tell where we are
        note .name
          ?? "{lc .^name} {.name}{.signature.gist}"
          !! "{lc .^name} {.signature.gist}"
          with callframe(1).code;
    }
    return
}

Call with arguments

The vertical bar, which we have already seen earlier, is a signature that captures argument lists with no type checking. It is not possible to omit it and leave empty parentheses, as in that case the routine can only be called without arguments.

Inside, some NQP-magic happens but that is quite readable for us. If there are arguments, the routine loops over them, shifting the next argument in each cycle.

Then, there is an attempt to get the name, type and content:

my $name := try $var.VAR.?name;
my $type := $var.WHAT.^name;

Notice the presence of try and ? in the method call. We already saw the pattern when we were taking about string interpolation. The ?name is only called on an object if the method exists there, and does not generate an error if not.

The content is a bit more difficult thing:

my $what := $var.?is-lazy
    ?? $var[^10].perl.chop ~ "... lazy list)"
    !! $var.perl;

The result depends on whether an object is a lazy list or not. For example, try dumping an infinite range:

$ ./perl6 -e'dd 1..∞'
(1, 2, 3, 4, 5, 6, 7, 8, 9, 10... lazy list)

Only the first ten items are listed. For a non-lazy object, the perl method is called.

Finally, the result is printed to STDERR:

note $name ?? "$type $name = $what" !! $what;

Call with no arguments

The second branch of the dd routine is triggered when there are no arguments. In that case, the routine tries to give some information about the place where it is called. Look at the following example:

sub f() { dd }
f;

The result of running this program shows the name and the signature of the function:

sub f()

A good use case can be thus to use dd in multi-functions instead of printing manual text messages.

multi sub f(Int) { dd }
multi sub f(Str) { dd }

f(42);
f('42');

Run the program, and it prints an extremely useful debugging information:

sub f(Int)
sub f(Str)

That’s all for today. See you tomorrow!

🔬5. Lurking behind interpolation in Perl 6

In the previous articles, we’ve seen that the undefined value cannot be easily interpolated in a string, as an exception occurs. Today, our goal is to see where exactly that happens in the source code of Rakudo.

So, as soon as we’ve looked at the Boolean values, let’s continue with them. Open perl6 in the REPL mode and create a variable:

$ perl6
To exit type 'exit' or '^D'
> my $b
(Any)

The variable is undefined, so be ready to get an exception when interpolating it:

> "$b"
Use of uninitialized value $b of type Any in string context.
Methods .^name, .perl, .gist, or .say can be used to stringify it to something meaningful.
 in block  at  line 1

Interpolation uses the Str method. For undefined values, this method is absent in the Bool class. So we have to trace back to the Mu class, where we can see the following collection of base methods:

proto method Str(|) {*}

multi method Str(Mu:U \v:) {
   my $name = (defined($*VAR_NAME) ?? $*VAR_NAME !! try v.VAR.?name) // '';
   $name ~= ' ' if $name ne '';
 
   warn "Use of uninitialized value {$name}of type {self.^name} in string"
      ~ " context.\nMethods .^name, .perl, .gist, or .say can be"
      ~ " used to stringify it to something meaningful.";
   ''
}

multi method Str(Mu:D:) {
    nqp::if(
        nqp::eqaddr(self,IterationEnd),
        "IterationEnd",
        self.^name ~ '<' ~ nqp::tostr_I(nqp::objectid(self)) ~ '>'
    )
}

The proto-definition gives the pattern for the Str methods. The vertical bar in the signature indicates that the proto does not validate the type of the argument and can also capture more arguments.

In the Str(Mu:U) method you can easily see the text of the error message. This method is called for the undefined variable. In our case, with the Boolean variable, there’s no Str(Bool:U) method in the Bool class, so the call is dispatched to the method of the Mu class.

Notice how the variable name is obtained:

my $name = (defined($*VAR_NAME) ?? $*VAR_NAME !! try v.VAR.?name) // '';

It tries either the dynamic variable $*VAR_NAME or the name method of the VAR object.

You can easily see which branch is used: just add a couple of printing instructions to the Mu class and recompile Rakudo:

proto method Str(|) {*}
multi method Str(Mu:U \v:) {
    warn "VAR_NAME=$*VAR_NAME" if defined $*VAR_NAME;
    warn "v.VAR.name=" ~ v.VAR.name if v.VAR.?name;
    . . .

Now execute the same interpolation:

> my $b ;
(Any)
> "$b"
VAR_NAME=$b
  in block  at  line 1

So, the name was taken from the $*VAR_NAME variable.

What about the second multi-method Str(Mu:D:)? It is important to understand that it will not be called for a defined Boolean object because the Bool class has a proper variant already.

🔬4. Exploring the Bool type in Perl 6, part 2

Today, we are continuing reading the source codes of the Bool class: src/core/Bool.pm, and will look at the methods that calculate the next or the previous values, or increment and decrement the values. For the Boolean type, it sounds simple, but you still have to determine the behaviour of the edge cases.

pred and succ

In Perl 6, there are two complementary methods: pred and succ that should return, correspondingly, the preceding and the succeeding values. This is how they are defined for the Bool type:

Bool.^add_method('pred', my method pred() { Bool::False });
Bool.^add_method('succ', my method succ() { Bool::True });

As you see, these methods are regular (not multi) methods and do not distinguish between defined or undefined arguments. The result neither depends on the value!

If you take two Boolean variables, one set to False and another to True, the prec method returns False for both variables:

my Bool $f = False;
my Bool $t = True;
my Bool $u;

say $f.pred;    # False
say $t.pred;    # False
say $u.pred;    # False
say False.pred; # False
say True.pred;  # False

Similarly, the succ method always returns True:

say $f.succ;    # True
say $t.succ;    # True
say $u.succ;    # True
say False.succ; # True
say True.succ;  # True

Increment and decrement

The variety of the ++ and -- operations is even more, as another dimension—prefix or postfix—is added.

First, the two prefixal forms:

multi sub prefix:<++>(Bool $a is rw) { $a = True; }
multi sub prefix:<-->(Bool $a is rw) { $a = False; }

When you read the sources, you start slowly understand that many strangely behaving bits of the language may be well explained, because the developers have to think about huge combinations of arguments, variables, positions, etc., about which you may not even think when using the language.

The prefix forms simply set the value of the variable to either True or False, and it happens for both defined and undefined variables. The is rw trait allows modifying the argument.

Now, the postfix forms. This time, the state of the variable matters.

multi sub postfix:<++>(Bool:U $a is rw --> False) { $a = True }
multi sub postfix:<-->(Bool:U $a is rw) { $a = False; }

We see a new element of syntax—the return value is mentioned after an arrow in the sub signature:

(Bool:U $a is rw --> False)

The bodies of the operators that work on defined variables, are wordier. If you look at the code precisely, you can see that it avoids assigning the new value to a variable if, for example, a variable containing True is incremented.

multi sub postfix:<++>(Bool:D $a is rw) {
    if $a {
        True
    }
    else {
        $a = True;
        False
    }
}


multi sub postfix:<-->(Bool:D $a is rw) {
    if $a {
        $a = False;
        True
    }
    else {
        False
    }
}

As you see, the changed value of the variable after the operation may be different from what the operator returns.

🔬3. Playing with the code of Rakudo Perl 6

Yesterday, we looked at the two methods of the Bool class that return strings. The string representation that the functions produce is hardcoded in the source code.

Let’s use this observation and try changing the texts.

So, here is the fragment that we will modify:

Bool.^add_multi_method('gist', my multi method gist(Bool:D:) {
    self ?? 'True' !! 'False'
});

This gist method is used to stringify a defined variable.

To make things happen, you need to have the source codes of Rakudo on your computer so that you can compile them. Clone the project from GitHub first:

$ git clone https://github.com/rakudo/rakudo.git

Compile with MoarVM:

$ cd rakudo
$ perl Configure.pl --gen-moar --gen-nqp --backends=moar
$ make

Having that done, you get the perl6 executable in the rakudo directory.

Now, open the src/core/Bool.pm file and change the strings of the gist method to use the Unicode thumbs instead of plain text:

Bool.^add_multi_method('gist', my multi method gist(Bool:D:) {
    self ?? '👍' !! '👎'
});

After saving the file, you need to recompile Rakudo. Bool.pm is in the list of files to be compiled in Makefile:

M_CORE_SOURCES = \
    src/core/core_prologue.pm\
    src/core/traits.pm\
    src/core/Positional.pm\
    . . .
    src/core/Bool.pm\
    . . .

Run make and get the updated perl6. Run it and enjoy the result:

:~/rakudo$ ./perl6
To exit type 'exit' or '^D'
> my Bool $b = True;
👍
> $b = !$b; 
👎
>

As an exercise, let us improve your local Perl 6 by adding the gist method for undefined values. By default, it does not exist, and we saw that yesterday. It means that an attempt to interpolate an undefined variable in a string will be rejected. Let’s make it better.

Interpolation uses the Str method. It is similar to both gist and perl, so you will have no difficulties in creating the new version.

This is what currently is in Perl 6:

Bool.^add_multi_method('Str', my multi method Str(Bool:D:) {
    self ?? 'True' !! 'False'
});

This is what you need to add:

Bool.^add_multi_method('Str', my multi method Str(Bool:U:) {
    '¯\_(ツ)_/¯'
});

Notice that self is not needed (and cannot be used) in the second variant.

Compile and run perl6:

$ ./perl6
To exit type 'exit' or '^D'
> my Bool $b;
(Bool)
> "Here is my variable: $b"
Here is my variable: ¯\_(ツ)_/¯
>

It works as expected. Congratulations, you’ve just changed the behaviour of Perl 6 yourself!

🔬2. Exploring the Bool type in Perl 6, part 1

This is the excerpt for your very first post.

Today, we will be digging into the internals of the Bool type using the source code of Rakudo, available on GitHub.

Perl 6 is written in the Perl 6 and NQP (Not Quite Perl 6) languages, which makes it relatively easy to read the sources. Of course, there are many things that are not easy to understand or which are not reflected in the publicly available documentation of the Perl 6 language. Neither you can find the deep details in the Perl 6 books so far. Anyway, this is still possible with some intermediate understanding of Perl 6.

OK, so back to the src/core/Bool.pm file. It begins with a few BEGIN phasers that add some methods and multi-methods to the Bool class. We’ll talk about the details of metamodels and class construction next time. Today, the more interesting for us is what the methods of the Bool class are doing.

gist and perl

The gist and perl methods return the string representation of the object: gist is implicitly called when a variable is stringified, perl is supposed to be called directly. It works for any object in Perl 6, but of course, the behaviour should be defined somewhere. And here they are:

Bool.^add_method('gist', my proto method gist(|) {*});
Bool.^add_multi_method('gist', my multi method gist(Bool:D:) { 
    self ?? 'True' !! 'False'
});
Bool.^add_multi_method('gist', my multi method gist(Bool:U:) {
    '(Bool)'
});

Bool.^add_method('perl', my proto method perl(|) {*});
Bool.^add_multi_method('perl', my multi method perl(Bool:D:) {
    self ?? 'Bool::True' !! 'Bool::False'
});
Bool.^add_multi_method('perl', my multi method perl(Bool:U:) {
    'Bool' 
});

Try out the methods in the following simple program:

my Bool $b = True;
say $b;      # True
say "[$b]";  # [True]
$b.perl.say; # Bool::True

As you can see, the True string is returned by the gist method, while the perl method returns Bool::True.

Both methods are multi-methods, and in the above example, the version with a defined argument was used. If you look at the signatures, you will see that the methods are different in the way an argument is specified: Bool:D: or Bool:U:. The letters D and U stay for defined and undefined, correspondingly. The first colon adds an attribute to the type, while the second one indicates that the argument is actually an invocant.

So, different versions of the methods are triggered depending on whether they are called on a defined or an undefined Boolean variable. To demonstrate the behaviour of the other two variants, simply remove the initialiser part from the code:

my Bool $b;
say $b;      # (Bool)
$b.perl.say; # Bool

As the variable $b has a type, Perl 6 knows the type of the object, on which it should call methods. Then it is dispatched to the versions with the (Bool:U:) signature because the variable is not defined yet.

When an undefined variable appears in the string, for example, say "[$b]", the gist method is not called. Instead, you get an error message.

Use of uninitialized value $b of type Bool in string context.
Methods .^name, .perl, .gist, or .say can be used to stringify it to something meaningful.
 in block  at bool-2.pl line 3
[]

The error message says that Perl knows of what type the variable was, but refuses to call a stringifying method.

That’s all for today. Next time, we’ll look at other methods defined for the Bool data type.