🔬5. Lurking behind interpolation in Perl 6

In the previous articles, we’ve seen that the undefined value cannot be easily interpolated in a string, as an exception occurs. Today, our goal is to see where exactly that happens in the source code of Rakudo.

So, as soon as we’ve looked at the Boolean values, let’s continue with them. Open perl6 in the REPL mode and create a variable:

$ perl6
To exit type 'exit' or '^D'
> my $b
(Any)

The variable is undefined, so be ready to get an exception when interpolating it:

> "$b"
Use of uninitialized value $b of type Any in string context.
Methods .^name, .perl, .gist, or .say can be used to stringify it to something meaningful.
 in block  at  line 1

Interpolation uses the Str method. For undefined values, this method is absent in the Bool class. So we have to trace back to the Mu class, where we can see the following collection of base methods:

proto method Str(|) {*}

multi method Str(Mu:U \v:) {
   my $name = (defined($*VAR_NAME) ?? $*VAR_NAME !! try v.VAR.?name) // '';
   $name ~= ' ' if $name ne '';
 
   warn "Use of uninitialized value {$name}of type {self.^name} in string"
      ~ " context.\nMethods .^name, .perl, .gist, or .say can be"
      ~ " used to stringify it to something meaningful.";
   ''
}

multi method Str(Mu:D:) {
    nqp::if(
        nqp::eqaddr(self,IterationEnd),
        "IterationEnd",
        self.^name ~ '<' ~ nqp::tostr_I(nqp::objectid(self)) ~ '>'
    )
}

The proto-definition gives the pattern for the Str methods. The vertical bar in the signature indicates that the proto does not validate the type of the argument and can also capture more arguments.

In the Str(Mu:U) method you can easily see the text of the error message. This method is called for the undefined variable. In our case, with the Boolean variable, there’s no Str(Bool:U) method in the Bool class, so the call is dispatched to the method of the Mu class.

Notice how the variable name is obtained:

my $name = (defined($*VAR_NAME) ?? $*VAR_NAME !! try v.VAR.?name) // '';

It tries either the dynamic variable $*VAR_NAME or the name method of the VAR object.

You can easily see which branch is used: just add a couple of printing instructions to the Mu class and recompile Rakudo:

proto method Str(|) {*}
multi method Str(Mu:U \v:) {
    warn "VAR_NAME=$*VAR_NAME" if defined $*VAR_NAME;
    warn "v.VAR.name=" ~ v.VAR.name if v.VAR.?name;
    . . .

Now execute the same interpolation:

> my $b ;
(Any)
> "$b"
VAR_NAME=$b
  in block  at  line 1

So, the name was taken from the $*VAR_NAME variable.

What about the second multi-method Str(Mu:D:)? It is important to understand that it will not be called for a defined Boolean object because the Bool class has a proper variant already.

🔬4. Exploring the Bool type in Perl 6, part 2

Today, we are continuing reading the source codes of the Bool class: src/core/Bool.pm, and will look at the methods that calculate the next or the previous values, or increment and decrement the values. For the Boolean type, it sounds simple, but you still have to determine the behaviour of the edge cases.

pred and succ

In Perl 6, there are two complementary methods: pred and succ that should return, correspondingly, the preceding and the succeeding values. This is how they are defined for the Bool type:

Bool.^add_method('pred', my method pred() { Bool::False });
Bool.^add_method('succ', my method succ() { Bool::True });

As you see, these methods are regular (not multi) methods and do not distinguish between defined or undefined arguments. The result neither depends on the value!

If you take two Boolean variables, one set to False and another to True, the prec method returns False for both variables:

my Bool $f = False;
my Bool $t = True;
my Bool $u;

say $f.pred;    # False
say $t.pred;    # False
say $u.pred;    # False
say False.pred; # False
say True.pred;  # False

Similarly, the succ method always returns True:

say $f.succ;    # True
say $t.succ;    # True
say $u.succ;    # True
say False.succ; # True
say True.succ;  # True

Increment and decrement

The variety of the ++ and -- operations is even more, as another dimension—prefix or postfix—is added.

First, the two prefixal forms:

multi sub prefix:<++>(Bool $a is rw) { $a = True; }
multi sub prefix:<-->(Bool $a is rw) { $a = False; }

When you read the sources, you start slowly understand that many strangely behaving bits of the language may be well explained, because the developers have to think about huge combinations of arguments, variables, positions, etc., about which you may not even think when using the language.

The prefix forms simply set the value of the variable to either True or False, and it happens for both defined and undefined variables. The is rw trait allows modifying the argument.

Now, the postfix forms. This time, the state of the variable matters.

multi sub postfix:<++>(Bool:U $a is rw --> False) { $a = True }
multi sub postfix:<-->(Bool:U $a is rw) { $a = False; }

We see a new element of syntax—the return value is mentioned after an arrow in the sub signature:

(Bool:U $a is rw --> False)

The bodies of the operators that work on defined variables, are wordier. If you look at the code precisely, you can see that it avoids assigning the new value to a variable if, for example, a variable containing True is incremented.

multi sub postfix:<++>(Bool:D $a is rw) {
    if $a {
        True
    }
    else {
        $a = True;
        False
    }
}


multi sub postfix:<-->(Bool:D $a is rw) {
    if $a {
        $a = False;
        True
    }
    else {
        False
    }
}

As you see, the changed value of the variable after the operation may be different from what the operator returns.

🔬3. Playing with the code of Rakudo Perl 6

Yesterday, we looked at the two methods of the Bool class that return strings. The string representation that the functions produce is hardcoded in the source code.

Let’s use this observation and try changing the texts.

So, here is the fragment that we will modify:

Bool.^add_multi_method('gist', my multi method gist(Bool:D:) {
    self ?? 'True' !! 'False'
});

This gist method is used to stringify a defined variable.

To make things happen, you need to have the source codes of Rakudo on your computer so that you can compile them. Clone the project from GitHub first:

$ git clone https://github.com/rakudo/rakudo.git

Compile with MoarVM:

$ cd rakudo
$ perl Configure.pl --gen-moar --gen-nqp --backends=moar
$ make

Having that done, you get the perl6 executable in the rakudo directory.

Now, open the src/core/Bool.pm file and change the strings of the gist method to use the Unicode thumbs instead of plain text:

Bool.^add_multi_method('gist', my multi method gist(Bool:D:) {
    self ?? '👍' !! '👎'
});

After saving the file, you need to recompile Rakudo. Bool.pm is in the list of files to be compiled in Makefile:

M_CORE_SOURCES = \
    src/core/core_prologue.pm\
    src/core/traits.pm\
    src/core/Positional.pm\
    . . .
    src/core/Bool.pm\
    . . .

Run make and get the updated perl6. Run it and enjoy the result:

:~/rakudo$ ./perl6
To exit type 'exit' or '^D'
> my Bool $b = True;
👍
> $b = !$b; 
👎
>

As an exercise, let us improve your local Perl 6 by adding the gist method for undefined values. By default, it does not exist, and we saw that yesterday. It means that an attempt to interpolate an undefined variable in a string will be rejected. Let’s make it better.

Interpolation uses the Str method. It is similar to both gist and perl, so you will have no difficulties in creating the new version.

This is what currently is in Perl 6:

Bool.^add_multi_method('Str', my multi method Str(Bool:D:) {
    self ?? 'True' !! 'False'
});

This is what you need to add:

Bool.^add_multi_method('Str', my multi method Str(Bool:U:) {
    '¯\_(ツ)_/¯'
});

Notice that self is not needed (and cannot be used) in the second variant.

Compile and run perl6:

$ ./perl6
To exit type 'exit' or '^D'
> my Bool $b;
(Bool)
> "Here is my variable: $b"
Here is my variable: ¯\_(ツ)_/¯
>

It works as expected. Congratulations, you’ve just changed the behaviour of Perl 6 yourself!

🔬2. Exploring the Bool type in Perl 6, part 1

This is the excerpt for your very first post.

Today, we will be digging into the internals of the Bool type using the source code of Rakudo, available on GitHub.

Perl 6 is written in the Perl 6 and NQP (Not Quite Perl 6) languages, which makes it relatively easy to read the sources. Of course, there are many things that are not easy to understand or which are not reflected in the publicly available documentation of the Perl 6 language. Neither you can find the deep details in the Perl 6 books so far. Anyway, this is still possible with some intermediate understanding of Perl 6.

OK, so back to the src/core/Bool.pm file. It begins with a few BEGIN phasers that add some methods and multi-methods to the Bool class. We’ll talk about the details of metamodels and class construction next time. Today, the more interesting for us is what the methods of the Bool class are doing.

gist and perl

The gist and perl methods return the string representation of the object: gist is implicitly called when a variable is stringified, perl is supposed to be called directly. It works for any object in Perl 6, but of course, the behaviour should be defined somewhere. And here they are:

Bool.^add_method('gist', my proto method gist(|) {*});
Bool.^add_multi_method('gist', my multi method gist(Bool:D:) { 
    self ?? 'True' !! 'False'
});
Bool.^add_multi_method('gist', my multi method gist(Bool:U:) {
    '(Bool)'
});

Bool.^add_method('perl', my proto method perl(|) {*});
Bool.^add_multi_method('perl', my multi method perl(Bool:D:) {
    self ?? 'Bool::True' !! 'Bool::False'
});
Bool.^add_multi_method('perl', my multi method perl(Bool:U:) {
    'Bool' 
});

Try out the methods in the following simple program:

my Bool $b = True;
say $b;      # True
say "[$b]";  # [True]
$b.perl.say; # Bool::True

As you can see, the True string is returned by the gist method, while the perl method returns Bool::True.

Both methods are multi-methods, and in the above example, the version with a defined argument was used. If you look at the signatures, you will see that the methods are different in the way an argument is specified: Bool:D: or Bool:U:. The letters D and U stay for defined and undefined, correspondingly. The first colon adds an attribute to the type, while the second one indicates that the argument is actually an invocant.

So, different versions of the methods are triggered depending on whether they are called on a defined or an undefined Boolean variable. To demonstrate the behaviour of the other two variants, simply remove the initialiser part from the code:

my Bool $b;
say $b;      # (Bool)
$b.perl.say; # Bool

As the variable $b has a type, Perl 6 knows the type of the object, on which it should call methods. Then it is dispatched to the versions with the (Bool:U:) signature because the variable is not defined yet.

When an undefined variable appears in the string, for example, say "[$b]", the gist method is not called. Instead, you get an error message.

Use of uninitialized value $b of type Bool in string context.
Methods .^name, .perl, .gist, or .say can be used to stringify it to something meaningful.
 in block  at bool-2.pl line 3
[]

The error message says that Perl knows of what type the variable was, but refuses to call a stringifying method.

That’s all for today. Next time, we’ll look at other methods defined for the Bool data type.

🦋1. The proto keyword in Perl 6

Today, we are looking precisely at the proto keyword. It gives a hint for the compiler about your intention to create multi-subs.

Example 1

Consider an example of the function that either flips a string or negates an integer.

multi sub f(Int $x) {
    return -$x;
}

multi sub f(Str $x) {
    return $x.flip;
}

say f(42);      # -42
say f('Hello'); # olleH

What if we create another variant of the function that takes two arguments.

multi sub f($a, $b) {
    return $a + $b;
}

say f(1, 2); # 3

This code perfectly works, but it looks like its harmony is broken. Even if the name of the function says nothing about what it does, we intended to have a function that somehow returns a ‘reflected’ version of its argument. The function that adds up two numbers does not fit this idea.

So, it is time to clearly announce the intention with the help of the proto keyword.

proto sub f($x) {*}

Now, an attempt of calling the two-argument function won’t compile:

===SORRY!=== Error while compiling proto.pl
Calling f(Int, Int) will never work with proto signature ($x)
at proto.pl:15
------> say f(1,2)

The calls of the one-argument variants work perfectly. The proto-definition creates a pattern for the function f: its name is f, and it takes one scalar argument. Multi-functions specify the behaviour and narrow their expertise to either integers or strings.

Example 2

Another example involves a proto-definition with two typed arguments in the function signature.

proto sub g(Int $x, Int $y) {*}

In this example, the function returns a sum of the two integers. When one of the numbers is much bigger than the other, the smaller number is just ignored as being not significant enough:

multi sub g(Int $x, Int $y) {
   return $x + $y;
}

multi sub g(Int $x, Int $y where {$y > 1_000_000 * $x}) {
   return $y;
}

Call the function with integer arguments and see how Perl 6 picks the correct variant:

say g(1, 2);          # 3
say g(3, 10_000_000); # 10000000

Didn’t you forget that the prototype insists on two integers? Try it out passing floating-point numbers:

say g(pi, e);

We got a compile-time error:

===SORRY!=== Error while compiling proto-int.pl
Calling g(Num, Num) will never work with proto signature (Int $x, Int $y)
at proto-int.pl:13
------> say ⏏g(pi, e);

The prototype has caught the error in the function usage. What happens if there is no proto for the g sub? The function is still not called, but the error message is different. It happens at run-time this time:

Cannot resolve caller g(3.14159265358979e0, 2.71828182845905e0); none of these signatures match:
 (Int $x, Int $y)
 (Int $x, Int $y where { ... })
 in block <unit> at proto-int.pl line 13

We still have no acceptable signature for the floating-point numbers, but the compiler cannot see that until the program flow reaches the code.