📘 The Brainfuck interpreter written in Perl 6

Create the interpreter for the Brainfuck language.

Brainfuck is an esoteric programming language that has a small set of instructions, each of them a single punctuation character.

It is assumed that the Brainfuck program has built-in data memory, which is an array of integers, and a pointer to the currently selected item. The two instructions, + and -, increment and decrement the current element. The < and > instructions move the data pointer one position to the left or to the right.

Another two instructions, . and ,, either print the current element using its values as the ASCII codepoint (in theory, it can be Unicode) or read a character from the standard input and put its numeric value to the current element of the data array.

Finally, [ and ] create loops. If when the program reads the closing bracket character, the current data element is not zero, the program returns to the corresponding opening bracket. In the case the program reads an opening bracket and the current data element is zero, the whole block between the two matching brackets is skipped. This option can also be used for embedding comments.

All other characters are ignored. This gives the ability to separate the program instructions with spaces or newlines, as well as to add comments just next to the main code. The comments should simply not include the main characters used as the code instructions.

Online, you can find many examples of the Brainfuck code. We’ll test our program on the following ‘Hello World!’ program:

++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++..+++.>>.<-.<.+++.------.--------.>>+.>++.

Now, we are ready to create the interpreter of Brainfuck in Perl 6.

First, read the source code to the $program variable, and pass it to the main interpreter subroutine:

my $program = $*IN.slurp;
brainfuck($program);

The parser first creates the containers it needs for the process: @program holds the program as an array of characters; the $program_pointer is set to the beginning of it; @data_memory keeps the data, and its current position is also set to 0 via $data_pointer.

sub brainfuck($program) {
    my @program = $program.comb('');
    my $program_pointer = 0;
    my @data_memory;
    my $data_pointer = 0;

Now, iterate over the program instructions.

    while $program_pointer < @program.elems {

At this point of the main loop, the @program[$program_pointer] element contains the current program instruction. We are using the givenwhen block to understand the meaning of it and make an action. The first four commands are straightforward:

        given @program[$program_pointer] {
            when '>' {$data_pointer++}
            when '<' {$data_pointer--}
            when '+' {@data_memory[$data_pointer]++}
            when '-' {@data_memory[$data_pointer]--}

Let’s skip the comma command for now and move on to the input dot. The input command is using the @data_memory array and the chr method to translate codepoints to characters.

            when '.' {
                print @data_memory[$data_pointer].chr
            }

Finally, the loop commands [ and ]. Their behaviour depends on the value of the current data element @data_memory[$data_pointer]. If the condition is met (i. e., if the current element is zero for [ and non-zero for ]), the $program_pointer must be moved to the position of the matching bracket.

To simplify the program, the code to find balancing brackets is placed to separate functions, _move_forward and _move_back. They modify the value of the program pointer, which is passed as an argument.

            when '[' {
                $program_pointer =
                    _move_forward(@program, $program_pointer)
                unless @data_memory[$data_pointer];
            }
            when ']' {
                $program_pointer =
                    _move_back(@program, $program_pointer)
                if @data_memory[$data_pointer];
            }
        }

All other instructions, which are not listed in the when clauses, are simply ignored. After the current instruction has been processed, the program pointer is moved to the next position:

        $program_pointer++;
    }
}

Finally, here is the code for the functions searching balancing brackets. They move either forward or backwards and count the opening and closing brackets. The $level variable is increased if the program finds the bracket, which is not the correct pair.

sub _move_back(@program, $program_pointer is copy) {
    my $level = 1;
    while $level && $program_pointer >= 0 {
        $program_pointer--;
        given @program[$program_pointer] {
            when '[' {$level--}
            when ']' {$level++}
        }
    }   
    return $program_pointer - 1;
}

sub _move_forward(@program, $program_pointer is copy) {
    my $level = 1;
    while $level && $program_pointer < @program.elems {
        $program_pointer++;
        given @program[$program_pointer] {
            when '[' {$level++}
            when ']' {$level--}
        }
    }   
    return $program_pointer - 1;
}

The subroutines use the same approach with the givenwhen keywords for dealing with command characters as in the main loop.

To prevent infinite loops in case of the incorrect program, both subs check if the $program_pointer reaches the beginning or end of the program. Notice that because the $program_pointer is modified inside the subs, it is declared as is copy in the signatures of the subs. The return value is intentionally decremented by one to compensate the subsequent increment of it in the main loop: $program_pointer++.

The interpreter is complete. Save the ‘Hello World!’ program in a file and pass it in the command line:

$ perl6 brainfuck.pl < helloworld.bf 
Hello World!

As an exercise, modify the interpreter so that it understands the , command. You need to update the givenwhen list in the main loop with the code that reads the character from the input:

when ',' {@data_memory[$data_pointer] = $*IN.getc.?ord}

The $*IN.getc returns Nil when there are no more characters in the input. Try to catch this situation to avoid filling the data memory with empty data. Here is a test program that copies the input to the output:

>+[[>],.-------------[+++++ +++++ +++[<]]>]<<[<]>>[.>]

Another useful modification would be error handling. There are a few places in the program where increments or decrements in one of the pointers may go out of the array ranges. Add the code that checks that to display an error message. To make theprocess easier, use some simple debugging code like the one below to visualise the position of the program pointer and data state at each iteration of the main loop:

say $program;
say ' ' x $program_pointer ~ '^';
say @data_memory[0..$data_pointer - 1] ~ ' [' ~
    @data_memory[$data_pointer] ~ '] ' ~
    @data_memory[$data_pointer + 1..*];

📘 Converting Morse to text using Perl 6

Convert the Morse sequence to plain text.

To save efforts in typing the decoding table, we can use the %code hash from Task 98, Text to Morse code, and create the ‘inversed’ hash, where the keys are the Morse sequences, and the values are letters or digits:

my %char = %code.kv.reverse;

Printing this variable shows its contents in the following way:

{- => t, -- => m, --- => o, ----- => 0, ----. => 9, ---.. => 8, 
--. => g, --.- => q, --.. => z, --... => 7, -. => n, -.- => k, 
-.-- => y, -.-. => c, -.. => d, -..- => x, -... => b, -.... => 6, 
. => e, .- => a, .-- => w, .--- => j, .---- => 1,.--. => p,
.-. => r, .-.. => l, .. => i, ..- => u, ..--- => 2, ..-. => f,
... => s, ...- => v, ...-- => 3, .... => h, ....- => 4, ..... => 5}

Despite the fact that Perl 6’s output does not print quotes, all the keys and values in %char are strings. The next step is to replace the sequences from the keys of the hash with its values. The small difficulty is that, unlike the text-to-Morse conversion, a regex has to search for the sequence of a few characters (dots and dashes), so it must anchor to the boundaries of the Morse characters.

The built-in << and >> regex anchors for word boundaries assume that the words are sequences of letters and digits, while Morse sequences are dots and dashes. Let’s use a space to serve as a separating character. To simplify the task, just add an additional space to the string before decoding it.

my $text = prompt('Morse phrase> ') ~ ' ';
$text ~~ s:g/(<[.-]>+) ' '/%char{$0}/;
$text ~~ s:g/\s+/ /;
say $text;

📘 Converting text to Morse code using Perl 6

Convert the given text to the Morse code.

Converting text to the Morse code is a relatively easy task. The solution is to replace all the alphanumeric characters with the corresponding representation in the Morse code.

In this solution, all the other characters are ignored and are removed from the source string. In the Morse code, letters are separated by the duration of one dash, and words are separated by the duration of approximately 2.5 dashes, so in the program, one space is used for separating characters, and three spaces separate the words.

The above logic is programmed in the series of replacements. First, lowercase the whole phrase (there is no distinction between lower- and upper-case letters) and then remove all the non-alphanumeric characters and increase the distance between the words. Finally, replace each remaining printable symbol with the corresponding Morse sequence.

my %code = (
    a => '.-',      b => '-...',    c => '-.-.',
    d => '-..',     e => '.',       f => '..-.',
    g => '--.',     h => '....',    i => '..',
    j => '.---',    k => '-.-',     l => '.-..',
    m => '--',      n => '-.',      o => '---',
    p => '.--.',    q => '--.-',    r => '.-.',
    s => '...',     t => '-',       u => '..-',
    v => '...-',    w => '.--',     x => '-..-',
    y => '-.--',    z => '--..',    0 => '-----', 
    1 => '.----',   2 => '..---',   3 => '...--',
    4 => '....-',   5 => '.....',   6 => '-....',
    7 => '--...',   8 => '---..',   9 => '----.'
);

my $phrase = prompt('Your phrase in plain text> ');

$phrase.=lc;
$phrase ~~ s:g/<-[a..z0..9]>/ /;
$phrase ~~ s:g/\s+/ /;
$phrase ~~ s:g/(<[a..z0..9]>)/%code{$0} /;

say $phrase; 

Let us test this on a random phrase:

$ perl6 morse.pl 
Your phrase in plain text> Hello, World!
.... . .-.. .-.. ---  .-- --- .-. .-.. -..  

The conversion table takes the biggest part of the program.

The regexes show how character classes are created in Perl 6.

A characters class with a range of symbols:

<[a..z0..9]>

A negative character class, which matches with any character other than the one from the range:

<-[a..z0..9]>

These character classes list all the allowed characters that can be encoded by the given %code hash. It is also possible to use \w and \W or <alnum> and <!alnum> instead of the above regexes if you are sure that the input string is pure ASCII. All the regexes in the program come with the :g adverb to make them global. Regex matching uses the double tilde ~~ operator for both matching and replacement.

📘 Reading directory content using Perl 6

Print the file names from the current directory.

Reading a directory in Perl 6 can be done using the dir routine defined in the IO::Path class.

say dir();

This tiny program does not do the task really satisfactory, as the dir routine returns a lazy sequence (an object of the Seq data type) of IO::Path objects.

To get the textual file names, take the path part of an IO::Path object using the path method:

.path.say for dir;

The code is equivalent to the more verbose fragment:

for dir() -> $file {
    say $file.path;
}

If you want to print full paths of the files in a directory, use the absolute method:

.absolute.say for dir;

The test named argument of the dir routine allows selecting filenames that match a certain regex, for example, listing all jpeg files:

for dir(test => /\.jpg$/) -> $file {
    say $file.path;
}

📘 The uniq utility written in Perl 6

Create the simple equivalent of the UNIX uniqutility, which only prints the lines from the STDIN input, which are not repeating the previous line.

The solution of this task can be built on the solution of Task 95, The catutility. This time, the entered lines have to be saved, so let’s introduce the $previous variable and make an explicit code block for the loop.

my $previous = '';
while (my $line = $*IN.get) {
    say $line unless $line eq $previous;
    $previous = $line;
}

On each iteration, the next line from the $*IN handle is read and saved in the $line variable. If the value is different from the previous line, which is saved in the $previous variable, then the current line has been printed.

At the moment, only duplicated lines are affected. If the two identical lines are separated by other lines, then everything is printed. Let us modify the program so that it only prints the unique lines per whole input stream.

my %seen;
while (my $line = $*IN.get) {
    next if %seen{$line};
    say $line;
    %seen{$line} = 1;
}

Here, the %seen hash is used as a storage of the lines printed. It’s also a good practice to use a Set object instead; see the example of using a set in Task 54, Exclusion of two arrays.

📘 The cat utility written in Perl 6

Create the equivalent of the UNIX catutility that copies its STDIN input directly to STDOUT.

Reading from the input and sending it to the output is a relatively easy task. The $*IN handle is by default connected to STDIN. Being an object of the IO::Handle type, it has the slurp method that returns the whole input text in one go. What is left to do, is just to print it to the output, which defaults to STDOUT.

Here’s the complete program:

say $*IN.slurp;

This works well, and we can use it via the command line:

$ perl6 cat.pl < file1.txt > file2.txt

However, in the interactive mode, the program does not replicate each entered line, but instead, it waits until the end of the output (say, until you press Ctrl+D). This behaviour is fully explainable because of the use of the slurpmethod.

Let us modify the program so that it prints each line as soon as it is entered. The IO::Handle class has another method, get, which reads one line from the handle and returns it. Create a loop and print the line after it is delivered by the get method:

.say while $_ = $*IN.get;

Here, the default variable $_ is used. This allows to omit creating the new variable and to make the whole program more compact. The .say call in it is a shortcut for $_.say.