Welcome to another sorting episode, this time we’ll talk about Merge sort in Perl 6.
In Merge sort, you first split the data into halves until the pieces become atomic (in the original meaning of the word), that is either each piece contains a single element, or, after the current split, the second part contains no elements.
The second step is two merge pairs of pieces. If the pieces are trivial, you simply attach one element to another in the correct order and get a one- or two-item sublist. As the fragments grow, you have to take the two pieces and create a new list by taking the values from the beginning of each piece so that the final sequence is sorted (numeric or lexicographic, but we only talk about numbers so far).
The first ad-hoc implementation is shown below.
sub merge-sort(@data) { return @data if @data.elems <= 1; sub merge(@l, @r) { my @a; while (@l and @r) { my $v; if @l[0] < @r[0] { $v = @l.shift; } else { $v = @r.shift; } @a.push($v); } @a.push(|@l, |@r); return @a; } my $mid = @data.elems div 2; my @l = @data[^$mid]; my @r = @data[$mid .. *-1]; @l = merge-sort(@l); @r = merge-sort(@r); my @a = merge(@l, @r); return @a; } my @data = 4, 5, 7, 1, 46, 78, 2, 2, 1, 9, 10; @data = merge-sort @data; say @data;
Notice that the data is not sorted in-place. Also notice that the merge procedure is contained in a separate internal function.
The algorithm calls itself iteratively: for each @l
and @r
parts another round of merge-sort
is initiated before the parts are merged via merge(@l, @r)
.
Let us beautify the code as far as it is possible. First, let’s get rid of a separate storage for the halves and move them straight to the function call:
my @l = merge-sort(@data[^$mid]); my @r = merge-sort(@data[$mid .. *-1]);
The body of the merge
function contains a lot of one-line fragments of code, which can be efficiently re-written using a ternary operator.
while (@l and @r) { my $v = @l[0] < @r[0] ?? @l.shift !! @r.shift; @a.push($v); }
Now, the merge
function seems to be redundant, and its actions can be moved to the place where the function is called. At this step, we can also gain some space from inlining temporary assignments and using postfix structures.
sub merge-sort(@data) { return @data if @data.elems <= 1; my $mid = @data.elems div 2; my @l = merge-sort(@data[^$mid]); my @r = merge-sort(@data[$mid .. *-1]); my @a; @a.push(@l[0] < @r[0] ?? @l.shift !! @r.shift) while @l and @r; @a.push(|@l, |@r); return @a; } my @data = 4, 5, 7, 1, 46, 78, 2, 2, 1, 9, 10; @data = merge-sort @data; say @data;
Maybe it is worth give comments about the two push
es. This code gets two arrays, which are already sorted (as they are the result of the merge-sort
call). Each time, you look at the first elements in them, and pick the one which is smaller.
The first push
adds elements to the result while both @l
and @r
parts contain elements. After at least one of them is exhausted, the second push adds the remaining element to the result. The remaining item is always not less than the last taken, as the arrays were already sorted.
While the call stack has been collapsing, the array gets sorted.
The final touch can be done to remove the temporary @a
variable (and—how sad it can be to invent the wheel—make the code look alike its Rosettacode’s counter partner). In Perl 6, there is a pair of routines: take
and gather
, that collect data while you generate it, and then return the whole at once. Here is the modified version of the function:
sub merge-sort(@data) { return @data if @data.elems <= 1; my $mid = @data.elems div 2; my @l = merge-sort(@data[^$mid]); my @r = merge-sort(@data[$mid .. *-1]); return flat(gather { take(@l[0] < @r[0] ?? @l.shift !! @r.shift) while @l and @r; take(@l, @r); }); }
Also notice that flat
before return
allows you not to flatten the data in the second take
.
You are welcome to further play with the code, which is available on GitHub, and merge interesting solutions into it.
One thought on “💡 103. Merge sort in Perl 6”