Sam's Blog
Some simple "white-space trim" benchmarks
Date: Wednesday, 3 March 2010, 15:38.
Categories: perl, ironman, benchmarking, trim, regexp, optimization, basic, tutorial.
Part of the series: Explaining Benchmark Analysis.
Laufeyjarson asked on Monday, about stripping whitespace from both ends of a string. The comments contains lots of suggestions, but no hard figures, so I thought I'd reproduce them here along with the code used to generate the benchmarks - it provides a simple example of how to write a quick and reliable benchmark.
I used a *cough* "simple one-liner" *cough* to run my benchmark from the command-line, I've added some spaces for clarity, but you should still be able to cut-n-paste this into bash (or the shell of your choice) to see the results yourself.
perl -MBenchmark -e '
my $s = " some text ";
my $t;
my %h = (
TWO => sub { $t = $s; $t =~ s/^\s+//; $t =~ s/\s+$//; $t; },
SUBSTITUTE => sub { $t = $s; $t =~ s/^\s*(.*?)\s*$/$1/; $t; },
GROUP_OR => sub { $t = $s; $t =~ s/(^\s+|\s+$)//g; $t; },
NONGROUP_OR => sub { $t = $s; $t =~ s/(?:^\s+|\s+$)//g; $t; },
UNGROUPED_OR => sub { $t = $s; $t =~ s/^\s+|\s+$//g; $t; },
);
foreach my $key ( keys( %h ) )
{
print qq~$key => "~, $h{ $key }->(), qq~"\n~;
}
Benchmark::cmpthese( -5, \%h );'
This code is fairly simple, if ugly (I said it was from a one-liner!), but the important points are:
We set up the test string, making sure it has whitespace inside the body text, so we notice if our regexp goes berserk and trims all whitespace.
We write our test-cases so that they return their result, this makes the benchmark less accurate, but the benefit is that we can test that our code snippets actually work.
We do a single-pass through the snippets and display the results, so we can check they did what they're supposed to. If you've ever leapt from your chair in excitement at finding a tenfold speed increase, only to later find that it's so much faster because it doesn't do anything anymore, you'll understand that this is important.
Finally we run the benchmark via the Benchmark module, for 5 CPU seconds for each snippet. Adjust this to taste.
And the output:
SUBSTITUTE => "some text" TWO => "some text" GROUP_OR => "some text" NONGROUP_OR => "some text" UNGROUPED_OR => "some text" Rate GROUP_OR SUBSTITUTE UNGROUPED_OR NONGROUP_OR TWO GROUP_OR 211544/s -- -15% -28% -29% -67% SUBSTITUTE 248507/s 17% -- -15% -16% -61% UNGROUPED_OR 293694/s 39% 18% -- -1% -54% NONGROUP_OR 296478/s 40% 19% 1% -- -53% TWO 632438/s 199% 154% 115% 113% --
So, we can see that running two s/// is faster than the alternatives, by a quite significant amount: enough to make our benchmark run three times faster than the slowest (two times faster than the next-best) even with the overhead of assigning the variable and returning the variable each time.
Now, whether that speed improvement is important to you is a whole different matter: there's usually a trade-off between performance and maintainability. I'm not sure that applies here, since the two-statement solution is no less readable than the others - if you have trouble understanding the regexp in two parts, you're clearly going to have trouble with the combined versions. The only point of confusion really is why you're doing it as two statements rather than one, in which case:
# Doing this in two statements is faster than one.
$t =~ s/^\s+//;
$t =~ s/\s+$//;
This is what comments are best used for: clarifying your reasoning.
Some further points:
I'm rather abusing the notion of a "one-liner", nevertheless if you remove the line-breaks from the example, you'll have approximately the sort of thing I type at the command-line when I'm curious about this sort of question. You might find it easier to write it as a script rather than a throwaway command but, for better or for worse, that isn't how my mind approaches throwaway problems.
Once you're sure that your regexps are doing what you intended, you can strip out the return of $t and see how that changes the results, it should increase the percentage difference in speeds: yes, it will change the relative performance as well as the absolute performance. This is what I meant about it making the benchmark less accurate.
The SUBSTITUTE example has a subtle gotcha in that it doesn't use the /s switch, so if you're using it to trim the whitespace off the front and end of a multi-line block of text, you might not get the behaviour you were after when the (.*?) doesn't match across the line-endings. If we'd used a different test string, with a line-break in the middle, this would have been obvious - so while the choice of test string tried to anticipate problems, it didn't get all of them. I've left the example as-is to illustrate this point, a revised example is below.
Revised example:
perl -MBenchmark -e '
my $s = " some\n text ";
my $t;
my %h = (
TWO => sub { $t = $s; $t =~ s/^\s+//; $t =~ s/\s+$//; $t; },
SUBSTITUTE => sub { $t = $s; $t =~ s/^\s*(.*?)\s*$/$1/; $t; },
SUBSTITUTE_S => sub { $t = $s; $t =~ s/^\s*(.*?)\s*$/$1/s; $t; },
GROUP_OR => sub { $t = $s; $t =~ s/(^\s+|\s+$)//g; $t; },
NONGROUP_OR => sub { $t = $s; $t =~ s/(?:^\s+|\s+$)//g; $t; },
UNGROUPED_OR => sub { $t = $s; $t =~ s/^\s+|\s+$//g; $t; },
);
foreach my $key ( keys( %h ) )
{
print qq~$key => "~, $h{ $key }->(), qq~"\n~;
}
Benchmark::cmpthese( -5, \%h );'
And the output:
SUBSTITUTE => " some text " TWO => "some text" SUBSTITUTE_S => "some text" GROUP_OR => "some text" NONGROUP_OR => "some text" UNGROUPED_OR => "some text" Rate GROUP_OR SUBSTITUTE_S UNGROUPED_OR NONGROUP_OR SUBSTITUTE TWO GROUP_OR 196921/s -- -9% -21% -22% -32% -68% SUBSTITUTE_S 216958/s 10% -- -13% -14% -25% -65% UNGROUPED_OR 248735/s 26% 15% -- -1% -14% -60% NONGROUP_OR 252084/s 28% 16% 1% -- -13% -59% SUBSTITUTE 288192/s 46% 33% 16% 14% -- -53% TWO 616676/s 213% 184% 148% 145% 114% --
Note that SUBSTITUTE didn't actually strip anything at all. Whoops. Given it's now the second-fastest implementation, we could have been mislead if we hadn't been checking it actually worked.
So, in conclusion, points to remember when benchmarking:
Choose your test input carefully to make sure it includes edge-cases.
Check that the benchmarked code is giving the right results - fast, broken code isn't going to do you any good at all.
Always be aware if you're making a trade-off of clarity for performance, and be sure it's worth the trade-off. If you're looking at a two microsecond saving, you need to be calling it a lot to have any real impact. (I first investigated this subject while writing the Template::Sandbox module and even with extremely heavy amounts of white-space trimming it had minor impact.)
Outside the scope of this article, but also vary your test input and analyze how the performance changes to reveal the strengths and weaknesses of each approach. Understanding why one approach is slower than the other will teach you more than just knowing which is fastest, and knowing that the situation may change with different input is even more valuable.
This blog entry is part of the series: Explaining Benchmark Analysis.
- Some simple "white-space trim" benchmarks
- Advanced Benchmark Analysis I: Yet more white-space trimming
- Advanced Benchmark Analysis II: Probing strengths and weaknesses