Sam's Blog

Advanced Benchmark Analysis I: Yet more white-space trimming

Date: Friday, 5 March 2010, 09:43.

Categories: perl, ironman, benchmarking, analysis, trim, regexp, optimization, intermediate, tutorial.

Part of the series: Explaining Benchmark Analysis.

Seems my previous blog, "Some simple "white-space trim" benchmarks" caught people's attention, and I've received some interesting suggestions and observations worthy of a followup article, this also gives me the chance to delve into explaining more advanced benchmark analysis.

So, deep breath, here goes.

Firstly, Aaron Sherman proposed an interesting twist, by using a substr() assign to shorten the string, so the faster m// could be used instead of s///.

( $t =~ /^\s+/ ) && ( substr( $t, 0, $+[ 0 ] ) = "" );
( $t =~ /\s+$/ ) && ( substr( $t, $-[ 0 ] )    = "" );

I'm sure he'll forgive me for saying this does nothing for code clarity, but it's an interesting angle to have tried, and shows that sometimes approaching the problem from another angle can give you results.

Secondly, Lars Dieckow made the observation that CPAN provides, an XS implementation, which is apparantly pretty fast. (Apologies to Lars if I've mangled his name, I couldn't get the unicode bits into this article.)

I'm ashamed to admit, I'd completely forgotten about String::Strip. Now that my memory has been jogged, I remember looking at it last year. The documentation didn't seem to imply it made much of an improvement, and I didn't bother (shame on me) to check myself. The gain didn't seem worth an extra dependency for Template::Sandbox.

So, let's take a look at our revised "one-liner".

perl -MBenchmark -MString::Strip=StripLTSpace -e '
my $s = "   some\n text   ";
my $t;
my %h = (
    CTRL    => sub { $t = $s;                                 $t; },
    TWO     => sub { $t = $s; $t =~ s/^\s+//; $t =~ s/\s+$//; $t; },
    CAPTURE => sub { $t = $s; $t =~ s/^\s*(.*?)\s*$/$1/s;     $t; },
    G_OR    => sub { $t = $s; $t =~ s/(^\s+|\s+$)//g;         $t; },
    NG_OR   => sub { $t = $s; $t =~ s/(?:^\s+|\s+$)//g;       $t; },
    UG_OR   => sub { $t = $s; $t =~ s/^\s+|\s+$//g;           $t; },
    STRIP   => sub { $t = $s; StripLTSpace( $t );             $t; },
    SUBSTR  => sub { $t = $s;
        ( $t =~ /^\s+/ ) && ( substr( $t, 0, $+[ 0 ] ) = "" );
        ( $t =~ /\s+$/ ) && ( substr( $t, $-[ 0 ] )    = "" );
        $t; },
foreach my $key ( keys( %h ) )
    print qq~$key => "~, $h{ $key }->(), qq~"\n~;
Benchmark::cmpthese( -5, \%h );'

Note that you'll need to install String::Strip before running this command. ("cpan String::Strip" should do the trick.)

I've renamed the code snippets so that the benchmark table doesn't wrap, they're slightly less understandable, but it's an improvement over the table getting messed-up over several lines.

I've also removed the broken "missing /s" version of the capture method, now that we've demonstrated that it doesn't work.

Finally, I've also added the CTRL snippet as "the control group" which, as any good scientist will tell you, provides a baseline to correct for the overheads/background-noise/etc of the experiment. This will be important later.

The results:

G_OR => "some text" TWO => "some text" SUBSTR => "some text" NG_OR => "some text" CTRL => " some text " CAPTURE => "some text" STRIP => "some text" UG_OR => "some text" Rate G_OR SUBSTR NG_OR UG_OR CAPTURE TWO STRIP CTRL G_OR 202098/s -- -12% -19% -20% -21% -68% -84% -97% SUBSTR 228861/s 13% -- -8% -9% -10% -63% -81% -96% NG_OR 249683/s 24% 9% -- -1% -2% -60% -80% -96% UG_OR 251118/s 24% 10% 1% -- -2% -60% -80% -96% CAPTURE 254970/s 26% 11% 2% 2% -- -59% -79% -96% TWO 625570/s 210% 173% 151% 149% 145% -- -49% -89% STRIP 1228842/s 508% 437% 392% 389% 382% 96% -- -79% CTRL 5930490/s 2834% 2491% 2275% 2262% 2226% 848% 383% --

The first thing to stand out is just how much faster String::Strip is, rather a lot more than the modest 35% its own documentation lays claim to. It's about twice as fast as the next-best implementation, so the documentation appears to be doing quite a job of underselling itself.

Secondly, the substr() approach doesn't fare too well with this input. Aaron had seen higher results in his benchmarks than I'm seeing here, and the reason is that he'd changed the input. We'll investigate that in part two of this blog, as it makes for a good example of my recommendation to understand how changing the input changes the performance.

Third thing to note is the CTRL benchmark, this is the speed of the function that only does the set-up work and result return, but no white-space trimming: this gives us an estimate of how much overhead we're incurring on each test.

We can see that it's managing a fast figure, but what does it mean?

Well you need to subtract the performance of the control from the other benchmarks if you want to have a more accurate result, which should be obvious, but what isn't neccessarily obvious is just what to subtract and from what.

This is because the benchmarks are given as runs-per-second rather than seconds-per-run, if we had seconds-per-run values we could just subtract the control value directly from each benchmark.

Basic maths should tell you that runs-per-second and seconds-per-run are the reciprocal of each other, so to get the corrected runs-per-second we need the reciprocal of the difference of the reciprocals of the uncorrected runs-per-second.

Yes, that's a mouthful, and no, I can't easily work out 1 / ( ( 1 / 1228842 ) - ( 1 / 5930490 ) ) at a glance either.

Fortunately there's an easy way to approximate it in your head, you can say that 5920490 is, waves hands, roughly 5 times bigger than 1228842, so 1/5920490 is therefore 1/5th of 1/1228842, and so subtracting it makes 1/1228842 20% smaller, which means that it makes the reciprocal 20% bigger... more hand-waving.

The short of it is, take the multiple, turn it on its head to get a fraction, express the fraction as a percentage and that roughly gives you the error margin in your benchmark caused by your overhead.

We can see that for STRIP we're getting roughly a 20% error, which isn't insignificant, but given that it outperforms the nearest rival by 100%, it only influences our results if we're caring about "just exactly how much faster is it?".

The next fastest result is TWO, with 625570/s, which is 1/10th of the rate of our control, so roughly a 10% error. 10% is about the variance we're getting from only running the benchmarks for 3 CPU seconds, so we can safely ignore that entirely, and since the influence of the overhead only diminishes for the slower results we can ignore it for those too.

So, as far as our results go, the overhead isn't having an impact that we care about, this is good to know, because it gives us confidence to draw conclusions about the relative performances without worry that the overhead is distorting those relationships.

If we'd been unfortunate and the overhead was distorting the results significantly, we'd want to rewrite our benchmarking to eliminate the overhead automatically. How to do that will be the topic of a future blog. (Edit: This topic has now been covered in "Monkey-patching to auto-correct custom controls".)

Now, returning to our results, I'll freely confess that I was suspicious of the speed of String::Strip and then I noticed that the documentation specifically only mentions removing spaces. "Aha!", I thought.

So, I tweaked the input.

my $s = " \n some\n text \t ";

I was pleasantly surprised by the result.

G_OR => "some text" TWO => "some text" SUBSTR => "some text" NG_OR => "some text" CTRL => " some text " CAPTURE => "some text" STRIP => "some text" UG_OR => "some text" Rate G_OR SUBSTR UG_OR NG_OR CAPTURE TWO STRIP CTRL G_OR 201332/s -- -12% -20% -20% -20% -68% -83% -96% SUBSTR 229227/s 14% -- -9% -9% -9% -63% -80% -95% UG_OR 250638/s 24% 9% -- -0% -1% -60% -78% -95% NG_OR 250638/s 24% 9% 0% -- -1% -60% -78% -95% CAPTURE 252560/s 25% 10% 1% 1% -- -60% -78% -95% TWO 626429/s 211% 173% 150% 150% 148% -- -46% -86% STRIP 1160674/s 476% 406% 363% 363% 360% 85% -- -75% CTRL 4593969/s 2182% 1904% 1733% 1733% 1719% 633% 296% --

Seems String::Strip isn't just faster than its own documentation, but it also trims all white-space types, not just the professed literal spaces.

Let's round up our conclusions for today:

  • Establishing a control helps you account for the overhead in setting up your benchmark. In its crudest form it lets you use a rule-of-thumb to estimate whether it's having an influence that you can dismiss, or whether you need to get more sophisticated.

  • Aaron's method shows that there's more than one way to approach the problem, sometimes you have more options than are obvious if you've got stuck in the mentality that you must use s///.

  • Lars shows that CPAN is your friend. It's always worth a look there: even if someone hasn't solved your problem exactly, you'll probably find something worth stealing being inspired by.

  • Module documentation is unreliable: don't believe what the author tells you, test it. Module authors can understate the benefits of their module just as readily as they can overstate them. What may have held true when they wrote the documentation, may no longer be true. What holds true on their machine may not hold true for yours.

In my next blog we'll return to the performance discrepency that Aaron noticed and subject it to deeper analysis, showing that your choice of input can vastly influence the results of your benchmarks and the conclusions you draw from them.

We'll also cover how to avoid that trap.

This blog entry is part of the series: Explaining Benchmark Analysis.

  1. Some simple "white-space trim" benchmarks
  2. Advanced Benchmark Analysis I: Yet more white-space trimming
  3. Advanced Benchmark Analysis II: Probing strengths and weaknesses

Browse Sam's Blog Subscribe to Sam's Blog

By day of March: 03, 05, 09, 10, 18, 27, 31.

By month of 2010: March, April, May, June, July, August, September, November.

By year: 2010, 2011, 2012, 2013.

Or by: category or series.


blog comments powered by Disqus
© 2009-2013 Sam Graham, unless otherwise noted. All rights reserved.