Sam's Blog entries for April 2010

Did you mean +, not *, in that regexp?

Date: Wednesday, 28 April 2010, 13:36.

Categories: perl, ironman, regexp, craft, basic, tutorial.

Continuing from my previous article "Anchoring Regexps", another common regexp mistake I see is use of * where the author really meant +.

So today I cover + and *: what's the difference and why does it matter?

Mangled Ironman Feed?

Date: Sunday, 25 April 2010, 20:43.

Categories: perl, ironman.

Is it just me or has the ironman feed started mangling some people's posts?

For example, this is supposed to be the second paragraph, not mangled into the first. It also seems to be stripping urls from those people's posts too.

It used to work fine, I hope it's just a temporary artefact of the new ironman system, because it's horrible to have your nicely-formatted blog intro mangled into an unreadable wall of text.

Unfortunately I can't tell, there's no list of known issues or anything else about the ironman feed on the ironman site.

Anchoring Regexps

Date: Thursday, 22 April 2010, 15:14.

Categories: perl, ironman, regexp, craft, optimization, basic, tutorial.

A common mistake I find whenever I look at someone else's regexps, is a failure to anchor the regexp.

This is often, in my experience, the single biggest thing you can do to improve the performance of a regexp: it's one of those things you should learn to do in every regexp where applicable, which should be almost every regexp unless you're specifically looking for "something somewhere in the middle but I don't know where".

So, what is anchoring, and why does it have such a big impact?

Text::Matrix.pm Released

Date: Wednesday, 14 April 2010, 17:31.

Categories: perl, ironman, text-matrix, template-benchmark, qa.

Even with twenty thousand distributions on CPAN, a figure that should truly boggle the mind, I'm still often surprised to find myself trying to do something reasonably basic that hasn't been covered already.

While writing Template::Benchmark I wanted to lay out a "feature matrix", a matrix of template engines and the features they supported: a simple grid of Y/N characters in even, regular spacing.

To my surprise none of the CPAN table modules covered this, they'd all force the layout to depend on the width of the column labels or try to force the column labels to wrap at single-character width. So I hacked together some ugly code myself and got on with writing the rest of Template::Benchmark.

As with all ugly one-off code though, you find yourself wanting to use it elsewhere and constrained by how un-resuable it is.

Well, I did what I should have done in the first place, I made it Text::Matrix - Text table layout for matrices of short regular data.

Only a beta 0.99_01 release, but should be hitting a CPAN mirror near you shortly if it isn't there already.

See below the cut for example output and more details.

As part of my development environment for Template::Sandbox, I maintain a suite of regression benchmarks, using Template::Benchmark against all previous versions of the distribution.

While a crude tool, it's something I find useful and thought I'd use this week's column to share how I automated as much pain away as I could.

Browse Sam's Blog Subscribe to Sam's Blog

By day of April: 07, 14, 22, 25, 28.

By month of 2010: March, April, May, June, July, August, September, November.

By year: 2010, 2011, 2012, 2013.

Or by: category or series.

© 2009-2013 Sam Graham, unless otherwise noted. All rights reserved.