So Sed Perl

Perl’s supposedly shell scripting on steroids- a swiss army chainsaw of sorts, they say.

All very exciting, yes, but despite having written a couple of reasonably large text-processing scripts with Perl (~50 lines), I could never get it to emulate the brevity and conciseness of Sed, the non-interactive text editor that is known to cause searing pain in the temples (and assault you with garbled line noise.) Sed, incorporating the power of regular expressions, allows for fantastically obscure edits in text files with just a few characters of input. The “conditional disemvowelment” Sed command

$ sed -e '/^[a-z]/ s_[aeiouAEIOU]__g' testfile.txt

can be emulated with Perl, but it’s a near ten line script that’s nearly impossible to type correctly at the command line in one attempt:

$ perl -e ' open TEST, "testfile.txt"; while <TEST> { if (m/^[a-z]/){ s/[aeiouAEIOU]//g; } print; }'

Painful. A few more hacks bring the number of lines down to five (while keeping the script readable), but it still can’t hold a candle to Sed. Which is somewhat disappointing considering the touted claims of Perl-makes-sed/awk/sh-obsolete. And then, after six months of using Perl, I read the manpage.

Apparently, there is more than one way to do this.

  • The -p & -n flags shroud the Perl command at the command input in a loop over the lines of the file passed as argument to it:

$ perl -n -e '#yourscript' testfile.txt

is equivalent to

$ perl -e ' open TEST,"testfile.txt"; while (<TEST>){ #yourscript }'

  • Even better,

$ perl -p -e '#yourscript' testfile.txt

expands to

$ perl -e ' open TEST, "testfile.txt"; while (<TEST>){ #yourscript } continue { print; }'

  • There’s more; the -a flag performs a split on $_ on a whitespace, the -F/regexp/ splits $_ on /regexp/, and both store the result of the split in @F. -F/regexp/ is equivalent to

$ perl -e ' while (<>){ @F = split(/regexp/); #yourscript }'

And so Perl does emulate Sed after all. The conditional disemvowelment reduces to:

$ perl -p -e 's/[aeiouAEIOU]//g if (m/^[a-z]/);' testfile.txt

A few more neat constructions:

  • Mass renaming: Rename files matching in-regexp to out-regexp

$ ls | perl -p -e 's/in-regexp/mv "$&" "out-regexp"/' | sh

  • Emulating grep:

$ perl -n -e 'print if (m/regexp/);' testfile.txt

More info here and here.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s