Regexes For Life

Archive for the ‘Open questions’ Category

Obscurantism

without comments

Seinfeld (the sitcom) was filled to the brim with self-reflexive scenarios. On a number of occasions, Costanza and Seinfeld engage in a conversation on their conversation- but the one occasion this was particularly striking  was when the two brainstorm to find an idea for the sitcom Jerry. (The irony.) The series nested itself in a self-deprecating public display of self-awareness, exemplified by the instance where Costanza utters in a moment of ennui: “This, this is what the show’s about! The show, is about nothing.”- breaking the fourth wall in the process (and, some might add, exposing it to be the tripe it is).

From mathworld, a corky, off-beat answer to a question from the Google Labs Aptitude Test:

19. ‘Tis known in refined company, that choosing K things out of N can be done in ways as many as choosing N minus K from N: I pick K, you the remaining.

Find though a cooler bijection, where you show a knack uncanny, of making your choices contain all K of mine.  Oh, for pedantry: let K be no more than half N.

Answer: ‘Tis more problematic to disentangle semantic meaning precise from the this paragraph of verbiage peculiar.

At this point, some exposition might be in order. What can a satirical in-joke from a TV series have in common with a well known binomial coefficient rule (and sentence construction that would spoof a recursive transition network)?

What, indeed.

Read the rest of this entry »

Written by Karthik

August 26, 2008 at 5:59 pm

Posted in Open questions

Tagged with , ,

Recursive Regexes

without comments

…have been troubling me for days now.

The Regex engine maintains a stack, so regular expressions in Perl can be constructed recursively. For instance,

$paren = qr/ \( ( [^()]+ # Not parens | (??{$paren}) # Another balanced group )* \) /x;

matches a string if parentheses in the string are paired correctly. (Taken from Regexp Power.)

That funny (??{}) (apparently) keeps qr/ / from interpolating the variable in the funny brackets when the expression is compiled- saving the interpolation for when the matching is carried out, which sends it recursing deeper.

So here’s the question:

A palindrome is a nested set of palindromes. That is, a [ palindrome - {first character, last character} ] is a palindrome again, and so forth until there is (i) nothing, or (ii) one character left. Cases (i) and (ii) are palindromes by definition. So how do you write a regex that matches any palindrome?

The best I could do was:

$palin = qr/ (.)? #Any character or no character, captured (??{$palin})? #A palindrome, not interpolated yet \1 #Character captured in line 1 /x;

But this doesn’t work. Any ideas?

Written by Karthik

August 18, 2008 at 5:14 pm

Posted in Open questions

Tagged with , ,