Unicode/charname.pm and regular expressions

James Coupe james at zephyr.org.uk
Sun Mar 27 17:04:00 BST 2011


On 27 March 2011 15:50, Peter Corlett <abuse at cabal.org.uk> wrote:

> Hi,
>
> I've found this rather odd interaction between charnames.pm and regular
> expressions. I discovered this when I wanted to build up a complex regex
> incrementally. I'm running Debian vendor Perl, i.e. 5.10.1.
>
> This code:
>
> perl -Mcharnames=:full -e 'my $foo = qr/\N{EM DASH}/; my $bar =
> qr/$foo$foo/; "whatever" =~ $bar'
>
> keels over with 'Constant(\N{EM DASH}) unknown: (possibly a missing "use
> charnames ...") in regex' when the match is attempted. I am of course using
> charnames.
> ...
> I don't think it's unreasonable for me to expect the first version to work.
> Have I tripped over an actual bug in Perl, or is there something I
> misunderstand about Perl regexes and Unicode?
>

Is that the same as this bug from 2008?

http://www.nntp.perl.org/group/perl.perl5.porters/2008/07/msg138502.html

The issue appears to be a change in stringification, to prevent \N{ }
constructs turning into metacharacters.

Elsewhere in that thread, Andreas points out the change that appeared to
break it:
http://www.nntp.perl.org/group/perl.perl5.porters/2008/06/msg138241.html

James.


More information about the london.pm mailing list