Data::Munge - various utility functions
use Data::Munge; my $re = list2re qw/f ba foo bar baz/; # $re = qr/bar|baz|foo|ba|f/; print byval { s/foo/bar/ } $text; # print do { my $tmp = $text; $tmp =~ s/foo/bar/; $tmp }; foo(mapval { chomp } @lines); # foo(map { my $tmp = $_; chomp $tmp; $tmp } @lines); print replace('Apples are round, and apples are juicy.', qr/apples/i, 'oranges', 'g'); # "oranges are round, and oranges are juicy." print replace('John Smith', qr/(\w+)\s+(\w+)/, '$2, $1'); # "Smith, John" my $trimmed = trim " a b c "; # "a b c" my $x = 'bar'; if (elem $x, [qw(foo bar baz)]) { ... } # executes: $x is an element of the arrayref my $contents = slurp $fh; # or: slurp *STDIN # reads all data from a filehandle into a scalar eval_string('print "hello world\\n"'); # says hello eval_string('die'); # dies eval_string('{'); # throws a syntax error my $fac = rec { my ($rec, $n) = @_; $n < 2 ? 1 : $n * $rec->($n - 1) }; print $fac->(5); # 120 if ("hello, world!" =~ /(\w+), (\w+)/) { my @captured = submatches; # @captured = ("hello", "world") }
This module defines a few generally useful utility functions. I got tired of redefining or working around them, so I wrote this module.
Converts a list of strings to a regex that matches any of the strings. Especially useful in combination with keys. Example:
keys
my $re = list2re keys %hash; $str =~ s/($re)/$hash{$1}/g;
This function takes special care to get several edge cases right:
Empty list: An empty argument list results in a regex that doesn't match anything.
Empty string: An argument list consisting of a single empty string results in a regex that matches the empty string (and nothing else).
Prefixes: The input strings are sorted by descending length to ensure longer matches are tried before shorter matches. Otherwise list2re('ab', 'abcd') would generate qr/ab|abcd/, which (on its own) can never match abcd (because ab is tried first, and it always succeeds where abcd could).
list2re('ab', 'abcd')
qr/ab|abcd/
abcd
ab
Takes a code block and a value, runs the block with $_ set to that value, and returns the final value of $_. The global value of $_ is not affected. $_ isn't aliased to the input value either, so modifying $_ in the block will not affect the passed in value. Example:
$_
foo(byval { s/!/?/g } $str); # Calls foo() with the value of $str, but all '!' have been replaced by '?'. # $str itself is not modified.
Since perl 5.14 you can also use the /r flag:
/r
foo($str =~ s/!/?/gr);
But byval works on all versions of perl and is not limited to s///.
byval
s///
Works like a combination of map and byval; i.e. it behaves like map, but $_ is a copy, not aliased to the current element, and the return value is taken from $_ again (it ignores the value returned by the block). Example:
map
my @foo = mapval { chomp } @bar; # @foo contains a copy of @bar where all elements have been chomp'd. # This could also be written as chomp(my @foo = @bar); but that's not # always possible.
Returns a list of the strings captured by the last successful pattern match. Normally you don't need this function because this is exactly what m// returns in list context. However, submatches also works in other contexts such as the RHS of s//.../e.
m//
submatches
s//.../e
A clone of javascript's String.prototype.replace. It works almost the same as byval { s/REGEX/REPLACEMENT/FLAG } STRING, but with a few important differences. REGEX can be a string or a compiled qr// object. REPLACEMENT can be a string or a subroutine reference. If it's a string, it can contain the following replacement patterns:
String.prototype.replace
byval { s/REGEX/REPLACEMENT/FLAG } STRING
qr//
Inserts a '$'.
Inserts the matched substring.
Inserts the substring preceding the match.
Inserts the substring following the match.
Inserts the substring matched by the Nth capturing group.
Note that these aren't variables; they're character sequences interpreted by replace.
replace
If REPLACEMENT is a subroutine reference, it's called with the following arguments: First the matched substring (like $& above), then the contents of the capture buffers (as returned by submatches), then the offset where the pattern matched (like $-[0], see "@-" in perlvar), then the STRING. The return value will be inserted in place of the matched substring.
$&
$-[0]
Normally only the first occurrence of REGEX is replaced. If FLAG is present, it must be 'g' and causes all occurrences to be replaced.
'g'
Returns STRING with all leading and trailing whitespace removed. Like length it returns undef if the input is undef.
length
undef
Returns a boolean value telling you whether SCALAR is an element of ARRAYREF or not. Two scalars are considered equal if they're both undef, if they're both references to the same thing, or if they're both not references and eq to each other.
eq
This is implemented as a linear search through ARRAYREF that terminates early if a match is found (i.e. elem 'A', ['A', 1 .. 9999] won't even look at elements 1 .. 9999).
elem 'A', ['A', 1 .. 9999]
1 .. 9999
Evals STRING just like eval but doesn't catch exceptions. Caveat: Unlike with eval the code runs in an empty lexical scope:
eval
my $foo = "Hello, world!\n"; eval_string 'print $foo'; # Dies: Global symbol "$foo" requires explicit package name
That is, the eval'd code can't see variables from the scope of the eval_string call.
eval_string
Reads and returns all remaining data from FILEHANDLE as a string, or undef if it hits end-of-file. (Interaction with non-blocking filehandles is currently not well defined.)
slurp $handle is equivalent to do { local $/; scalar readline $handle }.
slurp $handle
do { local $/; scalar readline $handle }
Creates an anonymous sub as sub BLOCK would, but supplies the called sub with an extra argument that can be used to recurse:
sub BLOCK
my $code = rec { my ($rec, $n) = @_; $rec->($n - 1) if $n > 0; print $n, "\n"; }; $code->(4);
That is, when the sub is called, an implicit first argument is passed in $_[0] (all normal arguments are moved one up). This first argument is a reference to the sub itself. This reference could be used to recurse directly or to register the sub as a handler in an event system, for example.
$_[0]
A note on defining recursive anonymous functions: Doing this right is more complicated than it may at first appear. The most straightforward solution using a lexical variable and a closure leaks memory because it creates a reference cycle. Starting with perl 5.16 there is a __SUB__ constant that is equivalent to $rec above, and this is indeed what this module uses (if available).
__SUB__
$rec
However, this module works even on older perls by falling back to either weak references (if available) or a "fake recursion" scheme that dynamically instantiates a new sub for each call instead of creating a cycle. This last resort is slower than weak references but works everywhere.
Lukas Mai, <l.mai at web.de>
<l.mai at web.de>
Copyright 2009-2011, 2013-2015, 2023 Lukas Mai.
This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.
See https://dev.perl.org/licenses/ for more information.
To install Data::Munge, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Data::Munge
CPAN shell
perl -MCPAN -e shell install Data::Munge
For more information on module installation, please visit the detailed CPAN module installation guide.