escape the word in regexp

escape the word in regexp - regex

I have a string like this
$str = '"filename","lf","$data","{ }",0';
How to remove all " from the string?
I tried to use this kind of regexp:
$str =~ s/"(.+?)"//s;
It should match the word and remove "-s

you can do it like this $string =~ s/\"//g;

Your $str looks lke you're dealing with a CSV file. Saddam's answer will work for most cases of course, but if you're really working with a .csv file, then I suggest that you use an actual parser like Text::CSV. That way if there are commas embedded in your double quoted valeus, they'll be handled properly:
use Text::CSV;
use strict;
use warnings;
my $csv = Text::CSV->new();
my $str = '"filename","lf","$data","{ }",0';
$csv->parse($str);
my #columns = $csv->fields();
use Data::Dump;
dd \#columns;

Related

Perl: regex for conditional replace?

in this string
ab<(CN)cdXYlm<(CI)efgXYop<(CN)zXYklmn<(CI)efgXYuvw<
I want to replace each substring between XY and < by either ONE or TWO depending on characters between previous brackets:
if XY after (CN) replace substring by ONE
if XY after (CI) replace substring by TWO
So the result should be:
ab<(CN)cdONE<(CI)efgTWO<(CN)zONE<(CI)efgTWO<
XY and following characters should be replaced but not angle bracket <.
This is for modifying HTML and arbitrary characters can occur between XY and <.
I guess I need two regex for (CN) and (CI).
# This one replaces just all XY:
my $s = 'ab<(CN)cdXYlm<(CI)efgXYop<(CN)zXYklmn<(CI)efgXYuvw<';
$s =~ s/(XY(.*?))</ONE/g;
# But how to add the conditions to the regex?

You don't need two regexes. Capture the C[NI] and retrieve the corresponding replacement value from a hash:
#!/usr/bin/perl
use warnings;
use strict;
my $s = 'ab<(CN)cdXYlm<(CI)efgXYop<(CN)zXYklmn<(CI)efgXYuvw<';
my %replace = (CN => 'ONE', CI => 'TWO');
$s =~ s/(\((C[NI])\).*?)XY.*?</$1$replace{$2}</g;
my $exp = 'ab<(CN)cdONE<(CI)efgTWO<(CN)zONE<(CI)efgTWO<';
use Test::More tests => 1;
is $s, $exp;

My guess is that this expression or maybe a modified version of that might work, not sure though:
([a-z]{2}<\([A-Z]{2}\)[a-z]{2})([^<]+)(<\([A-Z]{2}\)[a-z]{3})([^<]+)(<\([A-Z]{2}\)[a-z])([^<]+)(<\([A-Z]{2}\)[a-z]{3})([^<]+)<
Test
use strict;
use warnings;
my $str = 'ab<(CN)cdXYlm<(CI)efgXYop<(CN)zXYklmn<(CI)efgXYuvw<';
my $regex = qr/([a-z]{2}<\([A-Z]{2}\)[a-z]{2})([^<]+)(<\([A-Z]{2}\)[a-z]{3})([^<]+)(<\([A-Z]{2}\)[a-z])([^<]+)(<\([A-Z]{2}\)[a-z]{3})([^<]+)</mp;
my $subst = '"$1ONE$3TWO$5ONE$7TWO<"';
my $result = $str =~ s/$regex/$subst/rgee;
print $result;
The expression is explained on the top right panel of this demo, if you wish to explore/simplify/modify it, and in this link, you can watch how it would match against some sample inputs step by step, if you like.

This can be done in one line regex using /e and ternary operator ? in the /replace/.
/r option returns the resulting string, in effect this would keep the original string $s unmodified.
use strict;
use warnings;
my $s ='ab<(CN)cdXYlm<(CI)efgXYop<(CN)zXYklmn<(CI)efgXYuvw<';
print (($s=~s/\(([^)]+)\)([^(]+)XY[^(]+</"($1)$2".(($1 eq CN)?ONE:TWO)."<"/gre)."\n");
Output:
ab<(CN)cdONE<(CI)efgTWO<(CN)zONE<(CI)efgTWO<

Extract only pattern matched text

I have written a basic program using regular expression.
However the entire line is being returned instead of the matched part.
I want to extract the number only.
use strict;
use warnings;
my $line = "ABMA 1234";
$line =~ /(\s)(\d){4}/;
print $line; #prints *ABMA 1234*
Is my regular expression incorrect?

If you want to print 1234, you need to change your regex and print the 2nd match:
use strict;
use warnings;
my $line = "ABMA 1234";
$line =~ /(\s)(\d{4})/;
print $2;

You can replace the exact value with the corresponding values. And your are not removing the text \w;
use strict;
use warnings;
my $line = "ABMA 1234";
$line=~s/([A-z]*)\s+(\d+)/$2/;
print $line; #prints only 1234
If you want to store the value in the new string then
(my $newstring = $line)=~s/([A-z]*)\s+(\d+)/$2/;
print $newstring; #prints only 1234
Just try this:

I don't know how you output the match in perl but you can use below regex for output the full match in your regex, you might getting space appended with your result in your current regex.
\b[\d]{4}
DEMO

Perl regex return matches from substitution

I am trying to simultaneously remove and store (into an array) all matches of some regex in a string.
To return matches from a string into an array, you could use
my #matches = $string=~/$pattern/g;
I would like to use a similar pattern for a substitution regex. Of course, one option is:
my #matches = $string=~/$pattern/g;
$string =~ s/$pattern//g;
But is there really no way to do this without running the regex engine over the full string twice? Something like
my #matches = $string=~s/$pattern//g
Except that this will only return the number of subs, regardless of list context. I would also take, as a consolation prize, a method to use qr// where I could simply modify the quoted regex to to a sub regex, but I don't know if that's possible either (and that wouldn't preclude searching the same string twice).

Perhaps the following will be helpful:
use warnings;
use strict;
my $string = 'I thistle thing am thinking this Thistle a changed thirsty string.';
my $pattern = '\b[Tt]hi\S+\b';
my #matches;
$string =~ s/($pattern)/push #matches, $1; ''/ge;
print "New string: $string; Removed: #matches\n";
Output:
New string: I am a changed string.; Removed: thistle thing thinking this Thistle thirsty

Here is another way to do it without executing Perl code inside the substitution. The trick is that the s///g will return one capture at a time and undef if it does not match, thus quitting the while loop.
use strict;
use warnings;
use Data::Dump;
my $string = "The example Kenosis came up with was way better than mine.";
my #matches;
push #matches, $1 while $string =~ s/(\b\w{4}\b)\s//;
dd #matches, $string;
__END__
(
"came",
"with",
"than",
"The example Kenosis up was way better mine.",
)

How to pull out every comma, hyphen, underscore, space, and joining remaining words without spaces?

I guess this would be a rather long regular expression, but is there a way to takeout underscores, spaces, commas, and hyphens from a string and then join the words together in perl?
Example:
_Car - Eat, Tree
Becomes:
CarEatTree

You can use a simple substitution:
$string =~ s/[_ ,-]//g;

This can also be done without regular expressions: Transliterate: tr///
use warnings;
use strict;
my $s = '_Car - Eat, Tree';
$s =~ tr/_ ,\-//d;
print "$s\n";
__END__
CarEatTree

If you're looking to strip any punctuation, you can always use s/[[:punct:]]//g

search for [_, -] and replace with the empty string ""
$str = "_Car - Eat, Tree";
$str =~ s/[_, -]//g;

my $str = '_Car - Eat, Tree';
$str =~ s/[\_\-\,\s]*//g;

Using the transliteration operator with (d)elete the (c)omplement;
#!/usr/bin/perl
use strict;
use warnings;
use 5.012;
my $str = '_Car - Eat, Tree';
$str =~ tr/a-zA-Z//cd;
print $str;
__END__
C:\Old_Data\perlp>perl t6.pl
CarEatTree

How can I find all matches to a regular expression in Perl?

I have text in the form:
Name=Value1
Name=Value2
Name=Value3
Using Perl, I would like to match /Name=(.+?)/ every time it appears and extract the (.+?) and push it onto an array. I know I can use $1 to get the text I need and I can use =~ to perform the regex matching, but I don't know how to get all matches.

A m//g in list context returns all the captured matches.
#!/usr/bin/perl
use strict; use warnings;
my $str = <<EO_STR;
Name=Value1
Name=Value2
Name=Value3
EO_STR
my #matches = $str =~ /=(\w+)/g;
# or my #matches = $str =~ /=([^\n]+)/g;
# or my #matches = $str =~ /=(.+)$/mg;
# depending on what you want to capture
print "#matches\n";
However, it looks like you are parsing an INI style configuration file. In that case, I will recommend Config::Std.

my #values;
while(<DATA>){
chomp;
push #values, /Name=(.+?)$/;
}
print join " " => #values,"\n";
__DATA__
Name=Value1
Name=Value2
Name=Value3

The following will give all the matches to the regex in an array.
push (#matches,$&) while($string =~ /=(.+)$/g );

Use a Config:: module to read configuration data. For something simple like that, I might reach for ConfigReader::Simple. It's nice to stay out of the weeds whenever you can.

Instead of using a regular expression you might prefer trying a grammar engine like:
Parse::RecDescent
Regexp::Grammars
I've given a snippet of a Parse::ResDescent answer before on SO. However Regexp::Grammars looks very interesting and is influenced by Perl6 rules & grammars.
So I thought I'd have a crack at Regexp::Grammars ;-)
use strict;
use warnings;
use 5.010;
my $text = q{
Name=Value1
Name = Value2
Name=Value3
};
my $grammar = do {
use Regexp::Grammars;
qr{
<[VariableDeclare]>*
<rule: VariableDeclare>
<Var> \= <Value>
<token: Var> Name
<rule: Value> <MATCH= ([\w]+) >
}xms;
};
if ( $text =~ $grammar ) {
my #Name_values = map { $_->{Value} } #{ $/{VariableDeclare} };
say "#Name_values";
}
The above code outputs Value1 Value2 Value3.
Very nice! The only caveat is that it requires Perl 5.10 and that it may be overkill for the example you provided ;-)
/I3az/

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

escape the word in regexp - regex

I have a string like this $str = '"filename","lf","$data","{ }",0'; How to remove all " from the string? I tried to use this kind of regexp: $str =~ s/"(.+?)"//s; It should match the word and remove "-s

you can do it like this $string =~ s/\"//g;

Related

Perl: regex for conditional replace?

Extract only pattern matched text

Perl regex return matches from substitution

How to pull out every comma, hyphen, underscore, space, and joining remaining words without spaces?

How can I find all matches to a regular expression in Perl?

Categories

Resources