Perl replace delimiters - regex

I have CSV text like
1,2,3,{4,5,6,7,8},9,10,100
I want to replace the delimiter of fields between {}. The text should look like:
1,2,3,{4|5|6|7|8},9,10,100
I tried perl -0777 -pe 's/\{.*?,\}/|/g'
but nothing happens. What should I do instead?

This will do as you ask. It replaces all commas that are followed by a sequence of characters that are not braces { }, and then a closing brace
use strict;
use warnings;
use 5.010;
my $s = '1,2,3,{4,5,6,7,8},9,10,100';
$s =~ s/,(?=[^{}]*\})/|/g;
say $s;
output
1,2,3,{4|5|6|7|8},9,10,100

You can use the following regex with $1$2| replacement string:
(\{\s*|(?<!^)\G)(\d+),(?=[,0-9]*\})
Output:
1,2,3,{4|5|6|7|8},9,10,100
Sample code:
#!/usr/bin/perl
$txt = "1,2,3,{4,5,6,7,8},9,10,100";
$txt =~ s/(\{\s*|(?<!^)\G)(\d+),(?=[,0-9]*\})/$1$2|/g;
print $txt;

Here's a command line version for Perl 5.14 and greater.
perl -pe 's/([{][\d,]+[}])/$1 =~ s~,~|~gr/ge'
The /e means it's evaluating the replacement as a Perl expression and not the standard regex expression. That means that it is taking the value of the first capture ($1) and performing a substitution with return (/r) so as to avoid the error trying to modify the read-only value ($1).

You can try this:
$st = "1,2,3,{4,5,6,7,8},9,10,100";
if ( $st=~/\{(.*)\}/ ) {
$tr = $1;
$tr =~ s/,/|/g;
$st =~ s/\{*\}/{$tr}/;
print "$st \n"
}
Output:
1,2,3,{4,5,6,7,8{4|5|6|7|8},9,10,100

Related

How to change a pattern like XX1/XXSomething/XX1/Something to XXSomething/XX1/Something in perl

I'm having a file in which some lines have some patterns like
M1/XX2/XX1 XX2/XX1/XX2/WCLKB XX2/XX1/XX2/P001
M1/XX4/XX5 XX4/XX5/XX4/WCLKB XX4/XX5/XX4/P001
Here in some patterns XX2 is repeating. I need to change the above line to
M1/XX2/XX1 XX1/XX2/WCLKB XX1/XX2/P001
M1/XX4/XX5 XX5/XX4/WCLKB XX5/XX4/P001
These XX can vary XX[0..9]
The code is in Perl.
I tried using some regex but was confused.
open(FILE,$FilePath);
#linesInFile = <FILE>;
close(FILE);
foreach $item(#linesInFile){
if(grep(/^XX?\/XX.\/XX)
#I dont know how to complete this
}
If you're looking specifically for XXn/XXm/XXn/ (where n is the same number both times), you can use backreferences:
s{(XX[0-9]+/)(XX[0-9]+/\1)}{$2}g
Here \1 refers back to and matches the same string as the first capturing group, (XX[0-9]+/).
Live demo:
#!/usr/bin/perl
use strict;
use warnings;
while (my $line = readline DATA) {
$line =~ s{(XX[0-9]+/)(XX[0-9]+/\1)}{$2}g;
print $line;
}
__DATA__
M1/XX2/XX1 XX2/XX1/XX2/WCLKB XX2/XX1/XX2/P001
M1/XX4/XX5 XX4/XX5/XX4/WCLKB XX4/XX5/XX4/P001
Output:
M1/XX2/XX1 XX1/XX2/WCLKB XX1/XX2/P001
M1/XX4/XX5 XX5/XX4/WCLKB XX5/XX4/P001
If it's ok to blindly remove the first part:
while (<>) {
s{ \K[^\s/]+/}{}g;
print;
}
As a one-liner:
perl -pe's{ \K[^\s/]+/}{}g'
If you want to make sure it matches the pattern you specified:
while (<>) {
s{(?<!\S)(XX\d)/(?=XX[^\s/]+/\1/\S)}{}ag;
print;
}
As a one-liner:
perl -pe's{(?<!\S)(XX\d)/(?=XX[^\s/]+/\1/\S)}{}ag'
The key is \1, means which means "match what the first capture captured".
Based on what you have explained in the description of your problem XX[0..9], the following perl command should do the trick:
Input:
$ cat input
M1/XX2/XX1 XX2/XX1/XX2/WCLKB XX2/XX1/XX2/P001
M1/XX4/XX5 XX4/XX5/XX4/WCLKB XX4/XX5/XX4/P001
Command:
perl -pe 's#\bXX(\d)/XX(\d)/XX\1#XX$2/XX$1#g' input
Output:
M1/XX2/XX1 XX1/XX2/WCLKB XX1/XX2/P001
M1/XX4/XX5 XX5/XX4/WCLKB XX5/XX4/P001

Extract only pattern matched text

I have written a basic program using regular expression.
However the entire line is being returned instead of the matched part.
I want to extract the number only.
use strict;
use warnings;
my $line = "ABMA 1234";
$line =~ /(\s)(\d){4}/;
print $line; #prints *ABMA 1234*
Is my regular expression incorrect?
If you want to print 1234, you need to change your regex and print the 2nd match:
use strict;
use warnings;
my $line = "ABMA 1234";
$line =~ /(\s)(\d{4})/;
print $2;
You can replace the exact value with the corresponding values. And your are not removing the text \w;
use strict;
use warnings;
my $line = "ABMA 1234";
$line=~s/([A-z]*)\s+(\d+)/$2/;
print $line; #prints only 1234
If you want to store the value in the new string then
(my $newstring = $line)=~s/([A-z]*)\s+(\d+)/$2/;
print $newstring; #prints only 1234
Just try this:
I don't know how you output the match in perl but you can use below regex for output the full match in your regex, you might getting space appended with your result in your current regex.
\b[\d]{4}
DEMO

Add html to perl Regex

Am trying to replace all `` with a HTML code tag
replace:
$string = "Foo `FooBar` Bar";
with:
$string = "Foo <code>FooBar</code> Bar";
i tried these
$pattern = '`(.*?)`';
my $replace = "<code/>$&</code>";
$subject =~ s/$pattern/$replace/im;
#And
$subject =~ s/$pattern/<code/>$&</code>/im;
but none of them works.
Assuming you meant $string instead of $subject...
use strict;
use warnings;
use v5.10;
my $string = "Foo `FooBar` Bar";
my $pattern = '`(.*?)`';
my $replace = "<code/>$&</code>";
$string =~ s{$pattern}{$replace}im;
say $string;
This results in...
$ perl ~/tmp/test.plx
Use of uninitialized value $& in concatenation (.) or string at /Users/schwern/tmp/test.plx line 9.
Foo <code/></code> Bar
There's some problems here. First, $& means the string matched by the last match. That would be all of `FooBar`. You just want FooBar which is inside capturing parens. You get that with $1. See Extracting Matches in the Perl Regex Tutorial.
Second is $& and $1 are variables. If you put them in double quotes like $replace = "<code/>$&</code>" then Perl will immediately interpolate them. That means $replace is <code/></code>. This is where the warning comes from. If you want to use $1 it has to go directly into the replace.
Finally, when quoting regexes it's best to use qr{}. That does special regex quoting. It avoids all sorts of quoting issues.
Put it all together...
use strict;
use warnings;
use v5.10;
my $string = "Foo `FooBar` Bar";
my $pattern = qr{`(.*?)`};
$string =~ s{$pattern}{<code/>$1</code>}im;
say $string;

Perl : How to replace a _[0-9] with a comma in perl or any language

I have a file with the following pattern
'21pro_ABCD_EDG_10800_48052_2 0.0'
How do i replace the _[0-9] with a ,(comma)
so that i can get the output as
21pro_ABCD_EDG,10800,48052,2, 0.0
To replace the _[0-9] with a , you can do this:
$s =~ s/_([0-9])/,$1/g
#the same without capturing groups
$s =~ s/_(?=[0-9])/,/g;
Edit:
To get the extra comma after the 2 you can do this:
#This puts a , before all whitespace.
$s =~ s/_(?=[0-9])|(?=\s)/,/g;
#This one puts a , between [0-9] and any whitespace
$s =~ s/_(?=[0-9])|(?<=[0-9])(?=\s)/,/g;
The sed approach would be something like the following:
rupert#hake:~ echo '21pro_ABCD_EDG_10800_48052_2 0.0' | sed 's/_\([0-9]\)/,\1/g'
21pro_ABCD_EDG,10800,48052,2 0.0
Using the expression mentioned by jacob, here is the code snippet to perform the substitution for a large file
#!/usr/local/bin/perl
open (MYFILE, 'test');
while (<MYFILE>) {
chomp;
$s=$_;
$s =~ s/_(?=[0-9])|(?<=[0-9])(?=\s)/,/g;
$s =~ s/\s//g;
print "$s\n";
}
close (MYFILE);

How do I substitute with an evaluated expression in Perl?

There's a file dummy.txt
The contents are:
9/0/2010
9/2/2010
10/11/2010
I have to change the month portion (0,2,11) to +1, ie, (1,3,12)
I wrote the substitution regex as follows
$line =~ s/\/(\d+)\//\/\1+1\//;
It's is printing
9/0+1/2010
9/2+1/2010
10/11+1/2010
How to make it add - 3 numerically than perform string concat? 2+1??
Three changes:
You'll have to use the e modifier
to allow an expression in the
replacement part.
To make the replacement globally
you should use the g modifier. This is not needed if you've one date per line.
You use $1 on the replacement side, not a backreference
This should work:
$line =~ s{/(\d+)/}{'/'.($1+1).'/'}eg;
Also if your regex contains the delimiter you're using(/ in your case), it's better to choose a different delimiter ({} above), this way you don't have to escape the delimiter in the regex making your regex clean.
this works: (e is to evaluate the replacement string: see the perlrequick documentation).
$line = '8/10/2010';
$line =~ s!/(\d+)/!('/'.($1+1).'/')!e;
print $line;
It helps to use ! or some other character as the delimiter if your regular expression has / itself.
You can also use, from this question in Can Perl string interpolation perform any expression evaluation?
$line = '8/10/2010';
$line =~ s!/(\d+)/!("/#{[$1+1]}/")!e;
print $line;
but if this is a homework question, be ready to explain when the teacher asks you how you reach this solution.
How about this?
$ cat date.txt
9/0/2010
9/2/2010
10/11/2010
$ perl chdate.pl
9/1/2010
9/3/2010
10/12/2010
$ cat chdate.pl
use strict;
use warnings;
open my $fp, '<', "date.txt" or die $!;
while (<$fp>) {
chomp;
my #arr = split (/\//, $_);
my $temp = $arr[1]+1;
print "$arr[0]/$temp/$arr[2]\n";
}
close $fp;
$