Regex works in terminal but not in Perl script - regex

This is related to this question
Where combination of all lines that end with a backslash character is done. The following Perl command works in terminal but not in Perl script:
perl -i -p -e 's/\\\n//' input_file
In Perl script, I set it this way but it does not work:
`perl -i -p -e 's/\\\n//' $input_file_path_variable`;
**Updated the file content and code used
Input file content as follows:
foo bar \
bash \
baz
dude \
happy
Desired output content:
foo bar bash baz
dude happy
Current script:
#!/usr/bin/env perl
use Getopt::Long;
use FindBin;
use File::Path;
use File::Copy;
use DirHandle;
use FileHandle;
use File::Basename;
use Env;
use File::Slurp;
my $dummy_file = "/wdir/dummy.txt";
my $file_content = read_file($dummy_file);
print "$file_content\n";
print "==============================================\n"
my $ttest = $file_content =~ s/\\\n//;
print "$ttest\n";
Current Output
foo bar \
bash \
baz
dude \
happy
==============================================
1

I think I realized what your problem is. You need to make your substitution match multiple times.
You are working with this one-liner:
perl -i -p -e 's/\\\n//' input_file
Which will replace the backslash and newline once per line in a file read in line-by-line mode. And now you are trying to implement it inside another program, where you are slurping a whole file into a variable, as mentioned in your comment on another answer:
I tried reading the whole file into a variable and applied your solution, not working for me: my $file_content = read_file($input_file_path_variable) and then $file_content =~ s/\\n//;
This will only replace one time. The first match in the file. You need to add the modifier /g to make the match global, as many times as possible:
$file_content =~ s/\\\n//g;
I am not sure how you are using your code, since you are not showing us. This makes answering simple questions hard. But if my assumptions are correct, this will fix your problem.

I think you've changed your code/approach at least once since originally asking the question, but based on your current code, your issue is this line:
my $ttest = $file_content =~ s/\\\n//;
Firstly, the s/// operator needs a g at the end so that it's a global search and replace, instead of stopping after the first substitution.
Secondly, $file_content =~ s/\\\n// doesn't return the final string; it modifies the string in-place. (So it's changing $file_content.) In recent versions of Perl, you can add an r modifier to the end to get it to return the modified string.
So either of these will work for you:
my $ttest = ( $file_content =~ s/\\\n//gr );
Or:
( my $ttest = $file_content ) =~ s/\\\n//g;
Full script which produces the output you want:
#!/usr/bin/env perl
use strict;
use warnings;
use File::Slurp;
my $file_content = read_file('/tmp/dummy.txt');
(my $ttest = $file_content) =~ s/\\\n//g;
print "$file_content\n";
print "==============================================\n";
print "$ttest\n";

In a Perl you should write:
$input_file_variable =~ s/\\\n//;

Related

Adding quotes to a CSV using perl

I've got a CSV that looks as follows:
A,01,ALPHA
00,D,CHARLIE
E,F,02
This is the desired file after transformation:
"A",01,"ALPHA"
00,"D","CHARLIE"
"E","F",02
As you can see, the fields that are entirely numeric are left unquoted, whilst the alpha (or alphanumeric ones) are quoted.
What would be a sensible way to go about this in Perl ?
Already commented below, but I've tried stuff like
perl -pe 's/(\w+)/"$1"/g'
And that doesn't work because \w obviously picks up the numerics.
I recommend not reinventing the wheel, but rather to use an already existing module, as zdim recommends. Here is your example using Text::CSV_XS
test.pl
#!/usr/bin/env perl
use warnings;
use strict;
use Text::CSV_XS;
use Scalar::Util qw( looks_like_number );
my $csv = Text::CSV_XS->new();
while (my $row = $csv->getline(*STDIN)) {
my #quoted_row = map { looks_like_number($_) ? $_ : '"'. $_ .'"' } #$row;
print join(',',#quoted_row) . "\n";
}
Output
cat input | perl test.pl
"A",01,"ALPHA"
00,"D","CHARLIE"
"E","F",02
Another one-liner, input file modified to add a line with alphanumeric fields
$ cat ip.csv
A,01,ALPHA
00,D,CHARLIE
E,F,02
23,AB12,53C
$ perl -F, -lane 's/.*[^0-9].*/"$&"/ foreach(#F); print join ",", #F' ip.csv
"A",01,"ALPHA"
00,"D","CHARLIE"
"E","F",02
23,"AB12","53C"
To modify OP's attempt:
$ perl -pe 's/(^|,)\K\d+(?=,|$)(*SKIP)(*F)|\w+/"$&"/g' ip.csv
"A",01,"ALPHA"
00,"D","CHARLIE"
"E","F",02
23,"AB12","53C"
(^|,)\K\d+(?=,|$)(*SKIP)(*F) this will skip the fields with digits alone and the alternate pattern \w+ will get replaced
It seems that you are after a one-liner. Here is a basic one
perl -lpe '$_ = join ",", map /^\d+$/ ? $_ : "\"$_\"", split ",";' input.csv
Splits each line by , and passes obtained list to map. There each element is tested for digits-only /^\d+$/ and passed untouched, or padded with " otherwise. Then map's return is joined by ,.
The -l removes newline, what is needed since " pad the whole line. The result is assigned back to $_ in order to be able to use -p so that there is no need for explicit print.
The code is very easily used in a script, if you don't insist on an one-liner.
Processing of csv files is far better done by modules, for example Text::CSV

Perl regexp substitution - multiple matches

Friends,
need some help with substitution regex.
I have a string
;;;;;;;;;;;;;
and I need to replace it by
;\N;\N;\N;\N;\N;\N;\N;\N;\N;\N;\N;\N;
I tried
s/;;/;\\N/;/g
but it gives me
;\N;;\N;;\N;;\N;;\N;;\N;;
tried to fiddle with lookahead and lookbehind, but can't get it solved.
I wouldn't use a regex for this, and instead make use of split:
#!/usr/bin/env perl
use strict;
use warnings;
my $str = ';;;;;;;;;;;;;';
print join ( '\N', split ( //, $str ) );
Splitting on nulls, to get each character, and making use of the fact that join puts delimiters between characters. (So not before first, and not after last).
This gives:
;\N;\N;\N;\N;\N;\N;\N;\N;\N;\N;\N;\N;
Which I think matches your desired output?
As a oneliner, this would be:
perl -ne 'print join ( q{\N}, split // )'
Note - we need single quotes ' rather than double around the \N so it doesn't get interpolated.
If you need to handle variable content (e.g. not just ; ) you can add grep or map into the mix - I'd need some sample data to give you a useful answer there though.
I use this for infile edit, the regexp suits me better
Following on from that - perl is quite clever. It allows you to do in place editing (if that's what you're referring to) without needing to stick with regular expressions.
Traditionally you might do
perl -i.bak -p -e 's/something/somethingelse/g' somefile
What this is doing is expanding out that out into a loop:
LINE: while (defined($_ = <ARGV>)) {
s/someting/somethingelse/g;
}
continue {
die "-p destination: $!\n" unless print $_;
}
E.g. what it's actually doing is:
opening the file
iterating it by lines
transforming the line
printing the new line
And with -i that print is redirected to the new file name.
You don't have to restrict yourself to -p though - anything that generates output will work in this way - although bear in mind if it doesn't 'pass through' any lines that it doesn't modify (as a regular expression transform does) it'll lose data.
But you can definitely do:
perl -i.bak -ne 'print join ( q{\N}, split // )'
And inplace edit - but it'll trip over on lines that aren't just ;;;;; as your example.
So to avoid those:
perl -i.bak -ne 'if (m/;;;;/) { print join ( q{\N}, split // ) } else { print }'
Or perhaps more succinctly:
perl -i.bak -pe '$_ = join ( q{\N}, split // ) if m/;;;/'
Since you can't match twice the same character you approach doesn't work. To solve the problem you can only check the presence of a following ; with a lookahead (the second ; isn't a part of the match) :
s/;(?=;)/;\\N/g

Not able to run system command through perl script

Hi i am inserting image location into particular line of file using Perl one-liner code as below
my $image="/home/users/images/image1.tar";
system(q(perl -pi -e 'print "\n$image" if ($. == 5 && $_=~ /^\s*$/ )' myfile.txt));
i am not able to insert a image location into the file .
please can anyone help me out?
As was pointed out, using single quotes q{} did not allow your $image variable to interpolate.
To fix, just concatenate that variable into your string:
#!/usr/bin/perl
use strict;
use warnings;
my $image = '/home/users/images/image1.tar';
system(q{perl -pi -e 'print "\n} . $image . q{" if ($. == 5 && $_=~ /^\s*$/ )' myfile.txt});
However, a much better solution is just to do this processing local to your perl script.
The following does the exact same processing without the secondary call to perl by using $INPLACE_EDIT:
my $image = "/home/users/images/image1.tar";
local #ARGV = 'myfile.txt';
local $^I = '';
while (<>) {
print "\n$image" if $. == 5 && $_ =~ /^\s*$/;
print;
}
For additional methods for editing a file, just read perlfaq5 - How do I change, delete, or insert a line in a file, or append to the beginning of a file?
I'm not sure why you're shelling out to a second Perl process to do this. Why not just do the processing within the original Perl program?
But your problem seems to be that the string you're passing to system is in single quotes (using q(...)) which means that the $image variable won't be expanded. You probably want to change that to a double-quoted string (using qq(...)).
Update:
This is why shelling out to an external process is fraught with difficulty. You have one variable ($image) which needs to be passed though and another variable ($_) which needs to be internal to the second process. You also have an escape sequence (\s) which the shell is trying (but failing) to interpret.
Liberal application of backslashes to escape special characters gives this:
#!/usr/bin/perl
use strict;
use warnings;
my $image='/home/users/images/image1.tar';
system(qq(perl -pi -e 'print "\n$image" if (\$. == 5 && /^\\s*\$/ )' myfile.txt));
Which seems to work. But I still think you'd be far better off doing this all in one Perl program.

Perl Command Line Print - Test Regexp at Command Line

I'm very new to Perl and I'm trying to figure out just how to get this thing to work.
I found an example in this thread: Perl Regex - Print the matched value
Here is the example I'm trying to work with:
perl -e '$str="the variable Xy = 3 in this case"; print $str =~ /(Xy = 3)/;'
I've tried running it using cmd, but that produces this error:
"Can't find string terminator "'" anywhere before EOF at -e line 1."
When I run it in powershell nothing happens.
My ultimate goal is to set a variable at the command line, run a regexp find (and sometimes replace), and to print the result. This way I don't have to write a script every time I write a regexp pattern.
I've tried using the debugger, but nothing happens when I do this after setting the variable:
print $str =~ /(Xy = 3)/;
It is better to put your statements inside a Perl script on a windows environment because you will need the double quotes for most of your Perl stuff, but escaping on the command line gets messy eventually.
Ok, well I figured it out for powershell. I had to escape the double quotes.
perl -e '$str=\"the variable Xy = 3 in this case\"; print $str =~ /(Xy = 3)/;'
And the same goes for cmd except that I had to replace the single quotes with double quotes since single quotes aren't a string delimiter there.
perl -e "$str=\"the variable Xy = 3 in this case\"; print $str =~ /(Xy = 3)/;"

How to use regular expression to remove $ character in string?

I want to use regular expression to remove string with $ , % , # these three characters , but it seems can't remove $ and the error information shows undefined variable
How can I solve this problem?
here is my code
perl Remove.pl $ABC#60%
#!/usr/bin/perl
$Input = $ARGV[0];
$Input =~ s/\$|%|#//g;
print $Input;
thanks
I think your problem is with the shell, not with the Perl code. Single quote the argument to the script:
perl remove.pl '$ABC#60%'
The shell can interpret '$ABC' as a variable name in which case the script will receive no arguments. Perl will then complain about undefined variable in substitution.
$Input =~ s/[\$%#]//g;
ought to work
if you just want to remove some charactor, it will be better use tr
try this:
perl -e '$arg = shift; $arg =~ tr/$%#//d; print $arg' '$asdf#$'
your code is just fine, but the parameter you pass to the program will expand in bash. you should put single quote.
try this:
perl Remove.pl '$ABC#60%'