perl simple grep - regex

I want to create a simple grep call in perl.
I have 2 variables.
$var1,$var2
and I want to get all the files that the name of the file starts with $var1 and the end of the file is $var2
what would be the syntax for a grep command in perl that does that.

Gets all file names in the current directory that start with $var1 and end with $var2:
my #matchingFileNames = <$var1*$var2>;
EDIT: To handle spaces and special characters as #Schwern and #ikegami correctly pointed out:
my #matchingFileNames = <\Q$var1\E*\Q$var2\E>;

Something like:
my #files = grep {/\A$var.*$var2\z/} #input_files
will do

opendir(DIR, "yourDIR");
my #FILES= readdir(DIR);
my #matching_files;
for my $file (#FILES) {
push(#matching_files, $file) if ($file =~ /\A$var1/ and $file =~ /$var2\z/)
}

Related

How to replace string in a file with Perl in script (not in command line)

I want to replace a string in a file. Of course I can use
perl -pi -e 's/pattern/replacement/g' file
but I want to do it with a script.
Is there any other way to do that instead of system("perl -pi -e s/pattern/replacement/g' file")?
-i takes advantage that you can still read an unlinked filehandle, you can see the code it uses in perlrun. Do the same thing yourself.
use strict;
use warnings;
use autodie;
sub rewrite_file {
my $file = shift;
# You can still read from $in after the unlink, the underlying
# data in $file will remain until the filehandle is closed.
# The unlink ensures $in and $out will point at different data.
open my $in, "<", $file;
unlink $file;
# This creates a new file with the same name but points at
# different data.
open my $out, ">", $file;
return ($in, $out);
}
my($in, $out) = rewrite_file($in, $out);
# Read from $in, write to $out as normal.
while(my $line = <$in>) {
$line =~ s/foo/bar/g;
print $out $line;
}
You can duplicate what Perl does with the -i switch easily enough.
{
local ($^I, #ARGV) = ("", 'file');
while (<>) { s/foo/bar/; print; }
}
You can try the below simple method. See if it suits your requirement best.
use strict;
use warnings;
# Get file to process
my ($file, $pattern, $replacement) = #ARGV;
# Read file
open my $FH, "<", $file or die "Unable to open $file for read exited $? $!";
chomp (my #lines = <$FH>);
close $FH;
# Parse and replace text in same file
open $FH, ">", $file or die "Unable to open $file for write exited $? $!";
for (#lines){
print {$FH} $_ if (s/$pattern/$replacement/g);
}
close $FH;
1;
file.txt:
Hi Java, This is Java Programming.
Execution:
D:\swadhi\perl>perl module.pl file.txt Java Source
file.txt
Hi Source, This is Source Programming.
You can handle the use case in the question without recreating the -i flag's functionality or creating throwaway variables. Add the flag to the shebang of a Perl script and read STDIN:
#!/usr/bin/env perl -i
while (<>) {
s/pattern/replacement/g;
print;
}
Usage: save the script, make it executable (with chmod +x), and run
path/to/the/regex-script test.txt
(or regex-script test.txt if the script is saved to a directory in your $PATH.)
Going beyond the question:
If you need to run multiple sequential replacements, that's
#!/usr/bin/env perl -i
while (<>) {
s/pattern/replacement/g;
s/pattern2/replacement2/g;
print;
}
As in the question's example, the source file will not be backed up. Exactly like in an -e oneliner, you can back up to file.<backupExtension> by adding a backupExtension to the -i flag. For example,
#!/usr/bin/env perl -i.bak
You can use
sed 's/pattern/replacement/g' file > /tmp/file$$ && mv /tmp/file$$ file
Some sed versions support the -i command, so you won't need a tmpfile. The -i option will make the temp file and move for you, basicly it is the same solution.
Another solution (Solaris/AIX) can be using a here construction in combination with vi:
vi file 2>&1 >/dev/null <#
1,$ s/pattern/replacement/g
:wq
#
I do not like the vi solution. When your pattern has a / or another special character, it will be hard debugging what went wrong. When replacement is given by a shell variable, you might want to check the contents first.

how to to extract all text of the form "<key>=<value>" from a log file

Hi I have a requirement where I need to pull text of the form - = from a large log file.
log file consists of data like this:
[accountNumber=0, email=tom.cruise#gmail.com, firstName=Tom, lastName= , message=Hello How are you doing today ?
The output I expect is:
accountNumber=0
email=tom.cruise#gmail.com
firstName=Tom
etc.
Can anyone please help ? Also please explain the solution so that I can extend it to cater to my similar needs.
I wrote a one-liner for this:
perl -nle 's/^\[//; for (split(/,/)){s/(?:^\s+|\s+$)//g; print}' input.txt
I also made another line of input to test with:
Matt#MattPC ~/perl/testing/13
$ cat input.txt
[accountNumber=0, email=tom.cruise#gmail.com, firstName=Tom, lastName= , message=Hello How are you doing today ?
[accountNumber=2, email=john.smith#gmail.com, firstName=John, lastName= , message=What is up with you?
Here is the output:
Matt#MattPC ~/perl/testing/13
$ perl -nle 's/^\[//; for (split(/,/)){s/(?:^\s+|\s+$)//g; print}' input.txt
accountNumber=0
email=tom.cruise#gmail.com
firstName=Tom
lastName=
message=Hello How are you doing today ?
accountNumber=2
email=john.smith#gmail.com
firstName=John
lastName=
message=What is up with you?
Explanation:
Expanded code:
perl -nle '
s/^\[//;
for (split(/,/)){
s/(?:^\s+|\s+$)//g;
print
}'
input.txt
Line by line explanation:
perl -nle calls perl with the command line options -n, -l, and -e. The -n adds a while loop around the program like this:
LINE:
while (<>) {
... # your program goes here
}
The -l adds a newline at the end of every print. And the -e specifies my code which will be in single quotes (').
s/^\[//; removes the first [ if there is one. This searches and replaces on $_ which is equal to the line.
for (split(/,/)){ begins the for loop which will loop through the array returned by split(/,/). The split will split $_ since it was called with just one argument, and it will split on ,. $_ was equal to the line, but inside the for loop, $_ still get set to the element of the array we are on.
s/(?:^\s+|\s+$)//g; this line removes leading and trailing white space.
print will print $_ followed by a newline, which is our string=value.
}' close the for loop and finish the '.
input.txt provide input to the program.
Going off your specific data and desired output, you could try the following:
use strict;
use warnings;
open my $fh, '<', 'file.txt' or die "Can't open file $!";
my $data = do { local $/; <$fh> };
my #matches = $data =~ /(\w+=\S+),/g;
print join "\n", #matches;
Working Demo
Perl One-Liner
Use this:
perl -0777 -ne 'while(m/[^ ,=]+=[^,]*/g){print "$&\n";}' yourfile
Assuming that each line of the log ends with a closing square bracket, you can use this:
#!/usr/bin/perl
use strict;
use warnings;
my $line = '[accountNumber=0, email=tom.cruise#gmail.com, firstName=Tom, lastName= , message=Hello How are you doing today ?]';
while($line =~ /([^][,\s][^],]*?)\s*[],]/g) {
print $1 . "\n";
}

sed delete 1st line and remove leading/trailing white spaces

I am trying to delete the 1st line and removing leading and trailing white spaces in the subsequent lines using sed
If I have something like
line1
line2
line3
It should print
line2
line3
So I tried this command on unix shell:
sed '1d;s/^ [ \t]*//;s/[ \t]*$//' file.txt
and it works as expected.
When I try the same in my perl script:
my #templates = `sed '1d;s/^ [ \t]*//;s/[ \t]*$//' $MY_FILE`;
It gives me this message "sed: -e expression #1, char 10: unterminated `s' command" and doesn't print anything. Can someone tell me where I am going wrong
Why would you invoke Sed from Perl anyway? Replacing the sed with the equivalent Perl code is just a few well-planned keystrokes.
my #templates;
if (open (M, '<', $MY_FILE)) {
#templates = map { s/(?:^\s*|\s*$)//g; $_ } <M>;
shift #templates;
close M;
} else { # die horribly? }
The backticks work like double-quotes. Perl interpolates variables inside them, as you already know due to your use of $MY_FILE. What you may not know is that $/ is actually a variable, the input record separator (by default a newline character). The same is true for the backslashes before the tab character. Here Perl will interpret \t for you and replace it with the tab character. You'll need a second backslash so that sed sees \t instead of an actual tab character. The latter might work as well, though.
Consider to use safe pipe open instead of backticks, to avoid problems with escaping. For example:
my #templates = do {
open my $fh, "|-", 'sed', '1d;s/^ [ \t]*//;s/[ \t]*$//', $MY_FILE
or die $!;
local $/;
<$fh>;
};
You have a typo in your expression. You need a semicolon between the 2 substitution statements. You should use the following instead:
my #templates = `sed '1d;s/^ [ \\t]*//;s/[ \\t]*\$//' $MY_FILE`;
escaping $ and \ as suggested in the other answer. I should note that it also worked for me without escaping \ as it was replaced by a literal tab.
As others have mentioned, I would recommend you do this only in Perl, or only in Sed, because there's really no reason to use both for this task. Using Sed in Perl will mean you have to worry about escaping, quoting and capturing the output (unless reading from a pipe) somehow. Obviously, all that complicates things and it also makes the code very ugly.
Here is a Perl one-liner that will handle your reformatting:
perl -le 'my $line = <>; while (<>) { chomp; s/^\s*|\s*$//; print $_; }' file.txt
Basically, you just take the first line and store in a variable that won't be used, then process the rest of the lines. Below is a small script version that you can add to your existing script.
#!/usr/bin/env perl
use strict;
use warnings;
my $usage = "$0 infile";
my $infile = shift or die $usage;
open my $in, '<', $infile or die "Could not open file: $infile";
my $first = <$in>;
while (<$in>) {
chomp;
s/^\s*|\s*$//;
# process your data here, or just print...
print $_, "\n";
}
close $in;
This can also be down with awk
awk 'NR>1 {$1=$1;print}' file
line2
line3

Grep in perl never seems to work for me

I have a simple script that reads and list file from a directory.
But I don't want to list hidden files, files with a " . " in front.
So I've tried using the grep function to do this, but it returns nothing. I get no listing of files.
opendir(Dir, $mydir);
while ($file = readdir(Dir)){
$file = grep !/^\./ ,readdir Dir;
print "$file\n";
I don't think I'm using the regular expression correctly.
I don't want to use an array cause the array doesn't format the list correctly.
You can either iterate over directory entries using a loop, or read all the entries in the directory at once:
while (my $file = readdir(Dir)) {
print "$file\n" if $file !~ /^\./;
}
or
my #files = grep { !/^\./ } readdir Dir;
See perldoc -f readdir.
You're calling readdir() twice in a loop. Don't.
or like so:
#!/usr/bin/env perl -w
use strict;
opendir my $dh, '.';
print map {$_."\n"} grep {!/^\./} readdir($dh);
Use glob:
my #files = glob( "$mydir/*" );
print "#files\n";
See perldoc -f glob for details.
while ($file = readdir(Dir))
{
print "\n$file" if ( grep !/^\./, $file );
}
OR you can use a regualr expression :
while ($file = readdir(Dir))
{
print "\n$file" unless ( $file =~ /^\./ );
}

Using BASH - Find CSS block or definition and print to screen

I have a number of .css files spread across some directories. I need to find those .css files, read them and if they contain a particular class definition, print it to the screen.
For example, im looking for ".ExampleClass" and it exists in /includes/css/MyStyle.css, i would want the shell command to print
.ExampleClass {
color: #ff0000;
}
Use find to filter on all css files and execute a sed script on those files printing lines, between two regular expressions:
find ${DIR} -type f -name "*.css" -exec sed -n '/\.ExampleClass.{/,/}/p' \{\} \+
Considering that the css file can have multiline class definitions, and that there can be several ocurrences in the same file, I'd bet perl is the way to go.
For example:
#input: css filename , and css class name without dot (in the example, ExampleClass)
my ($filen,$classn) = #ARGV;
my $out = findclassuse($filen,$classn);
# print filename and result if non empty
print ("===== $filen : ==== \n" . $out . "\n") if ($out);
sub findclassuse {
my ($filename,$classname) = #_;
my $output = "";
open(my $fh, '<', $filename) or die $!;
$/ = undef; # so that i read the full file content
my $css = <$fh>;
$css =~ s#/\*.*?\*/# #g; # strip out comments
close $fh;
while($css =~ /([^}{]*\.$classname\b.*?{.*?})/gs) {
$output .= "\n\n" . $1;
}
return $output;
}
But this is not 100% foolproof, there remains some issues with comments, and the css parsing is surely not perfect.
find /starting/directory -type f -name '*.css' | xargs -ti grep '\.ExampleClass' {}
will find all the css files, print the filename and search string and then print the results of the grep. You could pipe the output through sed to remove any unnecessary text.
ETA: the regex needs work if we want to catch multiline expressions. Likely the EOL character should be set to } so that complete classes are considered one line. If this were done, then piping the find to perl -e rather than grep would be more effective
Assuming you never do anything weird, like putting the opening brace on a separate line, or putting an unindented (nested) closing brace before the intended one, you can do this:
sed -n '/\.ExampleClass *{/,/^}/p' *.css
And if the files are all over a directory structure:
find . -name *.css | xargs sed ...
This version handles multi-line blocks as well as blocks on a single line:
sed -n '/^[[:space:]]*\.ExampleClass[[:space:]]*{/{p;q}; /^[[:space:]]*\.ExampleClass[[:space:]]*{/,/}/p'
Examples:
foo { bar }
or
foo {
bar
}