Perl: Splitting a variable containing a path to directory - regex

So I'm having a problem trying to split a variable that has the complete path to a directory. I want to fetch the files under 'logs' directory of this path:
'/nfs/fm/disks/fm_mydiskhere_00000/users/me/repotest/myrepo/tools/toolsdir/logs/*';
and in my Perl script, I have a variable that contains this path: '$log_file'. When I print '$log_file', it contains the entire path; I want to go to the last directory 'logs' and fetch the files under it. By the way, this complete path is in a separate configuration file that is being read by my Perl script this way:
sub read_file {
my ($log_file) = shift(#_);
info ("Using file : $log_file");
my $fh = new FileHandle ("$log_file");
printError ("Could not open this file : '$log_file' - $!") unless defined $fh;
my $contents;
{
local $/ = undef;
$contents = <$fh>;
}
$fh->close();
eval $contents;
if ($#) {
chomp $#;
my $msg = "BAD (perl) syntax in file:\n\n$#\n";
if ( $msg =~ /requires explicit package name/ ) {
$msg .= "\n -> A 'requires explicit package name' message means".
" a NON-VALID variable name was found\n";
}
die "Error: $msg\n\n";
}
return 1;
}
And I'm using variable $log_file in another subroutine this way:
my $fh = read_file ($log_file);
if ($log_file eq "abc.txt"){
while (my $line = <$fh>) {
#do something
}
}
Can anybody help me here, please? Am I missing something or using $log_file in the wrong way?
Thanks in advance!

Try:
my #files = glob($log_file);
use Data::Dumper;
print Dumper(\#files);

Related

Write to file a multiple line string variable

I am trying to extract the content of a text file between two tags and store it to another file.
I manage to convert the input file to a multiple line string variable then use regexp successfully to get what I want in the variable.
But I fail writing my variable to a file, I assume this is because of the type of string with multiple \n inside.
I would appreciate any help. (This my first Perl Script…)
For the test, I use a index.html file but can be any text file.
Edit : solved, see correction in comments
Here below my documented code :
# Extract string between two tags
use strict;
use warnings;
my $inputfile = "";
my $outputfile = "";
# Parse Macro Arguments arguments
if(#ARGV < 2)
{
print "Usage: perl Macro_name.pl <inputfile.HTML> <outfile.HTML>\n";
exit(1);
}
$inputfile = $ARGV[0];
$outputfile = $ARGV[1];
my $body="";
# Convert input file to multiple line string #
$body = File_to_Var_Multi_Line($inputfile);
# First tag & Second tag match
if ( $body =~ /(.*)<body(.*?)>(.*)<\/body>/s )
{ # error :
my $body = $3; # $body is local here
# correction :
#Print to check if extract ok # declare another variable outside if
print $body, "\n";
}
# Write to file my match multiple line string #
open(my $fh_body, '>:encoding(UTF-8)', $outputfile)
or die "Could not open file '$outputfile' $!";
print $fh_body "$body\n";
close $fh_body;
# sub #
sub File_to_Var_Multi_Line
{
if(#_ < 1)
{
print "Usage: line=File_to_Var_Multi_Line<file>\n";
exit(1);
}
my $inputfile_2 = "";
$inputfile_2 = $_[0];
open(my $fl_in, '<:encoding(UTF-8)', $inputfile_2)
or die "Could not open file '$inputfile_2' $!";
my $line = "";
my $row_2 = "";
while (my $row_2 = <$fl_in>)
{
$line .= $row_2;
}
return $line
}
And the input test file :
<html>
<body>
page 1<br>
page 2<br>
page 3<br>
page 4<br>
page 5<br>
</body>
</html>
Notwithstanding RegEx match open tags except XHTML self-contained tags
You may find useful the 'range operator' for iterating through a file.
For example:
while ( <$fl_in> ) {
if ( m,<BODY>,i .. m,</BODY>,i ) {
print;
}
}
The condition will be true, if you're within the 'body' tags. (Although it's line oriented, so trailing stuff will be 'caught').

Substitute Text with RegEX in external file Perl

I am trying to have a script that will update a variable in an input file. The RegEx matches but it does not perform the substitution. What am I doing wrong.
sub updateInputDeck {
my $powerLevel = shift;
my $file = $outputFiles{input};
open INPUTFILE,"<",$file or die "Cannot open file $file $!\n";
while (<INPUTFILE>) {
if (s/((?<=\s{3}RP\s{2}=\s{2})\d+)/$powerLevel/) {
print $_;
print "Updating Input File for Power Level: $powerLevel";
}
}
close INPUTFILE;
}
UPDATE
I am trying to update the file pointed to in the filehandle. Can I only do this via a print statement. If that is the case I just want to reprint that one line. Is that possible?
You can use perl's in-place editing:
sub updateInputDeck {
my $powerLevel = shift;
my $file = $outputFiles{input};
local #ARGV = ($file);
local $^I = '.bac';
while( <> ){
s/((?<=\s{3}RP\s{2}=\s{2})\d+)/$powerLevel/;
print;
}
#unlink "$file$^I" or die "Can't delete backup";
return;
}
Also note, that your use of a global $outputFiles{input} as a parameter to your function is a bad style practice. Instead pass it as a parameter to your function as well.
I think you have to do it in 2 passes. First read the whole file into an array, edit it locally and write the whole thing back out, overwriting the original file.
open INPUTFILE,"<$file" or die "Cannot open file $file $!\n";
my #lines = <INPUTFILE>; # Read in entire file.
close INPUTFILE;
open INPUTFILE,">$file" or die "Cannot open file $file $!\n";
foreach $line (#lines) {
$line =~ s/((?<=\s{3}RP\s{2}=\s{2})\d+)/$powerLevel/;
print INPUTFILE $line;
}
close INPUTFILE;

How to match a regex pattern for multiple files under a directory in perl?

I wrote a script to match a pattern and return a statement for a file
#!/usr/bin/perl
use strict;
use warnings;
my $file = '/home/Sidtest/sid.txt';
open my $info , $file or die " Couldn't open the $file:$!";
while( my $line = <$info>) {
if ($line =~ m/^#LoadModule ssl_module/) {
print "FileName =",$file," Status = Failed \n";
}
elsif ($line =~ m/^LoadModule ssl_module/) {
print "FileName =",$file," Status = Passed \n";
}
}
close $info;
So now I am trying to modify this script to work for multiple files under the same directory. I haven't been able to do that successfully. Can anyone please help in how I can make it work for any number of files in a directory.
This will read every file in ./directory and foreach file, print out each line.
The print statement can be altered to print if /match/, or whatever you want:
my #dir = <directory/*>;
foreach my $file (#dir){
open my $input, '<', $file;
while (<$input>){
print "PASS: $_\n" if m/^#LoadModule ssl_module/;
[...]
}
}
The variable #ARGV contains a list of arguments sent to the script when started. Loop through #ARGV and call the script with the files you want to process:
#!/usr/bin/perl
use strict;
use warnings;
foreach my $file (#ARGV) {
open my $info , $file or die " Couldn't open the $file:$!";
while( my $line = <$info>) {
if ($line =~ m/^#LoadModule ssl_module/) {
print "FileName =",$file," Status = Failed \n";
}
elsif ($line =~ m/^LoadModule ssl_module/) {
print "FileName =",$file," Status = Passed \n";
}
}
close $info;
}
# process all files *.txt in your dir: ./myscript.pl /home/Sidtest/*.txt
Check perldoc perlrun, and look at the -p and -n parameters. Essentially, they treat your script as if it were the contents of a loop over stdin, where stdin is generated by iterating through the files supplied on the command line. The name of the file currently-being-processed can be accessed using the $ARGV variable.
So, you might go for an approach where your whole script looks more like this, using the -n param, where $_ contains the current line.:
if ( m/^#LoadModule ssl_module/) {
print "FileName =",$ARGV" Status = Failed \n";
} elsif (m/^LoadModule ssl_module/) {
print "FileName =",$ARGV," Status = Passed \n";
}

Pull regular expressions from file and compare to each line in a file

I found something that I could use on perlmonks.org (http://www.perlmonks.org/?node_id=870806) but I can't get it to work.
I can read the file without issue and build an array. Then, I'd like to compare each index of the array (each regex) to each line of a file, printing out the line before and the line after the matched line.
My code:
# List of regex's. If this file doesn't exist, we can't continue
open ( $fh, "<", $DEF_FILE ) || die ("Can't open regex file: $DEF_FILE");
while (<$fh>) {
chomp;
push (#bad_strings, $_);
}
close $fh || die "Cannot close regex file: $DEF_FILE: $!";
$file = '/tmp/mydirectory/myfile.txt';
eval { open ( $fh, "<", $file ); };
if ($#) {
# If there was an error opening the file, just move on
print "Error opening file: $file.\n";
} else {
# If no error, process the file
foreach $bad_string (#bad_strings) {
$this_line = "";
$do_next = 0;
seek($fh, 0, 0); # move pointer to 0 each time through
while(<$fh>) {
$last_line = $this_line;
$this_line = $_;
my $rege = eval "sub{ \$_[0] =~ $bad_string }"; # Real-time regex
if ($rege->( $this_line )) { # Line 82
print $last_line unless $do_next;
print $this_line;
$do_next = 1;
} else {
print $this_line if $do_next;
$last_line = "";
$do_next = 0;
}
}
}
} # End "if error opening file" check
This was working before when I had just a string per line in the file and performed a simple test such as if ($this_line =~ /$string_to_search_for/i ) but when I switched to regex in the file and a "real-time" eval statement, I now get Can't use string ("") as a subroutine ref while "strict refs" in use at scrub_file.pl line 82 and line 82 is if ($rege->($this_line)) {.
Prior to that error message, I'm receiving: Use of uninitialized value in subroutine entry at scrub_hhsysdump_file.pl line 82, <$fh> I have some understanding of that error message but can't seem to make the perl engine happy with my code thus far.
Still new to perl and always looking for pointers. Thanks in advance.
I fail to see the reason for those eval statements - all they seem to do is make the code a lot more complicated and difficult to debug.
But $rege is undef because eval "sub{ \$_[0] =~ $bad_string }" isn't working, due to the string having a syntax error. I don't know what's in $DEF_FILE, but unless it has properly-delimited regular expressions then you need to add the delimiters in the eval string.
my $rege = eval "sub{ \$_[0] =~ /$bad_string/ }"
may work, but you may need /\Q$bad_string/ instead if the strings in $DEF_FILE contain regex metacharacters and you want them to be treated as literal characters.
I suggest this version of your program which seems to do what you need without the fuss of the eval calls.
use strict;
use warnings;
use Fcntl ':seek';
my $DEF_FILE = 'myfile';
my #bad_strings = do {
open my $fh, '<', $DEF_FILE or die qq(Can't open regex file "$DEF_FILE": $!);
<$fh>;
};
chomp #bad_strings;
my $file = '/tmp/mydirectory/myfile.txt';
open my $fh, '<', $file or die qq(Unable to open "$file" for input: $!);
for my $bad_string (#bad_strings) {
my $regex = qr/$bad_string/;
my ($last_line, $this_line, $do_next) = ('', '', 0);
seek $fh, 0, SEEK_SET;
while (<$fh>) {
($last_line, $this_line) = ($this_line, $_);
if ($this_line =~ $regex) {
print $last_line unless $do_next;
print $this_line;
$do_next = 1;
}
else {
print $this_line if $do_next;
$do_next = 0;
}
}
}

Removing stop words and saving the new file Perl

I have created a Perl file to load in an array of "Stop words".
Then I load in a directory with ".ner" files contained in it.
Each file gets opened and each word is split and compared to the words in the stop file.
If the word matches the word it is changed to "" (nothing-and gets removed)
I then copy the file to another location. So I can differentiate between files with stop words and files without.
But does this change the file to now contain no stop words or will it revert back to the original?
#!/usr/bin/perl
#use strict;
#use warnings;
my #stops;
my #file;
use File::Copy;
open( STOPWORD, "/Users/jen/stopWordList.txt" ) or die "Can't Open: $!\n";
#stops = <STOPWORD>;
while (<STOPWORD>) #read each line into $_
{
chomp #stops; # Remove newline from $_
push #stops, $_; # add the line to #triggers
}
close STOPWORD;
$dirtoget="/Users/jen/temp/";
opendir(IMD, $dirtoget) || die("Cannot open directory");
#thefiles= readdir(IMD);
foreach $f (#thefiles){
if ($f =~ m/\.ner$/){
print $f,"\n";
open (FILE, "/Users/jen/temp/$f")or die"Cannot open FILE";
if ( FILE eq "" ) {
close FILE;
}
else{
while (<FILE>) {
foreach $word(split(/\|/)){
foreach $x (#stops) {
if ($x =~ m/\b\Q$word\E\b/) {
$word = '';
copy("/Users/jen/temp/$f","/Users/jen/correct/$f")or die "Copy failed: $!";
close FILE;
}
}
}
}
}
}
}
closedir(IMD);
exit 0;
The format of the file I am splitting and comparing is as follows:
'<title>|NN|O Woman|NNP|O jumped|VBD|O for|IN|O life|NN|O after|IN|O firebomb|NN|O attack|NN|O -|:|O National|NNP|I-ORG News|NNP|I-ORG ,|,|I-ORG Frontpage|NNP|I-ORG -|:|I-ORG Independent.ie</title>|NNP|'
Should I be outlining where the words should be split ie: split(/|/)?
You should ALWAYS use :
use strict;
use warnings;
use three args open and test opening for failure.
As said codaddict A split with no arguments is equivalent to split(' ', $_).
Here is a proposal to achieve the job (as far as I well understood what you wanted).
#!/usr/bin/perl
use strict;
use warnings;
use 5.10.1;
my #stops = qw(put here your stop words);
my %stops = map{$_ => 1} #stops;
my #thefiles;
my $path = '/Users/jen/temp/';
my $out = $path.'outputfile';
open my $fout, '>', $out or die "can't open '$out' for writing : $!";
foreach my $file(#thefiles) {
next unless $file =~ /\.ner$/;
open my $fh, '<', $path.$file or die "can't open '$file' for reading : $!";
my #lines = <$file>;
close $fh;
foreach my $line(#lines) {
my #words = split/\|/,$line;
foreach my $word(#words) {
$word = '' if exists $stops{$word};
}
print $fout join '|',#words;
}
}
close $out;
A split with no arguments is equivalent to split(' ', $_).
Since you want the lines to be split on | you need to do:
split/\|/
#jenniem001,
open FILE, ("<$fh")||die("cant");undef $/;my $whole_file = <FILE>;foreach my $word (#words){$whole_file=~s/\b\Q$word\E\b//ig;}open FILE (">>$duplicate")||die("cant");print FILE $whole_file;
That will remove stops from your file and create a duplicate. Just call give $duplicate a name :)