Perl - Printing the next line - regex

I am a noob Perl user trying to get my work done ASAP so I can go home on time today :)
Basically I need to print the next line of blank lines in a text file.
The following is what I have so far. It can locate blank lines perfectly fine. Now I just have to print the next line.
open (FOUT, '>>result.txt');
die "File is not available" unless (#ARGV ==1);
open (FIN, $ARGV[0]) or die "Cannot open $ARGV[0]: $!\n";
#rawData=<FIN>;
$count = 0;
foreach $LineVar (#rawData)
{
if($_ = ~/^\s*$/)
{
print "blank line \n";
#I need something HERE!!
}
print "$count \n";
$count++;
}
close (FOUT);
close (FIN);
Thanks a bunch :)

open (FOUT, '>>result.txt');
die "File is not available" unless (#ARGV ==1);
open (FIN, $ARGV[0]) or die "Cannot open $ARGV[0]: $!\n";
$count = 0;
while(<FIN>)
{
if($_ = ~/^\s*$/)
{
print "blank line \n";
count++;
<FIN>;
print $_;
}
print "$count \n";
$count++;
}
close (FOUT);
close (FIN);
not reading the entire file into #rawData saves memory, especially in the case of large files...
<FIN> as a command reads the next line into $_
print ; by itself is a synonym for print $_; (although I went for the more explicit variant this time...

Elaborating on Ron Savage's solution:
foreach $LineVar (#rawData)
{
if ( $lastLineWasBlank )
{
print $LineVar;
$lastLineWasBlank = 0;
}
if($LineVar =~ /^\s*$/)
{
print "blank line \n";
#I need something HERE!!
$lastLineWasBlank = 1;
}
print "$count \n";
$count++;
}

I'd go like this but there's probably other ways to do it:
for ( my $i = 0 ; $i < #rawData ; $i++ ){
if ( $rawData[$i] =~ /^\s*$/ ){
print $rawData[$i + 1] ; ## plus check this is not null
}
}
J.

sh> perl -ne 'if ($b) { print }; if ($b = !/\S/) { ++$c }; END { print $c,"\n" }'
Add input filename(s) to your liking.

Add a variable like $lastLineWasBlank, and set it at the end of each loop.
if ( $lastLineWasBlank )
{
print "blank line\n" . $LineVar;
}
something like that. :-)

Related

Check if multiple lines exist in text file

Using Perl I would like to check if the two lines highlighted below exist in a text file . Each line is preceded by a tab.
CF=CFU-ALL-PROV-NONE-YES-NO-NONE-YES;
CF=CFB-ALL-PROV-NONE-YES-YES-NONE-YES;
***CF=CFU-TS10-ACT-NONE-YES-NO-NONE-YES;***
CF=CFNRY-ALL-PROV-NONE-YES-YES-NONE-YES;
CF=CFNRC-ALL-PROV-NONE-YES-NO-NONE-YES;
***CF=CFB-TS10-ACT-NONE-YES-NO-NONE-YES;***
CF=CFD-TS10-REG-9124445544-YES-YES;
I am using the following if statement but it is not matched
if (/\t*CF=(CFU-TS10-ACT-(NONE|\d+))/ && /\t*CF=(CFB-TS10-ACT-(NONE|\d+))/)
{
say "this case is found here .....";
}
What am I doing wrong ?
Edited
This is the program I wrote :-
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
my $HSSIN='D:\testproject\HSS-export-test-run-small.txt';
my $ofile = 'D:\testproject\HSS-output.txt';
open (INFILE, $HSSIN) or die "Can't open input file";
open (OUTFILE,"> $ofile" ) or die "Cant open file";
my $add;
my $MSISDN;
my $line;
sub callForwardingsCF()
{
if (/\t*CF=(CFU-TS10-ACT-(NONE|\d+))/ && /\t*CF=(CFB-TS10-ACT-(NONE|+\d+))/)
{
say "this case is found here .....";
}
} # end sub callForwardingsCFD
while (<INFILE>)
{
if (/<SUBEND/)
{
say "SUBEND found";
#$line = $1 if /^\s*MSISDN=(\d+);/;
print OUTFILE "processSingle UpdateCommand GSUB MKEY $line";
print OUTFILE "\n";
}
if ($_ =~ /^\t*MSISDN=(\d+);/)
{ #find MSISDN in file global search
say "STARTER MSISDN is $1";
$MSISDN = $1;
$add = $1;
$line = "$1"; #group 1
}
callForwardingsCF(); #callForwardings
}
close INFILE;
close OUTFILE;
Example of a record in the input file
<BEGINFILE>
<SUBBEGIN
IMSI=232191400029053;
MSISDN=4369050064401;
DEFCALL=TS11;
CURRENTNAM=BOTH;
CAT=COMMON;
TBS=TS11&TS12&TS21&TS22;
VLRLIST=10;
SGSNLIST=10;
SMDP=MSC;
CB=BAOC-ALL-PROV;
CB=BOIC-ALL-PROV;
CB=BOICEXHC-ALL-PROV;
CB=BICROAM-ALL-PROV;
CW=CW-ALL-PROV;
CF=CFU-ALL-PROV-NONE-YES-NO-NONE-YES-65535-NO-NO-NO-NO-NO-NO-NO-NO-NO-NO;
CF=CFB-ALL-PROV-NONE-YES-YES-NONE-YES-65535-NO-NO-NO-NO-NO-NO-NO-NO-NO-NO;
CF=CFU-TS10-ACT-NONE-YES-NO-NONE-YES-65535-YES-YES-NO-NO-NO-NO-NO-NO-NO-NO;
CF=CFNRY-ALL-PROV-NONE-YES-YES-NONE-YES-65535-NO-NO-NO-NO-NO-NO-NO-NO-NO-NO;
CF=CFNRC-ALL-PROV-NONE-YES-NO-NONE-YES-65535-NO-NO-NO-NO-NO-NO-NO-NO-NO-NO;
CF=CFB-TS10-ACT-NONE-YES-NO-NONE-YES-65535-YES-YES-NO-NO-NO-NO-NO-NO-NO-NO;
CF=CFD-TS10-REG-91436903000-YES-YES-25-YES-65535-YES-YES-NO-NO-NO-YES-YES-YES-YES-NO;
TCSISTATE=YES;
OCSISTATE=YES;
CONTROL=SUB;
WPA=0;
GS=HOLD&MPTY&ECT&CLIR&CLIP;
CLIRES=TEMPALLOW;
CLIPOC=NO;
OCSI=10;
CFSMS=ACT-10-914366488325207-YES-YES-NO-NO-NO;
ARD=PROV;
SUBRES=ALLPLMN;
IST_ALERT_TIMER=120;
IST_ALERT_RESPONSE=2;
SUB_AGE=0;
MIMSI=240076400029053-ONELIVE-2-2-1-0-0;
MIMSI=232191400029053-ONELIVE-1-1-1-0-0;
SID=2805158185721065;
MCSISTATE=YES;
CLRBSG=CLIP-YES-NO-NO-NO-NO;
UPLCSLCK=NO;
UPLPSLCK=NO;
DEFOFAID=10;
EPS_PROFILE_ID=1;
TGPPAMBRMAXUL=50000000;
TGPPAMBRMAXDL=150000000;
ARD_EXT=NULL-NULL-NULL-N3GPPNOTALLOWED;
FRAUDTPL_ID=10;
HLR_INDEX=1;
LTEAUTOPROV=NO;
PSSER=1-1-10-1-NONE-DYNAMIC-00000000;
EPSSER=1-10-10-1-NONE-DYNAMIC-00000000-1;
MPS=NO;
<SUBEND
Thanks,
Graham
Per default regexes match linewise.
So if you were trying to match an input that contains multiple lines, you would have to use one of the modifiers that allows the regex to match the entire string.
See the the perl regex documentation - the chapter "Modifiers".
Then you should add the s modifiler and change your if statement to:
if ( /\t*CF=(CFB-TS10-ACT-(NONE|\d+))/s &&
/\t*CF=(CFU-TS10-ACT-(NONE|\d+))/s ) {
say "found";
}
If you read linewise you will never have both of your regexes match for the same line, so you would need to do your regexes seperately as already suggested by the other answer.
#$/ = ""; #without paragraph mode
open my $file, '<', 'data_file';
binmode $file;
while(<$file>){
print $_ if ( $_ =~ /\s+CF=CFU-TS10-ACT-NONE-YES-NO-NONE-YES-\d+-YES-YES-NO-NO-NO-NO-NO-NO-NO-NO;/ ||
$_ =~ /\s+CF=CFB-TS10-ACT-NONE-YES-NO-NONE-YES-\d+-YES-YES-NO-NO-NO-NO-NO-NO-NO-NO;/ );
}
EDIT:
OR, you can do it in paragraph mode if conditions allow it.
$/ = "";
open my $file, '<', 'data_file';
binmode $file;
while(<$file>){
(undef, $first) = split (/\s+(CF=CFU-TS10-ACT-NONE-YES-NO-NONE-YES-\d+-YES-YES-NO-NO-NO-NO-NO-NO-NO-NO;)/, $_);
(undef, $second) = split(/\s+(CF=CFB-TS10-ACT-NONE-YES-NO-NONE-YES-\d+-YES-YES-NO-NO-NO-NO-NO-NO-NO-NO;)/, $_ );
print $first . "\n" . $second;
}
Code is tested and seems to work fine with supplied data.
Also, those are not tabs "\t" ... those are spaces "\s+" preceding those lines. Best thing is to learn your data set before you try to parse it ;)
Typically perl processes file "line by line".
Try something like sample script below:
my($line1,$line2);
while(<STDIN>) {
$line1=$_ if /\t*CF=(CFU-TS10-ACT-(NONE|\d+))/
$line2=$_ if /\t*CF=(CFB-TS10-ACT-(NONE|\d+))/
if( $line1 and $line2 ) {
say "this case is found here .....";
last; # skip processing remaning lines
}
}
Alternatively you may "slurp" whole file into one scalar variable.

Printing the output to a file in perl

I have been trying to find examples related to file handling, but I haven't found anything that will solve my problem.
use strict;
use warnings;
my ($m, $b) = #ARGV;
my $count_args = $#ARGV + 1;
my $times = 2**$m;
my $output = "main fucntion to be called???\n";
open(OUTFILE, "> output1.txt") || die "could not open output file";
print OUTFILE "$output\n"; ## Notice, there is not a comma after the file handle and the string to be output
close(OUTFILE);
main();
sub main {
if ($m =~ /^\d+$/) {
if ($b =~ /^and$/i) {
func_and();
}
else {
print " not supported\n";
}
}
else {
print " enter valid number of pins\n";
}
}
sub func_and {
my $bit;
for (my $i = 0 ; $i < $times ; $i++) {
my $p = sprintf("%0${m}b", $i);
print "$p\t";
my #c = split('', $p);
my $result = 100;
foreach $bit (#c) {
if ($result < 100) {
$result = $result && $bit;
}
else {
$result = $bit;
}
}
print "result is $result \n";
}
}
This program prints an output if I provide the input as 2 and the output is printed on the screen.
I want to change the file handle STDOUT of this program that is i want to print the output to output1.txt file
Can you please point out the mistake I am making?
----->
copy STDOUT to another filehandle
open (my $STDOLD, '>&', STDOUT);
# redirect STDOUT to log.txt
open (STDOUT, '>>', "$ARGV[1]".".log");
print main();
# restore STDOUT
open (STDOUT, '>&', $STDOLD);
print " check .log file\n This should show in the terminal\n";
This works for me.. but in the end of the log file i have digit "1" printed i don't know why..I believe it is the timestamp being printed. I need to remove it.. do u know why is it happening?
It sounds like you want to redirect the output from the program to a file?
To do that from the command line, enter
perl myprogram.pl > myoutput.txt
If you want everything printed to STDOUT to go to a file then just open it as a file handle at the start of your program.
Like this
open STDOUT, '>', 'myoutput.txt' or die $!;
Any file handle you assign to the typeglob *::STDOUT will become STDOUT.
if ( my $fh = FileHandle->new( ">>$stdout_path" )) {
*::STDOUT = $fh;
}
Although I normally usually only redirect STDERR to STDOUT or a file.
I would try the following (avoiding using double quotes plus using a "clear then append" approach:
:
:
my $output = "main fucntion to be called???\n";
open(OUTFILE, '> output1.txt') || die "could not clear down output file";
close OUTFILE;
open(OUTFILE, '>> output1.txt') || die "could not open append output file";
print OUTFILE "$output\n"; ## Notice, there is not a comma after the file handle and the string to be output
close OUTFILE;
:
:
No need for braces in the close statement. The first open effectively emptys the file. The second is an open-extend (or append).
I assume you will then use print OUTFILE ... throughout and move the second close Output; to the end of your script.
OR...
:
:
my $output = "main fucntion to be called???\n";
open(STDOUT, '> output1.txt') || die "could not clear down output file";
close STDOUT;
open(STDOUT, '>> output1.txt') || die "could not open append output file";
print "$output\n"; ## Notice, there is not a comma after the file handle and the string to be output
:
:
close STDOUT;
:
:
Pretty sure this works - without need for STDOUT on every print command.
From what I understand, You are trying to send everything you "print" to a file. If that is what you want to do, it is fairly simple.
open (OUTPUT, '>output1.txt');
#whatever logic you want to do
print OUTPUT "whatever you want to print","temporary outputs";
#any other logic you have for computing the output
print OUTPUT "output1","output2";
#after completing all print statements, close the file handle
close(OUTPUT);
Basically, you need to give your File Handle (in my example OUTPUT) to "print" and then print like you would do normally. For example, if your original line is
print "not supported\n";
It will become
print OUTPUT "not supported\n";
You can use select to set the default filehandle for print:
open(my $OUTFILE, '>', 'output1.txt') || die "could not open output file: $!";
select $OUTFILE;
print $output;

Matching a pattern : regex - perl

If I want to find in this file all instances of the words USER and PASS and then put the number of times they appear into the two variables respectively, how would I go about that? Thanks!
open MYFILE, '<', 'source_file.txt' or die $!;
open OUTFILE, '>', 'Header.txt' or die $!;
$user = 0;
$pass = 0;
while (<MYFILE>) {
chomp;
my #header = split (' ',$_);
print OUTFILE "$linenum: #header\n\n";
if (/USER/ig) {
$user++;
}
if (/PASS/ig) {
$pass++;
}
}
Above is the new code and it works.
I set my variables equal to 0 and used the ++ incrementor on the variables.
But I am still open to suggestions perhaps on expanding my regex's capabilities? (if that makes sense)
You could simply do.
my $user = 0;
my $pass = 0;
while (<MYFILE>) {
chomp;
my #header = split ' ', $_;
print OUTFILE "$linenum: #header\n\n";
$user++ if /user/ig;
$pass++ if /pass/ig;
}

Pull regular expressions from file and compare to each line in a file

I found something that I could use on perlmonks.org (http://www.perlmonks.org/?node_id=870806) but I can't get it to work.
I can read the file without issue and build an array. Then, I'd like to compare each index of the array (each regex) to each line of a file, printing out the line before and the line after the matched line.
My code:
# List of regex's. If this file doesn't exist, we can't continue
open ( $fh, "<", $DEF_FILE ) || die ("Can't open regex file: $DEF_FILE");
while (<$fh>) {
chomp;
push (#bad_strings, $_);
}
close $fh || die "Cannot close regex file: $DEF_FILE: $!";
$file = '/tmp/mydirectory/myfile.txt';
eval { open ( $fh, "<", $file ); };
if ($#) {
# If there was an error opening the file, just move on
print "Error opening file: $file.\n";
} else {
# If no error, process the file
foreach $bad_string (#bad_strings) {
$this_line = "";
$do_next = 0;
seek($fh, 0, 0); # move pointer to 0 each time through
while(<$fh>) {
$last_line = $this_line;
$this_line = $_;
my $rege = eval "sub{ \$_[0] =~ $bad_string }"; # Real-time regex
if ($rege->( $this_line )) { # Line 82
print $last_line unless $do_next;
print $this_line;
$do_next = 1;
} else {
print $this_line if $do_next;
$last_line = "";
$do_next = 0;
}
}
}
} # End "if error opening file" check
This was working before when I had just a string per line in the file and performed a simple test such as if ($this_line =~ /$string_to_search_for/i ) but when I switched to regex in the file and a "real-time" eval statement, I now get Can't use string ("") as a subroutine ref while "strict refs" in use at scrub_file.pl line 82 and line 82 is if ($rege->($this_line)) {.
Prior to that error message, I'm receiving: Use of uninitialized value in subroutine entry at scrub_hhsysdump_file.pl line 82, <$fh> I have some understanding of that error message but can't seem to make the perl engine happy with my code thus far.
Still new to perl and always looking for pointers. Thanks in advance.
I fail to see the reason for those eval statements - all they seem to do is make the code a lot more complicated and difficult to debug.
But $rege is undef because eval "sub{ \$_[0] =~ $bad_string }" isn't working, due to the string having a syntax error. I don't know what's in $DEF_FILE, but unless it has properly-delimited regular expressions then you need to add the delimiters in the eval string.
my $rege = eval "sub{ \$_[0] =~ /$bad_string/ }"
may work, but you may need /\Q$bad_string/ instead if the strings in $DEF_FILE contain regex metacharacters and you want them to be treated as literal characters.
I suggest this version of your program which seems to do what you need without the fuss of the eval calls.
use strict;
use warnings;
use Fcntl ':seek';
my $DEF_FILE = 'myfile';
my #bad_strings = do {
open my $fh, '<', $DEF_FILE or die qq(Can't open regex file "$DEF_FILE": $!);
<$fh>;
};
chomp #bad_strings;
my $file = '/tmp/mydirectory/myfile.txt';
open my $fh, '<', $file or die qq(Unable to open "$file" for input: $!);
for my $bad_string (#bad_strings) {
my $regex = qr/$bad_string/;
my ($last_line, $this_line, $do_next) = ('', '', 0);
seek $fh, 0, SEEK_SET;
while (<$fh>) {
($last_line, $this_line) = ($this_line, $_);
if ($this_line =~ $regex) {
print $last_line unless $do_next;
print $this_line;
$do_next = 1;
}
else {
print $this_line if $do_next;
$do_next = 0;
}
}
}

match only once but not more

I want to check existence of server name taken from one file in another one. The idea is that second file holds multiple lines with server name + additional info in each.
so the output for the example name "server01" is
server01
server01
server01
i want to have it only once in output xls file for each name that exist in both files of course.
The program so far is:
#!/usr/bin/perl -w
use Spreadsheet::WriteExcel;
#OPEN FILES
open(FILE, "CEP06032012.csv") or die("Unable to open CEP file");
#CEP_file = <FILE>;
close(FILE);
open(FILE, "listsystems_temp") or die("Unable to open listsystems file");
#listsystems_file = <FILE>;
close(FILE);
#XLS properties
my $workbook = Spreadsheet::WriteExcel->new('report.xls');
my $worksheet_servers = $workbook->add_worksheet();
#MAIN
my $r = 0;
foreach my $lines(#CEP_file){
my #CEP_file = split ";", $lines;
my $server = $CEP_file[8];
foreach my $lines2(#listsystems_file){
if ($lines2 =~ m/.*$server.*/i && $server ne ""){
print "$server \n";
#print "$lines2 \n";
$worksheet_servers->write($r, 0, "$server");
$r++;
}
}
}
exit();
any ideas how to change it?
Try this. This will skip multiple occurance of $server, keep only one.
my %seen;
foreach my $lines2(#listsystems_file){
next if $seen{$server};
if ($lines2 =~ m/.*$server.*/i && $server ne ""){
print "$server \n";
#print "$lines2 \n";
$worksheet_servers->write($r, 0, "$server");
$r++;
$seen{$server} = 1;
}
}