Data extract from MQ output issue in perl code

Data extract from MQ output issue in perl code - regex

I'm having problem with my perl code, the code that I wrote is suppose to grab some information from MQ command dis ql(*) all,
below is one of the output example from above command,
AMQ8409: Display Queue details.
QUEUE(XXX.DATATYPE.NETSTATVM) TYPE(QLOCAL)
ACCTQ(QMGR) ALTDATE(2016-08-01)
ALTTIME(18.33.19) BOQNAME( )
BOTHRESH(0) CLUSNL( )
CLUSTER( ) CLCHNAME( )
CLWLPRTY(0) CLWLRANK(0)
CLWLUSEQ(QMGR) CRDATE(2016-08-01)
CRTIME(18.33.19) CURDEPTH(0)
CUSTOM( ) DEFBIND(OPEN)
DEFPRTY(0) DEFPSIST(YES)
DEFPRESP(SYNC) DEFREADA(NO)
DEFSOPT(SHARED) DEFTYPE(PREDEFINED)
DESCR(Queue for XXX.DataType.netstatvm)
DISTL(NO) GET(ENABLED)
HARDENBO INITQ( )
IPPROCS(1) MAXDEPTH(20000)
MAXMSGL(33554432) MONQ(QMGR)
MSGDLVSQ(PRIORITY) NOTRIGGER
NPMCLASS(NORMAL) OPPROCS(0)
PROCESS( ) PUT(ENABLED)
PROPCTL(COMPAT) QDEPTHHI(80)
QDEPTHLO(20) QDPHIEV(DISABLED)
QDPLOEV(DISABLED) QDPMAXEV(ENABLED)
QSVCIEV(NONE) QSVCINT(999999999)
RETINTVL(999999999) SCOPE(QMGR)
SHARE STATQ(QMGR)
TRIGDATA( ) TRIGDPTH(1)
TRIGMPRI(0) TRIGTYPE(FIRST)
USAGE(NORMAL)
Above output is grab from one of the queue in MQ instead of all queue which the command run.
From above, I want to extract the value from QUEUE, CURDEPTH and MAXDEPTH, as below:-
QUEUE(XXX.DATATYPE.NETSTATVM)
CURDEPTH(0)
MAXDEPTH(20000)
So, I wrote a perl code to obtain the value from QUEUE, CURDEPTH and MAXDEPTH, below is my code,
my $qm = XXX;
open (CHS_OUT, "echo 'dis ql(*) all'|runmqsc $qm|");
while (<CHS_OUT>) {
if ( /QUEUE\(/ ){
my $QueueName =~ /QUEUE/(/\S+)/g;
}
if ( /CURDEPTH\(/ ){
my $CurDepth =~ s/\D//g;
chomp $CurDepth;
print "$CurDepth \n";
}
if ( /MAXDEPTH\(/ ){
my $MaxDepth =~ s/\D//g;
chomp $MaxDepth;
print "$MaxDepth \n";
}
}
The output suppose to be as below,
XXX.DATATYPE.NETSTATVM
0
20000
However, I received a multiple error to extract all of this 3 information, one of the error as below,
Use of uninitialized value $MaxDepth in substitution (s///) at mq_test.pl line 26, line 7361.
Use of uninitialized value $MaxDepth in scalar chomp at mq_test.pl line 27, line 7361.
Use of uninitialized value $MaxDepth in concatenation (.) or string at mq_test.pl line 28, line 7361.
This make me confuse since I already do multiples changes of this code but still not success.

You could use the following regular Expression
(?:QUEUE|CURDEPTH|MAXDEPTH)\(\K[^()]+
See a demo on regex101.com.
That is
(?:QUEUE|CURDEPTH|MAXDEPTH) # one of the alternatives
\( # an opening bracket
\K # "forget" everything
[^()]+ # not (), at least once
In Perl this would be:
my #matches = $str =~ /(?:QUEUE|CURDEPTH|MAXDEPTH)\(\K[^()]+/g;
print "#matches\n";
# XXX.DATATYPE.NETSTATVM
# 0
# 20000

=~ is the binding operator. It binds the left hand side string to the match on the right hand side. But you have my $variable on the LHS - so the string is empty. What you want is to match against the implicit variable, and possibly store a part of the match. This is done by normal assignment in list context:
#!/usr/bin/perl
use warnings;
use strict;
while (<>) {
if ( /QUEUE\(/ ) {
my ($QueueName) = /QUEUE\((\S+)\)/;
print "QN: $QueueName\n";
}
if ( /CURDEPTH\(/ ) {
my ($CurDepth) = /CURDEPTH\((\d+)/;
print "CD: $CurDepth\n";
}
if ( /MAXDEPTH\(/ ) {
my ($MaxDepth) = /MAXDEPTH\((\d+)/;
print "MD: $MaxDepth\n";
}
}
You can combine all the regexes into one, too, and use a hash to store the values keyed by the word before the parenthesis:
#!/usr/bin/perl
use warnings;
use strict;
my %info;
while (<>) {
if (my ($key, $value)
= / ( QUEUE | CURDEPTH | MAXDEPTH ) \( ( [^)]+ ) /x
) {
$info{$key} = $value;
}
}
for my $key (keys %info) {
print "$key: $info{$key}\n";
}

I use the following to do something similar with awk:
echo "DIS QL(*) CURDEPTH MAXDEPTH"|runmqsc $qm | grep -o '^\w\+:\|\w\+[(][^)]\+[)]' | awk -F '[()]' -v OFS='\n' 'function printValues() { if ("QUEUE" in p) { print p["QUEUE"], p["CURDEPTH"], p["MAXDEPTH"], "" } } /^\w+:/ { printValues(); delete p; next } { p[$1] = $2 } END { printValues() }'
Output would look like this:
XXX.DATATYPE.NETSTATVM
0
20000
YYY.DATATYPE.NETSTATVM
50
10000

Related

Dynamically capture regular expression match in Perl

I'm trying to dynamically catch regex matching in Perl. I've known that eval will help me do this but I may be doing something wrong.
Code:
use strict;
use warnings;
my %testHash = (
'(\d+)\/(\d+)\/(\d+)' => '$1$2$3'
);
my $str = '1/12/2016';
foreach my $pattern (keys (%testHash)) {
my $value = $testHash{$pattern};
my $result;
eval {
local $_ = $str;
/$pattern/;
print "\$1 - $1\n";
print "\$2 - $2\n";
print "\$3 - $3\n";
eval { print "$value\n"; }
}
}
Is it also possible to store captured regex patterns in an array?

I believe what you really want is a dynamic version of the following:
say $str =~ s/(\d+)\/(\d+)\/(\d+)/$1$2$3/gr;
String::Substitution provides what we need to achieve that.
use String::Substitution qw( gsub_copy );
for my $pattern (keys(%testHash)) {
my $replacement = $testHash{$pattern};
say gsub_copy($str, $pattern, $replacement);
}
Note that $replacement can also be a callback. This permits far more complicated substitutions. For example, if you wanted to convert 1/12/2016 into 2016-01-12, you could use the following:
'(\d+)/(\d+)/(\d+)' => sub { sprintf "%d-%02d-%02d", #_[3,1,2] },
To answer your actual question:
use String::Substitution qw( interpolate_match_vars last_match_vars );
for my $pattern (keys(%testHash)) {
my $template = $testHash{$pattern};
$str =~ $pattern # Or /$pattern/ if you prefer
or die("No match!\n");
say interpolate_match_vars($template, last_match_vars());
}

I am not completely sure what you want to do here, but I don't think your program does what you think it does.
You are useing eval with a BLOCK of code. That's like a try block. If it dies inside of that eval block, it will catch that error. It will not run your string like it was code. You need a string eval for that.
Instead of explaining that, here's an alternative.
This program uses sprintf and numbers the parameters. The %1$s syntax in the pattern says _take the first argument (1$) and format it as a string (%s). You don't need to localize or assign to $_ to do a match. The =~ operator does that on other variables for you. I also use qr{} to create a quoted regular expression (essentially a variable containing a precompiled pattern) that I can use directly. Because of the {} as delimiter, I don't need to escape the slashes.
use strict;
use warnings;
use feature 'say'; # like print ..., "\n"
my %testHash = (
qr{(\d+)/(\d+)/(\d+)} => '%1$s.%2$s.%3$s',
qr{(\d+)/(\d+)/(\d+) nomatch} => '%1$s.%2$s.%3$s',
qr{(\d+)/(\d+)/(\d\d\d\d)} => '%3$4d-%2$02d-%1$02d',
qr{\d} => '%s', # no capture group
);
my $str = '1/12/2016';
foreach my $pattern ( keys %testHash ) {
my #captures = ( $str =~ $pattern );
say "pattern: $pattern";
if ($#+ == 0) {
say " no capture groups";
next;
}
unless (#captures) {
say " no match";
next;
}
# debug-output
for my $i ( 1 .. $#- ) {
say sprintf " \$%d - %s", $i, $captures[ $i - 1 ];
}
say sprintf $testHash{$pattern}, #captures;
}
I included four examples:
The first pattern is the one you had. It uses %1$s and so on as explained above.
The second one does not match. We check the number of elements in #captured by looking at it in scalar context.
The third one shows that you can also reorder the result, or even use the sprintf formatting.
The last one has no capture group. We check by looking at the index of the last element ($# as the sigil for arrays that usually have an # sigil) in #+, which holds the offsets of the ends of the last successful submatches in the currently active dynamic scope. The first element is the end of the overall match, so if this only has one element, we don't have capture groups.
The output for me is this:
pattern: (?^:(\d+)/(\d+)/(\d\d\d\d))
$1 - 1
$2 - 12
$3 - 2016
2016-12-01
pattern: (?^:(\d+)/(\d+)/(\d+) nomatch)
no match
pattern: (?^:\d)
no capture groups
pattern: (?^:(\d+)/(\d+)/(\d+))
$1 - 1
$2 - 12
$3 - 2016
1.12.2016
Note that the order in the output is mixed up. That's because hashes are not ordered in Perl, and if you iterate over the keys in a hash without sort the order is random.

Apologies! I realized both my question and sample code were both vague. But after reading your suggestions I came of with the following code.
I haven't optimized this code yet and there is a limit to the replacement.
foreach my $key (keys %testHash) {
if ( $str =~ $key ) {
my #matchArr = ($str =~ $key); # Capture all matches
# Search and replace (limited from $1 to $9)
for ( my $i = 0; $i < #matchArr; $i++ ) {
my $num = $i+1;
$testHash{$key} =~ s/\$$num/$matchArr[$i]/;
}
$result = $testHash{$key};
last;
}
}
print "$result\n";

Evaluing the regexp in list context returns the matches. so in your example:
use Data::Dumper; # so we can see the result
foreach my $pattern (keys (%testHash)) {
my #a = ($str =~/$pattern/);
print Dumper(\#a);
}
would do the job.
HTH
Georg

Is it also possible to store captured regex patterns in an array?
Of course it is possible to store captured substrings in an array:
#!/usr/bin/env perl
use strict;
use warnings;
my #patterns = map qr{$_}, qw{
(\d+)/(\d+)/(\d+)
};
my $str = '1/12/2016';
foreach my $pattern ( #patterns ) {
my #captured = ($str =~ $pattern)
or next;
print "'$_'\n" for #captured;
}
Output:
'1'
'12'
'2016'
I do not quite understand what you are trying to do with combinations of local, eval EXPR and eval BLOCK in your code and the purpose of the following hash:
my %testHash = (
'(\d+)\/(\d+)\/(\d+)' => '$1$2$3'
);
If you are trying to codify that this pattern should result in three captures, you can do that like this:
my #tests = (
{
pattern => qr{(\d+)/(\d+)/(\d+)},
ncaptures => 3,
}
);
my $str = '1/12/2016';
foreach my $test ( #tests ) {
my #captured = ($str =~ $test->{pattern})
or next;
unless (#captured == $test->{ncaptures}) {
# handle failure
}
}
See this answer to find out how you can automate counting the number of capture groups in a pattern. Using the technique in that answer:
#!/usr/bin/env perl
use strict;
use warnings;
use Test::More;
my #tests = map +{ pattern => qr{$_}, ncaptures => number_of_capturing_groups($_) }, qw(
(\d+)/(\d+)/(\d+)
);
my $str = '1/12/2016';
foreach my $test ( #tests ) {
my #captured = ($str =~ $test->{pattern});
ok #captured == $test->{ncaptures};
}
done_testing;
sub number_of_capturing_groups {
"" =~ /|$_[0]/;
return $#+;
}
Output:
ok 1
1..1

perl count line in double looping, if match regular expression plus 1

I open a file by putting the line to an array. Inside this file based on the regular expression that contains a duplicate value. If the regular expression is a match I want to count it. The regular expression may look like this
$b =~ /\/([^\/]+)##/. I want to match $1 value.
my #array = do
{
open my $FH, '<', 'abc.txt' or die 'unable to open the file\n';
<$FH>;
};
Below is the way I do, it will get the same line in my file. Thank for help.
foreach my $b (#array)
{
$conflictTemp = 0;
$b =~ /\/([^\/]+)##/;
$b = $1;
#print "$b\n";
foreach my $c (#array)
{
$c =~ /\/([^\/]+)##/;
$c = $1;
if($b eq $c)
{
$conflictTemp ++;
#print "$b , $c \n"
#if($conflictTemp > 1)
#{
# $conflict ++;
#}
}
}
}
Below is the some sample data, two sentences are duplicates
/a/b/c/d/code/Debug/atlantis_digital/c/d/code/Debug/atlantis_digital.map##/main/place.09/2
/a/b/c/d/code/C5537_mem_map.cmd##/main/place.09/0
/a/b/c/d/code/.settings/org.eclipse.cdt.managedbuilder.core.prefs##/main/4
/a/b/c/d/code/.project_initial##/main/2
/a/b/c/d/code/.project##/main/CSS5/5
/a/b/c/d/code/.cproject##/main/CSS5/10
/a/b/c/d/code/.cdtproject##/main/place.09/0
/a/b/c/d/code/.cdtproject##/main/place.09/0
/a/b/c/d/code/.cdtbuild_initial##/main/2
/a/b/c/d/code/.**cdtbuild##**/main/CSS5/2
/a/b/c/d/code/.**cdtbuild##**/main/CSS5/2
/a/b/c/d/code/.ccsproject##/main/CSS5/3

It looks like you're trying to iterate each element of the array, select some data via pattern match, and then count dupes. Is that correct?
Would it not be easier to:
my %count_of;
while ( <$FH> ) {
my ( $val ) = /\/([^\/]+)##/;
$count_of{$val}++;
}
And then, for the variables that have more than one (e.g. there's a duplicate):
print join "\n", grep { $count_of{$_} > 1 } keys %count_of;
Alternatively, if you're just wanting to play 'spot the dupe':
#!/usr/bin/env perl
use strict;
use warnings;
my %seen;
my $match = qr/\/([^\/]+)##/;
while ( <DATA> ) {
my ( $value ) = m/$match/ or next;
print if $seen{$value}++;
}
__DATA__
/a/b/c/d/code/Debug/atlantis_digital/c/d/code/Debug/atlantis_digital.map##/main/place.09/2
/a/b/c/d/code/C5537_mem_map.cmd##/main/place.09/0
/a/b/c/d/code/.settings/org.eclipse.cdt.managedbuilder.core.prefs##/main/4
/a/b/c/d/code/.project_initial##/main/2
/a/b/c/d/code/.project##/main/CSS5/5
/a/b/c/d/code/.cproject##/main/CSS5/10
/a/b/c/d/code/.cdtproject##/main/place.09/0
/a/b/c/d/code/.cdtproject##/main/place.09/0
/a/b/c/d/code/.cdtbuild_initial##/main/2
/a/b/c/d/code/.cdtbuild##/main/CSS5/2
/a/b/c/d/code/.cdtbuild##/main/CSS5/2
/a/b/c/d/code/.ccsproject##/main/CSS5/3

The problem has been solved by the previous answer - I just want to offer an alternate flavour that;
Spells out the regex
Uses the %seen hash to record the line the pattern first appears; to enable
slightly more detailed reporting
use v5.12;
use warnings;
my $regex = qr/
\/ # A literal slash followed by
( # Capture to $1 ...
[^\/]+ # ... anything that's not a slash
) # close capture to $1
## # Must be immdiately followed by literal ##
/x;
my %line_num ;
while (<>) {
next unless /$regex/ ;
my $pattern = $1 ;
if ( $line_num{ $pattern } ) {
say "'$pattern' appears on lines ", $line_num{ $pattern }, " and $." ;
next ;
}
$line_num{ $pattern } = $. ; # Record the line number
}
# Ran on data above will produce;
# '.cdtproject' appears on lines 7 and 8
# '.cdtbuild' appears on lines 10 and 11

Perl regular expressions to capture parts of command output

Edited the question : I have added a third function. I have come up with a regex which seems to be correct . (1st and 2nd function work as expected. )
I have written a couple of functions which are in a library, and I call the functions from my test script. I'm having some issues with the regular expressions. Can somebody help me out with the regular expressions?
Function 1:
sub ipsec_version {
my ($self) = #_;
my $cmd = 'sudo -s ipsec version ';
my $version = 0;
#execute the command
$self->execute($cmd);
foreach my $line ( #{ $self->get_stdout() } ) {
if ( $line =~ m/strongSwan/msx ) {
$version = $1;
}
}
return $version;
}
Function call:
$self->{'ipsec_version'} = $self->{'ipsec_obj'}->ipsec_version();
INFO('[Startup] ipsec version is : ' . $self->{'ipsec_version'} );
Actual output:
Use of uninitialized value in concatenation (.) or string at ... line 37.
ipsec version is :
Expected output:
strongSwan U5.1.2/K3.16.0-30-generic
Command output:
I need the script to capture the expected output string from this
Linux strongSwan U5.1.2/K3.16.0-30-generic
Institute for Internet Technologies and Applications
University of Applied Sciences Rapperswil, Switzerland
See 'ipsec --copyright' for copyright information.
Function 2:
sub ipsec_status {
my ($self,$connection_name) = #_;
my $cmd = 'sudo -s ipsec status ' . $connection_name;
my $status = 0;
#execute the command
$self->execute($cmd);
foreach my $line ( #{ $self->get_stdout() } ) {
if ( $line =~ m/Security\sassociations\d\()/ ) {
$status = $1;
}
}
return $status;
}
Function call:
$self->{'ipsec_status'} = $self->{'ipsec_obj'}->ipsec_status('connection');
('[Startup] ipsec status is : ' . $self->{'ipsec_status'} );
Actual output:
INFO [Startup] ipsec status is : 0
Expected output:
Security Associations (1 up, 0 connecting)
Command output:
I need the script to capture the expected output string from this
Security Associations (1 up, 0 connecting):
connection[3]: ESTABLISHED 3 seconds ago, 1.1.1.19[1.1.1.19]...10.81.1.50[10.81.1.50]
connection{3}: INSTALLED, TUNNEL, ESP in UDP SPIs: cb343e86_i abf6d1f2_o
Function 3 :
sub ipsec_restart {
my ($self) = #_;
my $cmd = 'sudo -s ipsec restart';
my $restart = 0;
$self->execute($cmd);
foreach my $line ( #{ $self->get_stdout() } ) {
if ( $line =~ /(Starting strongSwan.*IPsec$)/ ) {
$restart = $1;
last;
}
}
return $restart;
}
Function call :
$self->{'ipsec_restart'} = $self->{'ipsec_obj'}->ipsec_restart();
('[Startup] ipsec restart status is : ' . $self->{'ipsec_restart'} );
Expected output : See the highlighted text below.
I checked in https://regex101.com/ . My regex seems to be correct. /(Starting strongSwan.*IPsec$)/
Starting strongSwan 5.1.2 IPsec
Actual output is : 0

The problem is that you're using $1 which is set only if there are captures in the regular expression that you are using. Also, you should add a last so that you don't go on searching for a matching line once you have found one. lastly, it is wrong to habitually add the /m, /s and /x modifiers to every regex match regardless of whether they make any difference, and a leading m/ is only necessary if you are using something other than the default slash delimiters.
The first function should contain
for my $line ( #{ $self->get_stdout } ) {
if ( $line =~ /(strongSwan.*\S)/i ) {
$version = $1;
last;
}
}
and the second function is similar, although it looks like you're trying to match Security associations immediately followed by a digit, which isn't what is in the text. You also have a small a instead of a capital, which is fine if you also use an /i modifier. This captures Security Associations up to and including the last closing parenthesis on the same line. Is that what you need?
for my $line ( #{ $self->get_stdout } ) {
if ( $line =~ /(Security\s+Associations.*\))/i ) {
$status = $1;
last;
}
}

For function 1, assuming that all that changes is the version number, we can use
m/(strongSwan.*?) Institute/ms
as the regular expression. This will match everything from strongSwan to Institute, in a non-greedy way, and store all of it except Institute in $1.
For function 2, we will use the fact that the unknown data is contained in parentheses.
m/(Security associations \(.*?\))/

Using iterated variables with regex

The point of the overall script is to:
step 1) open a single column file and read off first entry.
step 2) open a second file containing lots of rows and columns, read off EACH line one at a time, and find anything in that line that matches the first entry from the first file.
step3) if a match is found, then "do something constructive", and if not, go to the first file and take the second entry and repeat step 2 and step 3, and so on...
here is the script:
#!/usr/bin/perl
use strict; #use warnings;
unless(#ARGV) {
print "\usage: $0 filename\n\n"; # $0 name of the program being executed
exit;
}
my $list = $ARGV[0];
chomp( $list );
unless (open(LIST, "<$list")) {
print "\n I can't open your list of genes!!! \n";
exit;
}
my( #list ) = (<LIST>);
close LIST;
open (CHR1, "<acembly_chr_sorted_by_exon_count.txt") or die;
my(#spreadsheet) = (<CHR1>);
close CHR1;
for (my $i = 0; $i < scalar #list; $i++ ) {
print "$i in list is $list[$i]\n";
for (my $j = 1; $j < scalar #spreadsheet; $j++ ) {
#print "$spreadsheet[$j]\n";
if ( $spreadsheet[$j] ) {
print "will $list[$i] match with $spreadsheet[$j]?\n";
}
else { print "no match\n" };
} #for
} #for
I plan to use a regex in the line if ( $spreadsheet[$j] ) { but am having a problem at this step as it is now. On the first interation, the line print "will $list[$i] match with $spreadsheet[$j]?\n"; prints $list[$i] OK but does not print $spreadsheet[$j]. This line will print both variables correctly on the second and following iterations. I do not see why?

At first glance nothing looks overtly incorrect. As mentioned in the comments the $j = 1 looks questionable but perhaps you are skipping the first row on purpose.
Here is a more perlish starting point that is tested. If it does not work then you have something going on with your input files.
Note the extended trailing whitespace removal. Sometimes if you open a WINDOWS file on a UNIX machine and use chomp, you can have embedded \r in your text that causes weird things to happen to printed output.
#!/usr/bin/perl
use strict; #use warnings;
unless(#ARGV) {
print "\usage: $0 filename\n\n"; # $0 name of the program being executed
exit;
}
my $list = shift;
unless (open(LIST, "<$list")) {
print "\n I can't open your list of genes!!! \n";
exit;
}
open(CHR1, "<acembly_chr_sorted_by_exon_count.txt") or die;
my #spreadsheet = map { s/\s+$//; $_ } <CHR1>;
close CHR1;
# s/\s+$//; is like chomp but trims all trailing whitespace even
# WINDOWS files opened on a UNIX system.
for my $item (<LIST>) {
$item =~ s/\s+$//; # trim all trailing whitespace
print "==> processing '$item'\n";
for my $row (#spreadsheet) {
if ($row =~ /\Q$item\E/) { # see perlre for \Q \E
print "match '$row'\n";
}
else {
print "no match '$row'\n";
}
}
}
close LIST;

Negate regular expression in Perl

I am splitting a text file into blocks in order to extract those blocks which do not contain a certain line by using a regular expression.
The text file looks like this:
[Term]
id: id1
name: name1
xref: type1:aab
xref: type2:cdc
[Term]
id: id2
name: name2
xref: type1:aba
xref: type3:fee
Someone helped me a few days ago by showing me how to extract those blocks which do contain a certain regular expression (for example "xref: type3"):
while (<MYFILE>) {
BEGIN { $/ = q|| }
my #lines = split /\n/;
for my $line ( #lines ) {
if ( $line =~ m/xref:\s*type3/ ) {
printf NEWFILE qq|%s|, $_;
last;
}
}
}
Now I want to write all blocks in a new file which do not contain "xref: type3". I tried to do this by simply negating the regex
if ( $line !~ m/xref:\s*type3/ )
or alternatively by negating the if statement by using
unless ( $line =~ m/xref:\s*type3/ )
Unfortunately it doesn't work - the output file is the same as the the original one. Any ideas what I'm doing wrong?

You have:
For every line, print this block if this line doesn't match the pattern.
But you want:
For every line, print this line if none of the other lines in the block match the pattern.
As such, you can't start printing the block before you examined every line in the block (or at all lines until you find a matching line).
local $/ = q||;
while (<MYFILE>) {
my #lines = split /\n/;
my $skip = 0;
for my $line ( #lines ) {
if ( $line =~ m/^xref:\s*type3/ ) {
$skip = 1;
last;
}
}
if (!$skip) {
for my $line ( #lines ) {
print NEWFILE $line;
}
}
}
But there's no need to split into lines. We can check and print the whole block at once.
local $/ = q||;
while (<MYFILE>) {
print NEWFILE $_ if !/^xref:\s*type3/m;
}
(Note the /m to make ^ match the start of any line.)

The problem is that you are using unless with !~ which is interpreted as if $line does not NOT match do this. ( a double negative )
When using the unless block with the normal pattern matching operator =~ you code worked perfectly, that is I see the first block as output because it does not contain type3.
LOOP:
while (<$MYFILE>) {
BEGIN { $/ = q|| }
my #lines = split /\n/;
for my $line ( #lines ) {
unless ( $line =~ m/xref:\s*type3/ ) {
printf qq|%s|, $_;
last LOOP;
}
}
}
# prints
# [Term]
# id: id1
# name: name1
# xref: type1:aab
# xref: type2:cdc

Do not process the records line by line. Use a paragraph mode:
{ local $/ = q();
while (<MYFILE>) {
if (! /xref:\s*type3/ ) {
printf NEWFILE qq|%s|, $_;
last;
}
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Data extract from MQ output issue in perl code - regex

Related

Dynamically capture regular expression match in Perl

perl count line in double looping, if match regular expression plus 1

Perl regular expressions to capture parts of command output

Using iterated variables with regex

Negate regular expression in Perl

Categories

Resources