I'm trying to match lines from a file and extract a certain part.
My Regex works with all online testers I could find but not with my perl.
I'm on version v5.10.0 and cannot update.
The regex looks like this:
sub parse_bl_line {
if ($_[0] =~ m/^copy\s+.*?\s+(.*?\_.*)/) {
return $1;
} else {
log_msg("Line discarded: $_[0]", 4);
return "0";
}
}
A couple lines of test data which should match (only the last matches):
#bl_lines = (
"copy xxxxxx_/cpu b_relCAP_R3.0-1_INT5_xxxxx_cpu_p1",
"copy xxxxxxxx_/va_xxx_parameters b_relCAP_R3.0-1_INT5_xxxxx_va_xxx_parameters_p1",
"copy xxxxxxxx_/xxxxxxx_view.tcl b_relCAP_R3.0-1_INT5_xxxxxx_view.tcl_p0",
"copy xxxxx_/xxxxxarchivetool.jar b_relEARLY_DROP1_xxxxxarchivetool.jar_xx");
And calling the function:
foreach(#bl_lines) {
$file=parse_bl_line($_);
if ($file !~ "0") {
log_msg("Line accepted: $_", 4);
log_msg("File extracted: $file", 4);
}else {
log_msg("Line rejected: $_", 2);
}
}
I'm trying to match the last part e.g.
b_relEARLY_DROP1_xxxxxarchivetool.jar_xx
Output looks the following:
20120726 13:15:34 - [XXX] ERROR: Line rejected: copy xxxxxx_/cpu b_relCAP_R3.0-1_INT5_xxxxx_cpu_p1
20120726 13:15:34 - [XXX] ERROR: Line rejected: copy xxxxxxxx_/va_xxx_parameters b_relCAP_R3.0-1_INT5_xxxxx_va_xxx_parameters_p1
20120726 13:15:34 - [XXX] ERROR: Line rejected: copy xxxxxxxx_/xxxxxxx_view.tcl b_relCAP_R3.0-1_INT5_xxxxxx_view.tcl_p0
20120726 13:15:35 - [XXX] INFO: Line accepted: copy xxxxx_/xxxxxarchivetool.jar b_relEARLY_DROP1_xxxxxarchivetool.jar_xx
20120726 13:15:35 - [XXX] INFO: File extracted: b_relEARLY_DROP1_xxxxxarchivetool.jar_xx
Hint
I did some of the testing that #BaL proposed and found out that the pattern matching works without the selection parenthesis.
if ($_[0] =~ m/^copy\s+.+?\s+.+?\_.+$/) {
The test : if ($file !~ "0") { is true when $file doesn't contain a 0 at any position which is the case of the last string only.
I guess you want to use : if ($file ne '0') { or even shorter : if ($file) {
Apart of this you should really use strict; and use warnings always.
What are you trying to match ? The last part ?
Don't use * if you know that you have something to match, use + instead :
if ($_[0] =~ m/^copy\s+.+?\s+(.\+?)$/) {
return $1;
}
I'm guessing that the last line of your test file is the only one that doesn't end with a "\n". Funny little buggers are always getting in the way.....
Change the comparison operator in your if statement from !~ to ne as you are making a string comparison. When I make this change, all log lines were accepted.
I tested this on perl 5.14.2, not 5.10, but I didn't use any special features. Give it a go! code is below:
use 5.14.2;
sub log_msg{
say shift;
}
sub parse_bl_line {
if ($_[0] =~ m/^copy\s+.*?\s+(.*?\_.*)/) {
return $1;
}
else {
log_msg("Line discarded: $_[0]", 4);
return "0";
}
}
my #bl_lines = (
"copy xxxxxx_/cpu b_relCAP_R3.0-1_INT5_xxxxx_cpu_p1",
"copy xxxxxxxx_/va_xxx_parameters b_relCAP_R3.0-1_INT5_xxxxx_va_xxx_parameters_p1",
"copy xxxxxxxx_/xxxxxxx_view.tcl b_relCAP_R3.0-1_INT5_xxxxxx_view.tcl_p0",
"copy xxxxx_/xxxxxarchivetool.jar b_relEARLY_DROP1_xxxxxarchivetool.jar_xx"
);
foreach(#bl_lines) {
my $file = parse_bl_line($_);
if ($file ne "0") { # Changed the comparison operator here
log_msg("Line accepted: $_", 4);
log_msg("File extracted: $file", 4);
}
else {
log_msg("Line rejected: $_", 2);
}
}
Related
This code is supposed to find a line with a regular expression and replace the line with "test". It is finding that line and replace it with "test" but also deleting the line under it, no matter what is in the next line down. I feel like I am just missing something about how a switch works in PowerShell.
Note: This is super boiled down code. There is a larger program this is part of.
$reg = '^HI\*BH'
$appendText = ''
$file = Get-ChildItem (join-path $PSScriptRoot "a.txt.BAK")
foreach ($f in $file){
switch -regex -file $f {
$reg
{
$appendText = "test"
}
default {
If ($appendText -eq '') {$appendText = $_}
$appendText
$appendText = ''
}
}
}
a.txt.BAK
HI*BH>00>D8>0*BH>00>D8>0*BH>A1>D8>0*BH>B1>D8>0000000~
HI*BE>02>>>0.00*BE>00>>>0.00~
NM1*71*1*TTT*NAME****XX*0000000~
PRV*AT*PXC*000V00000X~
Output:
test
NM1*71*1*TTT*NAME****XX*0000000~
PRV*AT*PXC*000V00000X~
The switch is not "deleting" anything - but you explicit ask it to overwrite $appendText on match, and you only ever output (and reset the value of) $appendText when it doesn't.
This code is supposed to find a line with a regular expression and replace the line with "test".
In that case I suggest you simplify your switch:
switch -regex -file $f {
$reg {
"test"
}
default {
$_
}
}
That's it - no fiddling around with variables - just output "test" on match, otherwise output the line as-is.
If you insist on using the intermediate variable, you'll need to output + reset the value in both cases:
switch -regex -file $f {
$reg {
$appendText = "test"
$appendText
$appendText = ''
}
default {
$appendText = $_
$appendText
$appendText = ''
}
}
Am trying to create a filter to only run the IF clause in case of the character in the 4 position be either "W", "X", "Y", "Z".
The code so far looks like this:
$name= "123Y567"
if ($name.notcontains("Y"))
{
"Action 1"
}
else
{
"Action 2"
}
}
I have tried this as well
$name= "123Y567"
if ($name -notmatch (/^.{4}["Y"]/))
{
"Action 1"
}
else
{
"Action 2"
}
}
What could be the best solution for this? but new working with scripting and Powershell in general.
Without regex, as Lee_Daily already suggested, use switch on the 4th character:
$name= "123Y567"
switch ($name[3]) {
'W' { Action_1 }
'X' { Action_2 }
'Y' { Action_3 }
'Z' { Action_4 }
}
Or if the action to perform is the same for any of the characters 'W','X','Y','Z' :
$name= "123Y567"
if ('W','X','Y','Z' -contains $name[3] ) { Action_1 }
If you want to use regex to check if the fourth char is a char in a custom range/set you may use
if ($name -cmatch '^.{3}[W-Z]')
Note that PS regex matching is case insensitive with -match, you need to use -cmatch to make it case sensitive.
The pattern matches
^ - start of string
.{3} - any 3 chars other than a newline
[W-Z] - a W, X, Y or Z.
Using Perl I would like to check if the two lines highlighted below exist in a text file . Each line is preceded by a tab.
CF=CFU-ALL-PROV-NONE-YES-NO-NONE-YES;
CF=CFB-ALL-PROV-NONE-YES-YES-NONE-YES;
***CF=CFU-TS10-ACT-NONE-YES-NO-NONE-YES;***
CF=CFNRY-ALL-PROV-NONE-YES-YES-NONE-YES;
CF=CFNRC-ALL-PROV-NONE-YES-NO-NONE-YES;
***CF=CFB-TS10-ACT-NONE-YES-NO-NONE-YES;***
CF=CFD-TS10-REG-9124445544-YES-YES;
I am using the following if statement but it is not matched
if (/\t*CF=(CFU-TS10-ACT-(NONE|\d+))/ && /\t*CF=(CFB-TS10-ACT-(NONE|\d+))/)
{
say "this case is found here .....";
}
What am I doing wrong ?
Edited
This is the program I wrote :-
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
my $HSSIN='D:\testproject\HSS-export-test-run-small.txt';
my $ofile = 'D:\testproject\HSS-output.txt';
open (INFILE, $HSSIN) or die "Can't open input file";
open (OUTFILE,"> $ofile" ) or die "Cant open file";
my $add;
my $MSISDN;
my $line;
sub callForwardingsCF()
{
if (/\t*CF=(CFU-TS10-ACT-(NONE|\d+))/ && /\t*CF=(CFB-TS10-ACT-(NONE|+\d+))/)
{
say "this case is found here .....";
}
} # end sub callForwardingsCFD
while (<INFILE>)
{
if (/<SUBEND/)
{
say "SUBEND found";
#$line = $1 if /^\s*MSISDN=(\d+);/;
print OUTFILE "processSingle UpdateCommand GSUB MKEY $line";
print OUTFILE "\n";
}
if ($_ =~ /^\t*MSISDN=(\d+);/)
{ #find MSISDN in file global search
say "STARTER MSISDN is $1";
$MSISDN = $1;
$add = $1;
$line = "$1"; #group 1
}
callForwardingsCF(); #callForwardings
}
close INFILE;
close OUTFILE;
Example of a record in the input file
<BEGINFILE>
<SUBBEGIN
IMSI=232191400029053;
MSISDN=4369050064401;
DEFCALL=TS11;
CURRENTNAM=BOTH;
CAT=COMMON;
TBS=TS11&TS12&TS21&TS22;
VLRLIST=10;
SGSNLIST=10;
SMDP=MSC;
CB=BAOC-ALL-PROV;
CB=BOIC-ALL-PROV;
CB=BOICEXHC-ALL-PROV;
CB=BICROAM-ALL-PROV;
CW=CW-ALL-PROV;
CF=CFU-ALL-PROV-NONE-YES-NO-NONE-YES-65535-NO-NO-NO-NO-NO-NO-NO-NO-NO-NO;
CF=CFB-ALL-PROV-NONE-YES-YES-NONE-YES-65535-NO-NO-NO-NO-NO-NO-NO-NO-NO-NO;
CF=CFU-TS10-ACT-NONE-YES-NO-NONE-YES-65535-YES-YES-NO-NO-NO-NO-NO-NO-NO-NO;
CF=CFNRY-ALL-PROV-NONE-YES-YES-NONE-YES-65535-NO-NO-NO-NO-NO-NO-NO-NO-NO-NO;
CF=CFNRC-ALL-PROV-NONE-YES-NO-NONE-YES-65535-NO-NO-NO-NO-NO-NO-NO-NO-NO-NO;
CF=CFB-TS10-ACT-NONE-YES-NO-NONE-YES-65535-YES-YES-NO-NO-NO-NO-NO-NO-NO-NO;
CF=CFD-TS10-REG-91436903000-YES-YES-25-YES-65535-YES-YES-NO-NO-NO-YES-YES-YES-YES-NO;
TCSISTATE=YES;
OCSISTATE=YES;
CONTROL=SUB;
WPA=0;
GS=HOLD&MPTY&ECT&CLIR&CLIP;
CLIRES=TEMPALLOW;
CLIPOC=NO;
OCSI=10;
CFSMS=ACT-10-914366488325207-YES-YES-NO-NO-NO;
ARD=PROV;
SUBRES=ALLPLMN;
IST_ALERT_TIMER=120;
IST_ALERT_RESPONSE=2;
SUB_AGE=0;
MIMSI=240076400029053-ONELIVE-2-2-1-0-0;
MIMSI=232191400029053-ONELIVE-1-1-1-0-0;
SID=2805158185721065;
MCSISTATE=YES;
CLRBSG=CLIP-YES-NO-NO-NO-NO;
UPLCSLCK=NO;
UPLPSLCK=NO;
DEFOFAID=10;
EPS_PROFILE_ID=1;
TGPPAMBRMAXUL=50000000;
TGPPAMBRMAXDL=150000000;
ARD_EXT=NULL-NULL-NULL-N3GPPNOTALLOWED;
FRAUDTPL_ID=10;
HLR_INDEX=1;
LTEAUTOPROV=NO;
PSSER=1-1-10-1-NONE-DYNAMIC-00000000;
EPSSER=1-10-10-1-NONE-DYNAMIC-00000000-1;
MPS=NO;
<SUBEND
Thanks,
Graham
Per default regexes match linewise.
So if you were trying to match an input that contains multiple lines, you would have to use one of the modifiers that allows the regex to match the entire string.
See the the perl regex documentation - the chapter "Modifiers".
Then you should add the s modifiler and change your if statement to:
if ( /\t*CF=(CFB-TS10-ACT-(NONE|\d+))/s &&
/\t*CF=(CFU-TS10-ACT-(NONE|\d+))/s ) {
say "found";
}
If you read linewise you will never have both of your regexes match for the same line, so you would need to do your regexes seperately as already suggested by the other answer.
#$/ = ""; #without paragraph mode
open my $file, '<', 'data_file';
binmode $file;
while(<$file>){
print $_ if ( $_ =~ /\s+CF=CFU-TS10-ACT-NONE-YES-NO-NONE-YES-\d+-YES-YES-NO-NO-NO-NO-NO-NO-NO-NO;/ ||
$_ =~ /\s+CF=CFB-TS10-ACT-NONE-YES-NO-NONE-YES-\d+-YES-YES-NO-NO-NO-NO-NO-NO-NO-NO;/ );
}
EDIT:
OR, you can do it in paragraph mode if conditions allow it.
$/ = "";
open my $file, '<', 'data_file';
binmode $file;
while(<$file>){
(undef, $first) = split (/\s+(CF=CFU-TS10-ACT-NONE-YES-NO-NONE-YES-\d+-YES-YES-NO-NO-NO-NO-NO-NO-NO-NO;)/, $_);
(undef, $second) = split(/\s+(CF=CFB-TS10-ACT-NONE-YES-NO-NONE-YES-\d+-YES-YES-NO-NO-NO-NO-NO-NO-NO-NO;)/, $_ );
print $first . "\n" . $second;
}
Code is tested and seems to work fine with supplied data.
Also, those are not tabs "\t" ... those are spaces "\s+" preceding those lines. Best thing is to learn your data set before you try to parse it ;)
Typically perl processes file "line by line".
Try something like sample script below:
my($line1,$line2);
while(<STDIN>) {
$line1=$_ if /\t*CF=(CFU-TS10-ACT-(NONE|\d+))/
$line2=$_ if /\t*CF=(CFB-TS10-ACT-(NONE|\d+))/
if( $line1 and $line2 ) {
say "this case is found here .....";
last; # skip processing remaning lines
}
}
Alternatively you may "slurp" whole file into one scalar variable.
Now suppose say i have this line in a file:
my %address = (
or any such similar line in which i have defined the hash.
I want to find the character "(" in the line and store "address" in say $hash_name. How do I do it?
Basic idea is to capture the name of the hash defined in the files.
I am trying to do is,
foreach $line <MYFILE> {
if($line =~ /($/ {
How do I proceed further?
Not sure if I understood your problem, but, how about:
my %hash;
while (my $line = <MYFILE>) {
if ($line =~ /\%(\w+)\s*=\s*\($/) {
$hash{$1} = 1;
}
}
open (F1,"inputfile.txt") or die("unable to open inputfile.txt");
my $hash_name
while (<F1>) {
if (/%(\w+) *= *\(/) {
$hash_name = $1;
print $hash_name;
}
}
DON'T ASK WHY but...
I have a regex that needs to be case insensitive if run on windows BUT case sensitive when run on *nix.
Here is an example snippet of what I am kind-of doing at the moment.
sub relative_path
{
my ($root, $path) = #_;
if ($os eq "windows")
{
# case insensitive with regex option 'i'
if ($path !~ /^\Q$root\E[\\\/](.*)$/i)
{
print "\tFAIL:$root not in $path\n";
}
else
{
return $1;
}
}
else
{
# case sensitive
if ($path !~ /^\Q$root\E[\\\/](.*)$/)
{
print "\tFAIL:$root not in $path\n";
}
else
{
return $1;
}
}
return "";
}
Argh! The repetition hurts my OCD but my perl-fu is weak. Somehow I want to make the regex option 'i' for case-insensitive conditional but I don't now how?
You can use an extended construct to specify the option. For example:
#!/usr/bin/env perl
use warnings; use strict;
my $s = 'S';
print check($s, 'i'), "\n";
print check($s, '-i'), "\n";
sub check {
my ($s, $opt) = #_;
return "Matched" if $s =~ /(?$opt)^s\z/;
return "Did not match";
}
See perldoc perlre.
You can create patterns and store them in scalars using the qr operator:
sub relative_path
{
my ($root, $path) = #_;
my $pattern = ($os eq "windows") ? qr/^\Q$root\E[\\\/](.*)$/i : qr/^\Q$root\E[\\\/](.*)$/;
if ($path !~ $pattern)
{
print "\tFAIL:$root not in $path\n";
}
else
{
return $1;
}
}
This might not be 100% perfect, but hopefully you should get the idea.
Make sure to check out the section "Quote and Quote-Like Operators" in perlop.
EDIT: Okay, here's a DRY solution since people are complaining about it.
sub relative_path
{
my ($root, $path) = #_;
my $base_pattern = qr/^\Q$root\E[\\\/](.*)$/;
my $pattern = ($os eq "windows") ? qr/$base_pattern/i : $base_pattern;
if ($path !~ $pattern)
{
print "\tFAIL:$root not in $path\n";
}
else
{
return $1;
}
}
In addition to achieving the stated objective, this properly handles volumes unlike the regex patterns previously posted.
use Path::Class qw( dir );
sub relative_path {
my ($root, $path) = #_;
if ($^O =~ /Win32/) {
require Win32;
$root = Win32::GetLongPathName($root);
$path = Win32::GetLongPathName($path);
}
$root = dir($root);
$path = dir($path);
if ($root->subsumes($path)) {
return $path->relative($root);
} else {
print "\tFAIL:$root not in $path\n";
return "";
}
}
By the way, it's not very appropriate to handle the error there. The function should return an error signal (return undef, throw an exception, etc) and the caller should handle it as it sees fit. Separations of concerns.
You can also do it using local modifiers (perl extended regexes option):
sub relative_path
{
my ($root, $path) = #_;
my $pattern = "^\Q$root\E[\\\/](.*)$";
$pattern = "(?i)$pattern" if ($os eq "windows");
if ($path =~ /$pattern/)
{
return $1;
}
else
{
print "\tFAIL:$root not in $path\n";
}
}
(after I typed my answer I saw that Sinan also suggested it, but I decided to post my answer as well, since it gives a concreter answer to the question)