match words in a string - regex

I have the next code:
$string = "hello to all world";
$strings_compare = tomorrow, hello, world;
$string_arrays =split(',',$strings_compare);
for ($i=0; $i<count($string_arrays); $i++){
$resultado = preg_match("/$string_arrays[$i]/",$string);
if($resultado == false){
echo "no match";
}else {
echo "match";
}
}
in this code the results are:
no match, match, no match
and the results should be: no match, match and match. What is my error?
if I change the $string by $string='say hello to all world now'
the results are match, match, match, this is OK.

It works fine for me when I'm using valid array syntax.
<?php
$string = "hello to all world";
$string_arrays = array("tomorrow", "hello", "world");
for ($i=0; $i<count($string_arrays); $i++) {
$resultado = preg_match("/$string_arrays[$i]/",$string);
if(!$resultado) {
echo "no match";
} else {
echo "match";
}
}
Returns
no matchmatchmatch

Try this:
$resultado = preg_match("/".preg_quote($string_arrays[$i])."/",$string);
Also:
$string_arrays = array("tomorrow", "hello", "world");

Related

allocating a string based on PHP if statement

I quite new to PHP this seems to be causing me problems:
<?php
$ppe1="Water";
echo $ppe1;
if ($ppe1="Tap") {
$dog="time";
}
else
{$dog="travel";}
echo $dog;
?>
I am obviously being bery stupid but I cannot seem to allocate the $dog string based on the if statement
You should have a read on:
Comparison Operators.
In short: You are not comparing but assigning a value with =.
<?PHP
$ppe1 = "Walter";
$dog = $ppe1 === "Tap" ? "time" : "travel";
echo $dog;
This makes use of the ternary operator which is basically $variable = boolean expression ? then : else;
Or with your code structure:
<?PHP
$ppe1 = "Walter";
$dog = "";
if($ppe1 === "Tap")
{
$dog = "time";
}
else
{
$dog = "travel";
}
echo $dog;
Take a look at the Comparison Operators. Use the following code. Also as you said you are a beginner, I recommend you to read this.
<?php
$ppe1 = "Water";
echo $ppe1; // "Water"
if ($ppe1 == "Tap") {
$dog = "time";
} else {
$dog="travel";
}
echo $dog; // "travel"
?>

Can you compare a string to a "template" string using grep?

my #array = (
'There were \d* errors that occurred',
'Your system exploded because \.*',
);
my $error = 'There were 22 errors that occurred';
if (grep(/$error/, #array)) {
print 'That error is ok, continue...';
} else {
die;
}
Is there any way in perl to compare a full string to a string containing regex?
Like in this example I'd want both $error = 'There were 22 errors that occurred' and $error = 'There were 12341235 errors that occurred' to be compared to a kind of "template" string and have a boolean returned if it matches. Using grep is probably not possible, I guess.
Maybe something like this that actually works:
my #s = ('there were \d* errors');
print _error_checker(#s, 'there were 10 errors');
sub _error_checker {
my (#acceptable_errors, $text) = #_;
foreach my $error (#acceptable_errors) {
if ($text =~ /$error/) {
return 1;
}
}
return 0;
}
You were close, you just need to invert your testing in the grep.
my #ok_errors = (
'There were \d* errors that occurred',
'Your system exploded because \.*',
);
my $errmsg = 'There were 22 errors that occurred';
if (grep {$errmsg =~ /$_/} #ok_errors) {
print 'That error is ok, continue...';
} else {
die;
}
Additionally, you can cache the regular expressions using qr{}
my #ok_errors = (
qr{There were \d* errors that occurred},
qr{Your system exploded because \.*},
);
my $errmsg = 'There were 22 errors that occurred';
if (grep {$errmsg =~ $_} #ok_errors) {
print 'That error is ok, continue...';
} else {
die;
}

Perl Regex not matching

I'm trying to match lines from a file and extract a certain part.
My Regex works with all online testers I could find but not with my perl.
I'm on version v5.10.0 and cannot update.
The regex looks like this:
sub parse_bl_line {
if ($_[0] =~ m/^copy\s+.*?\s+(.*?\_.*)/) {
return $1;
} else {
log_msg("Line discarded: $_[0]", 4);
return "0";
}
}
A couple lines of test data which should match (only the last matches):
#bl_lines = (
"copy xxxxxx_/cpu b_relCAP_R3.0-1_INT5_xxxxx_cpu_p1",
"copy xxxxxxxx_/va_xxx_parameters b_relCAP_R3.0-1_INT5_xxxxx_va_xxx_parameters_p1",
"copy xxxxxxxx_/xxxxxxx_view.tcl b_relCAP_R3.0-1_INT5_xxxxxx_view.tcl_p0",
"copy xxxxx_/xxxxxarchivetool.jar b_relEARLY_DROP1_xxxxxarchivetool.jar_xx");
And calling the function:
foreach(#bl_lines) {
$file=parse_bl_line($_);
if ($file !~ "0") {
log_msg("Line accepted: $_", 4);
log_msg("File extracted: $file", 4);
}else {
log_msg("Line rejected: $_", 2);
}
}
I'm trying to match the last part e.g.
b_relEARLY_DROP1_xxxxxarchivetool.jar_xx
Output looks the following:
20120726 13:15:34 - [XXX] ERROR: Line rejected: copy xxxxxx_/cpu b_relCAP_R3.0-1_INT5_xxxxx_cpu_p1
20120726 13:15:34 - [XXX] ERROR: Line rejected: copy xxxxxxxx_/va_xxx_parameters b_relCAP_R3.0-1_INT5_xxxxx_va_xxx_parameters_p1
20120726 13:15:34 - [XXX] ERROR: Line rejected: copy xxxxxxxx_/xxxxxxx_view.tcl b_relCAP_R3.0-1_INT5_xxxxxx_view.tcl_p0
20120726 13:15:35 - [XXX] INFO: Line accepted: copy xxxxx_/xxxxxarchivetool.jar b_relEARLY_DROP1_xxxxxarchivetool.jar_xx
20120726 13:15:35 - [XXX] INFO: File extracted: b_relEARLY_DROP1_xxxxxarchivetool.jar_xx
Hint
I did some of the testing that #BaL proposed and found out that the pattern matching works without the selection parenthesis.
if ($_[0] =~ m/^copy\s+.+?\s+.+?\_.+$/) {
The test : if ($file !~ "0") { is true when $file doesn't contain a 0 at any position which is the case of the last string only.
I guess you want to use : if ($file ne '0') { or even shorter : if ($file) {
Apart of this you should really use strict; and use warnings always.
What are you trying to match ? The last part ?
Don't use * if you know that you have something to match, use + instead :
if ($_[0] =~ m/^copy\s+.+?\s+(.\+?)$/) {
return $1;
}
I'm guessing that the last line of your test file is the only one that doesn't end with a "\n". Funny little buggers are always getting in the way.....
Change the comparison operator in your if statement from !~ to ne as you are making a string comparison. When I make this change, all log lines were accepted.
I tested this on perl 5.14.2, not 5.10, but I didn't use any special features. Give it a go! code is below:
use 5.14.2;
sub log_msg{
say shift;
}
sub parse_bl_line {
if ($_[0] =~ m/^copy\s+.*?\s+(.*?\_.*)/) {
return $1;
}
else {
log_msg("Line discarded: $_[0]", 4);
return "0";
}
}
my #bl_lines = (
"copy xxxxxx_/cpu b_relCAP_R3.0-1_INT5_xxxxx_cpu_p1",
"copy xxxxxxxx_/va_xxx_parameters b_relCAP_R3.0-1_INT5_xxxxx_va_xxx_parameters_p1",
"copy xxxxxxxx_/xxxxxxx_view.tcl b_relCAP_R3.0-1_INT5_xxxxxx_view.tcl_p0",
"copy xxxxx_/xxxxxarchivetool.jar b_relEARLY_DROP1_xxxxxarchivetool.jar_xx"
);
foreach(#bl_lines) {
my $file = parse_bl_line($_);
if ($file ne "0") { # Changed the comparison operator here
log_msg("Line accepted: $_", 4);
log_msg("File extracted: $file", 4);
}
else {
log_msg("Line rejected: $_", 2);
}
}

How to make perl regex options conditional

DON'T ASK WHY but...
I have a regex that needs to be case insensitive if run on windows BUT case sensitive when run on *nix.
Here is an example snippet of what I am kind-of doing at the moment.
sub relative_path
{
my ($root, $path) = #_;
if ($os eq "windows")
{
# case insensitive with regex option 'i'
if ($path !~ /^\Q$root\E[\\\/](.*)$/i)
{
print "\tFAIL:$root not in $path\n";
}
else
{
return $1;
}
}
else
{
# case sensitive
if ($path !~ /^\Q$root\E[\\\/](.*)$/)
{
print "\tFAIL:$root not in $path\n";
}
else
{
return $1;
}
}
return "";
}
Argh! The repetition hurts my OCD but my perl-fu is weak. Somehow I want to make the regex option 'i' for case-insensitive conditional but I don't now how?
You can use an extended construct to specify the option. For example:
#!/usr/bin/env perl
use warnings; use strict;
my $s = 'S';
print check($s, 'i'), "\n";
print check($s, '-i'), "\n";
sub check {
my ($s, $opt) = #_;
return "Matched" if $s =~ /(?$opt)^s\z/;
return "Did not match";
}
See perldoc perlre.
You can create patterns and store them in scalars using the qr operator:
sub relative_path
{
my ($root, $path) = #_;
my $pattern = ($os eq "windows") ? qr/^\Q$root\E[\\\/](.*)$/i : qr/^\Q$root\E[\\\/](.*)$/;
if ($path !~ $pattern)
{
print "\tFAIL:$root not in $path\n";
}
else
{
return $1;
}
}
This might not be 100% perfect, but hopefully you should get the idea.
Make sure to check out the section "Quote and Quote-Like Operators" in perlop.
EDIT: Okay, here's a DRY solution since people are complaining about it.
sub relative_path
{
my ($root, $path) = #_;
my $base_pattern = qr/^\Q$root\E[\\\/](.*)$/;
my $pattern = ($os eq "windows") ? qr/$base_pattern/i : $base_pattern;
if ($path !~ $pattern)
{
print "\tFAIL:$root not in $path\n";
}
else
{
return $1;
}
}
In addition to achieving the stated objective, this properly handles volumes unlike the regex patterns previously posted.
use Path::Class qw( dir );
sub relative_path {
my ($root, $path) = #_;
if ($^O =~ /Win32/) {
require Win32;
$root = Win32::GetLongPathName($root);
$path = Win32::GetLongPathName($path);
}
$root = dir($root);
$path = dir($path);
if ($root->subsumes($path)) {
return $path->relative($root);
} else {
print "\tFAIL:$root not in $path\n";
return "";
}
}
By the way, it's not very appropriate to handle the error there. The function should return an error signal (return undef, throw an exception, etc) and the caller should handle it as it sees fit. Separations of concerns.
You can also do it using local modifiers (perl extended regexes option):
sub relative_path
{
my ($root, $path) = #_;
my $pattern = "^\Q$root\E[\\\/](.*)$";
$pattern = "(?i)$pattern" if ($os eq "windows");
if ($path =~ /$pattern/)
{
return $1;
}
else
{
print "\tFAIL:$root not in $path\n";
}
}
(after I typed my answer I saw that Sinan also suggested it, but I decided to post my answer as well, since it gives a concreter answer to the question)

Text Pattern Processing in paragraph with unix linux utilities

I have a file with the following pattern (please note this is a file generated using sed,
awk, grep etc processing). The part of file input is as follows.
filename1,
BASE=a/b/c
CONFIG=$BASE/d
propertiesfile1=$CONFIG/e.properties
EndOfFilefilename1
filename2,
BASE=f/g/h
CONFIG=$BASE/i
propertiesfile1=$CONFIG/j.properties
EndOfFilefilename2
filename3,
BASE=k/l/m
CONFIG=$BASE/n
propertiesfile1=$CONFIG/o.properties
EndOfFilefilename3
I want the output like
filename1,a/b/c/d/e.properties,
filename2,f/g/h/i/j.properties,
filename3, k/l/m/n/o.properties,
I could not find a solution with sed or awk or grep. So I ams tuck. Please do let me know if you know the solution with these unix utilities or any other language, platform.
Regards,
Suhaas
Assuming you generated the original file, and therefore it is safe to execute it as a script:
sed -e 's/^.*,/FILE=&/' \
-e 's/^.*=\$CONFIG/PROPFILE=$CONFIG/' \
-e 's/^EndOfFile.*/echo $FILE $PROPFILE/' < yourInputFile | sh
This converts each section of your file into the form:
FILE=filename1,
BASE=a/b/c
CONFIG=$BASE/d
PROPFILE=$CONFIG/e.properties
echo $FILE $PROPFILE
... and then sends it into a shell for processing.
Line-by-line explanation:
Line 1: Searches for the lines ending in a comma (the filenames), and sets FILE to the name.
Line 2: Searches for lines that set the properties file, and renames the variable to PROPFILE.
Line 3: Replaces the EndOfFile lines with a command to echo the file name and the properties file, then pipes it into a shell.
This is an excellent use case for structural regular expressions, which have been implemented as a python library, amongst other places. Here's an article which descibes how to emulate SREs in Perl.
And here is an awk script to process that input and generate what you want:
BEGIN {
FS="="
state = 0;
base = "";
config = "";
prop = "";
filename = "";
dbg = 0;
}
/^BASE=/ {
if (dbg) {
print "BASE";
print $0;
}
if (state != 1) {
print "Error base!";
exit 1;
}
state++;
base = $2;
if (dbg > 1) printf ("BASE = %s\n", base);
}
/^CONFIG=/ {
if (dbg) {
print "CONFIG";
print $0;
}
if (state != 2) {
print "Error config!";
exit 1;
}
state++;
config = $2;
sub (/\$BASE/, base, config);
if (dbg > 1) printf ("CONFIG = %s\n", config);
}
/^propertiesfile1=/ {
if (dbg) {
print "PROP";
print $0;
}
if (state != 3) {
print "Error pF!";
exit 1;
}
state++;
prop = $2;
sub (/\$CONFIG/, config, prop);
}
/^EndOfFile/ {
if (dbg) {
print "EOF";
print $0;
}
if (state != 4) {
print "Error EOF!";
print state;
exit 1;
}
state = 0;
printf ("%s%s,\n", filename, prop);
}
/,$/{
if (dbg) {
print "FILENAME";
print $0;
}
if (state != 0) {
print "Error filename!";
print state;
exit 1;
}
state++;
filename = $1;
}
gawk
gawk -vRS= 'BEGIN{FS="BASE[=]?|CONFIG|\n"}
{
s=$1
for(i=1;i<=NF;i++){
if($i~/\// ){ s=s $i }
}
print s
s=""
}' file
output
$ more file
filename1,
BASE=a/b/c
CONFIG=$BASE/d
propertiesfile1=$CONFIG/e.properties
EndOfFilefilename1
filename2,
BASE=f/g/h
CONFIG=$BASE/i
propertiesfile1=$CONFIG/j.properties
EndOfFilefilename2
filename3,
BASE=k/l/m
CONFIG=$BASE/n
propertiesfile1=$CONFIG/o.properties
EndOfFilefilename3
$ ./shell.sh
filename1,a/b/c/d/e.properties
filename2,f/g/h/i/j.properties
filename3,k/l/m/n/o.properties
A perl script that does what you want would be something like (note this is untested)
while (<>) {
$base = $1 if (m/BASE=(.+)/);
$config = $1 if (m/CONFIG=(.+)/);
if (m/propertiesfile1=(.+)/) {
$props = $1;
$props =~ m/\$CONFIG/$config/;
$props =~ m/\$BASE/$base/;
print $ARGV . ", " . $props . "\n";
}
}
you give the script the filenames as arguments.
Multi-steps but it works!
cat yourInputFile | egrep ',|\/' | \
sed -e "s/^.*=//g" -e "s/\$.*\(\/.*\)/\1/g" | \
awk '{if($0 ~ "properties") print $0; else printf $0}'
The egrep grabs the lines containing a "," or a "/" and so eliminates the last line:
BASE=a/b/c
CONFIG=$BASE/d
propertiesfile1=$CONFIG/e.properties
The sed reduces the output to:
filename1,
a/b/c
/d
/e.properties
The awk portion reassembles the line to:
filename1,a/b/c/d/e.properties