Perl regex in a variable string - regex

I'm new to perl with regex.
I'm trying to have a string of oid 1.3.6.1.2.1.4.22.1.2.*.192.168.1.1, but I'm not sure how to do it.
I tried the below, but it is getting error which is saying not able to recognize the oid.
my $matchanyoid = "/(\d+)$/";
my $dot1dTpFdbAddress = '1.3.6.1.2.1.4.22.1.2.',$matchanyoid,'\.',$srcip;

Comma is not a concatenation operator, dot is:
my $dot1dTpFdbAddress = '1.3.6.1.2.1.4.22.1.2.' . $matchanyoid . '\.' . $srcip;
If you are trying to build a regular expression, note that the first several dots are not backslashed, so they can match anything. To avoid lots of backslashes, you can use the \Q ... \E construct:
my $matchanyoid = '(\d+)';
my $srcip = 12;
my $regex = qr/\Q1.3.6.1.2.1.4.22.1.2.\E$matchanyoid\.$srcip/;
print '1.3.6.1.2.1.4.22.1.2.123.12' =~ $regex;

Related

Using preg_replace with varying variable replacements

I'm trying to use preg_replace to search for a string but only replace a portion of the string, rather than the entire string, in a dynamic fashion.
For example, I am able to find the strings 'od', ':od', 'od:', '#od', and 'od ' with my code below. I want to replace only the 'od' portion with the word 'odometer' and leave the colon, hashtag, and white spaces untouched. However, the way that my current preg_replace is written would replace the colons and the hashtag in addition to the letters themselves. Any creative solutions to replace the characters only but preserve the surrounding symbols?
Thank you!
if(isset($_POST["text"]))
{
$original = $_POST["text"];
$abbreviation= array();
$abbreviation[0] = 'od';
$abbreviation[1] = 'rn';
$abbreviation[2] = 'ph';
$abbreviation[3] = 'real';
$translated= array();
$translated[0] ='odometer';
$translated[1] ='run';
$translated[2] ='pinhole';
$translated[3] ='fake';
function add_regex_finders($str){
return "/[\s:\#]" . $str . "[\s:]/i";
}
$original_parsed = array_map('add_regex_finders',$original);
preg_replace($original_parsed,$translated,$original);
}
You can add capture groups around the characters before and after the matched abbreviation, and then add the group references to the replacement string:
function add_regex_finders($str){
return "/([\s:\#])" . $str . "([\s:])/i";
}
$abbrevs_parsed = array_map('add_regex_finders', $abbreviation);
$translt_parsed = array_map(function ($v) { return '$1' . $v . '$2'; }, $translated);
echo preg_replace($abbrevs_parsed, $translt_parsed, $original);
Demo on 3v4l.org
Note you had a typo in your code, passing $original to the call to add_regex_finders when it should be $abbreviation.

How to write regular expression in powershell

I need regular expression in powershell to split string by a string ## and remove string up-to another character (;).
I have the following string.
$temp = "admin#test.com## deliver, expand;user1#test.com## deliver, expand;group1#test.com## deliver, expand;"
Now, I want to split this string and get only email ids into new array object. my expected output should be like this.
admin#test.com
user1#test.com
group1#test.com
To get above output, I need to split string by the character ## and remove sub string up-to semi-colon (;).
Can anyone help me to write regex query to achieve this need in powershell?.
If you want to use regex-based splitting with your approach, you can use ##[^;]*; regex and this code that will also remove all the empty values (with | ? { $_ }):
$res = [regex]::Split($temp, '##[^;]*;') | ? { $_ }
The ##[^;]*; matches:
## - double #
[^;]* - zero or more characters other than ;
; - a literal ;.
See the regex demo
Use [regex]::Matches to get all occurrences of your regular expression. You probably don't need to split your string first if this suits for you:
\b\w+#[^#]*
Debuggex Demo
PowerShell code:
[regex]::Matches($temp, '\b\w+#[^#]*') | ForEach-Object { $_.Groups[0].Value }
Output:
admin#test.com
user1#test.com
group1#test.com

Regular Expression to find $0.00

Need to count the number of "$0.00" in a string. I'm using:
my $zeroDollarCount = ("\Q$menu\E" =~ tr/\$0\.00//);
but it doesn't work. The issue is the $ sign is throwing the regex off. It works if I just want to count the number of $, but fails to find $0.00.
How is this a duplicate? Your solution does not address dollar sign which is an issue for me.
You are using the transliteration operator tr///. That doesn't have anything to do with a pattern. You need the match operator m// instead. And because you want it to find all occurances of the pattern, use the /g modifier.
my $count = () = $menu =~ m/\$0\.00/g;
If we run this program, the output is 2.
use strict;
use warnings;
my $menu = '$0.00 and $0.00';
my $count = () = $menu =~ m/\$0\.00/g;
print $count;
Now lets take a look at what is going on. First, the pattern of the match.
/\$0\.00/
This is fairly straight-forward. There is a literal $, which we need to escape with a backslash \. The zero is followed by a literal dot ., which again we need to escape, because like the $ it has special meanings in regular expressions.
my $count = () = $menu =~ m/\$0\.00/g;
This whole line looks weird. We can break it up into a few lines to make it more readable.
my #matches = ( $menu =~ m/\$0\.00/g );
my $count = scalar #matches;
We need the /g switch on the regular expression match to make it match all occurrences. In list context, the match operation returns all matches (which will be the string "$0.00" a number of times). Because we want the count, we then force that into scalar context, which gives us the number of elements. That can be shortened to one line by the idiom shown above.

split one line regex in a multiline regexp in perl

I have trouble spliting my regex in multiple line. I want my regex to match the line given:
* Code "l;k""dfsakd;.*[])_lkaDald"
So I created this regex which work:
my $firstRegexpr = qr/^\s*\*\s*Code\s+\"(?<Code>((\")*[^\"]+)+)\"/x;
But now I want to split it in multiline like this(and want it to match the same thing!):
my $firstRegexpr = qr/^\s*\*\s*Code\s+\"
(?<Code>((\")*[^\"]+)+)\"/x;
I read about this, but I have trouble using it:
/
^\s*\*\s*Code\s+\"
(?<Code>((\")*[^\"]+)+)\"
/x
My last question is about removing inlining variable in perl regex:
my $firstRegexpr = qr/^\s*\*\s*Code\s+\"(?<Code>((\")*[^\"$]+)+)\"\$/x;
the character $] is matched as a variable in the regex, how to define it not as a variable?
Thanks a lot for your time and please provide explicit example.
What the x flag does is very simply say 'ignore whitespace'.
So you no longer match 'space' characters , and instead have to use \s or similar.
So you can write:
if ( m/
^
\d+\s+
fish:\w+\s+
$
/x ) {
print "Matched\n";
}
You can test regular expressions with various websites but one example is https://regex101.com/
So to take your example: https://regex101.com/r/eG5jY8/1
But how is yours not working?
This matches:
my $string = q{* Code "l;k""dfsakd;.*[])_lkaDald"};
my $firstRegexpr = qr/^\s*
\*
\s*
Code\s+
\"
(?<Code>((\")*[^\"]+)+)
\"
/x;
print "Compiled_Regex: $firstRegexpr\n";
print "Matched\n" if ( $string =~ m/$firstRegexpr/ );
And as for not having $] - there's two answers. Either: Use \ to escape it, or use \Q\E.

Extracting first two words in perl using regex

I want to create extract the first two words from a sentence using a Perl function in PostgreSQL. In PostgreSQL, I can do this with:
text = "I am trying to make this work";
Select substring(text from '(^\w+-\w+|^\w+(\s+)?(!|,|\&|'')?(\s+)?\w+)');
It would return "I Am"
I tried to build a Perl function in Postgresql that does the same thing.
CREATE OR REPLACE FUNCTION extract_first_two (text)
RETURNS text AS
$$
my $my_text = $_[0];
my $temp;
$pattern = '^\w+-\w+|^\w+(\s+)?(!|,|\&|'')?(\s+)?\w+)';
my $regex = qr/$pattern/;
if ($my_text=~ $regex) {
$temp = $1;
}
return $temp;
$$ LANGUAGE plperl;
But I receive a syntax error near the regular expression. I am not sure what I am doing wrong.
Extracting words is none trivial even in English. Take the following contrived example using Locale::CLDR
use 'Locale::CLDR';
my $locale = Locale::CLDR->new('en');
my #words = $locale->split_words('adf543. 123.25');
#words now contains
adf543
.
123.25
Note that the full stop after adf543 is split into a separate word but the one between 123 and 25 is kept as part of the number 123.25 even though the '.' is the same character
If gets worse when you look at non English languages and much worse when you use non Latin scripts.
You need to precisely define what you think a word is otherwise the following French gets split incorrectly.
Je avais dit «Elle a dit «Il a dit «Ni» il ya trois secondes»»
The parentheses are mismatched in our regex pattern. It has three opening parentheses and four closing ones.
Also, you have two single quotes in the middle of a singly-quoted string, so
'^\w+-\w+|^\w+(\s+)?(!|,|\&|'')?(\s+)?\w+)'
is parsed as two separate strings
'^\w+-\w+|^\w+(\s+)?(!|,|\&|'')?(\s+)?\w+)'
and
'^\w+-\w+|^\w+(\s+)?(!|,|\&|'
')?(\s+)?\w+)'
But I can't suggest how to fix it as I don't understand your intention.
Did you mean a double quote perhaps? In which case (!|,|\&|")? can be written as [!,&"]?
Update
At a rough guess I think you want this
my $regex = qr{ ^ \w++ \s* [-!,&"]* \s* \w+ }x;
$temp = $1 if $my_text=~ /($regex)/;
but I can't be sure. If you describe what you're looking for in English then I can help you better. For instance, it's unclear why you don't have question marks, full stops, and semicolons in the list of intervening punctuation.