Perl regex that can match both positive and negative values - regex

I have a list of data which I want to match:
0:1
0:3
0:-1
0:2
0:-4
What's the regex I can use to match all of them:
I tried this but won't work:
$line =~ /0:(\w+)/
It only match the positives.

\w is for word symbols: letters, digits and underscore. That means your regexp besides 0:34 will match smth like 0:hello, but won't match minus symbol.
If you need only digits then /0:-?\d+/ should work. And if you need to match whole string (to filter out strings like a0:-3b you can use /^0:-?\d+$/.

how about $line =~ /0:[-]?[0-9]

Related

What is the regex to match exactly an alphanumeric 16 character string?

Here is a regex string I need to use but I only want it to match exactly 16 alphanumeric characters not the 16 within a longer string.
[A-Z]{6}[0-9]{2}[A-E,H,L,M,P,R-T][0-9]{2}[A-Z0-9]{5}
Its matches this: PLDTLL47S04L424T and MRTMTT25D09F205Z perfectly But what i dont want it to match is something like this in bold thats in middle of this long string:
FA4127E57FE52E49BC1FEEECC32E1246530EE1C#BL2PRD9301MB014.024d.mgd.msft.net
Thanks in advance!
You didn't say which regex flavor you're using, but the issue is that you're missing start and end anchors.
Add ^ and $ to your regex as such:
^[A-Z]{6}[0-9]{2}[A-E,H,L,M,P,R-T][0-9]{2}[A-Z0-9]{5}$
^ means match at the start of a string, or the point after any newline in multiline mode.
$ means the opposite: the end of a string, or the point before the newline in multiline mode.
In addition to my predecessors:
assuming that you want to match if and only if the line starts with something that matches your pattern, both anchor ^ and word boundary \b will do.
Ending the pattern with anchor $ and/or \b is, however, - taken into account the assumption that a line starting with something that matches, NOT correct.
See some example code:
#!/usr/bin/perl -w
my #tests = qw/
AAAAAA00A00AAAAA49BC1FEEECC32E1246530EE1C#BL2PRD9301MB014.024d.mgd.msft.net
0AAAAAA00A00AAAAA49BC1FEEECC32E1246530EE1C#BL2PRD9301MB014.024d.mgd.msft.net
/;
foreach my $test (#tests){
if ( $test =~ /^([A-Z]{6}[0-9]{2}[A-EHLMPR-T][0-9]{2}[A-Z0-9]{5})/ ) {
print "$1 matches\n";
} else {
print "NO MATCH\n";
}
}
generates output:
marc:tmp marc$ perl test.pl
AAAAAA00A00AAAAA matches
NO MATCH
if you change the pattern to
if ( $test =~ /^([A-Z]{6}[0-9]{2}[A-EHLMPR-T][0-9]{2}[A-Z0-9]{5}$)/ ) {
the result is:
marc:tmp marc$ perl test.pl
NO MATCH
NO MATCH
You can use Boundry Matchers to match the beginning and endings of lines, strings, words or other things. What is available depends on your flavour of regex. The start and end of string/input matchers are pretty universal.
^[A-Z]{6}[0-9]{2}[A-E,H,L,M,P,R-T][0-9]{2}[A-Z0-9]{5}$
Again depending on the flavour of regex you are using you can also POSIX character classes to match alpha numerics with \p{Alpha} and \p{Digit}. This will simplfy your regex a bit.
You should use ^ and $ to bound the regex
You can use word boundaries \b for this purpose:
\b[A-Z]{6}[0-9]{2}[A-E,H,L,M,P,R-T][0-9]{2}[A-Z0-9]{5}\b
^ ^
Edit: Word boundaries and not start ^ and end $ anchors because I am assuming you just want to avoid matches as a substring and your patterns are more like your sample string but with spaces
You may try this regex: ^(?=.*[0-9])(?=.*[a-zA-Z])([a-zA-Z0-9]+){16}$

Regex to get text from within parenthesis including delimited parenthsis

I am using powershell, if that matters.
Let's say I have
$s = "One two (three) four \(five\) six (\(seven\)) eight"
I want a regex that will return
three
(seven)
I need all matches, and I know how PowerShell stores the matches in $matches, similar to perl's $1 $2 $3 (but that's the easy part).
Use the below regex and get the string you want from group index 1.
(?<!\\)\(((?:\\[()]|[^()])*)\)
Negative lookbehind (?<!\\) which asserts that the match wouldn't be preceded by \ symbol.
DEMO
(?<!\\)\(([^)]+\))
Try this.See demo.
http://regex101.com/r/kT6vO6/3

Need help in matching regexp

I am having a string say
my $str = "FILLER-1-1,EQPT:MN,EQPT_MISSING,NSA,04-30,15-07-13,NEND,NA";
I want to match a pattern say
my $pattern = "FILLER-1-1";
I am using the below regexp
$reg = $str =~ /$pattern/;
This is working fine
Now the problem is it is also matching if our string is
FILLER-1-10/FILLER-1-11/FILLER-1-12 so on ...
I dont want to match this. Also I don't want my regexp to be like
$reg = $str =~ /$pattern\W+/;
This one is working against the above mentioned issue but \W may come or not come. In some strings it can come while in other it may not come. So i need the regexp to match only FILLER-1-1 without using \W+ and it should match specifically FILLER-1-10
Note: If somebody is doing -(minus) rating to my question, please let me know what's wrong in the code. It will be appreciable if the person write the comment too
As \w matches [a-zA-Z0-9], you can use the zero-width assumption \b, which denotes a change in \w state (called a "word boundary", hence the "b" shortcut):
/FILLER-1-1\b/
This means that there needs to be a character that differs from the previous word state - a word state change.
It will match
FILLER-1-1.
FILLER-1-1&
FILLER-1-1,
It will not match
FILLER-1-1a
FILLER-1-16
Read more about it here.
If you want to match FILLER at the start of the input (line) followed by two numbers, this simple regex should work:
/~FILLER-\d+-\d+/
~ matches the beginning of the input
\d matches any digit ([0-9])
+ matches at least one, but can match any number
use ? quantifier like so:
/FILLER-\d-\d\W?/
The \W? means not a word zero or one time

How to replace ',' without replacing it in unit in a string

I would like to replace "," to # in following strings, but without changing it in unit (10,000) format.
x,y,z to x#y#z
x1,y1,z1 to x1#y1#z1
x1,y1 10,000,z1 to x1#y1 10,000#z1
I used s/(\D),/\1#/g, but it won't work for 2 and 3. How to recognize the exclusion pattern is digit on both sides? Can someone help? thanks so much
You need a regex which says to match a comma that does not have a number to its left or right.
s/(?<!\d),|,(?!\d)/#/g
The negative lookbehind assertion (?<!\d) allows matches such as x,, since x is not a number. Using a negated expression allows this to also match beginning of line, e.g. ,x. The negative lookahead assertion (?!\d) allows matches against commas that are not followed by numbers. Neither of these expressions will match a comma surrounded by numbers.
Try the following alternative:
s/,(?<!\d)(?!\d)/\#/g;
sample script
use strict;
use warnings;
my #array = ( 'x,y,z', 'x1,y1,z1', 'x1,y1 10,000,z1');
for my $string (#array) {
$string =~ s/,(?<!\d)(?!\d)/\#/g;
print "$string\n";
}
#OUTPUT
#x#y#z
#x1#y1#z1
#x1#y1 10,000#z1

Insertion with Regex to format a date (Perl)

Suppose I have a string 04032010.
I want it to be 04/03/2010. How would I insert the slashes with a regex?
To do this with a regex, try the following:
my $var = "04032010";
$var =~ s{ (\d{2}) (\d{2}) (\d{4}) }{$1/$2/$3}x;
print $var;
The \d means match single digit. And {n} means the preceding matched character n times. Combined you get \d{2} to match two digits or \d{4} to match four digits. By surrounding each set in parenthesis the match will be stored in a variable, $1, $2, $3 ... etc.
Some of the prior answers used a . to match, this is not a good thing because it'll match any character. The one we've built here is much more strict in what it'll accept.
You'll notice I used extra spacing in the regex, I used the x modifier to tell the engine to ignore whitespace in my regex. It can be quite helpful to make the regex a bit more readable.
Compare s{(\d{2})(\d{2})(\d{4})}{$1/$2/$3}x; vs s{ (\d{2}) (\d{2}) (\d{4}) }{$1/$2/$3}x;
Well, a regular expression just matches, but you can try something like this:
s/(..)(..)(..)/$1/$2/$3/
#!/usr/bin/perl
$var = "04032010";
$var =~ s/(..)(..)(....)/$1\/$2\/$3/;
print $var, "\n";
Works for me:
$ perl perltest
04/03/2010
I always prefer to use a different delimiter if / is involved so I would go for
s| (\d\d) (\d\d) |$1/$2/|x ;