Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I want to search a value which comes inside a range x and y. I want a generic PERL regular expression because the x and y are dynamic.
Please help
This is an excessively bad idea. Not impossible, but hard to write as a general solution.
Let's write a regular expression that matches all numbers between 2 and 123. We have to look at each possible number of digits separately.
1 digit: [2-9] – 2 or larger
2 digits: [1-9][0-9] – any two-digit number
3 digits: [1](?:[0-1][0-9]|[2][0-3]) – either any 3-digit number up to 119, or 12x where 0 <= x <= 3.
Together: /\A(?:[2-9]|[1-9][0-9]|[1](?:[0-1][0-9]|[2][0-3]))\z/
Is this readable or maintainable? Certainly not.
You could use embedded code: /\A([0-9]+)(?(?{ not($x <= $^N && $^N <= $y) })(*F))\z/, but that's rather silly as well.
The best solution is to use code for what should be done with code. Regexes are simply not an appropriate tool here.
my ($num) = $string =~ /\A([0-9]+)\z/ or die "no number in \$string";
if (not($x <= $num and $num <= $y)) {
die "Number $num out of range [$x .. $y]";
}
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I need a regex pattern to match any text that comes between Health & Beauty that may or may not include a space and/or special character "&" but should not exceed the character limit of 10. In said case, I would want to extract:
Beauty & Fashion
The following is a regix code to extract anchor text:
(<[a|A][^>]*>|)
But I want to limit the character to 1 to 10 ? Is that possble?
For PCRE:
https://regex101.com/r/GJSlZl/1
For JS:
https://regex101.com/r/FIdlyU/1
The solution depends on the regex flavor:
js: (?<=<a[^>]+>)([\w &]{1,10})(?=<\/a>)
pcre: <a[^>]+>\K([\w &]{1,10})(?=<\/a>)
My guess is that you're looking to find some expression similar to,
(?<=&|>)([^&\r\n]{0,10}(?=&|<\/a>))*
which you might want to add more boundaries on the left side,
(?<=&|>)
Test
$re = '/(?<=&|>)([^&\r\n]{0,10}(?=&|<\/a>))*/s';
$str = '<a>Health & Beauty</a>
Health & Beauty
Health & Beauty 1 & Health & Beauty 1
<a>Health & Beauty 1 & Health & Beauty 1 </a>
<a>Health & Beauty 1 & Some other words & Beauty 1 & Some other words 2</a>
';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
var_dump($matches);
If you wish to explore/simplify/modify the expression, it's been
explained on the top right panel of
regex101.com. If you'd like, you
can also watch in this
link, how it would match
against some sample inputs.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I want to validate a 5 digit number.
Max of 5 digits
First 2 digits are within various ranges like (00-16) or (20-25) or
(32-39)
00789 - valid
11569 - valid
22698 - valid
32567 - valid
17895 - not valid
41578 - not valid
Is there a Regex Guru that can help with a regex expression that would work for this scenario?
I don't know anything about regular expressions. This a small part of a bigger solution that has legacy code using regular expression strings as data validation. A number comes in as a parameter. A lookup is done to get a regex validation string. The number and the regex string is passed to a validator where a regex.IsMatch is performed.
My question is can the above validation senario be written in a regex expression, if so what would that look like? I could then add the expression to the existing library of regex expressions in my app.
Why regex? First you need a collection to store your ranges, for example:
Dim ranges = New List(Of Tuple(Of Int32, Int32))
ranges.Add(Tuple.Create(0, 16))
ranges.Add(Tuple.Create(20, 25))
ranges.Add(Tuple.Create(32, 39))
The check itself is pretty easy:
Dim firstTwo = text.TrimStart("0"c).Substring(0, 2)
Dim number As Int32
Dim isValid = Int32.TryParse(firstTwo, number) AndAlso
ranges.Any(Function(t) number >= t.Item1 AndAlso number <= t.Item2)
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
I have the following sentence:
b.g The big bag of bits was bugged.
How can I exclude the b.g from it by using a regular expression?
I am sure I need a negative lookahead but I cannot get it right yet.
Something like
^(?!b\.g)
I would do it this way:
[^\S].*
What [^\S] does is basically skip any character until it reaches the first space. then start capturing. No need in this case for negative or positing Lookbehind.
Demo: regex101
If you prefer to do it with positive Lookbehind, you can do it this way
(?<=b\.g).*
Demo: regex101
sed 's/^...//' strips the first 3 characters, "b.g", but I doubt that's what you're really asking. Your ^ anchor appears to be a red herring.
You already have correct escaping for . period, just stick with that:
sed 's/b\.g//'
Python's positive lookbehind ?<= may be what you are trying to find words to express:
>>> m = re.search(r'(?<=b\.g)(.*)', 'b.g The big bag of bits was bugged.')
>>> print(m.group(1))
The big bag of bits was bugged.
In python you could do something like this:
import re
w = 'b.g The big bag of bits was bugged.'
print w
d = re.compile(r'^b.g\s')
a = re.sub(d, '', w)
print a
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
Which expression should I use to identify the number of hydrogen atoms in a chemical formula?
For example:
C40H51N11O19 - 51 hydrogens
C2HO - 1 hydrogen
CO2 - no hydrogens (empty)
Any suggestions?
Thanks!
Cheers!
You can start using this regex :
H\d*
H -> match literaly the H caracter
d* -> match 0 to N time a digit
see exemple and try yourself other regex at :
https://regex101.com/r/vdvH8S/2
But regex wont convert for you the result, regex only do lookup.
You need to process your result saying :
H with a number : extract the number
only H : 1
no match : 0
A Regex Expression that will match H with follwowing digits would be:
/H(\d+)/g
The 'H' is a literal charecter match to the H in the given chemical
formula
() declares a capture group, so you cna then grab the captured group without the H in whatever programming language you are using
\d will match any digit along with the + modifier that matches 1 or more
There is no catch all scenarios here, you might be best using something other than a regex.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
Explain the below code:
$x = '12aba34ba5';
#num = split /(a|b)+/, $x;
gives # #num = ('12','a','34','a','5')
#num = split /(?:a|b)+/, $x;
gives # #num = ('12','34','5')
In the first case you are capturing (a|b) so a gets captured.(a|b)+ will match aba but only a will be stored as regex remembers only the last one when continuous groups are there.So split is at groups of ab in any order .In the second case you dont capture (a|b) .So you get the correct split result.
The string 12aba34ba5 is being split on occurrences of multiple a or b characters, giving the result 12, 34, 5
However, you also have a capture in the split regex, which inserts the captured string into the split list
If you write 'aba' =~ /(a|b)+/ then there are three occurrences of the pattern (a|b), but only the last one can be saved in $1, and this is the value that split inserts
So you are picking up the last value of aba (a) and the last value of ba (another a) and inserting them into the list, giving 12, a, 34, a, 5
If you wanted the letters separated from the numbers, you could write
#num = split /((?:a|b)+)/, $x;
or, equivalently and more neatly
#num = split /([ab]+)/, $x;
giving 12, aba, 34, ba, 5