overall matching - regex

to search a single char in RegEx is easy.
exp: at least one digit:
\d
so i need to match at least 2 digit in the text
.*\d{2}.* or .*\d\d.* #### "d2dr5" -> not match... d22r or d00r match..
will not work because RegEx engine look for these numbers as consecutive how can I search for overall? for example
I want to match at least 3 digit and 2 uppercase word in the text. and the text length can be max 12. how can I do this ? If you can give an explained example so then i may have a point to re-search
example match:
a9r2lDpDf2 - matches. at least 3 digit 2 upper case and not exceeding 12 char in total.

If you want to make sure there is only three digits in the string you can try this (add start and end of string if needed):
[^\d]*\d[^\d]*\d[^\d]*\d[^\d]*
[^\d]* - anything except digits.
Same pattern can be used to check for uppercase letters:
[^A-Z]*[A-Z][^A-Z]*[A-Z][^A-Z]*
RegEx is not the best tool to check length. The language you use has something like length(str) or str.length or str.length() etc.
It can be done with lookahead feature. This is how RegEx looks in Perl (and it does what you ask):
/^(?=.*\d.*\d.*\d)(?=.*[A-Z].*[A-Z]).{12}$/
(?=.*\d.*\d.*\d) - "looks ahead" to see if there are 3 digits
(?=.*[A-Z].*[A-Z]) - "looks ahead" to see if there are 2 uppercase letters
.{12} - length must be precisely 12 characters. Any character 12 times.

I dont think regexes are the optimal solution here , but for academic interest
(?=(.*[0-9]){3})(?=(.*[A-Z]){2}).{5,12}

Related

Regex to match less than a two-digit number

I have some records as such, in a file:
A 20 year old programmer
A 52 year old politician
A 12 year old practitioner
Many many more of these...
I want to match only lines that contain a number less than 20. I have tried:
^[0-20]{2}$
But it works for only numbers 0-2. How should I construct a regular expression to match numbers < 20? For instance, it should match:
A 12 year old practitioner
But not
A 20 year old programmer
A 52 year old politician
You may use
\b1?[0-9]\b
See the regex demo
Details
\b - a word boundary
(?:1?[0-9]) - an optional 1 and any digit after it
\b - a word boundary
Word boundary variations
To match anywhere in a string, even if glued to a word:
(?<!\d)1?[0-9](?!\d)
To only match in between whitespaces:
(?<!\S)1?[0-9](?!\S)
Using regex to match digit ranges is usually a bit clumsy, but here, you can do it pretty simply with:
\b1?\d\b
https://regex101.com/r/YCWmNo/2
In plain language: an optional one, followed by a digit. So, any standalone digit is allowed, but a two-digit number needs its first digit to be a 1.
If you want to permit leading zeros, change to \b[01]?\d\b.

Regex - Exactly 7 digits no more no less

I am looking for help here. I want to write a regex to help me find EXACTLY a 7 digit in string - no more or less.
For instance in this string:
1234567 RE:TKT-2744870-R6P1G0: Gentle Reminder
It should return only 1234567
In this one:
12345678 RE:TKT-2744870-R6P1G0: Gentle Reminder
It should return none.
Can you help me with this one.
thanks in advance.
The proper regex should include \d{7} (7 digits) and 2 "border criteria",
for both start and end of the match, to block matching of a fragment
from longer sequence of digits.
My first thought was that neither before nor after the match there can be any digit.
But as I see from your example, these border criteria should be extended.
The set of "forbidden" chars (either before or after the match) should
include also - and letters.
E.g. 2744870 in your example data contains just 7 digits (no more, no less),
but you still don't want it to be matched, apparently because they are surrounded with - chars.
To keep the regex short, I propose:
(?<![\w-])\d{7}(?![\w-])
Details:
(?<![\w-]) - Negative lookbehind for word char or -.
\d{7} - 7 digits.
(?![\w-]) - Negative lookahead for word char or -.
If you decide to extend the set of "forbidden" chars in both border criteria,
just add them to [...] fragments in lookbehind / lookahead (but - char
should remain at the end, otherwise it must be quoted with \).
Regex like (\d{7})[^\d] (in other proposition) is wrong,
as it matches last 7 digits from any longer sequence of digits
(no "front border criterion").
It matches also both 2744870 (surronded with - chars), which are not
to be matched.
This one should do for your examples:
(\d{7})[^\d]
The first matching group contains the seven digits.
Alternatively –as suggested in the comments– you can use a negative lookahead to only match the seven digits and not require matching groups:
^\d{7}(?!\d)

Regex - matching while ignoring some characters

I am trying to write a regex to max a sequence of numbers that is 5 digits long or over, but I ignore any spaces, dashes, parens, or hashes when doing that analysis. Here's what I have so far.
(\d|\(|\)|\s|#|-){5,}
The problem with this is that this will match any sequence of 5 characters including those characters I want to ignore, so something like "#123 " would match. While I do want to ignore the # and space character, I still need the number itself to be 5 digits or more in order to qualify at a match.
To be clear, these would match:
1-2-3-4-5
123 45
2(134) 5
Bonus points if the matching begins and ends with a number rather than with one of those "special characters" I am excluding.
Any tips for doing this kind of matching?
If I understood requirements right you can use:
^\d(?:[()\s#-]*\d){4,}$
RegEx Demo
It always matches a digit at start. Then it is followed by 4 or more of a non-capturing group i.e. (?:[()\s#-]*\d) which means 0 or more of any listed special character followed by a digit.
So just repeat a digit, followed by any other sequence of allowed characters 5 or more times:
^(\d[()\s#-]*){5,}$
You can ensure it ends on a digit if you subtract one of the repetitions and add an explicit digit at the end:
^(\d[()\s#-]*){4,}\d$
You can suggest non-digits with \D so et would be something like:
(\d\D*){5,}
Here is a guide.

Check if a string contains at least 10 digits, 12 uppercase letter and 20 lowercase letter

What could be the regular expression to have at least 10 digits, 12 uppercase letter and 10 lowercase letters?
The string can start with any of the above and could be randomly
placed. For example, AB12jgGGfWisLWfoi34R32SgD42DSf3453jfh.
I used (?=.*\\d.*\\d)(?![.\\n])(?=.*[A-Z].*[A-Z])(?=.*[a-z].*[a-z]).*$ This is what I used for at least two uppercase, two lowercase and two digits. But adding 10 redundant \\d's in the expression above doesn't seem a good practice.
Moreoever, using \\d{10} doesn't work as if we expect consecutive 10 digits.
You can use this regex:
^(?=(.*?\d){10})(?=(.*?[A-Z]){12})(?=(.*?[a-z]){10})[a-zA-Z0-9]+$
RegEx Demo
Or even better performing regex:
^(?=(?:\D*\d){10})(?=(?:[^A-Z]*[A-Z]){12})(?=(?:[^a-z]*[a-z]){10})[a-zA-Z0-9]+$
This is because negation pattern works better than lazy quantifier .*? (thanks to #nhahtdh).

Regex representation for licence plate

Pattern for example:
L(IP)-P(F)-2(014)
More examples:
B-G-2
BI-GH-1245
HH-X-124
The chars in brackets are optional. First (Max 3 chars, min 1) and second part (max 2 chars, min 1) consits of letters only. Third part (max 4. min 1) consists of numbers only. The parts are divided by "-".
Any ideas how a regex for this would look like?
You can use the character class [A-Z] to match any uppercase character, and \d to match any digit. You can specify repetition using {m,n}, which means "match the previous element between m and n times":
It might look something like this:
[A-Z]{1,3}-[A-Z]{1,2}-[0-9]{1,4}
You may also want to add beginning and end of string anchors (^ and $ respectively):
^[A-Z]{1,3}-[A-Z]{1,2}-[0-9]{1,4}$
This depends on whether you are trying to pull license plates out of a larger string or trying to see if a particular string is a license plate (and nothing else).
If you also need to match lowercase characters, change each of the [A-Z] classes to [A-Za-z].
You can use this regex:
^[A-Za-z]{1,3}-[A-Za-z]{1,2}-[0-9]{1,4}$
If I interpret you correctly, you basically need
<1-3 letters><1-2 letters><1-4 numbers>
or [A-Za-z]{1,3}-[A-Za-z]{1,2}-[0-9]{1,4}