Regex - Can quantifier skip certain range?

Regex - Can quantifier skip certain range? - regex

I have a simple regex like this [0-9a-zA-Z]{32,45} that matches 0-9,a-z,A-Z 32 to 45 times. Is there a way I can have the regex skip a certain range? For example, I don't want to match if there are 40 characters.

One way to do that would be
\b[0-9a-zA-Z]{32,39}+(?:[0-9a-zA-Z]{2,6})?\b
See proof. You match 32 to 39 occurrences possessively, then an optional occurrence of 2 to 6 repetitions of the pattern.

Another way could be using an alternation | repeating the character class either 41-45 times or 32-39 times.
You could prepend and append a word boundary \b to the pattern.
\b(?:[0-9a-zA-Z]{41,45}|[0-9a-zA-Z]{32,39})\b
Regex demo

Related

how to exclude digits using regular expression in VBA

Hello I need to exclude sequence of digits from 890000 till 890001;
890002 to 899999 is acceptable
Is it possible doing using regular expression?

No need for regex.
If Value >= 890002 And Value <= 899999 Then
' Accept
End If

Ok, if you insist on using regex (may be for learning purpose):
In this simple case it is actually easier to exclude those two number and match the rest:
^89(?!000[12])\d{4}$
Explanation:
^ match from start of text
89match 89
(?!000[12]) negative look ahead for 3 times zero and one of characters in the character group (1 or 2). If this doesn't block the match:
\d{4} match 4 digits
$ match end of text.

modifying regex so that it finds space and put minimum length restriction

I currently have the following regex
(?:(?<=^)|(?<=\s))(?:\+62|08)\S+\b
the issue is that it can't find texts like
0823 2371 2318
or
+62812 2712 2819
basically what follows 08 or +62 can be a number from 0-9 or a single space or a dot or an _
I also need to restrict such that it needs to find 10 characters or more

You may use
(?:\+62|08)[\s._-]?(?=[\d\s._-]{8})\d+(?:[\s._-]\d+)*\b
See the regex demo
Details
(?:\+62|08) - +62 or 08
[\s._-]? - a whitespace, ., _ or -
(?=[\d\s._-]{8}) - there must be 8 digits/whitespaces/dots/hyphens or underscores immediately to the right of the current location
\d+ - 1+ digits
(?:[\s._-]\d+)* - zero or more repetitions of a whitespace/dot/underscore/hyphens and then 1+ digits
\b - word boundary.
If you need to restrict it to match substrings with only 10 digits or more, replace the (?=[\d\s._-]{8}) lookahead with (?=(?:[\s._-]*\d){8}), see this regex demo, or use (?:\+62|08)(?:[\s._-]*\d){8,}\b regex.

This RegEx divides your input phone numbers into three groups and it might match your desired patterns in an input string:
(\+62|08)([0-9]+)(\s[0-9]{4})+
You can simply add other boundaries other than +62 or 08 to those groups, if you wish.
Edit: I'm not sure if this would cover your input samples. You could maybe modify it using a RegEx such as this one:
(\+62|08)([0-9\.\-\s\_]{3,14})

use ultraedit find and replace Perl regex to insert colon into 4 digit time string

I have multiple 24-hour time strings through several files. For example, 1234, which I wish to replace with 12:34.
Finding them is easy, just \d\d\d\d, that I understand and it works. However, what replace string do I need. In other words, say xx:xx, what do I put in place of each x.
I've tried numbers of things to no avail. I'm obviously not understanding how I get it to remember the digits it found and to recall them in the replace string.

If in your example data 4 digits represent 24 hour time strings you could match 2 capturing groups between word boundaries to prevent a match with more then 4 digits. You can Adjust the word boundaries to your requirements.
Match
\b(\d{2})(\d{2})\b
Replace
group1:group2 \1:\2
Explanation
\b Match a word boundary
(\d{2}) Capture in a group 2 digits
(\d{2}) Capture in a group 2 digits
\b Match a word boundary
Note
Matching 4 digits does not verify a valid 24 hour time. You could match that using for example \b([01][0-9]|2[0-3])([0-5][0-9])\b and replace with \1:\2

Regexp: find out if value that repeats several times

I have strings:
TH 8H 5C QS TC
9S 4S JS KS JS
I want the second one to be picked up by reqexp. Help me please to contract the necessary expression.
What I tried so far is: S{5} but of course it look up sequentially.
Could I avoid determining which character I am looking for. I need 5 repetition of any. Could it be like .{5} ?
Thanks in advance!

If you have standalone strings, use
^\wS(?: \wS){4}$
See the regex demo
If these strings appear inside a larger text, replace the ^ and $ anchors with word boundaries \b:
\b\wS(?: \wS){4}\b
See another demo
Note that \w matches any alphanumeric or underscore character. If there can be any non-whitespace character, use \S instead:
\b\SS(?: \SS){4}\b
One more demo
\SS will match a non-whitespace followed with an S and (?: \SS){4} will match 4 same sequences (thus, there will be 5 2-character sequences with S at the end of each).

Regex to invert search in Notepad++

I have a string
2012-02-19 00:11:12,128|DEBUG|Thread-1|### Time taken is 18 ms
Below regex allows me to search for 18 ms
\d\d\s[m][s]
What I want to do is search for string prior to 18 ms in Notepad++ and then delete it. So that out of thousands rows I have, I can just extract out timings.
Also, I need regex mentioned above to work with timings which are in 3 digits as well as 2 digits. For example it should be able to search for 18 ms as well as 999 ms.
Please help.

You may put your regex into a positive lookahead:
^.*(?=\d{2,3}\sms\s*$)
In case you have some text after 18 ms, you need to use a word boundary \b:
\b allows you to perform a "whole words only" search using a regular expression in the form of \bword\b
^.*(?=\d{2,3}\sms\b)
See demo
{2,3} is a limiting quantifier that lets you match 2 or 3 preceding subpattern.
There's an additional quantifier that allows you to specify how many times a token can be repeated. The syntax is {min,max}, where min is zero or a positive integer number indicating the minimum number of matches, and max is an integer equal to or greater than min indicating the maximum number of matches. If the comma is present but max is omitted, the maximum number of matches is infinite.
You can replace with empty string and 18 ms will stay on the line.
Note you can use \d+ to allow 1 or more digits to be matched (without restrictions on the digit number).
Note 2: if your number is the first of many on the line you need to use lazy matching, i.e. use .*? instead of .* in the beginning of pattern.

Also, I need regex mentioned above to work with timings which are in 3 digits as well as 2 digits.
.*?(?=\d{2,3}\sms\b)
Use the above regex and then replace the match with empty string.

You can use capturing group:
Find:
^.*(\d{2,}\s[m][s])$
Replace with:
\1

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Regex - Can quantifier skip certain range? - regex

I have a simple regex like this [0-9a-zA-Z]{32,45} that matches 0-9,a-z,A-Z 32 to 45 times. Is there a way I can have the regex skip a certain range? For example, I don't want to match if there are 40 characters.

One way to do that would be \b[0-9a-zA-Z]{32,39}+(?:[0-9a-zA-Z]{2,6})?\b See proof. You match 32 to 39 occurrences possessively, then an optional occurrence of 2 to 6 repetitions of the pattern.

Another way could be using an alternation | repeating the character class either 41-45 times or 32-39 times. You could prepend and append a word boundary \b to the pattern. \b(?:[0-9a-zA-Z]{41,45}|[0-9a-zA-Z]{32,39})\b Regex demo

Related

how to exclude digits using regular expression in VBA

modifying regex so that it finds space and put minimum length restriction

use ultraedit find and replace Perl regex to insert colon into 4 digit time string

Regexp: find out if value that repeats several times

Regex to invert search in Notepad++

Categories

Resources