Regular Expression begining of string with special characters - regex

Using this for an example string
+$43073$7
and need the 5 number sequence from it I'm using the Regex expression
#"\$+(?<lot>\d{5})"
which is matching up any +$ in the string. I tried
#"^\$+(?<lot>\d{5})"
as the +$ are always at the beginning of the string. What will work?

If you use anchor ^, you need to include the + symbol at the first and don't forget to escape it because + is a special meta character in regex which repeats the previous token one or more times.
#"^\+\$(?<lot>\d{5})"
And without the anchor, it would be like
#"\$(?<lot>\d{5})"
And get the 5 digit number you want from group index 1.
DEMO

I would match what you want:
\d+
or if you only want digits after "special" characters at the start of input:
^\W+(\d+)
grabbing group 1

Related

Regular Expression: Find a specific group within other groups in VB.Net

I need to write a regular expression that has to replace everything except for a single group.
E.g
IN
OUT
OK THT PHP This is it 06222021
This is it
NO MTM PYT Get this content 111111
Get this content
I wrote the following Regular Expression: (\w{0,2}\s\w{0,3}\s\w{0,3}\s)(.*?)(\s\d{6}(\s|))
This RegEx creates 4 groups, using the first entry as an example the groups are:
OK THT PHP
This is it
06222021
Space Charachter
I need a way to:
Replace Group 1,2,4 with String.Empty
OR
Get Group 3, ONLY
You don't need 4 groups, you can use a single group 1 to be in the replacement and match 6-8 digits for the last part instead of only 6.
Note that this \w{0,2} will also match an empty string, you can use \w{1,2} if there has to be at least a single word char.
^\w{0,2}\s\w{0,3}\s\w{0,3}\s(.*?)\s\d{6,8}\s?$
^ Start of string
\w{0,2}\s\w{0,3}\s\w{0,3}\s Match 3 times word characters with a quantifier and a whitespace in between
(.*?) Capture group 1 match any char as least as possible
\s\d{6,8} Match a whitespace char and 6-8 digits
\s? Match an optional whitespace char
$ End of string
Regex demo
Example code
Dim s As String = "OK THT PHP This is it 06222021"
Dim result As String = Regex.Replace(s, "^\w{0,2}\s\w{0,3}\s\w{0,3}\s(.*?)\s\d{6,8}\s?$", "$1")
Console.WriteLine(result)
Output
This is it
My approach does not work with groups and does use a Replace operation. The match itself yields the desired result.
It uses look-around expressions. To find a pattern between two other patterns, you can use the general form
(?<=prefix)find(?=suffix)
This will only return find as match, excluding prefix and suffix.
If we insert your expressions, we get
(?<=\w{0,2}\s\w{0,3}\s\w{0,3}\s).*?(?=\s\d{6}\s?)
where I simplified (\s|) as \s?. We can also drop it completely, since we don't care about trailing spaces.
(?<=\w{0,2}\s\w{0,3}\s\w{0,3}\s).*?(?=\s\d{6})
Note that this works also if we have more than 6 digits because regex stops searching after it has found 6 digits and doesn't care about what follows.
This also gives a match if other things precede our pattern like in 123 OK THT PHP This is it 06222021. We can exclude such results by specifying that the search must start at the beginning of the string with ^.
If the exact length of the words and numbers does not matter, we simply write
(?<=^\w+\s\w+\s\w+\s).*?(?=\s\d+)
If the find part can contain numbers, we must specify that we want to match until the end of the line with $ (and include a possible space again).
(?<=^\w+\s\w+\s\w+\s).*?(?=\s\d+\s?$)
Finally, we use a quantifier for the 3 ocurrences of word-space:
(?<=^(\w+\s){3}).*?(?=\s\d+\s?$)
This is compact and will only return This is it or Get this content.
string result = Regex.Match(#"(?<=^(\w+\s){3}).*?(?=\s\d+\s?$)").Value;

Validating usernames using regular expressions

I have to validate a username in reactJs.
The conditions are-
It should be alphanumeric
Should be greater than 5 characters and less than 11 characters
Should not start with a digit
My solution is not working:
value.match(/^[a-zA-Z][a-zA-Z0-9]{6,10}/)
You just need to change {6,10} to {5,9} since [a-zA-Z] has already represent a character
value.match(/^[a-zA-Z][a-zA-Z0-9]{5,9}$/)
You are already matching 1 character in the first character class [a-zA-Z].
To match greater than 5 characters and less than 11 characters you could use {5,9} as a quantifier for the second character class and assert the end of the line $ to prevent match from returning the first 9 characters when the string is longer than 9 characters.
^[a-zA-Z][a-zA-Z0-9]{5,9}$
Regex demo
const strings = [
"A123456789BBBBBBB",
"A123456789"
];
let pattern = /^[a-zA-Z][a-zA-Z0-9]{5,9}$/;
strings.forEach((value) => {
console.log(value.match(pattern));
});
One method is to use lookaheads to validate the string with some rules.
You could use this pattern ^(?=[a-zA-Z0-9]{5,11}$)(?!\d).+.
This (?=[a-zA-Z0-9]{5,11}$) assures, that what follows beginning of a string ^ is 5 to 11 alphanumerics.
Second, negative, lookahead is (?!\d) to prevent from matching, when digit is first character in a string.
Demo
Read this for reference.

Regex to extract last period and md5 string

I have the following regular expression:
/^[a-f0-9]{8}$/ --- This expression extracts an 8 character string as a md5 hash, for example: if I have the following string "hello world .305eef9f x1xxx 304ccf9f test1232" it will return "304ccf9f"
I also have the following regular expression:
/.[^.]*$/ --- This expression extracts a string after the last period (included), for example, if I have "hello world.this.is.atest.case9.23919sd3xxxs" it will return ".23919sd3xxxs"
Thing is, I've readen a bit about regex but I can't join both expressions in order to find the md5 string after the last period (included), for example:
topLeftLogo.93f02a9d.controller.99f06a7s ----> must return ".99f06a7s"
Thanks in advance for your time and help!
/^[a-f0-9]{8}$/ --- This expression extracts an 8 character string as a md5 hash
Yes but it doesn't return "304ccf9f" from "hello world .305eef9f x1xxx 304ccf9f test1232" because ^ in regex means start of string. How is it possible for it to match in middle of a string?
/.[^.]*$/ --- This expression extracts a string after the last period
No. It will do if you escape first dot only \.
To combine these two you have to replace ^ with \.:
\.[a-f0-9]{8}$
To match your characters 8 times after the last dot in this range [a-f0-9] you might use (if supported) a positive lookahead (?!.*\.) to match your values and assert that what follows does not contain a dot:
\.[a-f0-9]{8}(?!.*\.)
Regex demo
If you want to match characters from a-z instead of a-f like 99f06a7s you could use [a-z0-9]
About the first example
This regex ^[a-f0-9]{8}$ will match one of the ranges in the character class 8 times from the start until the end of the string due to the anchors ^ and $. It would not find a match in hello world .305eef9f x1xxx 304ccf9f test1232 on the same line.
About the second example
.[^.]*$ will match any character zero or more times followed by matching not a dot. That would for example also match a single a and is not bound to first matching a dot because you have to escape the dot to match it literally.
I'm adding this just in case people needs to solve a similar casuistic:
Case 1: for example, we want to get the hexadecimal ([a-f0-9]) 8 char string from our filename string
between the last period and the file extension, in order, for example, to remove that "hashed" part:
Example:
file.name2222.controller.2567d667.js ------> returns .2567d667
We will need to use the following regex:
\.[a-f0-9]{8}(?=\.\w+$)
Case 2: for example, we want the same as above but ignoring the first period:
Example:
file.name2222.controller.2567d667.js ------> returns 2567d667
We will need to use the following regex
[a-f0-9]{8}(?=\.\w+$)

Regex for a string with alpha numeric containing a '.' character

I have not been able to find a proper regex to match any string not starting and ending with some condition.
This matches
AS.E
23.5
3.45
This doesn't match
.263
321.
.ASD
The regex can be alpha-numeric character with optional '.' character and it has to be with in range of 2-4(minimum 2 chars & maximum 4 chars).
I was able to create one ->
^[^\.][A-Z|0-9|\.]{2,4}$
but with this I couldn't achieve mask '.' character at the end of regex.
Thanks.
Maybe not the most optimized but a working one. Created step by step:
The first character should be alphanumeric
^[a-zA-Z0-9]
0, 1 or 2 character alphanumeric or . but not matching end of string
[a-zA-Z0-9\.]{0,2}
an alphanumeric character matching end of string
[a-zA-Z0-9]$
Concatenate all of this to obtain your regex
^[a-zA-Z0-9][a-zA-Z0-9\.]{0,2}[a-zA-Z0-9]$
Edit: This regex allows multiple dots (up to 2)
If I guessed correctly, you want to match all words that are
Between 2 and 4 characters long ...
... and start and end with a character from [A-Z0-9] ...
... and have characters from [A-Z0-9.] in the middle ...
... and are not preceded or followed by a ..
Try this regex to match all these substrings in a text:
(?<=^|[^.])[A-Z0-9][A-Z0-9.]{0,2}[A-Z0-9](?=$|[^.])
However, note that this will match the AA in .AAAA.. If you don't want this match, then please give more details on your requirements.
When you are only interested in the number of matches, but not the matched strings, then you could use
(^|[^.])[A-Z0-9][A-Z0-9.]{0,2}[A-Z0-9]($|[^.])
If you have one string, and want to know whether that string completely matches or not, then use
^[A-Z0-9][A-Z0-9.]{0,2}[A-Z0-9]$
If there may be at most one . inside the match, replace the part [A-Z0-9.]{0,2} with ([A-Z0-9]?[A-Z0-9.]?|[A-Z0-9.]?[A-Z0-9]?).
You can use this pattern to match what you say,
^[^\.][a-zA-Z0-9\.]{2,4}[^\.]$
Check the result here..
https://regex101.com/r/8BNdDg/3

Regex for String with first two characters fixed and rest digits

Is there a regular expression for? :
String of length 8
First two chracters fixed 'UE' or 'ue'
remaining 6 characters must be digits [0-9]
Eg: https://regex101.com/r/PufypE/1
The expression i tried
\^(UE|ue){2}[0-9]{6}\
but its not working (no match found!)
You want:
\b(UE|ue)[0-9]{6}\b
You don't need the {2} next to the (UE|ue) since you are specifying those exactly. The \b is a word boundary so this will match a list like you put in the comment: UE123456,ue654321 This is a good site to play with a regex on for this kind of stuff: http://regex101.com
Regex should be:
^[Uu][Ee][0-9]{6}$
(UE|ue){2} in your regex would match 2 occurrences of UE or ue