Regular expression to end when it sees a number? - regex

So right now im trying to create a regex that takens in ID's. The ID is a string so it can have letters and numbers. However, i need to make an expression to see that it gets digits, it should not take in letters anymore and end the expression.
What i have:
[a-zA-Z]([a-zA-Z]|[0-9])*
Example:
"Bob23Dan"
Example answer:
1) "Bob23"
2) "Dan"

This will match a variable number of letters (atleast one) that ends with a variable number of numbers (optional)
[a-zA-Z]+[0-9]*

If you can tolerate there being letters after the digits in the original string, but you just don't want to match them, then I think you need this:
^[a-zA-Z]*[0-9]+
which will take in any number of letters from the start of the string (^), then at least one digit. It will fail to match if there are no digits, but pass if there are digits but no letters.
If you want to make sure there are no letters after the digits in the original string (the purpose of the regex is to test rather than to match), then append the end of line char ($) like this:
^[a-zA-Z]*[0-9]+$

Related

regex - allow only single digit numbers in a string

I need to match a string that can be of length from 1 to 20 characters maximum, and it contains letters a-g and numbers 1-7. However, the numbers cannot be next to each other - only single digit numbers are allowed.
Valid strings: aabbca1a6, 4gg1g2g1, 1
Invalid string: aabbca16a - theres two numbers next to each other, forming a two digit number 16.
I can match most strings quite easily with [a-g1-7]{1,20}, however i have no idea how to detect when two numbers are next to each other efficiently.
Currently in my program, after parsing through the regex, i'm just going through the whole string again in a loop, making sure there's no 2 numbers next to each other, however i'd prefer if it all could be done with just one (simple) regex.
You can use the answer from the comments by Ulugbek Umirov using negative lookahead at the start of an anchored string asserting not 2 digits to the right.
^(?!.*\d{2})[a-g1-7]{1,20}$
The pattern matches:
^ Start of string
(?!.*\d{2}) Negative lookahead, assert not 2 digits
[a-g1-7]{1,20} Repeat the ranges in the character class 1-20 times
$ End of string
Regex demo
Another option could be asserting the string length and repeat matching in a way that there can not be 2 digits next to each other
^(?=[a-g1-7]{1,20}$)[a-g]*(?:[1-7][a-g]+)*[0-7]?$
Regex demo
The simplest regex and approach is to check if it doesn't match:
.*\d\d.*
The best way to solve this problem is to use this trick
-Check if number between 1-7 and character between a-g and the numbers are not siblings with each other by using this pattern
^[1-7]?([a-g]+[1-7]?)*
-Then you can check the length of string using string methods like (length method in JavaScript)

regex alphanumeric, but when it gets a numeric, then only numeric

I need to validate an input that must start with an alpha, and then it can be alphanumeric, but once numeric; it must be numeric to the end of the string.
[a-z][a-z,0-9]{1,5}
This does only part of the job. So it validates correctly for
a1
abc12
ab123
but I do not want
a1b2c1
so onces it gets a numeric, the rest must be numeric.
Try this:
^(?=.{2,6}$)([a-z]+[0-9]*)$
First check for 2-6 characters from beginning to end of line. It doesn't even matter what characters they are - you are just checking for length.
Then, 1 or more letters followed by any number of numbers. Since you already checked for 2-6 characters only you don't really care how many letters are followed by how many numbers. At first, I thought it would be much more complicated to list all the possibilities but the positive lookahead does alot of the work
See https://regex101.com/r/HYQIf6/5
This should work for a string of any length:
^[a-z]+([a-z]*|\d*)$
This will return true if the string:
starts with one or more letters from a to z
followed by zero or more: letters or numbers until the end
See the matches at Regex101
Edit:
This works as well:
^[a-z]+\d*$
See new regex

Extracting groups of characters that match a pattern from a string in VB

I'm still a regex baby and need some help parsing a string.
I am using VB, and intend to run a string through NCalc, a library that parses mathematical equations from strings.
The problem is, the equations will have numbers, operations and variables.
An equation may look like this:
P20*4.143/((N2+N3)/2)
As you can see, P20, N2 and N3 are variables. In my case, they are stored in a datatable elsewhere in my application.
What I need to do is parse the string, looking for groups of characters in-between operations (-+/*), get their actual values and replace the variable with the value in the original string all while ignoring actual numbers.
The above string should become:
120.5*4.143/((4500+4570)/2)
So something like this:
Dim equation = "P20*4.143/((N2+N3)/2)"
For Each match As String In Regex(match_all_groups_with_letters)
return replace(match, value)
Next
Then I can do something like:
finalResult = NCalc.Doyourmagic(equation)
You may use a simple pattern like
"\b\d*\p{L}[\p{L}\d]*\b"
See the regex demo
It matches a leading word boundary \b, zero or more digits (\d*), a letter (\p{L}), and zero or more digits or letters ([\p{L}\d]*) followed with a trailing word boundary (\b).
Adjust the quantifiers accordingly (if the digits are always present, use \d+ instead of \d*). If the letters can only be ASCII letters, use [A-Za-z] (or just uppercase ASCII - [A-Z]) instead of \p{L} (that matches all Unicode letters).
Assuming that variable names always start with a letter and may contain numbers
Dim equation = "P20*4.143/((N2+N3)/2)"
Dim pattern = "[A-Za-z0-9]*[A-Za-z][A-Za-z0-9]*" ' Can start and end with numbers,
' must contain at least one letter.
Dim matches = Regex.Matches(equation, pattern)
For Each m As Match In matches
Dim value = GetValueFor(m.Value)
equation = Regex.Replace(equation, "\b" & m.Value & "\b", value)
Next
\b marks the beginning or end of a word. The square braces [] enclose character groups. The star * means zero, one or more repetitions of the preceding operation. So we have any number of repetitions of letters and digits, then exactly one letter, and finally again any number of repetitions of letters and digits.
Instead of re-using Regex for replacing the identifiers found, you could use the information provided in the match. It has Index and Length properties telling you the exact location of the identifiers in the equation. You can then replace by using string functions. But make sure to iterate the matches the reverse way in order to preserve the positions of the not yet processed identifiers.
You will have to somehow get the values corresponding to identifiers. A Dictionary(Of String, Double) is ideal for holding the values. The key is the identifier.

Regex: Match if strings starts with a letter or number and then

I'm wanting to match a string if begins with either a letter or number, and from there I want to count the string (excluding whitespaces), and if it's over 5 characters, match it.
I believe I'm pretty close, my current regex is:
\s*(?:\S[\t ]*){5,}
What I need to add, is making sure the string starts with either a letter or number (or if it begins with a whitespace, make sure the following character is a letter or number.)
http://regex101.com/r/lD7mZ2/1
How about the regex
^\s*[a-zA-Z0-9]\s*(?:\S[\t ]*){4,}
Example: http://regex101.com/r/lD7mZ2/4
Changes made
^ anchors the regex at the start of the string.
[a-zA-Z0-9] matches letter or digit
{4,} quantifies it minimum 4 times. The presceding \w makes length of minimum 5
OR
a shorter version would be
^\s*[a-zA-Z0-9]\s*(?:\S\s*){4,}

Regular expression for first 4 characters

I need a regular expression for 4 characters. The first 3 characters must be a number and the last 1 must be a letter or a digit.
I formed this one, but it not working
^([0-9]{3}+(([a-zA-Z]*)|([0-9]*)))?$
Some valid matches: 889A, 777B, 8883
I need a regular expression for first 3 will be a number and the last 1 will be a alphabet or digit
This regex should work:
^[0-9]{3}[a-zA-Z0-9]$
This assumes string is only 4 characters in length. If that is not the case remove end of line anchor $ and use:
^[0-9]{3}[a-zA-Z0-9]
Try this
This will match it anywhere.
\d{3}[a-zA-Z0-9]
This will match only beginning of a string
^\d{3}[a-zA-Z0-9]
You can also try this website: http://gskinner.com/RegExr/
It makes it very easy to create and test your regex.
Just take the stars out...
^([0-9]{3}+(([a-zA-Z])|([0-9])))?$
The stars mean zero or more of something before it. You are already using an or (|) so you want to match exactly one of the class, or one of the other, not zero or more of the class, or zero or more of the other.
Of course, it can be simplified further:
^\d{3}[a-zA-Z\d]$
Which literally means... three digits, followed by a character from either lowercase or uppercase a-z or any digit.