Regex: Check if string contains at least one digit - regex

I have got a text string like this:
test1test
I want to check if it contains at least one digit using a regex.
What would this regex look like?

I'm surprised nobody has mentioned the simplest version:
\d
This will match any digit. If your regular expression engine is Unicode-aware, this means it will match anything that's defined as a digit in any language, not just the Arabic numerals 0-9.
There's no need to put it in [square brackets] to define it as a character class, as one of the other answers did; \d works fine by itself.
Since it's not anchored with ^ or $, it will match any subset of the string, so if the string contains at least one digit, this will match.
And there's no need for the added complexity of +, since the goal is just to determine whether there's at least one digit. If there's at least one digit, this will match; and it will do so with a minimum of overhead.

The regular expression you are looking for is simply this:
[0-9]
You do not mention what language you are using. If your regular expression evaluator forces REs to be anchored, you need this:
.*[0-9].*
Some RE engines (modern ones!) also allow you to write the first as \d (mnemonically: digit) and the second would then become .*\d.*.

In Java:
public boolean containsNumber(String string)
{
return string.matches(".*\\d+.*");
}

you could use look-ahead assertion for this:
^(?=.*\d).+$

Another possible solution, in case you're looking for all the words in a given string, which contain a number
Pattern
\w*\d{1,}\w*
\w* - Matches 0 or more instances of [A-Za-z0-9_]
\d{1,} - Matches 1 or more instances of a number
\w* - Matches 0 or more instances of [A-Za-z0-9_]
The whole point in \w* is to allow not having a character in the beginning or at the end of a word. This enables capturing the first and last words in the text.
If the goal here is to only get the numbers without the words, you can omit both \w*.
Given the string
Hello world 1 4m very happy to be here, my name is W1lly W0nk4
Matches
1 4m W1lly W0nk4
Check this example in regexr - https://regexr.com/5ff7q

Ref this
SELECT * FROM product WHERE name REGEXP '[0-9]'

This:
\d+
should work
Edit, no clue why I added the "+", without it works just as fine.
\d

In perl:
if($testString =~ /\d/)
{
print "This string contains at least one digit"
}
where \d matches to a digit.

If anybody falls here from google, this is how to check if string contains at least one digit in C#:
With Regex
if (Regex.IsMatch(myString, #"\d"))
{
// do something
}
Info: necessary to add using System.Text.RegularExpressions
With linq:
if (myString.Any(c => Char.IsDigit(c)))
{
// do something
}
Info: necessary to add using System.Linq

Related

Regex to match everything except this regex

I think this is a simple thing for a lot of you, but I have a very limited knowlegde of regex at the moment. I want to match everything except a double digit number in a string.
For example:
TEST22KLO4567
QE45C2C
LOP10G7G400
Now I found out the regex to match the double digit numbers:
\d{2}
Which matches the following:
TEST22KLO4567
QE45C2C
LOP10G7G400
Now it seems to me that it would be fairly easy to turn that regex around to match everything BUT "\d{2}". I searched a lot but I can't seem to get it done. I hope someone here can help.
This only works if your regex engine supports look behinds:
^.+?(?=\d{2})|(?<=\d{2}).+$
Explanation:
The | separates two cases where this would match:
^.+?(?=\d{2})
This matches everything from the start of the string (^) until \d{2} is encountered.
(?<=\d{2}).+$
This matches the end of the string, from the place just after two digits.
If your regex engine doesn't support look behinds (JavaScript for example), I don't think it is possible using a pure regex solution.
You can match the first part:
^.+?(?=\d{2})
Then get where the match ends, add 2 to that number, and get the substring from that index.
You are right rejecting a search in regex is usually rather tricky.
In your case I think you want to have [^\d{2}], however, this is tricky as your other strings also contain two digits so your regex using it won't select them.
I would go with this regex (using PCRE 8.36 but should work also in others):
\*{2}\w*\*{2}
Explanation:
\*{2} .... matches "*" literally exactly two times
\w* .... matches "word character" zero or unlimited times
Found one regex pretty straightforward :
^(.*?[^\d])\d{2}([^\d].*?)$
Explanations :
^ : matches the beginnning of a line
(.*?[^\d]) : matches and catches the first part before the two numbers. It can contain anything (.*?) but needs to end with something different to a number ([^\d]) so we ensure that there is only 2 numbers in the middle
\d{2} : is the part you found yourself
([^\d].*?) : is the symetric of (.*?[^\d]) : begins with something different from a number ([^\d]) and matches anything next.
$ : up to the end of the line.
To test this reges you can use this link
It will match the first occurence of double digit, but because OP said there was only one it does the job correctly. I expect it to work with every regex engine as nothing too complex is used.

RegEx more than multiple characters before number

I really don't use RegEx that much. You could say I am RegEx n00b. I have been working on this issue for a half a day.
I am trying to write a pattern that looks backward from a number character. For example:
1. bob1 => bob
2. cat3 => cat
3. Mary34 => Mary
So far I have this (?![A-Z][a-z]{1,})([A-Za-z_])
It only matches for individual characters, I want all the characters before the number character. I tried to add the ^ and $ into my pattern and using an online simulator. I am unsure where to put the ^ and $.
NOTE: I am using RegEx for the .NET Framework
You may use a regex like
[\p{L}_]+(?=\d)
or
[\w-[\d]]+(?=\d)
See the regex demo
Pattern details
[\p{L}_]+ - any 1 or more letters (both lower- and uppercase) and/or _
OR
[\w-[\d]]+ - 1 or more word chars except digits (the -[] inside a character class is a character class subtraction construct)
(?=\d) - a positive lookahead that requires a digit to appear immediately to the right of the current location
If we break down your RegEx, we see:
(?![A-Z][a-z]{1,}) which says "look ahead to find a string that is NOT one uppercase letter followed one or more lowercase letters" and ([A-Za-z_]) which says "match one letter or underscore". This should end up matching any single lowercase letter.
If I understand what you want to achieve, then you want all of the letters before a number. I would write something like that as:
\b([a-zA-Z]+)[0-9]
This will start at a word boundary \b, match one or more letters, and require a digit right after the matched string.
(The syntax I used seems to match this document about .NET RegEx: https://learn.microsoft.com/en-us/dotnet/standard/base-types/regular-expressions)
In light of Wiktor Stribizew's comment, here is a pure match RegEx:
\b[a-zA-Z_]+(?=[0-9])
This matches the pattern and then looks ahead for the digit. This is better than my first lookahead attempt. (Thank you Wiktor.)
http://www.rexegg.com/regex-lookarounds.html

regular expression no characters

I have this regular expression
([A-Z], )*
which should match something like
test, (with a space after the comma)
How to I change the regex expression so that if there are any characters after the space then it doesn't match.
For example if I had:
test, test
I'm looking to do something similar to
([A-Z], ~[A-Z])*
Cheers
Use the following regular expression:
^[A-Za-z]*, $
Explanation:
^ matches the start of the string.
[A-Za-z]* matches 0 or more letters (case-insensitive) -- replace * with + to require 1 or more letters.
, matches a comma followed by a space.
$ matches the end of the string, so if there's anything after the comma and space then the match will fail.
As has been mentioned, you should specify which language you're using when you ask a Regex question, since there are many different varieties that have their own idiosyncrasies.
^([A-Z]+, )?$
The difference between mine and Donut is that he will match , and fail for the empty string, mine will match the empty string and fail for ,. (and that his is more case-insensitive than mine. With mine you'll have to add case-insensitivity to the options of your regex function, but it's like your example)
I am not sure which regex engine/language you are using, but there is often something like a negative character groups [^a-z] meaning "everything other than a character".

Regular expression to match last number in a string

I need to extract the last number that is inside a string. I'm trying to do this with regex and negative lookaheads, but it's not working. This is the regex that I have:
\d+(?!\d+)
And these are some strings, just to give you an idea, and what the regex should match:
ARRAY[123] matches 123
ARRAY[123].ITEM[4] matches 4
B:1000 matches 1000
B:1000.10 matches 10
And so on. The regex matches the numbers, but all of them. I don't get why the negative lookahead is not working. Any one care to explain?
Your regex \d+(?!\d+) says
match any number if it is not immediately followed by a number.
which is incorrect. A number is last if it is not followed (following it anywhere, not just immediately) by any other number.
When translated to regex we have:
(\d+)(?!.*\d)
Rubular Link
I took it this way: you need to make sure the match is close enough to the end of the string; close enough in the sense that only non-digits may intervene. What I suggest is the following:
/(\d+)\D*\z/
\z at the end means that that is the end of the string.
\D* before that means that an arbitrary number of non-digits can intervene between the match and the end of the string.
(\d+) is the matching part. It is in parenthesis so that you can pick it up, as was pointed out by Cameron.
You can use
.*(?:\D|^)(\d+)
to get the last number; this is because the matcher will gobble up all the characters with .*, then backtrack to the first non-digit character or the start of the string, then match the final group of digits.
Your negative lookahead isn't working because on the string "1 3", for example, the 1 is matched by the \d+, then the space matches the negative lookahead (since it's not a sequence of one or more digits). The 3 is never even looked at.
Note that your example regex doesn't have any groups in it, so I'm not sure how you were extracting the number.
I still had issues with managing the capture groups
(for example, if using Inline Modifiers (?imsxXU)).
This worked for my purposes -
.(?:\D|^)\d(\D)

Need a simple RegEx to find a number in a single word

I've got the following url route and i'm wanting to make sure that a segment of the route will only accept numbers. as such, i can provide some regex which checks the word.
/page/{currentPage}
so.. can someone give me a regex which matches when the word is a number (any int) greater than 0 (ie. 1 <-> int.max).
/^[1-9][0-9]*$/
Problems with other answers:
/([1-9][0-9]*)/ // Will match -1 and foo1bar
#[1-9]+# // Will not match 10, same problems as the first
[1-9] // Will only match one digit, same problems as first
If you want it greater than 0, use this regex:
/([1-9][0-9]*)/
This'll work as long as the number doesn't have leading zeros (like '03').
However, I recommend just using a simple [0-9]+ regex, and validating the number in your actual site code.
This one would address your specific problem. This expression
/\/page\/(0*[1-9][0-9]*)/ or "Perl-compatible" /\/page\/(0*[1-9]\d*)/
should capture any non-zero number, even 0-filled. And because it doesn't even look for a sign, - after the slash will not fit the pattern.
The problem that I have with eyelidlessness' expression is that, likely you do not already have the number isolated so that ^ and $ would work. You're going to have to do some work to isolate it. But a general solution would not be to assume that the number is all that a string contains, as below.
/(^|[^0-9-])(0*[1-9][0-9]*)([^0-9]|$)/
And the two tail-end groups, you could replace with word boundary marks (\b), if the RE language had those. Failing that you would put them into non-capturing groups, if the language had them, or even lookarounds if it had those--but it would more likely have word boundaries before lookarounds.
Full Perl-compatible version:
/(?<![\d-])(0*[1-9]\d*)\b/
I chose a negative lookbehind instead of a word boundary, because '-' is not a word-character, and so -1 will have a "word boundary" between the '-' and the '1'. And a negative lookbehind will match the beginning of the string--there just can't be a digit character or '-' in front.
You could say that the zero-width assumption ^ is just one of the cases that satisfies the zero-width assumption (?<![\d-]).
string testString = #"/page/100";
string pageNumber = Regex.Match(testString, "/page/([1-9][0-9]*)").Groups[1].Value;
If not matched pageNumber will be ""
While Jeremy's regex isn't perfect (should be tested in context, against leading characters and such), his advice is good: go for a generic, simple regex (eg. if you must use it in Apache's mod_rewrite) but by any means, handle the final redirect in server's code (if you can) and do a real check of parameter's validity there.
Otherwise, I would improve Jeremy's expression with bounds: /\b([1-9][0-9]*)$/
Of course, a regex cannot provide a check against any max int, at best you can control the number of digits: /\b([1-9][0-9]{0,2})$/ for example.
This will match any string such that, if it contains /page/, it must be followed by a number, not consisting of only zeros.
^(?!.*?/page/([0-9]*[^0-9/]|0*/))
(?! ) is a negative look-ahead. It will match an empty string, only if it's contained pattern does not match from the current position.