Regex for non decimal integer considering exponents - regex

I am looking for a Regex for non decimal integer considering exponents and honestly I have tried a lot before asking here.
The regex should
match 1.23E4,1.2334576E34, 122E3,123,456 etc.
not match 1.234E2 (since it expands to 123.4).
should not match 1.22 and so on.
My try was
^[+-]?([0-9]*\\.?[0-9]+|[0-9]+\\.?[0-9]*)([eE][+]?[0-9]+)?$
However as you can see I am not calculating the exponent so that after expansion I should be able to tell that a value X after expanding does not contain a decimal.
Is there any way to extract the number of digits after the decimal . and compare it with exponent so that I can be sure that after expanding it will not contain a decimal.
For the info only a regex that can work in runtime will work for me.
Please help me guys...

ok, so this is only if you really need this for some weird regexp-only validation. it's written in python 3 and it makes no attempt to be compact (there's no limitation except available memory in the size of a regexp in python).
def over(n):
'''make aregexp for an exponent of n or more'''
assert n < 100
return r'([1-9]\d{2,}|%s)' % '|'.join(str(i) for i in range(n, 100))
def make_decimal(n_digits, n_decimal):
'''make a regexp for a number with an "E" with the given number of significant digits and decimal places'''
assert n_decimal < n_digits
assert 100 > n_decimal >= 0
if n_decimal:
return r'\d{%d}.\d{%d}E%s' % (n_digits-n_decimal, n_decimal, over(n_decimal))
else:
return r'\d{%d}E\d+'
def make_e(n_digits):
'''make a regexp for an integer with an "E" with the given number of significant digits'''
return '|'.join(make_decimal(n_digits, i) for i in range(n_digits))
def make_regexp(max_digits):
'''make a regexp for a decimal integer with up to the given number of significant digits'''
assert max_digits < 100
return r'(\d+|%s)' % '|'.join(make_e(i) for i in range(max_digits+1))
here's some test code.
from re import compile
rx = make_regexp(8)
m = compile('^%s$' % rx)
for n in ['1.23E4', '1.2334576E34', '122E3', '123', '456']:
assert m.match(n), n
for n in ['1.234E2', '1.22']:
assert not m.match(n), n
for up to significant 8 digits (to the left of E), which seems a reasonable limit, the regexp generated is 8774 digits long. you could reduce this significantly (for example, see https://stackoverflow.com/a/17840228/181772), but what's the need (the regular expression engine is capable of generating a much smaller internal automaton from this)?

Description
It's not impossible, but rather difficult and the expression will really start to get out of hand. Take this 2831 character monster which:
validates a number with exponent will expand to an integer
requires a number to be in 123.456e7890 or 1234.678e1,234,567
if the exponent contains commas they must appear in the correct comma delimited three digit groupings
supports only numbers upto 99 places after the decimal point
As written here it does require the use of the x option which will ignore white space and comments. The expression could be shortened to about 2041 by replacing the [eE] with e and using the i option; and [0-9] with \d however this will slightly reduce performance because \d class contains all unicode characters and not just 0-9.
^
(?=.*?[eE][0-9]{1,3}(?:,[0-9]{3})*|[0-9]*$) # validate commas are in the correct order
(?=[0-9]+\. # match the integer portion of a real number
(?=
[0-9]{1,99}[eE][1-9](?:,?[0-9]){2,}
|[0-9]{1,9}[eE][1-9],?[0-9]
|[0-9]{10,19}[eE][2-9],?[0-9]
|[0-9]{20,29}[eE][3-9],?[0-9]
|[0-9]{30,39}[eE][4-9],?[0-9]
|[0-9]{40,49}[eE][5-9],?[0-9]
|[0-9]{50,59}[eE][6-9],?[0-9]
|[0-9]{60,69}[eE][7-9],?[0-9]
|[0-9]{70,79}[eE][89],?[0-9]
|[0-9]{80,89}[eE][9],?[0-9]
|[0-9]{90,99}[eE][1-9],?[0-9]
|(?=[0-9]{90}(?=.*?[eE]9)(?:[eE].,?[0-9]|[0-9]{1}[eE].,?[1-9]|[0-9]{2}[eE].,?[2-9]|[0-9]{3}[eE].,?[3-9]|[0-9]{4}[eE].,?[4-9]|[0-9]{5}[eE].,?[5-9]|[0-9]{6}[eE].,?[6-9]|[0-9]{7}[eE].,?[7-9]|[0-9]{8}[eE].,?[89]|[0-9]{9}[eE].,?9))
|(?=[0-9]{80}(?=.*?[eE]8)(?:[eE].,?[0-9]|[0-9]{1}[eE].,?[1-9]|[0-9]{2}[eE].,?[2-9]|[0-9]{3}[eE].,?[3-9]|[0-9]{4}[eE].,?[4-9]|[0-9]{5}[eE].,?[5-9]|[0-9]{6}[eE].,?[6-9]|[0-9]{7}[eE].,?[7-9]|[0-9]{8}[eE].,?[89]|[0-9]{9}[eE].,?9))
|(?=[0-9]{70}(?=.*?[eE]7)(?:[eE].,?[0-9]|[0-9]{1}[eE].,?[1-9]|[0-9]{2}[eE].,?[2-9]|[0-9]{3}[eE].,?[3-9]|[0-9]{4}[eE].,?[4-9]|[0-9]{5}[eE].,?[5-9]|[0-9]{6}[eE].,?[6-9]|[0-9]{7}[eE].,?[7-9]|[0-9]{8}[eE].,?[89]|[0-9]{9}[eE].,?9))
|(?=[0-9]{60}(?=.*?[eE]6)(?:[eE].,?[0-9]|[0-9]{1}[eE].,?[1-9]|[0-9]{2}[eE].,?[2-9]|[0-9]{3}[eE].,?[3-9]|[0-9]{4}[eE].,?[4-9]|[0-9]{5}[eE].,?[5-9]|[0-9]{6}[eE].,?[6-9]|[0-9]{7}[eE].,?[7-9]|[0-9]{8}[eE].,?[89]|[0-9]{9}[eE].,?9))
|(?=[0-9]{50}(?=.*?[eE]5)(?:[eE].,?[0-9]|[0-9]{1}[eE].,?[1-9]|[0-9]{2}[eE].,?[2-9]|[0-9]{3}[eE].,?[3-9]|[0-9]{4}[eE].,?[4-9]|[0-9]{5}[eE].,?[5-9]|[0-9]{6}[eE].,?[6-9]|[0-9]{7}[eE].,?[7-9]|[0-9]{8}[eE].,?[89]|[0-9]{9}[eE].,?9))
|(?=[0-9]{40}(?=.*?[eE]4)(?:[eE].,?[0-9]|[0-9]{1}[eE].,?[1-9]|[0-9]{2}[eE].,?[2-9]|[0-9]{3}[eE].,?[3-9]|[0-9]{4}[eE].,?[4-9]|[0-9]{5}[eE].,?[5-9]|[0-9]{6}[eE].,?[6-9]|[0-9]{7}[eE].,?[7-9]|[0-9]{8}[eE].,?[89]|[0-9]{9}[eE].,?9))
|(?=[0-9]{30}(?=.*?[eE]3)(?:[eE].,?[0-9]|[0-9]{1}[eE].,?[1-9]|[0-9]{2}[eE].,?[2-9]|[0-9]{3}[eE].,?[3-9]|[0-9]{4}[eE].,?[4-9]|[0-9]{5}[eE].,?[5-9]|[0-9]{6}[eE].,?[6-9]|[0-9]{7}[eE].,?[7-9]|[0-9]{8}[eE].,?[89]|[0-9]{9}[eE].,?9))
|(?=[0-9]{20}(?=.*?[eE]2)(?:[eE].,?[0-9]|[0-9]{1}[eE].,?[1-9]|[0-9]{2}[eE].,?[2-9]|[0-9]{3}[eE].,?[3-9]|[0-9]{4}[eE].,?[4-9]|[0-9]{5}[eE].,?[5-9]|[0-9]{6}[eE].,?[6-9]|[0-9]{7}[eE].,?[7-9]|[0-9]{8}[eE].,?[89]|[0-9]{9}[eE].,?9))
|(?=[0-9]{10}(?=.*?[eE]1)(?:[eE].,?[0-9]|[0-9]{1}[eE].,?[1-9]|[0-9]{2}[eE].,?[2-9]|[0-9]{3}[eE].,?[3-9]|[0-9]{4}[eE].,?[4-9]|[0-9]{5}[eE].,?[5-9]|[0-9]{6}[eE].,?[6-9]|[0-9]{7}[eE].,?[7-9]|[0-9]{8}[eE].,?[89]|[0-9]{9}[eE].,?9))
|(?:[eE][0-9]|[0-9]{1}[eE][1-9]|[0-9]{2}[eE][2-9]|[0-9]{3}[eE][3-9]|[0-9]{4}[eE][4-9]|[0-9]{5}[eE][5-9]|[0-9]{6}[eE][6-9]|[0-9]{7}[eE][7-9]|[0-9]{8}[eE][89]|[0-9]{9}[eE]9)
)
|(?=[0-9]+[eE]) # integers
)
[+-]?
([0-9]*\.?[0-9]+|[0-9]+\.?[0-9]*)
[eE][+]?((?:,?[0-9]+)+)
As written here the expression uses the x option which ignores white space
Example
Sample Text
1.2334576E34
1.23E4
1.2334576E34
122E3,123,456
1.234
1.234E2
Matches
[0] => 1.2334576E34
[1] => 1.23E4
[2] => 1.2334576E34
[3] => 122E3,123,456

Related

Reg Ex for even number of 0s and 1s

I am trying to create a regular expression that determines if a string (of any length) matches a regex pattern such that the number of 0s in the string is even, and the number of 1s in the string is even. Can anyone help me determine a regex statement that I could try and use to check the string for this pattern?
So completely reformulated my answer to reflect all the changes:
This regex would match all strings with only zeros and ones and only equal amounts of those
^(?=1*(?:01*01*)*$)(?=0*(?:10*10*)*$).*$
See it here on Regexr
I am working here with positive lookahead assertions. The big advantage here of a lookahead assertion is, that it checks the complete string, but without matching it, so both lookaheads start to check the string from the start, but for different assertions.
(?=1*(?:01*01*)*$) does check for an equal amount of 0 (including 0)
(?=0*(?:10*10*)*$) does check for an equal amount of 1 (including 0)
.* does then actually match the string
Those lookaheads checks:
(?=
1* # match 0 or more 1
(?: # open a non capturing group
0 # match one 0
1* # match 0 or more 1
0 # match one 0
1* # match 0 or more 1
)
* # repeat this pattern at least once
$ # till the end of the string
)
So, I have come up with a solution to the problem:
(11+00+(10+01)(11+00)\*(10+01))\*
For even sets of 0s, you can use the following regex to ensure that the number of 0s is even.
^(1*01*01*)*$
However, I believe that the question is to have both an even number of 0s and also an even number of 1s. Since it is possible to construct a non-deterministic finite automaton (NFA) for this problem, the solution is regular and can be represented using a regex expression. The NFA is represented via the machine below, S1 is the start/exit state.
S1 ---1----->S2
|^ <--1----- |^
|| ||
00 00
|| ||
v| v|
S3----1----->S4
<---1------
From there, there's a way to convert NFAs to regex expressions but it's been a while since my computation course. There's some notes below that seem to be helpful in explaining the steps required to convert a NFA to a regex.
http://www.cs.uiuc.edu/class/sp09/cs373/lectures/lect_08.pdf
RE-UPDATED
Try this : [ check out this demo : http://regexr.com?30m7c ]
^(00|11|0011|0110|1100|1001)+$
Hint :
Even numbers are divisible by 2, thus - in binary - they always end in zero (0)
Not a regular expression (which is likely to be impossible, although I can't prove it: the proof by contradiction via the pumping lemma fails), but the "correct" solution is avoiding a complicated and inefficient regular expression all together and using something like (in Python):
def even01(string):
return string.count("1") % 2 == 0 and string.count("0") % 2 == 0
Or if the string has to consist only of 1s and 0s:
import re
def even01(string):
return not re.search("[^01]",string) and \
string.count("1") % 2 == 0 and string.count("0") % 2 == 0
^(0((1(00)*1)*0|1(11|00)*01)|1((0(11)*0)*1|0(11|00)*10))*$
If I haven't overlooked anything, this matches any bit string where the number of 0s is even and the number of 1s is even, using only rudimentary regex operators (*, ^, $). It's slightly easier to see how it works if written like this:
^(0((1(00)*1)*0
|1(11|00)*01)
|1((0(11)*0)*1
|0(11|00)*10))*$
The following test code should illustrate the correctness - we compare the result of the pattern match against a function that tells us if a string has an even number of 0s and 1s. All bit strings of length 16 are tested.
import re
balanced = lambda s: s.count('0') % 2 == 0 and s.count('1') % 2 == 0
pat = re.compile('^(0((1(00)*1)*0|1(11|00)*01)|1((0(11)*0)*1|0(11|00)*10))*$')
size = 16
num = 2**size
for i in xrange(num):
binstr = bin(i)[2:].zfill(size)
b, m = balanced(binstr), bool(pat.match(binstr))
if b != m:
print "balanced('%s') = %d, pat.match('%s') = %d" % (binstr, b, binstr, m)
break
elif i != 0 and i % (num / 10) == 0:
# Python 2's `/` operator performs integer division
print "%d percent done..." % (100 * i / num + 1)
If you try to solve within the same sentence (starting with ^ and ending with $), you are in deep trouble. :-)
You can make sure that you have an even number of 0s (with ^(1*01*01*)*$, as stated by #david-z) OR you can make sure that you have an even number of 1s:
^(1*01*01*)*$|^(0*10*10*)*$
It works for strings with small lengths as well, such as "00" or "101", both valid strings.
I have also been working on lookaheads and lookbacks in my spare time, and using lookahead the problem can be solved while taking also account for the single 1s and/or the single 0s. So, the expression should also work for 11,1111,111111,... and also for 00,0000,000000,....
^(((?=(?:1*01*01*)*$)(?=(?:0*10*10*)*$).*)|([1]{2})*|([0]{2})*)$
Works for all cases.
So, if the string consists of only 1s or only 0s:
([1]{2})*|([0]{2})*
If it contains a mix of 0s and 1s, the positive lookahead will take care of that.
((?=(?:1*01*01*)*$)(?=(?:0*10*10*)*$).*
Combining both of them, it takes into account all string with even number of 0s and 1s.

Regex - Validation of numeric with up to 4 decimal places

I am having a bit of difficulty with the following:
I need to allow any positive numeric value up to four decimal places. Here are some examples.
Allowed:
123
12345.4
1212.56
8778787.567
123.5678
Not allowed:
-1
12.12345
-12.1234
I have tried the following:
^[0-9]{0,2}(\.[0-9]{1,4})?$|^(100)(\.[0]{1,4})?$
However this doesn't seem to work, e.g. 1000 is not allowed when it should be.
Any ideas would be greatly appreciated.
Thanks
To explain why your attempt is not working for a value of 1000, I'll break down the expression a little:
^[0-9]{0,2} # Match 0, 1, or 2 digits (can start with a zero)...
(\.[0-9]{1,4})?$ # ... optionally followed by (a decimal, then 1-4 digits)
| # -OR-
^(100) # Capture 100...
(\.[0]{1,4})?$ # ... optionally followed by (a decimal, then 1-4 ZEROS)
There is no room for 4 digits of any sort, much less 1000 (theres only room for a 0-2 digit number or the number 100)
^\d* # Match any number of digits (can start with a zero)
(\.\d{1,4})?$ # ...optionally followed by (a decimal and 1-4 digits)
This expression will pass any of the allowed examples and reject all of the Not Allowed examples as well, because you (and I) use the beginning-of-string assertion ^.
It will also pass these numbers:
.2378
1234567890
12374610237856987612364017826350947816290385
000000000000000000000.0
0
... as well as a completely blank line - which might or might not be desired
to make it reject something that starts with a zero, use this:
^(?!0\d)\d* # Match any number of digits (cannot "START" with a zero)
(\.\d{1,4})?$ # ...optionally followed by (a decimal and 1-4 digits)
This expression (which uses a negative lookahead) has these evaluations:
REJECTED Allowed
--------- -------
0000.1234 0.1234
0000 0
010 0.0
You could also test for a completely blank line in other ways, but if you wanted to reject it with the regex, use this:
^(?!0\d|$)\d*(\.\d{1,4})?$
Try this:
^[0-9]*(?:\.[0-9]{0,4})?$
Explanation: match only if starting with a digit (excluding negative numbers), optionally followed by (non-capturing group) a dot and 0-4 digits.
Edit: With this pattern .2134 would also be matched. To only allow 0 < x < 1 of format 0.2134, replace the first * with a + above.
This regex would do the trick:
^\d+(?:\.\d{1,4})?$
From the beginning of the string search for one or more digits. If there's a . it must be followed with atleast one digit but a maximum of 4.
^(?<!-)\+?\d+(\.?\d{0,4})?$
The will match something with doesn't start with -, maybe has a + followed by an integer part with at least one number and an optional floating part of maximum 4 numbers.
Note: Regex does not support scientific notation. If you want that too let me know in a comment.
Well asked!!
You can try this:
^([0-9]+[\.]?[0-9]?[0-9]?[0-9]?[0-9]?|[0-9]+)$
If you have a double value but it goes to more decimal format and you want to shorter it to 4 then !
double value = 12.3457652133
value =Double.parseDouble(new DecimalFormat("##.####").format(value));

Decimal or numeric values in regular expression validation

I am trying to use a regular expression validation to check for only decimal values or numeric values. But user enters numeric value, it don't be first digit "0"
How do I do that?
A digit in the range 1-9 followed by zero or more other digits:
^[1-9]\d*$
To allow numbers with an optional decimal point followed by digits. A digit in the range 1-9 followed by zero or more other digits then optionally followed by a decimal point followed by at least 1 digit:
^[1-9]\d*(\.\d+)?$
Notes:
The ^ and $ anchor to the start and end basically saying that the whole string must match the pattern
()? matches 0 or 1 of the whole thing between the brackets
Update to handle commas:
In regular expressions . has a special meaning - match any single character. To match literally a . in a string you need to escape the . using \. This is the meaning of the \. in the regexp above. So if you want to use comma instead the pattern is simply:
^[1-9]\d*(,\d+)?$
Further update to handle commas and full stops
If you want to allow a . between groups of digits and a , between the integral and the fractional parts then try:
^[1-9]\d{0,2}(\.\d{3})*(,\d+)?$
i.e. this is a digit in the range 1-9 followed by up to 2 other digits then zero or more groups of a full stop followed by 3 digits then optionally your comma and digits as before.
If you want to allow a . anywhere between the digits then try:
^[1-9][\.\d]*(,\d+)?$
i.e. a digit 1-9 followed by zero or more digits or full stops optionally followed by a comma and one or more digits.
Actually, none of the given answers are fully cover the request.
As the OP didn't provided a specific use case or types of numbers, I will try to cover all possible cases and permutations.
Regular Numbers
Whole Positive
This number is usually called unsigned integer, but you can also call it a positive non-fractional number, include zero. This includes numbers like 0, 1 and 99999.
The Regular Expression that covers this validation is:
/^(0|[1-9]\d*)$/
Test This Regex
Whole Positive and Negative
This number is usually called signed integer, but you can also call it a non-fractional number. This includes numbers like 0, 1, 99999, -99999, -1 and -0.
The Regular Expression that covers this validation is:
/^-?(0|[1-9]\d*)$/
Test This Regex
As you probably noticed, I have also included -0 as a valid number. But, some may argue with this usage, and tell that this is not a real number (you can read more about Signed Zero here). So, if you want to exclude this number from this regex, here's what you should use instead:
/^-?(0|[1-9]\d*)(?<!-0)$/
Test This Regex
All I have added is (?<!-0), which means not to include -0 before this assertion. This (?<!...) assertion called negative lookbehind, which means that any phrase replaces the ... should not appear before this assertion. Lookbehind has limitations, like the phrase cannot include quantifiers. That's why for some cases I'll be using Lookahead instead, which is the same, but in the opposite way.
Many regex flavors, including those used by Perl and Python, only allow fixed-length strings. You can use literal text, character escapes, Unicode escapes other than \X, and character classes. You cannot use quantifiers or backreferences. You can use alternation, but only if all alternatives have the same length. These flavors evaluate lookbehind by first stepping back through the subject string for as many characters as the lookbehind needs, and then attempting the regex inside the lookbehind from left to right.
You can read more bout Lookaround assertions here.
Fractional Numbers
Positive
This number is usually called unsigned float or unsigned double, but you can also call it a positive fractional number, include zero. This includes numbers like 0, 1, 0.0, 0.1, 1.0, 99999.000001, 5.10.
The Regular Expression that covers this validation is:
/^(0|[1-9]\d*)(\.\d+)?$/
Test This Regex
Some may say, that numbers like .1, .0 and .00651 (same as 0.1, 0.0 and 0.00651 respectively) are also valid fractional numbers, and I cannot disagree with them. So here is a regex that is additionally supports this format:
/^(0|[1-9]\d*)?(\.\d+)?(?<=\d)$/
Test This Regex
Negative and Positive
This number is usually called signed float or signed double, but you can also call it a fractional number. This includes numbers like 0, 1, 0.0, 0.1, 1.0, 99999.000001, 5.10, -0, -1, -0.0, -0.1, -99999.000001, 5.10.
The Regular Expression that covers this validation is:
/^-?(0|[1-9]\d*)(\.\d+)?$/
Test This Regex
For non -0 believers:
/^(?!-0(\.0+)?$)-?(0|[1-9]\d*)(\.\d+)?$/
Test This Regex
For those who want to support also the invisible zero representations, like .1, -.1, use the following regex:
/^-?(0|[1-9]\d*)?(\.\d+)?(?<=\d)$/
Test This Regex
The combination of non -0 believers and invisible zero believers, use this regex:
/^(?!-0?(\.0+)?$)-?(0|[1-9]\d*)?(\.\d+)?(?<=\d)$/
Test This Regex
Numbers with a Scientific Notation (AKA Exponential Notation)
Some may want to support in their validations, numbers with a scientific character e, which is by the way, an absolutely valid number, it is created for shortly represent a very long numbers. You can read more about Scientific Notation here. These numbers are usually looks like 1e3 (which is 1000), 1e-3 (which is 0.001) and are fully supported by many major programming languages (e.g. JavaScript). You can test it by checking if the expression '1e3'==1000 returns true.
I will divide the support for all the above sections, including numbers with scientific notation.
Regular Numbers
Whole positive number regex validation, supports numbers like 6e4, 16e-10, 0e0 but also regular numbers like 0, 11:
/^(0|[1-9]\d*)(e-?(0|[1-9]\d*))?$/i
Test This Regex
Whole positive and negative number regex validation, supports numbers like -6e4, -16e-10, -0e0 but also regular numbers like -0, -11 and all the whole positive numbers above:
/^-?(0|[1-9]\d*)(e-?(0|[1-9]\d*))?$/i
Test This Regex
Whole positive and negative number regex validation for non -0 believers, same as the above, except now it forbids numbers like -0, -0e0, -0e5 and -0e-6:
/^(?!-0)-?(0|[1-9]\d*)(e-?(0|[1-9]\d*))?$/i
Test This Regex
Fractional Numbers
Positive number regex validation, supports also the whole numbers above, plus numbers like 0.1e3, 56.0e-3, 0.0e10 and 1.010e0:
/^(0|[1-9]\d*)(\.\d+)?(e-?(0|[1-9]\d*))?$/i
Test This Regex
Positive number with invisible zero support regex validation, supports also the above positive numbers, in addition numbers like .1e3, .0e0, .0e-5 and .1e-7:
/^(0|[1-9]\d*)?(\.\d+)?(?<=\d)(e-?(0|[1-9]\d*))?$/i
Test This Regex
Negative and positive number regex validation, supports the positive numbers above, but also numbers like -0e3, -0.1e0, -56.0e-3 and -0.0e10:
/^-?(0|[1-9]\d*)(\.\d+)?(e-?(0|[1-9]\d*))?$/i
Test This Regex
Negative and positive number regex validation fro non -0 believers, same as the above, except now it forbids numbers like -0, -0.00000, -0.0e0, -0.00000e5 and -0e-6:
/^(?!-0(\.0+)?(e|$))-?(0|[1-9]\d*)(\.\d+)?(e-?(0|[1-9]\d*))?$/i
Test This Regex
Negative and positive number with invisible zero support regex validation, supports also the above positive and negative numbers, in addition numbers like -.1e3, -.0e0, -.0e-5 and -.1e-7:
/^-?(0|[1-9]\d*)?(\.\d+)?(?<=\d)(e-?(0|[1-9]\d*))?$/i
Test This Regex
Negative and positive number with the combination of non -0 believers and invisible zero believers, same as the above, but forbids numbers like -.0e0, -.0000e15 and -.0e-19:
/^(?!-0?(\.0+)?(e|$))-?(0|[1-9]\d*)?(\.\d+)?(?<=\d)(e-?(0|[1-9]\d*))?$/i
Test This Regex
Numbers with Hexadecimal Representation
In many programming languages, string representation of hexadecimal number like 0x4F7A may be easily cast to decimal number 20346.
Thus, one may want to support it in his validation script.
The following Regular Expression supports only hexadecimal numbers representations:
/^0x[0-9a-f]+$/i
Test This Regex
All Permutations
These final Regular Expressions, support the invisible zero numbers.
Signed Zero Believers
/^(-?(0|[1-9]\d*)?(\.\d+)?(?<=\d)(e-?(0|[1-9]\d*))?|0x[0-9a-f]+)$/i
Test This Regex
Non Signed Zero Believers
/^((?!-0?(\.0+)?(e|$))-?(0|[1-9]\d*)?(\.\d+)?(?<=\d)(e-?(0|[1-9]\d*))?|0x[0-9a-f]+)$/i
Test This Regex
Hope I covered all number permutations that are supported in many programming languages.
Oh, forgot to mention, that those who want to validate a number includes a thousand separator, you should clean all the commas (,) first, as there may be any type of separator out there, you can't actually cover them all.
But you can remove them first, before the number validation:
//JavaScript
function clearSeparators(number)
{
return number.replace(/,/g,'');
}
Similar post on my blog.
I had the same problem, but I also wanted ".25" to be a valid decimal number. Here is my solution using JavaScript:
function isNumber(v) {
// [0-9]* Zero or more digits between 0 and 9 (This allows .25 to be considered valid.)
// ()? Matches 0 or 1 things in the parentheses. (Allows for an optional decimal point)
// Decimal point escaped with \.
// If a decimal point does exist, it must be followed by 1 or more digits [0-9]
// \d and [0-9] are equivalent
// ^ and $ anchor the endpoints so tthe whole string must match.
return v.trim().length > 0 && v.trim().match(/^[0-9]*(\.[0-9]+)?$/);
}
Where my trim() method is
String.prototype.trim = function() {
return this.replace(/(^\s*|\s*$)/g, "");
};
Matthew DesVoigne
I've tested all given regexes but unfortunately none of them pass those tests:
String []goodNums={"3","-3","0","0.0","1.0","0.1"};
String []badNums={"001","-00.2",".3","3.","a",""," ","-"," -1","--1","-.1","-0", "2..3", "2-", "2...3", "2.4.3", "5-6-7"};
Here is the best I wrote that pass all those tests:
"^(-?0[.]\\d+)$|^(-?[1-9]+\\d*([.]\\d+)?)$|^0$"
A simple regex to match a numeric input and optional 2 digits decimal.
/^\d*(\.)?(\d{0,2})?$/
You can modify the {0,2} to match your decimal preference {min, max}
Snippet for validation:
const source = document.getElementById('source');
source.addEventListener('input', allowOnlyNumberAndDecimals);
function allowOnlyNumberAndDecimals(e) {
let str = e.target.value
const regExp = /^\d*(\.)?(\d{0,2})?$/
status = regExp.test(str) ? 'valid' : 'invalid'
console.log(status + ' : ' + source.value)
}
<input type="text" id="source" />
Here is a great working regex for numbers. This accepts number with commas and decimals.
/^-?(?:\d+|\d{1,3}(?:,\d{3})+)?(?:\.\d+)?$/
Here is my regex for validating numbers:
^(-?[1-9]+\\d*([.]\\d+)?)$|^(-?0[.]\\d*[1-9]+)$|^0$
Valid numbers:
String []validNumbers={"3","-3","0","0.0","1.0","0.1","0.0001","-555","94549870965"};
Invalid numbers:
String []invalidNumbers={"a",""," ","-","001","-00.2","000.5",".3","3."," -1","--1","-.1","-0"};
Below is the perfect one for mentioned requirement :
^[0-9]{1,3}(,[0-9]{3})*(([\\.,]{1}[0-9]*)|())$
Try this code, hope it will help you
String regex = "(\\d+)(\\.)?(\\d+)?"; for integer and decimal like 232 232.12
/([0-9]+[.,]*)+/ matches any number with or without coma or dots
it can match
122
122,354
122.88
112,262,123.7678
bug: it also matches 262.4377,3883 ( but it doesn't matter parctically)
if you need to validate decimal with dots, commas, positives and negatives try this:
Object testObject = "-1.5";
boolean isDecimal = Pattern.matches("^[\\+\\-]{0,1}[0-9]+[\\.\\,]{1}[0-9]+$", (CharSequence) testObject);
Good luck.
My regex
/^((0((\.\d*[1-9]\d*)?))|((0(?=[1-9])|[1-9])\d*(\.\d*[1-9]\d*)?))$/
The regular expression ^(\d+(\.\d+)?)$ works for every number.
For demonstration I embedded it into a runnable JS-fiddle:
const source = document.getElementById('source');
source.addEventListener('input', allowOnlyNumberAndDecimals);
function allowOnlyNumberAndDecimals(e) {
let str = e.target.value
const regExp = /^(\d+(\.\d+)?)$/
let status = regExp.test(str) ? 'valid' : 'invalid'
console.log(status + ' : ' + source.value)
}
body {
height: 100vh;
background: pink;
color: black;
justify-content: center;
align-items: center;
}
<h1>VALIDATE ALL NUMBERS :)<h1>
<input type="text" id="source" />

How to detect a floating point number using a regular expression

What is a good regular expression for handling a floating point number (i.e. like Java's Float)
The answer must match against the following targets:
1) 1.
2) .2
3) 3.14
4) 5e6
5) 5e-6
6) 5E+6
7) 7.e8
8) 9.0E-10
9) .11e12
In summary, it should
ignore preceding signs
require the first character to the left of the decimal point to be non-zero
allow 0 or more digits on either side of the decimal point
permit a number without a decimal point
allow scientific notation
allow capital or lowercase 'e'
allow positive or negative exponents
For those who are wondering, yes this is a homework problem. We received this as an assignment in my graduate CS class on compilers. I've already turned in my answer for the class and will post it as an answer to this question.
[Epilogue]
My solution didn't get full credit because it didn't handle more than 1 digit to the left of the decimal. The assignment did mention handling Java floats even though none of the examples had more than 1 digit to the left of the decimal. I'll post the accepted answer in it's own post.
Just make both the decimal dot and the E-then-exponent part optional:
[1-9][0-9]*\.?[0-9]*([Ee][+-]?[0-9]+)?
I don't see why you don't want a leading [+-]? to capture a possible sign too, but, whatever!-)
Edit: there might in fact be no digits left of the decimal point (in which case I imagine there must be the decimal point and 1+ digits after it!), so a vertical-bar (alternative) is clearly needed:
(([1-9][0-9]*\.?[0-9]*)|(\.[0-9]+))([Ee][+-]?[0-9]+)?
[This is the answer from the professor]
Define:
N = [1-9]
D = 0 | N
E = [eE] [+-]? D+
L = 0 | ( N D* )
Then floating point numbers can be matched with:
( ( L . D* | . D+ ) E? ) | ( L E )
It was also acceptable to use D+ rather than L, and to prepend [+-]?.
A common mistake was to write D* . D*, but this can match just '.'.
[Edit]
Someone asked about a leading sign; I should have asked him why it was excluded but never got the chance. Since this was part of the lecture on grammars, my guess is that either it made the problem easier (not likely) or there is a small detail in parsing where you divide the problem set such that the floating point value, regardless of sign, is the focus (possible).
If you are parsing through an expression, e.g.
-5.04e-10 + 3.14159E10
the sign of the floating point value is part of the operation to be applied to the value and not an attribute of the number itself. In other words,
subtract (5.04e-10)
add (3.14159E10)
to form the result of the expression. While I'm sure mathematicians may argue the point, remember this was from a lecture on parsing.
http://www.regular-expressions.info/floatingpoint.html
Here is what I turned in.
(([1-9]+\.[0-9]*)|([1-9]*\.[0-9]+)|([1-9]+))([eE][-+]?[0-9]+)?
To make it easier to discuss, I'll label the sections
( ([1-9]+ \. [0-9]* ) | ( [1-9]* \. [0-9]+ ) | ([1-9]+)) ( [eE] [-+]? [0-9]+ )?
-------------------------------------------------------- ----------------------
   A B
A: matches everything up to the 'e/E'
B: matches the scientific notation
Breaking down A we get three parts
( ([1-9]+ \. [0-9]* ) | ( [1-9]* \. [0-9]+ ) | ([1-9]+) )
----------1---------- ---------2---------- ---3----
Part 1: Allows 1 or more digits from 1-9, decimal, 0 or more digits after the decimal (target 1)
Part 2: Allows 0 or more digits from 1-9, decimal, 1 or more digits after the decimal (target 2)
Part 3: Allows 1 or more digits from 1-9 with no decimal (see #4 in target list)
Breaking down B we get 4 basic parts
( [eE] [-+]? [0-9]+ )?
..--1- --2-- --3--- -4- ..
Part 1: requires either upper or lowercase 'e' for scientific notation (e.g. targets 8 & 9)
Part 2: allows an optional positive or negative sign for the exponent (e.g. targets 4, 5, & 6)
Part 3: allows 1 or more digits for the exponent (target 8)
Part 4: allows the scientific notation to be optional as a group (target 3)
#Kelly S. French, this regular expression matches all your test cases.
^[+-]?(\d+\.\d+|\d+\.|\.\d+|\d+)([eE][+-]?\d+)?$
Source: perldoc perlretut
'([-+])?\d*(\.)?\d+(([eE]([-+])?)?\d+)?'
That's the regular expression I have arrived at when trying to solve this kind of task in Matlab. Actually, it won't correctly detect numbers like (1.) but some additional changes may solve the problem... well, maybe the following would fix that:
'([-+])?(\d+(\.)?\d*|\d*(\.)?\d+)(([eE]([-+])?)?\d+)?'
#Kelly S. French: the sign is missing because in a parser it would get added by the unary minus (negation) expression, therefore it is not neccessary to be detected as part of a float.

Regexes for integer constants and for binary numbers

I have tried 2 questions, could you tell me whether I am right or not?
Regular expression of nonnegative integer constants in C, where numbers beginning with 0 are octal constants and other numbers are decimal constants.
I tried 0([1-7][0-7]*)?|[1-9][0-9]*, is it right? And what string could I match? Do you think 034567 will match and 000083 match?
What is a regular expression for binary numbers x such that hx + ix = jx?
I tried (0|1){32}|1|(10)).. do you think a string like 10 will match and 11 won’t match?
Please tell me whether I am right or not.
You can always use http://www.spaweditor.com/scripts/regex/ for a quick test on whether a particular regex works as you intend it to. This along with google can help you nail the regex you want.
0([1-7][0-7])?|[1-9][0-9] is wrong because there's no repetition - it will only match 1 or 2-character strings. What you need is something like 0[0-7]*|[1-9][0-9]*, though that doesn't take hexadecimal into account (as per spec).
This one is not clear. Could you rephrase that or give some more examples?
Your regex for integer constants will not match base-10 numbers longer than two digits and octal numbers longer than three digits (2 if you don't count the leading zero). Since this is a homework, I leave it up to you to figure out what's wrong with it.
Hint: Google for "regular expression repetition quantifiers".
Question 1:
Octal numbers:
A string that start with a [0] , then can be followed by any digit 1, 2, .. 7 [1-7](assuming no leading zeroes) but can also contain zeroes after the first actual digit, so [0-7]* (* is for repetition, zero or more times).
So we get the following RegEx for this part: 0 [1-7][0-7]*
Decimal numbers:
Decimal numbers must not have a leading zero, hence start with all digits from 1 to 9 [1-9], but zeroes are allowed in all other positions as well hence we need to concatenate [0-9]*
So we get the following RegEx for this part: [1-9][0-9]*
Since we have two options (octal and decimal numbers) and either one is possible we can use the Alternation property '|' :
L = 0[1-7][0-7]* | [1-9][0-9]*
Question 2:
Quickly looking at Fermat's Last Theorem:
In number theory, Fermat's Last Theorem (sometimes called Fermat's conjecture, especially in older texts) states that no three positive integers a, b, and c can satisfy the equation an + bn = cn for any integer value of n greater than two.
(http://en.wikipedia.org/wiki/Fermat%27s_Last_Theorem)
Hence the following sets where n<=2 satisfy the equation: {0,1,2}base10 = {0,1,10}base2
If any of those elements satisfy the equation, we use the Alternation | (or)
So the regular expression can be: L = 0 | 1 | 10 but can also be L = 00 | 01 | 10 or even be L = 0 | 1 | 10 | 00 | 01
Or can be generalized into:
{0} we can have infinite number of zeroes: 0*
{1} we can have infinite number of zeroes followed by a 1: 0*1
{10} we can have infinite number of zeroes followed by 10: 0*10
So L = 0* | 0*1 | 0*10
max answered the first question.
the second appears to be the unsolvable diophantine equation of fermat's last theorem. if h,i,j are non-zero integers, x can only be 1 or 2, so you're looking for
^0*10?$
does that help?
There are several tool available to test regular expressions, such as The Regulator.
If you search for "regular expression test" you will find numerous links to online testers.