What is a good regular expression for handling a floating point number (i.e. like Java's Float)
The answer must match against the following targets:
1) 1.
2) .2
3) 3.14
4) 5e6
5) 5e-6
6) 5E+6
7) 7.e8
8) 9.0E-10
9) .11e12
In summary, it should
ignore preceding signs
require the first character to the left of the decimal point to be non-zero
allow 0 or more digits on either side of the decimal point
permit a number without a decimal point
allow scientific notation
allow capital or lowercase 'e'
allow positive or negative exponents
For those who are wondering, yes this is a homework problem. We received this as an assignment in my graduate CS class on compilers. I've already turned in my answer for the class and will post it as an answer to this question.
[Epilogue]
My solution didn't get full credit because it didn't handle more than 1 digit to the left of the decimal. The assignment did mention handling Java floats even though none of the examples had more than 1 digit to the left of the decimal. I'll post the accepted answer in it's own post.
Just make both the decimal dot and the E-then-exponent part optional:
[1-9][0-9]*\.?[0-9]*([Ee][+-]?[0-9]+)?
I don't see why you don't want a leading [+-]? to capture a possible sign too, but, whatever!-)
Edit: there might in fact be no digits left of the decimal point (in which case I imagine there must be the decimal point and 1+ digits after it!), so a vertical-bar (alternative) is clearly needed:
(([1-9][0-9]*\.?[0-9]*)|(\.[0-9]+))([Ee][+-]?[0-9]+)?
[This is the answer from the professor]
Define:
N = [1-9]
D = 0 | N
E = [eE] [+-]? D+
L = 0 | ( N D* )
Then floating point numbers can be matched with:
( ( L . D* | . D+ ) E? ) | ( L E )
It was also acceptable to use D+ rather than L, and to prepend [+-]?.
A common mistake was to write D* . D*, but this can match just '.'.
[Edit]
Someone asked about a leading sign; I should have asked him why it was excluded but never got the chance. Since this was part of the lecture on grammars, my guess is that either it made the problem easier (not likely) or there is a small detail in parsing where you divide the problem set such that the floating point value, regardless of sign, is the focus (possible).
If you are parsing through an expression, e.g.
-5.04e-10 + 3.14159E10
the sign of the floating point value is part of the operation to be applied to the value and not an attribute of the number itself. In other words,
subtract (5.04e-10)
add (3.14159E10)
to form the result of the expression. While I'm sure mathematicians may argue the point, remember this was from a lecture on parsing.
http://www.regular-expressions.info/floatingpoint.html
Here is what I turned in.
(([1-9]+\.[0-9]*)|([1-9]*\.[0-9]+)|([1-9]+))([eE][-+]?[0-9]+)?
To make it easier to discuss, I'll label the sections
( ([1-9]+ \. [0-9]* ) | ( [1-9]* \. [0-9]+ ) | ([1-9]+)) ( [eE] [-+]? [0-9]+ )?
-------------------------------------------------------- ----------------------
A B
A: matches everything up to the 'e/E'
B: matches the scientific notation
Breaking down A we get three parts
( ([1-9]+ \. [0-9]* ) | ( [1-9]* \. [0-9]+ ) | ([1-9]+) )
----------1---------- ---------2---------- ---3----
Part 1: Allows 1 or more digits from 1-9, decimal, 0 or more digits after the decimal (target 1)
Part 2: Allows 0 or more digits from 1-9, decimal, 1 or more digits after the decimal (target 2)
Part 3: Allows 1 or more digits from 1-9 with no decimal (see #4 in target list)
Breaking down B we get 4 basic parts
( [eE] [-+]? [0-9]+ )?
..--1- --2-- --3--- -4- ..
Part 1: requires either upper or lowercase 'e' for scientific notation (e.g. targets 8 & 9)
Part 2: allows an optional positive or negative sign for the exponent (e.g. targets 4, 5, & 6)
Part 3: allows 1 or more digits for the exponent (target 8)
Part 4: allows the scientific notation to be optional as a group (target 3)
#Kelly S. French, this regular expression matches all your test cases.
^[+-]?(\d+\.\d+|\d+\.|\.\d+|\d+)([eE][+-]?\d+)?$
Source: perldoc perlretut
'([-+])?\d*(\.)?\d+(([eE]([-+])?)?\d+)?'
That's the regular expression I have arrived at when trying to solve this kind of task in Matlab. Actually, it won't correctly detect numbers like (1.) but some additional changes may solve the problem... well, maybe the following would fix that:
'([-+])?(\d+(\.)?\d*|\d*(\.)?\d+)(([eE]([-+])?)?\d+)?'
#Kelly S. French: the sign is missing because in a parser it would get added by the unary minus (negation) expression, therefore it is not neccessary to be detected as part of a float.
Related
I'm trying to match numbers greater than 40. The good point is that all of them have 2 decimal places, so all of them are like: 3.25, 5.89, 999.75 and they don't use any leading zeros (except on the decimal part that always have 2 digits)...
At first I tried the following code but then I realized this wouldn't match numbers like 100, 1000... even if they are greater than 40.
[4-9][0-9]\.
I don't have to match the decimal part, so don't worry about matching that, just help me to find how to match numbers greater than 40 (up to 9999 would be fine).
Thanks for your help.
This should do the job:
([4-9][0-9]|\d{3,})\.
Check it here:
http://www.regexr.com/3a5v9
Don't use regular expressions for number comparison. If, for example, you're using Javascript:
var aNumber = parseFloat("50");
if (aNumber > 40) {
// yay!
}
If your regex flavour can use negative lookbehind to match the numbers from 41 to 9999 without decimal:
\b(?:[1-9][0-9]{2,3}|[5-9][0-9]|4[1-9])(?<!\.\d{1,2})\b
(40\.(?!0[^\d]|00)\d{1,2}|(((4[1-9](?!\d)|[5-9][0-9])(?![\d])|\d*[1-9]\d{2,})(\.\d{1,2})?))
This prevents false positives from leading 0s.
This worked for me.
It tries to match 40 followed by 1 or two decimals that are not 00.
It then tries to match 4 followed by 1-9, decimal optional.
If it can't match that it matches 5-9 followed by 0-9, decimal optional.
It then triese to match any digit, any number of times, followed by 1-9, followed by 1 or 2 digits, decimal optional.
If you want to require the decimal, just remove the last question mark.
This will do it:
([4-9][0-9]+|\d{3,})
This it will get all the numbers of two digits having the first one greater than 4 or any number with three digits.
As an example http://www.regexr.com/3a5v0
You can use brackets to indicate a minimum and, if desired, maximum number of characters to match. So,
([4-9][0-9]|[1-9][0-9]{2,})\.
matches 4-9 followed by one or more digits. Presumably there's a boundary of some sort at the beginning of this, but it sounds like you have that part worked out. This uses an OR to allow for two possible groups of first digits.
(Most of the other answer are perfect for me -- This is paranoia and a bad idea :)
for use with grep -Po or Perl we could use:
'\b(\d{3,}|[4-9]\d)\.\d\d'
but this would get 40.00 (not greater than 40)
'\b(\d{3,}|[5-9]\d|4[1-9])\.\d\d|\b40\.\d?[1-9]\d?'
Corresponding to:
DDD.DD
| [5-9]D.DD
| 4[1-9].DD
| 40.D[1-9]
| 40.[1-9]D
In flex(1) you have this code to parse strings and get numbers greater than 40:
pru.l:
%option noyywrap
%%
\+?(0*[4-9][0-9]|0*[1-9][0-9][0-9][0-9]*)(\.[0-9]*)? { printf("Greater than 40: %s\n", yytext); }
\-?[0-9]*(\.[0-9]*)? { printf("Lesser than 40: %s\n", yytext); }
\n |
. ;
%%
int main()
{ yylex(); }
Install flex and compile this file it with
make pru
Then run it as:
pru <filein >fileout
or just
pru
This code constructs a deterministic finite automaton from the regular expressions listed and prints the commands listed on the right when recognizes a value greater than 40. It allows a leading optional sign and leading zeros, and an optional fractional part composed of any number of digits. And it does this with only one asignment and one decision for each character read. You have access to the automaton state table generated by flex (it writes C code for you)
the regex that recognizes numbers greater than 40 (with decimals and leading sign and zeros) is:
\+?(0*[4-9][0-9]|0*[1-9][0-9][0-9][0-9]*)(\.[0-9]*)?
and can be abreviated as:
\+?(0*[4-9][0-9]|0*[1-9][0-9]{3,})(\.[0-9]*)?
explanation:
\+? matches an optional plus sign.
(...|...) two options:
0* optional arbitrary number of leadin zeros.
[4-9][0-9] the numbers 40 to 99
[1-9][0-9]{3,} the numbers 100 and up.
(.[0-9]*)? optional decimal point followed by an arbitrary number of digits.
I am looking for a Regex for non decimal integer considering exponents and honestly I have tried a lot before asking here.
The regex should
match 1.23E4,1.2334576E34, 122E3,123,456 etc.
not match 1.234E2 (since it expands to 123.4).
should not match 1.22 and so on.
My try was
^[+-]?([0-9]*\\.?[0-9]+|[0-9]+\\.?[0-9]*)([eE][+]?[0-9]+)?$
However as you can see I am not calculating the exponent so that after expansion I should be able to tell that a value X after expanding does not contain a decimal.
Is there any way to extract the number of digits after the decimal . and compare it with exponent so that I can be sure that after expanding it will not contain a decimal.
For the info only a regex that can work in runtime will work for me.
Please help me guys...
ok, so this is only if you really need this for some weird regexp-only validation. it's written in python 3 and it makes no attempt to be compact (there's no limitation except available memory in the size of a regexp in python).
def over(n):
'''make aregexp for an exponent of n or more'''
assert n < 100
return r'([1-9]\d{2,}|%s)' % '|'.join(str(i) for i in range(n, 100))
def make_decimal(n_digits, n_decimal):
'''make a regexp for a number with an "E" with the given number of significant digits and decimal places'''
assert n_decimal < n_digits
assert 100 > n_decimal >= 0
if n_decimal:
return r'\d{%d}.\d{%d}E%s' % (n_digits-n_decimal, n_decimal, over(n_decimal))
else:
return r'\d{%d}E\d+'
def make_e(n_digits):
'''make a regexp for an integer with an "E" with the given number of significant digits'''
return '|'.join(make_decimal(n_digits, i) for i in range(n_digits))
def make_regexp(max_digits):
'''make a regexp for a decimal integer with up to the given number of significant digits'''
assert max_digits < 100
return r'(\d+|%s)' % '|'.join(make_e(i) for i in range(max_digits+1))
here's some test code.
from re import compile
rx = make_regexp(8)
m = compile('^%s$' % rx)
for n in ['1.23E4', '1.2334576E34', '122E3', '123', '456']:
assert m.match(n), n
for n in ['1.234E2', '1.22']:
assert not m.match(n), n
for up to significant 8 digits (to the left of E), which seems a reasonable limit, the regexp generated is 8774 digits long. you could reduce this significantly (for example, see https://stackoverflow.com/a/17840228/181772), but what's the need (the regular expression engine is capable of generating a much smaller internal automaton from this)?
Description
It's not impossible, but rather difficult and the expression will really start to get out of hand. Take this 2831 character monster which:
validates a number with exponent will expand to an integer
requires a number to be in 123.456e7890 or 1234.678e1,234,567
if the exponent contains commas they must appear in the correct comma delimited three digit groupings
supports only numbers upto 99 places after the decimal point
As written here it does require the use of the x option which will ignore white space and comments. The expression could be shortened to about 2041 by replacing the [eE] with e and using the i option; and [0-9] with \d however this will slightly reduce performance because \d class contains all unicode characters and not just 0-9.
^
(?=.*?[eE][0-9]{1,3}(?:,[0-9]{3})*|[0-9]*$) # validate commas are in the correct order
(?=[0-9]+\. # match the integer portion of a real number
(?=
[0-9]{1,99}[eE][1-9](?:,?[0-9]){2,}
|[0-9]{1,9}[eE][1-9],?[0-9]
|[0-9]{10,19}[eE][2-9],?[0-9]
|[0-9]{20,29}[eE][3-9],?[0-9]
|[0-9]{30,39}[eE][4-9],?[0-9]
|[0-9]{40,49}[eE][5-9],?[0-9]
|[0-9]{50,59}[eE][6-9],?[0-9]
|[0-9]{60,69}[eE][7-9],?[0-9]
|[0-9]{70,79}[eE][89],?[0-9]
|[0-9]{80,89}[eE][9],?[0-9]
|[0-9]{90,99}[eE][1-9],?[0-9]
|(?=[0-9]{90}(?=.*?[eE]9)(?:[eE].,?[0-9]|[0-9]{1}[eE].,?[1-9]|[0-9]{2}[eE].,?[2-9]|[0-9]{3}[eE].,?[3-9]|[0-9]{4}[eE].,?[4-9]|[0-9]{5}[eE].,?[5-9]|[0-9]{6}[eE].,?[6-9]|[0-9]{7}[eE].,?[7-9]|[0-9]{8}[eE].,?[89]|[0-9]{9}[eE].,?9))
|(?=[0-9]{80}(?=.*?[eE]8)(?:[eE].,?[0-9]|[0-9]{1}[eE].,?[1-9]|[0-9]{2}[eE].,?[2-9]|[0-9]{3}[eE].,?[3-9]|[0-9]{4}[eE].,?[4-9]|[0-9]{5}[eE].,?[5-9]|[0-9]{6}[eE].,?[6-9]|[0-9]{7}[eE].,?[7-9]|[0-9]{8}[eE].,?[89]|[0-9]{9}[eE].,?9))
|(?=[0-9]{70}(?=.*?[eE]7)(?:[eE].,?[0-9]|[0-9]{1}[eE].,?[1-9]|[0-9]{2}[eE].,?[2-9]|[0-9]{3}[eE].,?[3-9]|[0-9]{4}[eE].,?[4-9]|[0-9]{5}[eE].,?[5-9]|[0-9]{6}[eE].,?[6-9]|[0-9]{7}[eE].,?[7-9]|[0-9]{8}[eE].,?[89]|[0-9]{9}[eE].,?9))
|(?=[0-9]{60}(?=.*?[eE]6)(?:[eE].,?[0-9]|[0-9]{1}[eE].,?[1-9]|[0-9]{2}[eE].,?[2-9]|[0-9]{3}[eE].,?[3-9]|[0-9]{4}[eE].,?[4-9]|[0-9]{5}[eE].,?[5-9]|[0-9]{6}[eE].,?[6-9]|[0-9]{7}[eE].,?[7-9]|[0-9]{8}[eE].,?[89]|[0-9]{9}[eE].,?9))
|(?=[0-9]{50}(?=.*?[eE]5)(?:[eE].,?[0-9]|[0-9]{1}[eE].,?[1-9]|[0-9]{2}[eE].,?[2-9]|[0-9]{3}[eE].,?[3-9]|[0-9]{4}[eE].,?[4-9]|[0-9]{5}[eE].,?[5-9]|[0-9]{6}[eE].,?[6-9]|[0-9]{7}[eE].,?[7-9]|[0-9]{8}[eE].,?[89]|[0-9]{9}[eE].,?9))
|(?=[0-9]{40}(?=.*?[eE]4)(?:[eE].,?[0-9]|[0-9]{1}[eE].,?[1-9]|[0-9]{2}[eE].,?[2-9]|[0-9]{3}[eE].,?[3-9]|[0-9]{4}[eE].,?[4-9]|[0-9]{5}[eE].,?[5-9]|[0-9]{6}[eE].,?[6-9]|[0-9]{7}[eE].,?[7-9]|[0-9]{8}[eE].,?[89]|[0-9]{9}[eE].,?9))
|(?=[0-9]{30}(?=.*?[eE]3)(?:[eE].,?[0-9]|[0-9]{1}[eE].,?[1-9]|[0-9]{2}[eE].,?[2-9]|[0-9]{3}[eE].,?[3-9]|[0-9]{4}[eE].,?[4-9]|[0-9]{5}[eE].,?[5-9]|[0-9]{6}[eE].,?[6-9]|[0-9]{7}[eE].,?[7-9]|[0-9]{8}[eE].,?[89]|[0-9]{9}[eE].,?9))
|(?=[0-9]{20}(?=.*?[eE]2)(?:[eE].,?[0-9]|[0-9]{1}[eE].,?[1-9]|[0-9]{2}[eE].,?[2-9]|[0-9]{3}[eE].,?[3-9]|[0-9]{4}[eE].,?[4-9]|[0-9]{5}[eE].,?[5-9]|[0-9]{6}[eE].,?[6-9]|[0-9]{7}[eE].,?[7-9]|[0-9]{8}[eE].,?[89]|[0-9]{9}[eE].,?9))
|(?=[0-9]{10}(?=.*?[eE]1)(?:[eE].,?[0-9]|[0-9]{1}[eE].,?[1-9]|[0-9]{2}[eE].,?[2-9]|[0-9]{3}[eE].,?[3-9]|[0-9]{4}[eE].,?[4-9]|[0-9]{5}[eE].,?[5-9]|[0-9]{6}[eE].,?[6-9]|[0-9]{7}[eE].,?[7-9]|[0-9]{8}[eE].,?[89]|[0-9]{9}[eE].,?9))
|(?:[eE][0-9]|[0-9]{1}[eE][1-9]|[0-9]{2}[eE][2-9]|[0-9]{3}[eE][3-9]|[0-9]{4}[eE][4-9]|[0-9]{5}[eE][5-9]|[0-9]{6}[eE][6-9]|[0-9]{7}[eE][7-9]|[0-9]{8}[eE][89]|[0-9]{9}[eE]9)
)
|(?=[0-9]+[eE]) # integers
)
[+-]?
([0-9]*\.?[0-9]+|[0-9]+\.?[0-9]*)
[eE][+]?((?:,?[0-9]+)+)
As written here the expression uses the x option which ignores white space
Example
Sample Text
1.2334576E34
1.23E4
1.2334576E34
122E3,123,456
1.234
1.234E2
Matches
[0] => 1.2334576E34
[1] => 1.23E4
[2] => 1.2334576E34
[3] => 122E3,123,456
My aim is to write a regular expression for a decimal number where a valid number is one of
xx.0, xx.125, xx.25, xx.375, xx.5, xx.625, xx.75, xx.875 (i.e. measured in 1/8ths) The xx can be 0, 1 or 2 digits.
i have come up with the following regex:
^\d*\.?((25)|(50)|(5)|(75)|(0)|(00))?$
while this works for 0.25,0.5,0.75 it wont work for 0.225, 0.675 etc .
i assumed that the '?' would work in a case where there is preceding number as well.
Can someone point out my mistake
Edit : require the number to be a decimal !
Edit2 : i realized my mistake i was confused about the '?'. Thank you.
I would add another \d* after the literal . check \.
^\d*\.?\d*((25)|(50)|(5)|(75)|(0)|(00))?$
I think it would probably just be easier to multiply the decimal part by 8, but you don't consider digits that lead the last two decimals in the regex.
^\d{0,2}\.(00?|(1|6)?25|(3|8)?75|50?)$
Your mistake is: \.? indicates one optional \., not a digit (or anything else, in this case).
About the ? (question mark) operator: Makes the preceding item optional. Greedy, so the optional item is included in the match if possible. (source)
^\d{0,2}\.(0|(1|2|6)?25|(3|6|8)?75|5)$
Regular expressions are for matching patterns, not checking numeric values. Find a likely string with the regex, then check its numeric value in whatever your host language is (PHP, whatever).
I am having a bit of difficulty with the following:
I need to allow any positive numeric value up to four decimal places. Here are some examples.
Allowed:
123
12345.4
1212.56
8778787.567
123.5678
Not allowed:
-1
12.12345
-12.1234
I have tried the following:
^[0-9]{0,2}(\.[0-9]{1,4})?$|^(100)(\.[0]{1,4})?$
However this doesn't seem to work, e.g. 1000 is not allowed when it should be.
Any ideas would be greatly appreciated.
Thanks
To explain why your attempt is not working for a value of 1000, I'll break down the expression a little:
^[0-9]{0,2} # Match 0, 1, or 2 digits (can start with a zero)...
(\.[0-9]{1,4})?$ # ... optionally followed by (a decimal, then 1-4 digits)
| # -OR-
^(100) # Capture 100...
(\.[0]{1,4})?$ # ... optionally followed by (a decimal, then 1-4 ZEROS)
There is no room for 4 digits of any sort, much less 1000 (theres only room for a 0-2 digit number or the number 100)
^\d* # Match any number of digits (can start with a zero)
(\.\d{1,4})?$ # ...optionally followed by (a decimal and 1-4 digits)
This expression will pass any of the allowed examples and reject all of the Not Allowed examples as well, because you (and I) use the beginning-of-string assertion ^.
It will also pass these numbers:
.2378
1234567890
12374610237856987612364017826350947816290385
000000000000000000000.0
0
... as well as a completely blank line - which might or might not be desired
to make it reject something that starts with a zero, use this:
^(?!0\d)\d* # Match any number of digits (cannot "START" with a zero)
(\.\d{1,4})?$ # ...optionally followed by (a decimal and 1-4 digits)
This expression (which uses a negative lookahead) has these evaluations:
REJECTED Allowed
--------- -------
0000.1234 0.1234
0000 0
010 0.0
You could also test for a completely blank line in other ways, but if you wanted to reject it with the regex, use this:
^(?!0\d|$)\d*(\.\d{1,4})?$
Try this:
^[0-9]*(?:\.[0-9]{0,4})?$
Explanation: match only if starting with a digit (excluding negative numbers), optionally followed by (non-capturing group) a dot and 0-4 digits.
Edit: With this pattern .2134 would also be matched. To only allow 0 < x < 1 of format 0.2134, replace the first * with a + above.
This regex would do the trick:
^\d+(?:\.\d{1,4})?$
From the beginning of the string search for one or more digits. If there's a . it must be followed with atleast one digit but a maximum of 4.
^(?<!-)\+?\d+(\.?\d{0,4})?$
The will match something with doesn't start with -, maybe has a + followed by an integer part with at least one number and an optional floating part of maximum 4 numbers.
Note: Regex does not support scientific notation. If you want that too let me know in a comment.
Well asked!!
You can try this:
^([0-9]+[\.]?[0-9]?[0-9]?[0-9]?[0-9]?|[0-9]+)$
If you have a double value but it goes to more decimal format and you want to shorter it to 4 then !
double value = 12.3457652133
value =Double.parseDouble(new DecimalFormat("##.####").format(value));
I have tried 2 questions, could you tell me whether I am right or not?
Regular expression of nonnegative integer constants in C, where numbers beginning with 0 are octal constants and other numbers are decimal constants.
I tried 0([1-7][0-7]*)?|[1-9][0-9]*, is it right? And what string could I match? Do you think 034567 will match and 000083 match?
What is a regular expression for binary numbers x such that hx + ix = jx?
I tried (0|1){32}|1|(10)).. do you think a string like 10 will match and 11 won’t match?
Please tell me whether I am right or not.
You can always use http://www.spaweditor.com/scripts/regex/ for a quick test on whether a particular regex works as you intend it to. This along with google can help you nail the regex you want.
0([1-7][0-7])?|[1-9][0-9] is wrong because there's no repetition - it will only match 1 or 2-character strings. What you need is something like 0[0-7]*|[1-9][0-9]*, though that doesn't take hexadecimal into account (as per spec).
This one is not clear. Could you rephrase that or give some more examples?
Your regex for integer constants will not match base-10 numbers longer than two digits and octal numbers longer than three digits (2 if you don't count the leading zero). Since this is a homework, I leave it up to you to figure out what's wrong with it.
Hint: Google for "regular expression repetition quantifiers".
Question 1:
Octal numbers:
A string that start with a [0] , then can be followed by any digit 1, 2, .. 7 [1-7](assuming no leading zeroes) but can also contain zeroes after the first actual digit, so [0-7]* (* is for repetition, zero or more times).
So we get the following RegEx for this part: 0 [1-7][0-7]*
Decimal numbers:
Decimal numbers must not have a leading zero, hence start with all digits from 1 to 9 [1-9], but zeroes are allowed in all other positions as well hence we need to concatenate [0-9]*
So we get the following RegEx for this part: [1-9][0-9]*
Since we have two options (octal and decimal numbers) and either one is possible we can use the Alternation property '|' :
L = 0[1-7][0-7]* | [1-9][0-9]*
Question 2:
Quickly looking at Fermat's Last Theorem:
In number theory, Fermat's Last Theorem (sometimes called Fermat's conjecture, especially in older texts) states that no three positive integers a, b, and c can satisfy the equation an + bn = cn for any integer value of n greater than two.
(http://en.wikipedia.org/wiki/Fermat%27s_Last_Theorem)
Hence the following sets where n<=2 satisfy the equation: {0,1,2}base10 = {0,1,10}base2
If any of those elements satisfy the equation, we use the Alternation | (or)
So the regular expression can be: L = 0 | 1 | 10 but can also be L = 00 | 01 | 10 or even be L = 0 | 1 | 10 | 00 | 01
Or can be generalized into:
{0} we can have infinite number of zeroes: 0*
{1} we can have infinite number of zeroes followed by a 1: 0*1
{10} we can have infinite number of zeroes followed by 10: 0*10
So L = 0* | 0*1 | 0*10
max answered the first question.
the second appears to be the unsolvable diophantine equation of fermat's last theorem. if h,i,j are non-zero integers, x can only be 1 or 2, so you're looking for
^0*10?$
does that help?
There are several tool available to test regular expressions, such as The Regulator.
If you search for "regular expression test" you will find numerous links to online testers.