I am trying to clean up a field so that it only has integers or floating numbers.
Basically, I want to the row to be blank if there are dates or text.
This catches most things:
regex_replace("^(\d*.\d*).*","$1")
but leaves the initial numbers if the row is a date (i.e. 2022 if 2022-07-01 20:30:29 or 7 if 7/1/2022).
0
0
1
15
1.8910127482598
2022-07-01 20:30:29
7/1/2022
West
Living
C000000475
1
0
0
0
How can I modify the regex so that it removes the dates as well?
TIA,
LCH
Find all numbers
^(\d*\.?\d*)$
Your regular expression uses \d.\d which is probably not intended by you. The . must be escaped, otherwise it will be interpreted as "any character".
Notice I wrote \.? to find an optional decimal point. A . means "any character", not the decimal dot. We therefore escape it.
I added the $ at the end to denote "end of line".
Replacing with $1 just leaves the number. Use an empty string to remove numbers.
Find a playground on regex101 here:
https://regex101.com/r/P7jwNV/1
This is a slightly tweaked version of your expression. However, it will go through the lines and replace the number with themselves. How would that leave the other rows empty?
Remove the numbers
You say you want to remove the non-numbers, however your regular expression is trying to find numbers and replace them with the full search result. Which is the same as not doing anything.
^([^:\-\/\D]+|\d+\.\d+)$
With your examples this will leave the non-numbers if we replace with $1.
See regex101 playground here:
https://regex101.com/r/VV68Pj/1
Remove the non-numbers
Regular expressions are not for finding a pattern you then want the opposite of the matches to work on. Se we have to find the patterns we don't want to replace them with an empty string. We can classify the non-numbers separately with |:
^((?![\d]+).+|\d+\/\d+\/\d+|\d+-\d+-\d+ \d+:\d+:\d+)$
?! is a negative lookahead, in our case it finds a non-digit
(?![\d]+).+: If the following does not have a digit in it...
\d+\/\d+\/\d+: Or the following is a date (I escaped / there, you may not need to)...
\d+-\d+-\d+ \d+:\d+:\d+: Or the following is a date + timestamp
We then simply replace with nothing (an empty string) to remove them.
Regex101 playground to tinker with it:
https://regex101.com/r/ZK5PDZ/1
Related
I have a simple question.
I need a regular expression to match a hexdecimal number without colon at the end.
For example:
0x85af6b9d: 0x00256f8a ;some more interesting code
// dont match 0x85af6b9d: at all, but match 0x00256f8a
My expression for hexdecimal number is 0[xX][0-9A-Fa-f]{1,8}
Version with (?!:) is not possible, because it will just match 0x85af6b9 (because of the {1,8} token)
Using a $ also isn't possible - there can be more numbers than one
Thanks!
Here is one way to do so:
0[xX][0-9A-Fa-f]{1,8}(?![0-9A-Fa-f:])
See the online demo.
We use a negative lookahead to match all hexadecimal numbers without : at the end. Because of {1,8}, it is also necessary to ensure that the entire hexadecimal number is correctly matched. We therefore reuse the character set ([0-9A-Fa-f]) to ensure that the number does not continue.
I would like to use a regular expression in replace() to format currency input with optional length of digits before the decimal point and only one or two digits after the decimal point.
Basically, it should match something like this 9999999,00 or this 9999999.00 and not this 9999999,000.
I have the following regexp:
value.replace(/[^\d*((\.|\,)\d(\d)?)?$]/, "")
But it doesn't work, as it allows digits, . or , in any order, instead of the given format.
Can I put this string ^\d*((\.|\,)\d(\d)?)?$ inside the square brackets [^] to match any characters outside the format? Or maybe there is another way to fix it?
EDIT: I'm going to use it with react-final-form parse feature, to allow only inputs of the given format and delete all other characters. Here is my codesandox https://codesandbox.io/s/react-final-form-simple-example-o9jub?fontsize=14
Check this pattern :
^([1-9]\d*|0?)([,\.]\d{2})?$
Demo with some samples :
Here
You've couple of mistakes in your expression:
I'm not sure why you added the square brackets [^...] to the expression - characters within square brackets are treated separately and not as an whole expression.
Note that regular brackets (...) will focus the output of the expression to the matched string within the brackets.
Try this expression:
\d+[\.,]\d{1,2}
The \d+ validates you have atleast 1 digit before the decimal point.
[\.,] is just a better way to match different characters.
\d{1,2} for 1 or 2 digits above the decimal point.
I'm trying to detect a price in regex with this:
^\-?[0-9]+(,[0-9]+)?(\.[0-9]+)?
This covers:
12
12.5
12.50
12,500
12,500.00
But if I pass it
12..50 or 12.5.0 or 12.0.
it still returns a match on the 12 . I want it to negate the entire string and return no match at all if there is more than one period in the entire string.
I've been trying to get my head around negative lookaheads for an hour and have searched on Stack Overflow but can't seem to find the right answer. How do I do this?
What you are looking for, is this:
^\d+(,\d{3})*(\.\d{1,2})?$
What it does:
^ Start of Line
\d+ one or more Digits followed by
(,\d{3})* zero, one or more times a , followed by three Digits followed by
(\.\d{1,2})? one or zero . followed by one or two Digits followed by
$ End of Line
This will only match valid Prices. The Comma (,) is not obligatory in this Regex, but it will be matched.
Look here: http://www.regextester.com/?fam=98001
If you work with Prices and want to store them in a Database I recommend saving them as INT. So 1,234,56 becomes 123456 or 1,234 becomes 123400. After you matched the valid price, all you have to do is to remove the ,s, split the Value by the Dot, and fill the Value of [1] with str_pad() (STR_PAD_RIGHT) with Zeros. This makes Calculations easier, in special when you work with Javascript or other different Languages.
Your regex:
^\-?[0-9]+(,[0-9]+)?(\.[0-9]+)?
Note: The regex you provided does not seem to work for 12 (without "."). Since you didn't add a quantifier after \., it tries to match that pattern literally (.).
While there are multiple ways to solve this and the most "correct" answer will depend on your specific requirements, here's a regex that will not match 12..1, but will match 12.1:
(^\-?[0-9]+(?:,[0-9]+)?(?:\.[0-9]+))+
I surrounded the entire regex you provided in a capturing group (...), and added a one or more quantifier + at the end, so that the entire regex will fail if it does not satisfy that pattern.
Also (this may or may not be what you want), I modified the inner groups into non-capturing groups (?: ... ) so that it does not return unnecessary groups.
This site offers a deconstruction of regexes and explains them:
For the regex provided: https://regex101.com/r/EDimzu/2
Unit tests: https://regex101.com/r/EDimzu/2/tests (Note the 12 one's failure for multiple languages).
You can limit it by requiring there is only 0 or 1 periods like this:
^[0-9,]+[\.]{0,1}?[0-9,]+$
search for regex where Keep all digits with length of 10-13 digits and delete the rest in notepad++
my regex doesnt work
[^\d{10,13}]
it finds numbers with commas too :(
Searching for
^(?:.*?(\d{10,13}).*|.*)$
and replacing with
\1
you keep just the 10 to 13 digit long numbers (and empty lines).
Remove the empty lines searching for
^\n
and replacing with nothing.
See it in action: RegEx101.
Addressing #WiktorStribiżew's comments: Relying on the sought after numbers to be always surrounded by white space (which has been checked with OP - but not for the potential case, lines to (effectively) hold just numbers) the search expression could be adjusted to
^(?:.*\s(\d{10,13})\s.*|.*)$
still replacing with
\1
to handle comma holding strings of numbers correctly: RegEx101
By the way:
[^\d{10,13}]
is a character class, which matches anything, which is not:
a number, or
any character out of "{10,3}" (without the quotes, but including the curly braces).
Please comment if and as this requires adjustment / further detail.
To match numbers that are not exactly 3 digits long:
\b(\d{1,9}|\d{14,})\b
You can find all 10-13 length stand alone digits like this
(?<!\d)\d{10,13}(?!\d)
What you do then is up to you.
I don`t know how does notepad works, but this I think this is the regex you are looking for: ^([0-9]){10,13}$
A good page to create/test regex: http://regexr.com/
I'm heaving trouble finding the right regex for decimal numbers which include the comma separator.
I did find a few other questions regarding this issue in general but none of the answers really worked when I tested them
The best I got so far is:
[0-9]{1,3}(,([0-9]{3}))*(.[0-9]+)?
2 main problems so far:
1) It records numbers with spaces between them "3001 1" instead of splitting them to 2 matches "3001" "1" - I don't really see where I allowed space in the regex.
2) I have a general problem with the beginning\ending of the regex.
The regex should match:
3,001
1
32,012,111.2131
But not:
32,012,11.2131
1132,012,111.2131
32,0112,111.2131
32131
In addition I'd like it to match:
1.(without any number after it)
1,(without any number after it)
as 1
(a comma or point at the end of the number should be overlooked).
Many Thanks!
.
This is a very long and convoluted regular expression that fits all your requirements. It will work if your regex engine is based on PCRE (hopefully you're using PHP, Delphi or R..).
(?<=[^\d,.]|^)\d{1,3}(,(\d{3}))*((?=[,.](\s|$))|(\.\d+)?(?=[^\d,.]|$))
DEMO on RegExr
The things that make it so long:
Matching multiple numbers on the same line separated by only 1 character (a space) whilst not allowing partial matchs requires a lookahead and a lookbehind.
Matching numbers ending with . and , without including the . or , in the match requires another lookahead.
(?=[,.](\s|$)) Explanation
When writing this explanation I realised the \s needs to be a (\s|$) to match 1, at the very end of a string.
This part of the regex is for matching the 1 in 1, or the 1,000 in 1,000. so let's say our number is 1,000. (with the . on the end).
Up to this point the regex has matched 1,000, then it can't find another , to repeat the thousands group so it moves on to our (?=[,.](\s|$))
(?=....) means its a lookahead, that means from where we have matched up to, look at whats coming but don't add it to the match.
So It checks if there is a , or a . and if there is, it checks that it's immediately followed by whitespace or the end of input. In this case it is, so it'd leave the match as 1,000
Had the lookahead not matched, it would have moved on to trying to match decimal places.
This works for all the ones that you have listed
^[0-9]{1,3}(,[0-9]{3})*(([\\.,]{1}[0-9]*)|())$
. means "any character". To use a literal ., escape it like this: \..
As far as I know, that's the only thing missing.