JSP input validate amount - possibly with REGEX - regex

This is probably a simple question that has been solved many times. I am new to front end dev, so struggling with the validation part. I have a currency input that I used the following statement in JavaScript to only allow numbers. Can I just edit this or add a line to also only allow two decimals as you type?
$("input#amountToSave").on("blur keyup", function() {
this.value=this.value.replace(/[^0-9.]+/,'');
});

You can try something like this
^\$?([1-9]{1}[0-9]{0,2}(\,[0-9]{3})*(\.[0-9]{0,2})?|[1-9]{1}[0-9]{0,}(\.[0-9]{0,2})?|0(\.[0-9]{0,2})?|(\.[0-9]{1,2})?)$
Many currency expresssions allow leading zeros, thus $01.40 passes thru them. This expression kills them, except for 0 in the one's column. Works with or without commas and/or dollar sign. Decimals not mandatory, unless no zero in ones column and decimal point is placed. Allows $0.00 and .0 Keywords: money dollar currency
E.g.,
$1,234.50 | $0.70 | .7

Okay - I think I see the problem. I need a regex (it can be a simple one for the two decimals - for now I don't need it too complex and no Currency symbol is necessary) that is inverted. I now saw when reading up I am testing if the value is not 0-9 replace with space. So I need to add to that reg exp that if it is the 3rd decimal, then replace with space.

Related

Replace trailing ".1" to ".2"

I am assuming you would need a regex for this. The best I could come up with is
=REGEXREPLACE(C2, "\.(?=[^.]*$)", ".2")
but it only detects the period in the end and the google sheet returns #REF!
Other ways, such as directly changing the cell C2:C5, are also welcomed.
You can just check if the trailing 2 characters from the right are equal to .1
get two chars from the right
test equality
RIGHT(A1,2)=".1"
Then, to convert matching values, you can slice off the last two chars (length-2) and append the .2
LEFT(A1,LEN(A1)-2)&".2"
All together
=IF(RIGHT(A1,2)=".1",LEFT(A1,LEN(A1)-2)&".2",A1)
If you actually want to increment arbitrary values (and not just .1), you can skip the equality check and add 0.1 intermediately
=LEFT(C3,LEN(C3)-2)&((RIGHT(C3,2)+0.1)&"")
If you have values with more than a single digit, hunt them in an intermediate column so you can use their length to
add the right power of ten (.5+0.1, .993+0.001, etc.)
exclude the right number of chars when appending
If you want a full version parser, consider VBA or passing the column to a more practical language

Regex to extract UK Currency including £ symbol and Pence (p)

I am fairly new to RegEx and have had a search around online but am unable to find a regex that fits my requirements.
The ultimate aim is to search a string of text and extract the lowest monetary amount, however as the string may contain more than one £amount, then i'm happy for a regex to just extract all monetary values it can find and then I can write a calculation in order to return the lowest amount.
The string may have numbers that are not monetary values / numerous amounts, therefore the regex should always look for a £ symbol first OR it could end with a "p" or "P" to signify pence. For example "I need 2 of these at £10 each and one of those at 50p" - should return 10.00 & 0.50 - I can then calculate that 0.50 is the lowest amount.
As people also write their amounts in various ways, I need the regex to be able to spot different patterns - including the "," for every thousand. All below values should be valid:
£0
£0.00
£0.00p
£0000
£0000.00
£0000.00p
£0,000
£0,000.00
£0,000.00p
0p
Hopefully someone may be able to advise the best way to approach this.
Thanks
This works on your data set:
(?=^£|.*p$)£?\d*(?:,\d{3})*(\.\d{2})?p?
But it may improperly match some edge cases as well because everything is optional...
https://regex101.com/r/WptUn6/3

RegEx to clean VISA merchant names (remove random strings)

I am trying to develop a ReGex (.Net flavor), which I can use to clean VISA merchant names.
Examples:
Norton *AP1223506209 --> Norton *AP
Norton *AP1223511428
EUROWINGS VYJD6J_123001 --> EUROWINGS
EUROWINGS W6PDFI_125626
AER LINGUCB22QKM2 --> AER LINGUCB
AER LINGUCB248L2W
AIR FRANCE JWNCSC --> AIR FRANCE
AIR FRANCE K8L7TT
PAYPAL *AIRBNB HMQXBW --> PAYPAL *AIRBNB
PAYPAL *AIRBNB HMQXNZ
SAS 1174565172360 --> SAS
SAS 1174565172368
I would like to keep the first "name" part, but remove the second "gibberish" part.
The following Regex works for Norton and Air Lingu as well as for Eurowings and Air France, if they contain numbers in the gibberish part. It totally fails for PAYPAL *AIRBNB and other strings, that don't contain any numbers in the gibberish part, and also for SAS, probably because the name is too short / there are too many spaces:
Search:
([A-z *-]{2,50}[A-z]{2,50})(.{0,3}([0-9-]{0,3}[A-z *+.#-/]{0,3}){1,10})
Replace:
$1
Is there any way to make this work for gibberish parts that don't contain numbers? I have something like this in mind, but don't manage to create an according RegEx:
Group 1 (to keep)
Must contain consonants and vowels
Can contain few numbers, spaces or punctuation signs (e.g.: "7x7: Taxi Service")
Group 2 (to be removed)
Consists of sequences of numbers, letters and optional punctuation signs
OR: consists of consonants, only
OR: consists of numbers, only
Thanks for any help and best regards
Pesche
Edit:
If I add more examples, Lindens solution still works quite well, but does not recognize all of the examples or in some cases too much of the string. I tried to adjust it, but with my lacking skills didn't quite succeed:
https://regex101.com/r/7y9zGl/4
The following problems remain:
with a length of 6 for the last \w, longer patterns would not be matched in full length (e.g. after easyjet and after EMP Merchan). Increasing it, however, causes other strings to be truncated (e.g. AER LINGU, potentially also HOTELS.COM if > 12 was used).
The merchant names after PAYPAL * and GOOGLE * should not be deleted, as they are true merchant names. I tried to exclude strings containing GOOGLE * with a negative lookbehind, but it does not seem to work like that.
Whereas the merchant name after PAYPAL * should generally remain, in some cases it is followed by gibberish, e.g. PAYPAL *AIRBNB HMQXBW. If the negative lookbehind worked, those cases would no longer be cleaned.
if the merchant name is not followed by gibberish, part of the name itself may be deleted (e.g. EMP Merchan)
As the full list of merchant names is long and versatile, the approach to detect "gibberish" should be as generic as possible (i.e. not rely on a certain length of the gibberish part). Hence my original, now slightly modified "pattern":
Consists of sequences of numbers, letters and optional punctuation signs
OR: consists non or very few vowels (EASYJET 000ESJ5TWN -> the gibberish contains only one vowel, EASYJET 3 of them; PAYPAL *NITSCHKE -> NITSCHKE should not be matched, it contains 2 vowels)
OR: consists of numbers, only
Is such a thing even possible? The goal is to use SQL to clean the merchant names. If necessary, this can be done in several run throughs (for different kind of patterns).
Thx again!
Updated regex based on extended sample and desired results:
[\s*<]+\d+$|[\s*<]+(?![A-Z]{6}.*)\w*\d[\w>]*$|\d{6,}$|[\s*<]+[A-Z]{6}$|(?![A-Z]+$)(?<=[A-Z])\w{6}$
Demo
I cannot validate as I'm only on my phone, but can you try something like this?
^([0-9A-Za-z\*][ ]{0-2})
Take all the numbers, the letters (capital and minor) the star and max 2 spaces from the beginning of the line.
Please check the () but I guess the idea is here.
Sorry, it seems wrong when there is no double space.
You want to take all the char until 2 spaces or 2 numbers according to your examples.
.* {2}|.*[0-9]{2}
Is it better?
Regards,
Thomas

regex for a real number in flex ignoring leading zeros

I have the following sets:
NUMBER [0-9]+
DECIMAL ("."{NUMBER})|({NUMBER}("."{NUMBER}?)?)
REAL {DECIMAL}([eE][+-]?{NUMBER})?
and I want my lexer to accept real numbers like:
0.002 or 0.004e-10 or .01
the problem is that I want it ignore the leading zeros but to keep the rest of the number for example:
when I give 000.0002 I want to keep 0.0002 and when I give 0.2e-0100 I want to keep 0.2e-100
So I was thinking something like the atof function but I do not know how to do it exactly.
Any thoughts?
Thanks in advance
lex will return the complete token that your pattern matches as one string. You cannot change that. At the expense of considerable complexity you could use start conditions to match a leading zero (which may be the only digit), and collect tokens for the pieces, e.g.,
0.2e-0100
as
0.2e-
0
100
and glue the first/last tokens together but you would find it much simpler to develop your own string function which filters out the unwanted leading zeroes.

Extract numbers out of text with inconcistant linebreaks

I have text with 6 numbers typically stored in one line
SomeData\n0.00 0.00 0.00 31,570.07 0.00 31,570.07\nSomeData
SomeData\n0.00 0.00 0.00 485,007.24 0.00 485,007.24\nSomeData
This regex worked fine on it:
\n[0-9,.-]* [0-9,.-]* [0-9,.-]* [0-9,.-]* [0-9,.-]* [0-9,.-]*\n
I noticed that every once in a while I get this:
SomeData\n0.00 0.00 10,921,594\n.89\n-\n9,563,271.0\n6\n0.00 1,358,323.83\nSomeData
Note how the linebreaks are randomly inserted after a sign or between numbers as if the system stored the values without filtering linebreaks.
I am struggling to get this extracted. I tried various expressions but my more successful one was [0-9,.-][\n]{0,1}[0-9,.-][ ]{0,1} to match an individual number.
What expression can I use to match both variations of the number formats preferably already stripping out the inconstant line breaks?
Update: Going with
[-\n]{0,2}[0-9,]+[\n.0-9]{3,4}[\n ]{0,1}
Please let me know if I there's a better way
One way would be to write an exact representation of what constitutes a number, so in your case [-+]?[0-9]+[0-9,]*(?:\.[0-9]+)? would do the trick. This helps, because then your search can know when a number starts and when one ends (because of rules like: a sign always is at the start a dot cannot appear multiple times, etc.). Then you want to match pairs of six delimited by either a new line or space so wrap it in a capture group and limit by 6: (...[ \n]*){6,6}. This helps because then the regex engine can figure out by backtracking what to consider a number by knowing how many it should match. Then you want to allow new lines in pretty much any position, so place the new line in each character group. You might also want to anchor the numbers on both sides, but this is not necessary, because now the regex engine will try to identify valid tuples of 6 numbers. End result is:
SomeData\n([-+]?[0-9\n]+[0-9,\n]*(?:\.[0-9\n]+)?[ \n]){6,6}SomeData
This will find tuples of 6 numbers no matter where the enters are. Here is an example: https://regex101.com/r/jD5nT8/1