Money amount regex currency name after the amount - regex

I am trying to create a regular expression that matches money amount (various currencies either in front of or after the given amount. The decimals are separated by a dot or by a comma).
This is what I've got so far:
\$[0-9.,]+|\£[0-9.,]+|\€[0-9.,]+
However, if I put currencies in the square brackets together with the other signs, it does not work as I expect it to (it still doesn't match 20,000$, only $20,000 and I want it to match both).
Can you tell me how I can modify my regex so that it also matches the amounts with the currency after the digits?
Also, is the only way to include more than one currency in the regex to separate them with a pipe and rewrite the same regular expression over and over again?

Updated:
This regex should match numbers with decimal group separators (zero or more) and a decimal point (zero or one):
(?:\d{1,3},)*\d{1,3}(?:\.\d+)?
For your use-case you should be happy with this regex:
[\$£€](?:\d{1,3},)*\d{1,3}(?:\.\d/{1,2})?|(?:\d{1,3},)*\d{1,3}(?:\.\d{1,2})?[\$£€]
Legacy answer
There is no way in regular expressions (at least that I know of) that would allow you to swap the order of two groups of characters, thus you'll have to specify it like "AB or BA".
Hope, this one works for you:
[\$\£\€]\d+(?:[.,]\d+)?|\d+(?:[.,]\d+)?[\$\£\€]
The \d+(?:[.,]\d+)? part could be simplified back to [\d.,]+. The simplest for of regex (with a lot of information lost) is this:
[\$£€]?[\d.,]+[\$£€]?
... but that allows a lot of erroneous inputs, like 20.$ or $.,€ or simply 5.

Related

Using Regex to find repeating groups in phone numbers

I'm looking for a way to use regex to search for obviously false phone numbers that have the same digit repeating. The numbers are all formatted and stored as follows:
(111)111-1111
I'm not able to alter the text in any way.
I've tried modifying a few of the regex lines I've seen such as:
^([0-9])\1{2}.\1{3}.\1{4}$
which was for finding repeating digits with a period in between the numbers. However, I haven't figured out how to get around the first character as a parenthesis.
Any help would be appreciated!
You misunderstand the purpose of the . Dot Operator. It is not to match a period, it matches anything. In that (quite badly) regex, it serves only to skip the - – and because it matches anything, it will also match something like 11121113111.
Use this regexp instead:
^\(?([0-9])\1{2}\)?\1{3}-?\1{4}$
This checks for parentheses around the first group, optionally so it will still work without; and specifically checks for the presence of a dash between the second and third group of digits, also optionally.

Intelligent RegEx Replacement

I'm setting up a system to parse a string with very specified syntax and fix user errors. For example, the syntax requires dates in a m/d/yy format (no leading 0s), so I need to make the following substitutions:
10/01/13 -> 10/1/13
10/10/13 -> no change
10/1/13 -> no change
01/10/13 -> 1/10/13
I have a lot of rules like this by which I need to find portions of a string and fix those portions. I can use RegEx to identify what needs to be corrected easily. For an easier example, I want to find CBUx[2-9], but then I need to replace with something like this CBU x [2-9] (spaces around x if preceded by CBU and follwed by a digit). Example:
input text: "blah blah CBUx3"
matched: "CBUx3"
replace: "CBU x 3"
output text: "blah blah CBU x 3"
Is this possible? Note that I'm fully aware I could write code to find the slashes and digits. I'm specifically trying to do this with an "intelligence RegEx replace". I have a lot of different types of corrections that I can match with RegEx, and I would like to avoid writing specific correction procedures for each.
Maybe something like that for the leading zeroes:
\b0+([1-9])
And replace with $1 (or \1 depending on the language, though \1 is less common nowadays).
But something a bit better might be with the use of a negative lookbehind:
(?<![.,])\b0+([1-9])
So that the 0 in 10,001.002 are not changed to 10,1.2.
regex101 demo
The word boundary, \b, makes sure that the 0 (or more) are at the beginning of the number and the negative lookbehind is for cases of decimals and thousand separators, assuming that you have have floating numbers in the string. Note that this will however prevent the removal of zeroes in a date format of 11.01.13. A more complex regex can however be made with the assumption that such a date always have a least one number after a second dot (itself after 2 numbers since dates and months take at most 2 digits) without encountering anything other than other numbers, which makes the regex look like...
(?<![.,](?![0-9]{2}\.[0-9]))\b0+([1-9])
And which renders to something like this.
For the CBUx[2-9], you can use a capture group as well:
CBUx([2-9])
And replace with: CBU x $1 (or \1)
There might be some tweaks I didn't consider for the leading zero removal part, but that's what I can think about right now.

Add two decimal digits to a number range regex

I've created a Regexp to validate a direction in degrees, between -359 and +359 (with optional sign). This is my regex:
const QString xWindDirectionPattern("[+-]{0,1}([0-9]{1,2}|[12][0-9]{2}|3[0-5][0-9])");
Now, I want to add two decimal numbers, in order to write numbers from -359.99 to +359.99. I've tried something like appending \.[0-9]{1,2}|[0-9]{1,3} but It does not work.
I'd like to have optional decimal point so I can have
23.3 valid
23.33 valid
23 valid
23.333 not valid
I've read some other questions, like this one, but I'm not able to modify the example to match a number range, like in my case.
How can I achieve this result?
Thanks in advance for your replies.
How can achieve this?
I've created a Regexp to validate a direction in degrees, between -359 and +359
No, you can't. You shouldn't. You are using the wrong tool. Regex cannot do the kinds of validation, which require it to dig into the semantics of the characters.
Regex can only process and match text, but cannot identify what they actually mean. Basically Regex are good for parsing regular language, and bad for almost everything else.
For e.g.:
A Regex can match 3 digits, but it would be extremely impractical to use it to match 3 digits that fall in range - [259, 634]. For that you would need to know the meaning of each individual digits in that number.
A Regex can match a pattern for date like - \d\d/\d\d/\d\d, but it cannot identify which part is date, and which part is month.
Similarly, it can find you two numbers x and y, but it cannot identify, whether x < y or not.
The task as above require you to understand the meaning of the text. Regex can't do that.
Well, of course you have come up with a regex for sure, but as you can see it is highly un-flexible. A little change in your requirement, will screw both - the regex and you.
You should better use corresponding language features - constructs like if-else to make sure you are reading degrees in that range, and not regex.
You can do this:
[+-]{0,1}((?:[0-9]{1,2}|[12][0-9]{2}|3[0-5][0-9])(?:\.[0-9]{1,2})?)
This will allow an a decimal point followed by one or two digits. You'll probably also want to use start and end anchors (^ / $) to ensure that there are no characters other than this pattern in your string—without this, 23.333 would be allowed because 23.33 matches the above pattern:
^[+-]{0,1}((?:[0-9]{1,2}|[12][0-9]{2}|3[0-5][0-9])(?:\.[0-9]{1,2})?)$
You can test it out here.
Try [+-]?([1-9]\d?|[12]\d{2}|3[0-5]\d)(\.\d{1,2})?.
[+-]? Optional Sign
[1-9]\d? 1 or 2 digit number
[12]\d{2} 100 to 299
3[0-5]\d 300 to 359
(\.\d{1,2})? Optional decimal point followed by 1 or two digits

emacs syntax highlight numbers not part of words (with regex?)

I've moved to emacs recently and I am used to/like numbers being highlighted. A quick hack I took from here puts the following in my .emacs:
(add-hook 'after-change-major-mode-hook
'(lambda () (font-lock-add-keywords
nil
'(("\\([0-9]+\\)"
1 font-lock-warning-face prepend)))))
Which gives a good start, i.e. any digit is highlighted. However, I am a complete beginner with regex and would ideally like the following behaviour:
Also highlight the decimal point if it's part of a float, e.g. 12.34
Do not highlight any part of the number if it is next/part of a word. e.g. in these cases: foo11 ba11r 11spam, none of the '1's should be highlighted
Allow 'e' within two number integers to allow scientific notation (not required, bonus credit)
Unfortunately this looks very much like a 'do this for me' question which I am loathe to post, but I have failed thus far to make any decent progress myself.
About as far as I have got is discovering [^a-zA-Z][0-9]+[^a-zA-Z] to match anything but a letter either side (e.g. an equals sign), but all this does is include the adjacent symbol in the highlighting. I am not sure how to tell it 'only highlight the numbers if there isn't a letter on either side'.
Of course, I can't imagine regex is the way to go with complicated syntax highlighting, so any good number highlighting in emacs ideas are also welcome,
Any help very much appreciated. (In case it makes any difference, this is for use when Python coding.)
Start by going to your scratch buffers and typing in a some test text. put some numbers in there, some identifiers that contain numbers, some numbers with missing parts (like .e12), etc. These will be our testcases and will let us experiment rapidly. Now run M-x re-builder to enter the regex builder mode, which will let you try out any regex against the text of the current buffer to see what it matches. This is a very handy mode; you'll be able to use it all the time. Just note that because Emacs lisp requires you to put regexes into strings, you must double up on all of your backslashes. You're already doing that correctly, but I'm not going to double them up in here.
So, limiting the match to numbers that are not part of identifiers is pretty easy. \b will match word boundaries, so putting one at either end of your regex will make it match a whole word
You can match floats just by adding a period to the character class you started with, so that it becomes [0-9.]. Unfortunately, that can match a period all on it's own; what we really want is [0-9]*\.?[0-9]+, which will match 0 or more digits followed by an optional period followed by one or more digits.
A leading sign can be matched with [-+]?, so that gets us negative numbers.
To match exponents we need an optional group: \(...\)?, and since we are only using this for highlighting, and don't actually need to separate out the content of the group, we can do \(?:...\), which will save the regex matcher a little time. Inside the group we will need to match an "e" ([eE]), an optional sign ([-+]?), and one or more digits ([0-9]+).
Putting it all together: [-+]?\b[0-9]*\.?[0-9]+\(?:[eE][-+]?[0-9]+\)?\b. Note that I've put the optional sign before the first word boundary, because the "+" and "-" characters create a word boundary.
First of all, lose the add-hook and the lambda. The font-lock-add-keywords call doesn't need either. If you want this only for python-mode, pass the mode symbol as the first argument instead of nil.
Second, there are two main ways to do that.
Add a grouping construct around the digits. The numbers in the font-lock-keywords forms correspond to the groups, so this would be '(("\\([^a-zA-Z]\\([0-9]+\\)[^a-zA-Z]\\)" 2 font-lock-warning-face prepend). The outer grouping is rather useless here, though, so this can be simplified to '(("[^a-zA-Z]\\([0-9]+\\)[^a-zA-Z]" 1 font-lock-warning-face prepend).
Just use the beginning and end of symbol backslash constructs. Then the regexp looks like this: \_<[0-9]+\_>. We can highlight the whole match here, so there's no need for the group number: '(("\\_<[0-9]+\\_>" . font-lock-warning-face prepend). As a variation, you could use the beginning-of-word and end-of-word constructs, but you probably don't want to highlight numbers adjacent to underscores or whatever other characters, if any, python-mode has in the syntax class symbol.
And lastly, there's probably no need for prepend. The numbers are likely all unhighlighted before this, and if you consider possible interaction with other minor modes like whitespace, you'd better choose append, or just omit this element entirely.
End result:
(font-lock-add-keywords nil '(("\\_<[0-9]+\\_>" . font-lock-warning-face)))

Preparing number using abbreviations

RegEx for BMHT in a sequence is my previous post.
I'm looking to build a number using abbreviations, and ofcourse using regex.
Now I know how to validate a number with BMTH abbreviations.
Now my next and final target is to build a number using the abbreviations.
e.g. -2T2H22.55 should be displayed as -2,222.55
-2M2H22.63 should be displayed as -2,000,222.63
Help appreciated.
Flex's scripting language, ActionScript, is an ECMAScript implementation like JavaScript, so regex literals have to be delimited with slashes, for example: /^(?:\d+B)?(?:\d{1,3}M)?(?:\d{1,3}T)?(?:\d{1}H)?(\.[0-9]*)?/.
But that regex still has some problems. For one thing, you don't account for the minus sign or the two digits after the hundreds place. And, while the decimal point may be optional, if it is present you should require it to be followed by at least one digit (so +, not * in that last group).
Finally, you'll need to capture the various components so you can use them to construct the number. Here's my result:
/^(-?)(?:(\d+)B)?(?:(\d{1,3})M)?(?:(\d{1,3})T)?(?:(\d)H)?(\d{0,2})(\.\d+)?$/
The minus sign, if present, will be captured in group $1. The rest of the components will be in groups $2 through $7. You can use them in a callback function to construct the number. Also, notice that everything in this regex is optional; it will match an empty string or just a hyphen, so you'll need to check for that.