I've been reading this post Return only one group with OR condition in Regex to get an understanding how to get only one group in match. Somehow that does not work on my pattern.
Here is the used string:
Ledigingen 4 lediging € 32,48 € 50,92 21,00 %
van 01-01-2019 t/m 31-01-2019
Huur 1 Maand € 8,63 € 8,63 21,00 %
toeslag over € 50,42 (21% BTW) € 2,76 21,00 %
(WHITESPACE) Totaal exclusief BTW € 50,18
BTW hoog (21%) € 50,18 € 50,89
totaal inclusief BTW € 70,07
Currently it extracts each occurence of amount. Is there a way to get only values followed by [Tt]otaa?l excl/incl BTW?
I guess I've been using positive/negative lookahead wrong.
Desired output from the given input is:
€ 50,18
€ 70,07
DEMO
RegEx
(?!<=[tT]otaa?l\s*?.*?)([€$]\s*\d+(?:[,.]\d{0,2})?)
Related
I have a Google Sheets formula that extracts a £ currency value or a percentage discount from a block of text.
=REGEXEXTRACT(B2,"[\d,.£%]+") - Extracts £ value or % discount (but other numbers too)
=REGEXEXTRACT(B2,"[\d,.]+") - Extracts digits, commas, or periods
However, if the text contains any others numbers before the £ value or % discount they get extracted first.
How can I only extract the £ value or % discount from each cell in Google Sheets?
The maximum discount displayed is 2 decimal places maximum, which may help in building a formula to extract 4 digits left or right of the value.
EXAMPLE DATA
Amy Wills 44% Discount
1Direction Food 45.37% Discount
AllUnder20 £120 Commission
AATU 13.31% Discount
Tickets4You £70 Commission
AllAboutU £7 Commission
Andrea Cardini 4% Discount
You can use
=JOIN("", REGEXEXTRACT(B2, "£(\d+(?:[.,]\d+)?)|(\d+(?:[.,]\d+)?)%"))
Details:
£(\d+(?:[.,]\d+)?) - matches a £ and then matches and captures into Group 1 one or more digits followed with one or zero occurrences of ./, and then one or more digits
| - or
(\d+(?:[.,]\d+)?)% - matches and captures into Group 2 one or more digits followed with one or zero occurrences of ./, and then one or more digits, and then a % is matched.
See the demo screenshot:
See the RE2 regex demo.
Based on your samples, this should work.
=SUMPRODUCT(N(SPLIT(B2," ")))
You can see it at work here in cell C2.
I need to get the price from a string, but no other numbers. There are no restrictions on what the string can say, but it will always have a dollar amount in it. It's the dollar amount I need to get from the string.
The closest solution I've been able to find is \d{1,3}[,\\.]?(\\d{1,2})?
On an example string like, "2 BED / 2 BATH for $120,000.00, what a deal!!!", the regex should only return $1,000,000, and no other numbers. The solution above will return 2, 2, and 1,000,000.00. An ideal solution should NOT match on any digits that are outside of the dollar amount. It also needs to include the symbol immediately before the match (to account for the possibility of all currency symbols (USD, GBP, EUR, etc).
So, the price that's matched by the regex should look like: $120,000.00, but it could also match on something like €40,000
If you want to match all currency symbols before a number with the number itself, you may combine the two expressions:
Currency symbol regex: \b(?:[BS]/\.|R(?:D?\$|p))| \b(?:[TN]T|[CJZ])\$|Дин\.|\b(?:Bs|Ft|Gs|K[Mč]|Lek|B[Zr]|k[nr]|[PQLSR]|лв|ден|RM|MT|lei|zł|USD|GBP|EUR|JPY|CHF|SEK|DKK|NOK|SGD|HKD|AUD|TWD|NZD|CNY|KRW|INR|CAD|VEF|EGP|THB|IDR|PKR|MYR|PHP|MXN|VND|CZK|HUF|PLN|TRY|ZAR|ILS|ARS|CLP|BRL|RUB|QAR|AED|COP|PEN|CNH|KWD|SAR)\b|\$[Ub]|[\p{Sc}ƒ]
Number regex: (?<!\d)(?<!\d\.)(?:\d{1,3}(?:,\d{3})*|\d+)(?:\.\d{1,2})?(?!\.?\d)
Currencies are taken from World Currency Symbols, the 3-letter currency codes used in the pattern are the most commonly used ones, but the comprehensive list can also be compiled using those data.
The answer is
(?:\b(?:[BS]/\.|R(?:D?\$|p))|\b(?:[TN]T|[CJZ])\$|Дин\.|\b(?:Bs|Ft|Gs|K[Mč]|Lek|B[Zr]|k[nr]|[PQLSR]|лв|ден|RM|MT|lei|zł|USD|GBP|EUR|JPY|CHF|SEK|DKK|NOK|SGD|HKD|AUD|TWD|NZD|CNY|KRW|INR|CAD|VEF|EGP|THB|IDR|PKR|MYR|PHP|MXN|VND|CZK|HUF|PLN|TRY|ZAR|ILS|ARS|CLP|BRL|RUB|QAR|AED|COP|PEN|CNH|KWD|SAR)|\$[Ub]|[\p{Sc}ƒ])\s?(?:\d{1,3}(?:,\d{3})*|\d+)(?:\.\d{1,2})?(?!\.?\d)
See the regex demo
It is created like this: (?:CUR_SYM_REGEX)\s?NUM_REGEX, with the lookbehinds in number regex stripped from the pattern since the left-hand context is already defined.
you can use this one
[$€]{1}(?P<amount>[\d,\.]+(?>\.\d{2}){0,})\b
insert any currency sign into the first group [$€] to match them
and try it online here
This alternative will match any amount without specify the currency
\S+\d[\d,\.]*?\b
If you have to specify currency due to misspellings in the input, then you can also use the following regex as an alternative:
(?:\p{Sc}|ƒ)[\d,\.]+\\b
Note: \p{Sc} can match any Currency Symbol.
The regex '\S+\d[\d,\.]*?\b' tested in a testbench written in Java, to show it handles any amount and currency:
public static void main(String[] args) {
List<String> inputs = Arrays.asList(
"2 BED / 2 BATH for $120,000.00, what a deal!!!",
"$1 2 BED / 2 BATH for $120,000.00, what a deal $3",
"$1.00 2 BED / 2 BATH for $2,000.00, what a deal $300",
"£40.00 2 BED / 2 BATH for $50,000, what a deal €600.00",
"₧10 2 BED / 2 BATH for ƒ80.00, what a deal ₨9"
);
Pattern pattern = Pattern.compile("\\S+\\d[\\d,\\.]*?\\b");
for (String input : inputs) {
System.out.printf("Line to match: '%s'%n", input);
Matcher matcher = pattern.matcher(input);
System.out.println("Extracted price string:");
while(matcher.find()) {
System.out.println(matcher.group());
}
System.out.println("=======================");
}
}
Output:
Line to match: '2 BED / 2 BATH for $120,000.00, what a deal!!!'
Extracted price string:
$120,000.00
=======================
Line to match: '$1 2 BED / 2 BATH for $120,000.00, what a deal $3'
Extracted price string:
$1
$120,000.00
$3
=======================
Line to match: '$1.00 2 BED / 2 BATH for $2,000.00, what a deal $300'
Extracted price string:
$1.00
$2,000.00
$300
=======================
Line to match: '£40.00 2 BED / 2 BATH for $50,000, what a deal €600.00'
Extracted price string:
£40.00
$50,000
€600.00
=======================
Line to match: '₧10 2 BED / 2 BATH for ƒ80.00, what a deal ₨9'
Extracted price string:
₧10
ƒ80.00
₨9
=======================
Link to more currency signs:
https://en.wikipedia.org/wiki/Currency_sign_(typography)
Here's the string I'm interrogating:
[Card-1 Intake : 30 C] [Card-1 Exhaust : 35 C] [Card-1 CPU : 38 C] [Card-1 Switch CPU : 47 C]
I am completely lost in how I can use regex (PCRE) to grab the 'Intake' value, i.e. 30.
Any help appreciated.
To match just the value, use a look behind:
(?<=Intake : )\S+
See live demo.
If look behinds are not supported, use group 1 of:
Intake : (\S+)
See live demo.
/Card-1 Intake : (\d+) C/
If you do x = string.match with the regex you can access it with x[1]
I have multiple formats of strings from which I have to extract exactly 10 digit number.
I have tried the following regexes for it. But it extracts the first 10 digits from the number instead of ignoring it.
([0-9]{10}|[0-9\s]{12})
([[:digit:]]{10})
These are the formats
Format 1
KINDLY AUTH FOR FUNDS
ACC 1469007967 (Number needs to be extracted)
AMT R5 000
DD 15/5
FROM:006251
Format 2
KINDLY AUTH FOR FUNDS
ACC 146900796723423 **(Want to ignore this number)**
AMT R5 000
AMT R30 000
DD 15/5
FROM:006251
Format 3
PLEASE AUTH FUNDS
ACC NAME-PREMIER FISHING
ACC NUMBER -1186 057 378 **(the number after - sign needs to be extracted)**
CHQ NOS-7132 ,7133,7134
AMOUNTS-27 000,6500,20 000
THANKS
FROM:190708
Format 4
PLEASE AUTHORISE FOR FUNDS ON AC
**1162792833** CHQ:104-R8856.00 AND (The number in ** needs to be extracted)
CHQ:105-R2772.00
REGARDS,
To match those numbers including the formats to have either 10 digits or 4 space space 3 space 3, you might use a backreference \1 to a capturing group which will match an optional space.
Surround the pattern by word boundaries \b to prevent the digits being part of a larger word.
\b\d{4}( ?)\d{3}\1\d{3}\b
Regex demo
Your expression seems to be fine, just missing a word boundary and we might want to likely modify the second compartment, just in case:
\b([0-9]{10}|[0-9]{4}\s[0-9]{3}\s[0-9]{3})\b
In this demo, the expression is explained, if you might be interested.
Adding a word boundary \b helps. The regex becomes: (\b([0-9]{10}|[0-9\s]{12})\b).
Check it here https://regex101.com/r/6Hm8PD/2
I want to validate Indian phone numbers as well as mobile numbers. The format of the phone number and mobile number is as follows:
For land Line number
03595-259506
03592 245902
03598245785
For mobile number
9775876662
0 9754845789
0-9778545896
+91 9456211568
91 9857842356
919578965389
I would like the regular expression in one regex. I have tried the following regex but it is not working properly.
{^\+?[0-9-]+$}
For land Line Number
03595-259506
03592 245902
03598245785
you can use this
\d{5}([- ]*)\d{6}
NEW for all ;)
OLD: ((\+*)(0*|(0 )*|(0-)*|(91 )*)(\d{12}+|\d{10}+))|\d{5}([- ]*)\d{6}
NEW: ((\+*)((0[ -]*)*|((91 )*))((\d{12})+|(\d{10})+))|\d{5}([- ]*)\d{6}
9775876662
0 9754845789
0-9778545896
+91 9456211568
91 9857842356
919578965389
03595-259506
03592 245902
03598245785
this site is useful for me, and maby for you .;)http://gskinner.com/RegExr/
Use the following regex
^(\+91[\-\s]?)?[0]?(91)?[789]\d{9}$
This will support the following formats:
8880344456
+918880344456
+91 8880344456
+91-8880344456
08880344456
918880344456
This works really fine:
\+?\d[\d -]{8,12}\d
Matches:
03598245785
9775876662
0 9754845789
0-9778545896
+91 9456211568
91 9857842356
919578965389
987-98723-9898
+91 98780 98802
06421223054
9934-05-4851
WAQU9876567892
ABCD9876541212
98723-98765
Does NOT match:
2343
234-8700
1 234 765
for mobile number:
const re = /^[6-9]{1}[0-9]{9}$/;
I use the following for one of my python project
Regex
(\+91)?(-)?\s*?(91)?\s*?(\d{3})-?\s*?(\d{3})-?\s*?(\d{4})
Python usage
re.search(re.compile(r'(\+91)?(-)?\s*?(91)?\s*?(\d{3})-?\s*?(\d{3})-?\s*?(\d{4})'), text_to_search).group()
Explanation
(\+91)? // optionally match '+91'
(91)? // optionally match '91'
-? // optionally match '-'
\s*? // optionally match whitespace
(\d{3}) // compulsory match 3 digits
(\d{4}) // compulsory match 4 digits
Tested & works for
9992223333
+91 9992223333
91 9992223333
91999 222 3333
+91999 222 3333
+91 999-222-3333
+91 999 222 3333
91 999 222 3333
999 222 3333
+919992223333
For both mobile & fixed numbers: (?:\s+|)((0|(?:(\+|)91))(?:\s|-)*(?:(?:\d(?:\s|-)*\d{9})|(?:\d{2}(?:\s|-)*\d{8})|(?:\d{3}(?:\s|-)*\d{7}))|\d{10})(?:\s+|)
Explaination:
(?:\s+|) // leading spaces
((0|(?:(\+|)91)) // prefixed 0, 91 or +91
(?:\s|-)* // connecting space or dash (-)
(?:(?:\d(?:\s|-)*\d{9})| // 1 digit STD code & number with connecting space or dash
(?:\d{2}(?:\s|-)*\d{8})| // 2 digit STD code & number with connecting space or dash
(?:\d{3}(?:\s|-)*\d{7})| // 3 digit STD code & number with connecting space or dash
\d{10}) // plain 10 digit number
(?:\s+|) // trailing spaces
I've tested it on following text
9775876662
0 9754845789
0-9778545896
+91 9456211568
91 9857842356
919578965389
0359-2595065
0352 2459025
03598245785
07912345678
01123456789
sdasdcsd
+919898101353
dasvsd0
+91 dacsdvsad
davsdvasd
0112776654
You can use regular expression like this.
/^[(]+\ ++\d{2}[)]+[^0]+\d{9}/
For Indian Mobile Numbers
Regular Expression to validate 11 or 12 (starting with 0 or 91) digit number
String regx = "(0/91)?[7-9][0-9]{9}";
String mobileNumber = "09756432848";
check
if(mobileNumber.matches(regx)){
"VALID MOBILE NUMBER"
}else{
"INVALID MOBILE NUMBER"
}
You can check for 10 digit mobile number by removing "(0/91)?" from the regular expression i.e. regx
you can implement following regex
regex = '^[6-9][0-9]{9}$'
All mobile numbers in India start with 9, 8, 7 or 6. Now, there is a chance that you are not bothering about the prefixes (+91 or 0). If this is your scenario, then you can take the help from the website regextester.com or you can use r'^(+91[-\s]?)?[0]?(91)?[789]\d{9}$'
And if you want to validate the Phone number with prefixes(+91 or 0) then use : r'^[6-9]\d{9}$'.
r'\+?(91?|0?)[\-\s]?[3-9]\d{3}[\-\s]?\d{6}$'
explanation
+? # Start with plus sign or not
(91?|0?) # Followed by 91 or 0 or none of them
[-\s]? # Followed by either - or space, or none of them
[3-9] # followed by any number from 3 between 9
\d{3} # followed by any three digits
\d{6} # followed by any six digits
$ # specify string should stop at that point
You Can Use Regex Like This:
^[0-9\-\(\)\, ]+$
All Landline Numbers and Mobile Number
^[\d]{2,4}[- ]?[\d]{3}[- ]?[\d]{3,5}|([0])?(\+\d{1,2}[- ]?)?[789]{1}\d{9}$
var phonereg = /^(\+\d{1,3}[- ]?)?\d{10}$/;