I am trying to extract the last number (the price) from these strings:
"1 601 15.01.2019 14.01.2022 21.224,00"
"1 601 01.01.2019 31.12.2021 38.354,00"
"1 601 01.01.2019 31.12.2021 1629,32"
My pattern:
.Pattern = "\s\d{1,3}\.\d{3}"
The expected result:
21.224,00
38.354,00
1629,32
You could use this pattern: \d{1,2}\.?\d{3,4},\d{2}/gm.
See this demo: https://regex101.com/r/F45CmK/1
Related
RegExp reExp = RegExp(r'^(0|\+|(\+[0-9]{2,4}|\(\+?[0-9]{2,4}\)) ?)([0-9]*|\d{2,4}-\d{2,4}(-\d{2,4})?)$');
String phoneNumber = phoneNumberOne.text.replaceFirst(reExp, '');
I think what are you looking for is the following:
You have this input:
+855 98675432
(855) 98675432
055 98675432
and you want to get the output as below:
98675432
98675432
98675432
The regex to do that:
void main() {
RegExp regExp = RegExp(r'^((0[0-9]{2,4}|\+[0-9]{2,4}|\(\+?[0-9]{2,4}\)) ?)');
String firstNumber = "+855 98675432".replaceFirst(regExp, '');
print(firstNumber);
String secondNumber = "(855) 98675432".replaceFirst(regExp, '');
print(secondNumber);
String thirdNumber = "055 98675432".replaceFirst(regExp, '');
print(thirdNumber);
}
Explanation:
^ means that the string should be at the beginning.
(0[0-9]{2,4} means 0 then two to four numbers in this range 0 to 9.
| means or.
\+[0-9]{2,4} means + then two to four numbers in this range 0 to 9.
\(\+?[0-9]{2,4}\)) means + then two to four numbers in this range 0 to 9 in a parentheses.
? means zero or one space.
The regex above can be even made shorter like this:
^(\(?(0|\+|)[0-9]{2,4}\)?) ?
But it's a little bit harder to read.
I need to get price of this strig "Prix\xa0de base : 26 900 euros – bonus" but there is a 0 in 'Prix\xa0de' and I don't know how to do it.
Thanks for your help!
You can use something like this:
subject = "Prix\xa0de base : 26 900 euros – bonus"
match = re.search(r"^.*:\s+([\d ]+)\s+", subject)
if match:
result = match.group(1)
else:
result = ""
result will be 26 900
If it always is followed by the word 'euros' then as simple as:
'(\d+ ?\d+) euros'
Capturing the number (or number with a space as separator) before 'euros'
I have TEXT File and extracted every line of data from text file. The extracted data is stored to list of string then I iterate loop to List of string to manipulate and validate the data extracted. Now every line of string I extracted, I want to validate if that line of string is contain 1). I used RegEx for this but it gives me no luck. (Please see image below)
My Text File
Code
Dim strRegexPattern As String = "^\d{1,6}[)]\s$"
Dim myRegex As New Regex(strRegexPattern, RegexOptions.None)
Dim _strMatch As Match = myRegex.Match(line) '<-- i use for each line as string in listOfExtractedLines
If _strMatch.Success Then
MsgBox(_strMatch.Value)
End If
String extracted from text file(with formatting and spaces)
Title : 8015B DRO(C10-C28) - ORO (C18-C36)
Column01 Col2 Col3 Column04 Col5 Col06 Col(007)
--------------------------------------------------------------------------
Intxxxxx xxxxxxxxx
1) zzzzzzzzzzzzzzzzzz 4.464 168 212614 25.00 xyz 0.00
33) aaaaaaaaaaaaaaaaaaa 4.818 114 330529 25.00 xyz 0.00
51) bbbbbbbbbbbbbbbb 6.742 117 318044 25.00 xyz 0.00
64) cccccccccccccccccccccc 8.397 152 186712 25.00 xyz 0.00
21) Endosulfan Sulfa 12.51 13 918.2E6 840.8E6 106.315
22) Endrin Ketone 13.11 14 143.4E6 992.2E6 104.978
^.*?\s\d{1,6}[)]\s.*$
Try this to match the whole line.
Edit:
(?:^|\s+)\d{1,6}[)]\s.*$
Having this string s=";123;;123;;456;;124;;123;;567;" in R, which shows some Ids separated by ";", I want to find the repeated IDs, so in this case ";123;" is repeated. I used the following command in R:
gregexpr("(;[1-9]+;).*\1", s)
but it doesn't find the repeated patterns. Any idea what is wrong?
One example of a long string:
1760381;;1774536;;1774614;;1774617;;1774705;;1774723;;1775013;;1902321;;1928678;;2105486;;2105514;;2105544;;2105575;;2105585;;2279115;;2379236;;290927;;542280;;555749;;641540;;683822;;694934;;713228;;713248;;713249;;726949;;727204;;731434;;754522;;7693856;;100095;;1003838;;1045582;;1079057;;1108697;;1231229;;124087;;1249672;;1328126;;1412065;;1419930;;1441743;;1470580;;1476585;;1502106;;1556149;;1637775;;1643922;;1655644;;1755547;;1759001;;1760295;;1760296;;1760320;;1760326;;1760338;;1760348;;1760349;;1760350;;1760353;;1760375;;1760376;;1760377;;1760378;;1760388;;1760401;;1760402;;1760403;;1760410;;1760421;;1760425;;1760426;;1760642;;1760654;;1770463;;1774365;;1774366;;1774394;;1774449;;1774453;;1774454;;1774455;;1774456;;1774457;;1774458;;1774461;;1774462;;1774463;;1774464;;1774466;;1774469;;1774504;;1774505;;1774506;;1774519;;1774520;;1774525;;1774527;;1774529;;1774532;;1774533;;1774539;;1774542;;1774593;;1774595;;1774604;;1774610;;1774616;;1774617;;1774641;;1774660;;1774671;;1774674;;1774684;;1774687;;1774694;;1774704;;1774706;;1774713;;1774717;;1774722;;1774723;;1774726;;1774733;;1774745;;1774750;;1774753;;1774754;;1774766;;1774784;;1774786;;1774795;;1774799;;1774800;;1774803;;1774809;;1774813;;1774835;;1774849;;1774852;;1774853;;1774854;;1774857;;1774858;;1774861;;1774862;;1774867;;1774868;;1774869;;1774870;;1774877;;1774878;;1774880;;1774884;;1774885;;1774886;;1774902;;1774905;;1774934;;1774935;;1774937;;1774939;;1774946;;1774949;;1774950;;1774958;;1774959;;1774960;;1774961;;1774962;;1774964;;1774965;;1774966;;1774967;;1774969;;1774971;;1774972;;1774973;;1774975;;1774977;;1774978;;1774999;;1775000;;1775003;;1775005;;1775006;;1775009;;1775013;;1775014;;1775017;;1775024;;1775026;;1775033;;1775038;;1775040;;1775041;;1775044;;1775087;;1785544;;1811645;;1837210;;1864356;;1928674;;1928678;;1932882;;1954203;;2066856;;2076876;;2105349;;2105351;;2105458;;2105464;;2105476;;2105480;;2105482;;2105484;;2105489;;2105496;;2105500;;2105510;;2105514;;2105518;;2105532;;2105545;;2105550;;2172257;;2172762;;218438;;2228198;;2229827;;2247909;;2262250;;2263135;;2287260;;2335872;;2335873;;2335874;;2335877;;2338682;;2352560;;2420902;;263946;;265370;;303060;;330571;;338764;;387492;;387750;;388362;;431807;;436056;;436442;;444058;;458026;;491696;;504783;;513098;;529228;;539799;;549649;;559957;;562574;;563116;;576418;;582851;;592273;;599952;;614463;;626416;;645122;;652363;;665854;;668048;;682877;;683822;;688317;;709795;;710684;;723114;;724447;;724526;;725177;;731389;;731434;;876958;;879962;;947924;;987322;;987446;;61326;;1025952;;1095970;;1338018;;1349990;;1373122;;1419930;;1760310;;1760320;;1774705;;1774706;;1774708;;1774712;;1774952;;1774954;;1774963;;1774972;;1774977;;1775077;;1901075;;2022080;;2117779;;2143723;;441554;;450517;;549649;;1010402;;113311;;1148258;;1374348;;1419930;;1606449;;1606515;;1606608;;1606610;;1760320;;1760338;;1760618;;1760642;;1774504;;1774520;;1774595;;1774705;;1774909;;1774977;;1775011;;1775043;;179542;;1928678;;2105598;;2105721;;2188303;;2335873;;340762;;387759;;436442;;504783;;588336;;646185;;682877;;715644;;725080;;741661;;760924
m<-gregexpr("[0-9]+",s)
n<-regmatches(s,m)
[[1]]
[1] "123" "123" "456" "124" "123" "567"
data.frame(table(unlist(n)))
Var1 Freq
1 123 3
2 124 1
3 456 1
4 567 1
The code works for your long form string too: Here is the head and tail of the output:
head(data.frame(table(unlist(n))),10)
Var1 Freq
1 100095 1
2 1003838 1
3 1010402 1
4 1025952 1
5 1045582 1
6 1079057 1
7 1095970 1
8 1108697 1
9 113311 1
10 1148258 1
tail(data.frame(table(unlist(n))),10)
Var1 Freq
316 731434 2
317 741661 1
318 754522 1
319 760924 1
320 7693856 1
321 876958 1
322 879962 1
323 947924 1
324 987322 1
325 987446 1
1) In the examples the ids are all the same length so we assume that is a general feature. Try this pattern where (?=...) is a zero width lookahead expression (see ?regex)
pat <- ";([1-9]+);(?=.*\\1)"
gregexpr(pat, s, perl = TRUE)
or this:
library(gsubfn)
strapply(s, pat, perl = TRUE)[[1]]
## [1] "123" "123"
This lists each id one fewer times than its occurrence (zero times for ids not duplicated) in s so to list each duplicated id uniquely try unique(st) where st is the result of this last line of code above.
Note: In the second example in the question, i.e. the long string, there is no ; at the end of the string so the last id can never be matched by the expression unless we first paste a ; onto the end.
2) Instead of matching the contents we could match the delimiters instead:
strsplit(s, ";")[[1]])[-1]
If st is the result of this line of code then st is just a vector of all the ids so unique(st[duplicated[st]) uniquely lists each duplicated id and involves no regular expressions.
I have a problem building a regex. this is a sample of the text:
text 123 12345 abc 12 def 67 i 89 o 0 t 2
The numbers are sometimes padded with blanks to the max length (3).
e.g.:
"1" can be "1" or "1 "
"13" can be "13" or "13 "
My regex is at the moment this:
\b([\d](\s*)){1,3}\b
The results of this regex are the following: (. = blank for better visibility)
123.
12....
67.
89.
0....
2
But I need this: (. = blank for better visibility)
123
12.
67.
89.
0..
2
How can I tell the regex engine to count the blanks into the {1,3} option?
Try this:
\b(?:\d[\d\s]{0,2})(?:(?<=\s)|\b)
This will also cover strings like text 123 1 23 12345 123abc 12 def 67 i 89 o 0 t 2 and results in:
123
1.
23.
12.
67.
89.
0..
2
Does this do what you want?
\b(\d){1,3}\s*\b
This will also include whitespace (if available) after the selection.
I think you want this
\b(?:\d[\d\s]{0,2})(?!\d)
See it here on Regexr
the word boundary will not work at the end, because if the end of the match is a whitespace, there is no word boundary. Therefor I use a negative lookahead (?!\d) to ensure that there is no digit following.
But if you have a string like this "1 23". It will match only the "2" and the "23", but not the whitespace after the first "2".
Assuming you want to use the padded numbers somewhere else, break the problem apart into two; (simple) parsing the numbers, and (simple) formatting the numbers (including padding).
while ( $text =~ /\b(\d{1,3})\b/g ) {
printf( "%-3d\n", $1 );
}
Alternatively:
#padded_numbers = map { sprintf( "%-3d", $_ ) } ( $text =~ /\b(\d{1,3})\b/g )