I have a regex to match the date formats Sep.23'15 or Sep 23'15 or Sep23'15
[a-zA-Z]{3}[. ]\d{2}'\d{2}
I am able to match Sep.23'15 & Sep 23'15 but not Sep23'15
How to write the regex to match with space and without space ?
I suggest matching a dot (optionally) and then use * quantifier instead of the ? suggested by Tushar applied to a space:
[a-zA-Z]{3}\.?[ ]*\d{2}'\d{2}
^^^^^^^
This regex will also handle format like Sep. 23'15 (with a dot and a space(s) between the month and the day'year).
Regex explanation:
[a-zA-Z]{3} - 3 ASCII letters
\.? - 1 or 0 dots
[ ]* - zero or more regular spaces (\h, or \p{Zs}, or [[:blank:]] are recommended depending on the regex flavor if you only need to match horizontal whitespace)
\d{2}'\d{2} - 2 digits + ' + 2 digits.
See demo
You can use the following regex:
[A-Za-z]{3}[\.\s]?\d{1,2}\'\d{2}
Related
I created a Regex to check a string for the following situation:
first 4 chars are numbers
following by a point
following by 3 numbers
following by a point
following by 4 to 8 numbers or letters
ie: 1234.123.125B
My Regex: ^[0-9]{4}[.][0-9]{3}[.][0-9a-zA-Z]{4,8}$
But now I need a wildcard search: The Regex should also match if there is a '*' after the first 8 characters. For example:
1234.123.12* MATCH
1234.123* MATCH
1234.123.45B9* MATCH
1234.12* NO MATCH
1234.12345* NO MATCH
How can I add the wildcard search to my Regex?
Thank you
You may use this regex with alternation:
^\d{4}\.\d{3}(?:\*|\.[\da-zA-Z]{0,7}\*|\.[\da-zA-Z]{4,8})$
RegEx Demo
RegEx Details:
^: Start
\d{4}\.\d{3}: Match 4 digits + 1 dot + 3 digits
(?:\*|\.[\da-zA-Z]{0,7}\*|\.[\da-zA-Z]{4,8}): matches a single * OR a * after after a dot and 0 to 7 digits/letters OR match 4 to 8 digits/letters
$: End
My assumptions are that:
You don't allow wildcards to be mid-string
Nor do you want to allow wildcards after the full pattern (e.g.: 1234.123.12345678*).
So, alternatively you may possibily use something like:
^\d{4}\.\d{3}(?!.*\*.)(?![^*]{0,4}$)[.*][*\da-zA-Z]{0,8}$
See the online demo.
^ - Start string ancor.
\d{4}\.\d{3} - Four digits, a dot and another three digits.
(?!.*\*.) - Negative lookahead for zero or more characters followed by asterisk and another character other than newline.
(?![^*]{0,4}$) - Negative lookahead for zero to four characters other than asterisk before end string ancor.
[.*] - A literal dot or asterisk.
[*\da-zA-Z]{0,8} - Zero to eight characters from the character class.
$ - End string ancor.
I'm trying to replace 「g」 with 「k」 in some article :
this is "g1"..., and "g2" is......, last "g1034" shows that...
the end result will be
this is "k1"..., and "k2" is......, last "k1034" shows that...
I'm using the following regex to find the pattern :
"([^"]*)"
but fail to replace with success.
How can I do a replacement? thx in advance
One option is to use
"\Kg(?=\d+")
" Match "
\Kg Forget what is currenly matched using \K, then match g
(?=\d+") Positive lookahead, assert what is on the right is 1+ digits and "
And replace with k
Regex demo
Another option could be using capturing groups
(")g(\d+")
In the replacement use the 2 groups
$1k$2
Regex demo
Note that if you do not want to match only digits, you could use [^"]+ instead of \d+ to match at least 1 char after g or use * to match 0 or more chars after g
I have this scenario:
Ex1:
Valid:
12345678|abcdefghij|aaaaaaaa
Invalid:
12345678|abcdefghijk|aaaaaaaaa
Which means that between pipes the maximum length is 8. How can I make in the regex?
I put this
^(?:[^|]+{0,7}(?:\|[^|]+)?$ but it´s not working
Try the following pattern:
^.{1,8}(?:\|.{1,8})*$
The basic idea is to match between one and eight characters, followed by | and another 1 to 8 characters, that term repeated zero or more times. Explore the demo with any data you want to see how it works.
Sample data:
123
12345678
abcdefghi (no match)
12345678|abcdefgh|aaaaaaaa
12345678|abcdefghijk|aaaaaaaaa (no match)
Demo here:
Regex101
When you want to match delimited data, you should refrain from using plain unrestricted .. You need to match parts between |, so you should consider [^|] negated character class construct that matches any char but |.
Since you need to limit the number of the pattern occurrences of the negated character class, restrict it with a limiting quantifier {1,8} that matches 1 to 8 consecutive occurrences of the quantified subpattern.
Use
^[^|]{1,8}(?:\|[^|]{1,8})*$
See the regex demo.
Details
^ - start of a string
[^|]{1,8} - any 1 to 8 chars other than |
(?:\|[^|]{1,8})* - 0 or more consecutive sequences of:
\| - a literal pipe symbol
[^|]{1,8} - any 1 to 8 chars other than |
$ - end of string.
Then, the [^|] can be restricted further as per requirements. If you only need to validate a string that has ASCII letters, digits, (, ), +, ,, ., /, :, ?, whitespace and -, you need to use
^[A-Za-z0-9()+,.\/:?\s-]{1,8}(?:\|[A-Za-z0-9()+,.\/:?\s-]{1,8})*$
See another regex demo.
I need to match these values:
(First approach to a regex that roughly does what I want)
\d+([.,]\d{3})*[.,]\d{2}
like
24,56
24.56
1.234,56
1,234.56
1234,56
1234.56
but I need to not match
1.234.56
1,234,56
So somehow I need to check the last occurrence of "." or "," to not be the same as the previous "." or ",".
Background: Amounts shall be matched in English and German format with (optional) 1000-Separators.
But even with help of regex101 I completely fail at coming up with a correctly working look-behind. Any suggestions are highly appreciated.
UPDATE
Based on the answers I got so far, I came up with this (demo):
\d{1,3}(?:([\.,'])?\d{3})*(?!\1)[\.,\s]\d{2}
But it matches for example 1234.567,23 which is not desirable.
You may capture the digit grouping symbol and use a negative lookahead with a backreference to restrict the decimal separator:
^(?:\d+|\d{1,3}(?:([.,])\d{3})*)(?!\1)[.,]\d{2}$
^ ^ ^^^^^
See the regex demo
Group 1 will contain the last value of the digit grouping symbol and (?!\1)[.,] will match the other symbol.
Details:
^ - start of string
(?:\d+|\d{1,3}(?:([.,])\d{3})*) - either of the two alternatives:
\d+ - 1+ digits
| - or
\d{1,3} - 1 to 3 digits,
(?:([.,])\d{3})* - zero or more sequences of:
([.,]) - Group 1 capturing . or ,
\d{3} - 3 digits
(?!\1)[.,] - a . or , but not equal to what was last captured with ([.,]) pattern above
\d{2} - 2 digits
$ - end of string.
You can use
^\d+(([.,])\d{3})*(?!\2)[.,]\d{2}$
live demo
my_string = "2011, this year I made 750,000 dollars"
Is there an elegant way to match "2011" and "750,000" in the string above. The idea is to extract numeric values when it looks like to numeric values, i.e. \d+ or \d+[\.,]?\d* depending on the presence of a comma after
I tried this but it doesn't match exactly what I wanted, I got "2011," which is no good
library(stringr)
str_match_all(fkin, "(\\d+[\\.,]?\\d*)
Here is my expected resut:
"2011" "750,000"
You can do:
[0-9]+(?:[,.][0-9]+)*
It's very elegant, I tried it in front of a mirror.
Here is a one regex pure base R approach to extract integer or float values that are not part of the string of digits separated with a hyphen:
> str <- "2011, this year I made 750,000 dollars and 750,000-589 here"
> regmatches(str, gregexpr('(?<!\\d-)\\b\\d+(?:[,.]\\d+)?+(?!-)', str, perl=T))[[1]]
[1] "2011" "750,000"
See the IDEONE demo and a regex demo.
Since the regex contains lookarounds, you need to specify the perl=TRUE argument.
Pattern explanation:
(?<!\d-) - a negative lookbehind failing the match when a digit with a hyhen precedes the current location
\b\d+ - a word boundary (before the next digit, there cannot be a word char - letter, digit or _)
(?:[,.]\d+)?+ - a non-capturing group ((?:...)) matching 1 or 0 sequences of a comma or dot ([,.]) followed with 1 or more digits (and this sequence is matched possessively (see ?+) so that the regex engine did not check for a hyphen after \b\d+)
(?!-) - a negative loookahead that fails the match if there is a hyphen after the digits detected.