For example, for this string I want to match all A and a:
"All the apples make good cake."
Here's what I did: /(.)[^.]*\1*/ig
I started by getting the first character in the group, which can be any character: (.) Then I added [^.]* because I don't want to match any other character that isn't the first one. Finally I added \1* because I wanted to match the first character again. All other similar variations that I've tried don't seem to work.
The regex you are trying to build would capture very first character then any thing up to the same character as much as possible, using a negative lookahead (tempered dot):
(?i)(\w)(?:(?!\1).)*
Capturing group 1 holds the character you need. Try it on a live demo.
If regex engine supports \K match re-setter token then you can append it to the regex above to only match desired part:
(?i)(\w)(?:(?!\1).)*\K
Related
I just started learning regex and I'm trying to understand how it possible to do the following:
If I have:
helmut_rankl:20Suzuki12
helmut1195:wasserfall1974
helmut1951:roller11
Get:
helmut_rankl:20Suzuki1
helmut1195:wasserfall197
helmut1951:roller1
I tried using .$ which actually match the last character of a string, but it doesn't match letters and numbers.
How do I get these results from the input?
You could match the whole line, and assert a single char to the right if you want to match at least a single character.
.+(?=.)
Regex demo
If you also want to match empty strings:
.*(?=.)
This will do what you want with regex's match function.
^(.*).$
Broken down:
^ matches the start of the string
( and ) denote a capturing group. The matches which fall within it are returned.
.* matches everything, as much as it can.
The final . matches any single character (i.e. the last character of the line)
$ matches the end of the line/input
I have a string and would like to match a part of it.
The string is Accept: multipart/mixedPrivacy: nonePAI: <sip:4168755400#1.1.1.238>From: <sip:4168755400#1.1.1.238>;tag=5430960946837208_c1b08.2.3.1602135087396.0_1237422_3895152To: <sip:4168755400#1.1.1.238>
I want to match PAI: <sip:4168755400#
the whitespace can be a word so i would like to use .* but if i used that it matches most of the string
The example on that link is showing what i'm matching if i use the whitespace instead of .*
(PAI: <sip:)((?:\([2-9]\d{2}\)\ ?|[2-9]\d{2}(?:\-?|\ ?))[2-9]\d{2}[- ]?\d{4})#
The example on that link is showing what i'm trying to achieve with .* but it should only match PAI: <sip:4168755400#
(PAI:.*<sip:)((?:\([2-9]\d{2}\)\ ?|[2-9]\d{2}(?:\-?|\ ?))[2-9]\d{2}[- ]?\d{4})#
I tried lookaround but failing.
Any idea?
thanks
Matching the single space can be updated by using a character class matching either a space or a word character and repeat that 1 or more times to match at least a single occurrence.
Note that you don't have to escape the spaces, and in both occasions you can use an optional character class matching either a space or hyphen [ -]?
If you want the match only, you can omit the 2 capturing groups if you want to.
(PAI:[ \w]+<sip:)((?:\([2-9]\d{2}\) ?|[2-9]\d{2}[ -]?)[2-9]\d{2}[- ]?\d{4})#
Regex demo
The regex should be like
PAI:.*?(<sip:.*?#)
Explanation:
PAI:.*? find the word PAI: and after the word it can be anything (.*) but ? is used to indicate that it should match as few as possible before it found the next expression.
(<sip:.*?#) capturing group that we want the result.
<sip:.*?# find <sip: and after the word it can be anything .*? before it found #.
Example
I am looking for a RegExp to find and replace all instances of last digits in a line with the same digit and a full stop in Memsource, which seems to be not working properly.
Example:
Pic. 12-1
Pic. 12-2
Pic. 12-3
To:
Pic. 12-1.
Pic. 12-2.
Pic. 12-3.
I've chosen them by \d$ but when I try to replace them with a \., .$ etc. it seems not to be working properly. Any advice would be much appreciated. Thanks!
As #WiktorStribiżew stated in the comments, you can use (\d)$ as the pattern to match and \1. as the string to replace it with. A quick breakdown of how this works is:
(\d) Matches any digit and captures it in group 1
$ Matches the end of the line
\1. Replaces the the matched string with the first capture group followed by a period
Resulting in (\d)$ -> \1.
However, is it necessary to even match the digit? Would the following substitution suffice $ -> .? This would simply add the . to the end of each line. The only issue would be that it would not discriminate whether or not the line ends with a digit.
If it must end with a digit to receive a period, you can also avoid using capture groups by using a positive look-behind. In this case the pattern to match would be (?<=\d)$ and the replacement pattern would be ..
(?<=\d) Is a positive look-behind that checks if the there is a digit before the current character without consuming any characters.
(?<=\d)$ Checks to make sure the character at the end of the line is a digit with consuming that character (i.e. that character will not be replaced).
The resulting replacement would then be (?<=\d)$ -> . which would add a period to each line that ends with a digit without the need for capture groups.
Further reading:
Grouping and Capturing
Lookahead and Lookbehind
I need to create regex to find last underscore in string like 012344_2.0224.71_3 or 012354_5.00123.AR_3.335_8
I have wanted find last part with expression [^.]+$ and then find underscore at found element but I can not handle it.
I hope you can help me :)
Just use a negative character class [^_] that will match everything except an underscore (this helps to ensure no other underscores are found afterwards) and end of string $
Pattern would look as such:
(_)[^_]*$
The final underscore _ is in a capturing group, so you are wanting to return the submatch. You would replace the group 1 (your underscore).
See it live: Regex101
Notice the green highlighted portion on Regex101, this is your submatch and is what would be replaced.
The simplest solution I can imagine is using .*\K_, however not all regex flavours support \K.
If not, another idea would be to use _(?=[^_]*$)
You have a demo of the first and second option.
Explanation:
.*\K_: Fetches any character until an underscore. Since the * quantifier is greedy, It will match until the last underscore. Then \K discards the previous match and then we match the underscore.
_(?=[^_]*$): Fetch an underscore preceeded by non-underscore characters until the end of the line
If you want nothing but the "net" (i.e., nothing matched except the last underscore), use positive lookahead to check that no more underscores are in the string:
/_(?=[^_]*$)/gm
Demo
The pattern [^.]+$ matches not a dot 1+ times and then asserts the end of the string. The will give you the matches 71_3 and 335_8
What you want to match is an underscore when there are no more underscores following.
One way to do that is using a negative lookahead (?!.*_) if that is supported which asserts what is at the right does not match any character followed by an underscore
_(?!.*_)
Pattern demo
I am quite stuck with a regex I can't get to work. It should capture everything except digits and the word fiktiv (not single characters of it!). Objective is to get rid of this content.
I have tried something like (?!\d|fiktiv).* on my sample string 123456788daswqrt fiktiv
https://regex101.com/r/kU8mF3/1
However this does match the fiktiv at the end as well.
One possibility would be to use a neglected character class, which can be used by putting a ^ in [] braces. So you basically say don't match digits, and as many non digits as you can get until a space occurs and the word fiktiv appears.
This capturing will be "saved" in the capturing group 1 for later use.
([^\d]+)\s+fiktiv
Testing could be done here:
https://regex101.com/
It should capture everything except digits and the word fiktiv (not single characters of it!). Objective is to get rid of this content.
So, you want to remove any character that is not a digit (that is, \D or [^0-9] pattern) and not a fiktiv char sequence.
You may use a regex with a capturing group and alternation:
(fiktiv)|[^0-9]
and replace with the contents of Group 1 using a $1 backreference, fiktiv, to restore it in the replaced string.
See the regex demo
C# implementation:
Regex.Replace(input, "(fiktiv)|[^0-9]", "$1")
Also, see Use RegEx in SQL with CLR Procs.