I want to match ___ except within {}:
https://regex101.com/r/PYRWIA/1
I don't understand why it match though with
/___\s*\n*(?!})/
FIrst of all there is no need to use \n in your regex since \s matches line break also.
Second issue is with use of * (0 or more occurrences) in your regex since \s* will let negative lookahead condition being met right after last dash since next character is a line break not }.
You can use any of these 2 patterns:
___(?!\s*})
___\s+(?!})
Updated RegEx Demo
Related
I have 2 variants of strings:
some_prefix.needed part*some_suffix
some_prefix.needed part
I need only 'needed part' to be matched.
Left boundary is always dot.
Right boundary is asterisk (if exists) or end of line.
Already tried:
/.*[.](.*)[*].*/ - is working for first case
/.*[.](.*)/ - is working for second case
How to do the same with one regex?
You can use
/\.([^*]+)/
See the regex demo.
Details
\. - a dot
([^*]+) - Group 1: any one or more chars other than a *.
You can also make sure you get the rightmost match by using .* before the pattern (as in the original regex):
/.*\.([^*]+)/
If supported, you might also use a lookbehind to assert a . to the left.
(?<=\.)[^*]+
The pattern matches:
(?<=\.) Positive lookbehind, assert . directly to the left
[^*]+ Match 1+ times any char except * using a negated character class
Regex demo
I have a regex
[a-zA-Z][a-z]
I have to change this regex such that the regex should not accept string that starts with "de","DE","dE" and "De" .I cannot use look behind or look ahead because my system does not support it?
There's a solution without a lookahead or lookbehind, but you need to be able to use groups.
The idea there is to create a sort of "honeypot" that will match your negative results and keep only the results that do interest you.
In your case, that would write:
[dD][eE].*|(<your-regex>)
If the proposition is de<anything> (case insensitive here), it will match, but group(1) will be null.
On the other hand, matching diZ for instance would match not match what is before the or and would therefore fall into the group(1).
Finally, if the proposition doesn't start with de and doesn't match your regex, well, there will be no groups to get at all.
If you need to be sure that your proposition will match the whole provided string, you can update the regex thus:
^(?:[dD][eE].*|(<your-regex>))$
Note that ?: is not a lookahead of any kind, it serves to mark the group as non-capturing, so that <your-regex> will still be captured by group(1) (would become group(2) otherwise and the capture of a group is not always a transparent operation, performance-wise).
Simply ignore those characters:
[a-ce-z][a-df-z][a-gi-kwxyzWZXZ]
Make sure the flag is set to case insensitive. Also, [a-gi-kwxyzWZXZ] can then be modified to [a-gi-kwxyz].
EDIT:
As pointed out in this comment, the regex here won't support other words that start with d but are not followed by e. In this case, negative lookahead is a possible solution:
^(?!de)[a-z]+
This matches anything not starting with "DE" (case insensitive, without look arounds, allowing leading whitespace):
^ *+(?:[^Dd].|.[^Ee])<your regex for rest of input>
See live demo.
The possessive quantifier *+ used for whitespace prevents [^Dd] from being allowed to match a space via backtracking, making this regex hardened against leading spaces.
You can use an alternation excluding matching the d and D from the first character, or exclude matching the e as the second character.
Note that the pattern [a-zA-Z][a-z] matches at least 2 characters, so will the following pattern:
^(?:[abce-zABCE-Z][a-z]|[a-zA-Z][a-df-z]).*
^ Start of string
(?: Non capture group
[abce-zABCE-Z][a-z] Match a char a-zA-Z without d and D followed by a lowercase char a-z
| or
[a-zA-Z][a-df-z] Match a char a-zA-Z followed by a lowercase chars a-z without e
) Close non capture grou
.* Match 0+ times any char except a newline
Regex demo
Another option is to use word boundaries \b instead of an anchor ^
\b(?:[abce-zABCE-Z][a-z]|[a-zA-Z][a-df-z])[a-zA-Z]*\b
Regex demo
I need to create regex to find last underscore in string like 012344_2.0224.71_3 or 012354_5.00123.AR_3.335_8
I have wanted find last part with expression [^.]+$ and then find underscore at found element but I can not handle it.
I hope you can help me :)
Just use a negative character class [^_] that will match everything except an underscore (this helps to ensure no other underscores are found afterwards) and end of string $
Pattern would look as such:
(_)[^_]*$
The final underscore _ is in a capturing group, so you are wanting to return the submatch. You would replace the group 1 (your underscore).
See it live: Regex101
Notice the green highlighted portion on Regex101, this is your submatch and is what would be replaced.
The simplest solution I can imagine is using .*\K_, however not all regex flavours support \K.
If not, another idea would be to use _(?=[^_]*$)
You have a demo of the first and second option.
Explanation:
.*\K_: Fetches any character until an underscore. Since the * quantifier is greedy, It will match until the last underscore. Then \K discards the previous match and then we match the underscore.
_(?=[^_]*$): Fetch an underscore preceeded by non-underscore characters until the end of the line
If you want nothing but the "net" (i.e., nothing matched except the last underscore), use positive lookahead to check that no more underscores are in the string:
/_(?=[^_]*$)/gm
Demo
The pattern [^.]+$ matches not a dot 1+ times and then asserts the end of the string. The will give you the matches 71_3 and 335_8
What you want to match is an underscore when there are no more underscores following.
One way to do that is using a negative lookahead (?!.*_) if that is supported which asserts what is at the right does not match any character followed by an underscore
_(?!.*_)
Pattern demo
Regexp problem. I'd like to have the first four strings below matching. Output should be the 3 characters between _ and . only.
Therefore these will match:
_20101_Bp16tt20_KG2.asc
_201_Bondp0_KGB.ASC
_2011_rndiep16tt20_232.AsC
_20101_odiep16tt20_ab3.ASC
and should return respectively KG2, KGB, 232, ab3.
And these will not match:
_2_ordep16tt.asc
__Bndt20_pippo_K.asc
I am able to select the whole block _KG2.asc, by doing ((?<=_)(...)(\.(?i)(asc))). However, I just want KG2. I think I should apply a positive lookbehind, but my tries all failed. Could you help me?
You could make use of \K and a positive lookahead:
_\K[A-Za-z0-9]{3}(?=\.(?i)asc$)
Regex demo
That would match
_ Match literally
\K Forget previous match
[A-Za-z0-9]{3} Match 3 times an upper/lower case character or a digit (Replace with a dot if you want to match any character)
(?=\.(?i)asc$) Positive lookahead to assert that what follows is a dot and asc in lower or uppercase and assert the end of the string
Use a lookahead as well
((?<=_)(...)(?=\.(?i)(asc)))
See https://regexr.com/40jfa
May be this expression is helping you..
'_201_Bondp0_KGB.ASC'.match(/(?<=_)(...)(?=\.)/g)
I would like to parse some syslog lines that they look like
Oct 20 16:34:59 artguard TTN-xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
I would like to turn them into
TTN-xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
So I was wondering how the regular expression should look like that would allow me to do so, since the first part will change every day, because it is appended by the syslog.
EDIT: to avoid duplicated, I am trying to use REGEX with filebeat, where no all regex are supported as explained here
Regex101
(TTN-.*$)
Debuggex Demo
Explained
1st Capturing Group (TTN-.*$)
TTN- matches the characters TTN- literally (case sensitive)
.* matches any character (except for line terminators)
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
$ asserts position at the end of a line
Global pattern flags
g modifier: global. All matches (don't return after first match)
m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)
The regular expression TTN-\S* is probably a way of doing what you're looking for, here it is in a java-script example.
var value = "Oct 20 16:34:59 artguard TTN-xxxxxxxxxxxxxxxxxxxxxxxxxxxxx";
var matches = value.match(
new RegExp("TTN-\\S*", "gi")
);
document.writeln(matches);
It works in two main parts:
The TTN- matches TTN- (obviously)
The \S* matches any character that is not a white-space, this is done as many times as possible.
Currently it is always expecting atleas a '-' after the TTN but if you repace the '-' with a '-{01}' in the regex it will expect TNN maybe a dash followed by 0-n characters that are not a white-space. You could also replace \S* with \w* to get all the letters and digits or .* to get all characters apart from end of line /n character, TNN-\S*[^\s{2}] too end the match with two spaces. Hope this was helpful.