I am looking for a regex which does the following:
//SPECIAL_WORD some text -> Should match
//SPECIAL_WORD (123456) -> Should match
//SPECIAL_WORD 123456 -> Should NOT match
=> Basically anything other then 'SPECIAL_WORD blank 6 digits' should match if the SPECIAL_WORD is found.
I found how I can match the positive case SPECIAL_WORD\s\d{6}
I tried the positive lookahead however didn't get it to work: (?!SPECIAL_WORD\s\d{6}). I also tried to negate the whole thing \b(?=\w)(?!SPECIAL_WORD\s\d{6})\b(\w*)however then everything else is matched...
Any ideas?
You should match SPECIAL_WORD then go for a negative lookahead:
\bSPECIAL_WORD\s(?!\d{6}\b)
\b assures that both ends are not part of a word. You may not need them.
Live demo
Related
New to RegEx, PCRE(PHP), have a basic question:
Text String I'm working with is below, text is literal
us%3Aks%2Cus%3Aal%2Cus%3Aok%2Cus%3Aia%2Cus%3Ala%2Cus%3Asc%2Cus%3Aut%2Cus%3Act%2Cus%3Aor%2Cus%3Atn%2Cus%3Amo%2Cus%3Aaz%2Cus%3Ain%2Cus%3Amd%2Cus%3Aco%2Cus%3Awi%2Cus%3Awa
Goal for getting the first is to get everything up to the first %2C and the first %2C -> "us%3Aks%2C"
Goal for getting the last is to get the the last %2C and everything after it. -> "%2Cus%3Awa"
What am I doing wrong with my attempts?
1. ^(.+%2C)
2. (%2C.+)$
You may use this regex with a lazy match and a greedy match:
^(.*?%2C).+(%2C.*)$
RegEx Demo
RegEx Details:
^: Start
(.*?%2C): Match 0 or more characters followed by %2C (lazy match) in group #1
.+: Match 1 or more of any characters (greedy match)
(%2C.*): Match %2C followed by 0 or more characters in group #2
$: End
It's a matter of greediness, which controls how many characters the expression will gobble before being satisfied. So, instead of using .+, you could use .*?.
For your case (1), the expression becomes:
1. ^(.*?%2C)
For your second case, unfortunately, purely lazy matching will not help, but we will have to actually skip most of the string in advance, with a very greedy .+, so the second expression becomes something like:
2. .+(%2C.+)$
Regexp problem. I'd like to have the first four strings below matching. Output should be the 3 characters between _ and . only.
Therefore these will match:
_20101_Bp16tt20_KG2.asc
_201_Bondp0_KGB.ASC
_2011_rndiep16tt20_232.AsC
_20101_odiep16tt20_ab3.ASC
and should return respectively KG2, KGB, 232, ab3.
And these will not match:
_2_ordep16tt.asc
__Bndt20_pippo_K.asc
I am able to select the whole block _KG2.asc, by doing ((?<=_)(...)(\.(?i)(asc))). However, I just want KG2. I think I should apply a positive lookbehind, but my tries all failed. Could you help me?
You could make use of \K and a positive lookahead:
_\K[A-Za-z0-9]{3}(?=\.(?i)asc$)
Regex demo
That would match
_ Match literally
\K Forget previous match
[A-Za-z0-9]{3} Match 3 times an upper/lower case character or a digit (Replace with a dot if you want to match any character)
(?=\.(?i)asc$) Positive lookahead to assert that what follows is a dot and asc in lower or uppercase and assert the end of the string
Use a lookahead as well
((?<=_)(...)(?=\.(?i)(asc)))
See https://regexr.com/40jfa
May be this expression is helping you..
'_201_Bondp0_KGB.ASC'.match(/(?<=_)(...)(?=\.)/g)
Think about two different highlight match in vim
Pattern 1.
syn match match1 /\$[^$ ]+\$/
Match $foo$, $bar$, $123$
Pattern 2.
syn match match2 /(\w+\|(\$[^\$]+\$)\#=)+__/
I want it match foo$bar$__ but not $bar$
The problem is Pattern1 will conflict with Pattern2.
I'm trying to use Positive Lookahead to bypass Pattern1 in Pattern2,
but the prefix __ (Double underscores) destroy the behavior of Positive lookahead.
How do I solve this issue? or i'm doing something wrong !?
Update:
Sorry for bad explanation.
Pattern 1 match any string surrounded by two dollar signs
syn match match1 /\$[^$ ]\+\$/
-> $foo$, $bar$
Pattern 2 match any string end with double underscores BUT match still but exclude any string that match as Pattern1.
syn match match2 /\(\w\+\|\(\$[^\$]\+\$\)\#=\)\+__/
-> hello__, world__
so the problem is when I add any string related to pattern 1
hello$foo$__
in this case. I want hello AND __ match with pattern 1(Continuous)
but also let $foo$ match with pattern 2.
I don't think you understand what lookahead does. It looks like what you're trying to do is to match a string, but skip over parts of it:
foo$bar$___
^^^~~~~~^^^
... where the parts marked ^ form the match proper (discontinous) and the parts marked ~ are skipped over.
This is not possible with a regex. A regex always matches a continuous piece of string.
What lookahead does is it lets you "peek ahead": It matches a sub-regex as usual, but does not move the current position within the string. Depending on where you put the lookahead, this lets you either check text beyond the end of the match or make sure the same string is matched by two regexes simultaneously (although the latter can also be done with \& in vim).
Example:
\%(foo\)\#=bar
This can never match. It requires the next three characters to be both foo and bar at the same time, which is impossible.
I think what you're looking for is overlapping matches. Vim supports this directly:
syn match match1 /\$[^$ ]\+\$/
syn match match2 /\%(\w\|\$[^$ ]\+\$\)\+__/ contains=match1
Here we're saying matches of match2 can contain matches of match1. This gives you the highlighting you want.
Here is string:
$$START$$ should be matching along with $$MIDDLE$$
$$NOTMATCH$$ this should NOT be matching
$$LAST$$ this should be matching
In the above paragraph, I need to build a regex which can match all the Keywords($$[a-zA-Z]$$) except $$NOTMATCH$$
Until now, I have tried (?!\$\$NOTMATCH\$\$)(\$\$([^\$\$]+)\$\$) but It is not properly working and is not considering the $$ symbols in the end of Keyword, demo here.
Any suggestions are welcome.
Thanks in advance
I need to build a regex which can match all the Keywords ($$[a-zA-Z]$$) except $$NOTMATCH$$
You can use negative lookahead in the middle as this:
(?<!\$)\$\$(?!NOTMATCH)[^$\s]+\$\$(?!\$)
RegEx Demo
(?!NOTMATCH) is negative lookahead that will fail the match if we have NOTMATCH between $$ characters.
(?<!\$) is negative lookbehind to ensure we don't have $ before our match.
(?<\$) is negative lookahead to ensure we don't have $ after our match.
I need a RegEx that matches when the string is not 1234, not 6789, and not blank.
1234 -> not a match
6789 -> not a match
[blank] -> not a match
abc -> match
5431 -> match
The RegEx engine is the one bundled in the JDK 6, if that matters.
Thanks
Try using negative look aheads:
^(?!.*1234.*$)(?!.*5677.*$)(?!=\s*$).+
This negative lookahead should work:
^(?!.*?\b(1234|5677)\b).+$
Word boundaries \b is to make sure that you don't disallow 11234 and 56777 etc.
.+ will make sure to not to match blank input.