Is it possible to do string negation in regular expressions? I need to match all strings that do not contain the string "..". I know you can use ^[^\.]*$ to match all strings that do not contain "." but I need to match more than one character. I know I could simply match a string containing ".." and then negate the return value of the match to achieve the same result but I just wondered if it was possible.
You can use negative lookaheads:
^(?!.*\.\.).*$
That causes the expression to not match if it can find a sequence of two periods anywhere in the string.
^(?:(?!\.\.).)*$
will only match if there are no two consecutive dots anywhere in the string.
Related
I can very easily write a regular expression to match a string that contains 2 consecutive repeated characters:
/(\w)\1/
How do I do the complement of that? I want to match strings that don't have 2 consecutive repeated characters. I've tried variations of the following without success:
/(\w)[^\1]/ ;doesn't work as hoped
/(?!(\w)\1)/ ;looks ahead, but some portion of the string will match
/(\w)(?!\1)/ ;again, some portion of the string will match
I don't want any language/platform specific way to take the negation of a regular expression. I want the straightforward way to do this.
The below regex would match the strings which don't have any repeated characters.
^(?!.*(\w)\1).*
(?!.*(\w)\1) negative lookahead which asserts that the string going to be matched won't contain any repeated characters. .*(\w)\1 will match the string which has repeated characters at the middle or at the start or at the end. ^(?!.*(\w)\1) matches all the starting boundaries except the one which has repeated characters. And the following .* matches all the characters exists on that particular line. Note this this matches empty strings also. If you don't want to match empty lines then change .* at the last to .+
Note that ^(?!(\w)\1) checks for the repeated characters only at the start of a string or line.
Lookahead and lookbehind, collectively called "lookaround", are zero-length assertions just like the start and end of line. They do not consume characters in the string, but only assert whether a match is possible or not. Lookaround allows you to create regular expressions that are impossible to create without them, or that would get very longwinded without them.
if for instance I have these words
john=14
adam=21
ben=11
john=18
johan=17
john=141
...
and the task is to find all occurences of john=14.
I came up with the following regular expression: .*=[^14].*\n which matches every string without a leading 1 after the equal sign.
However, I want to exactly match only john=14 in this example (and also for permutations of this example). It doesn't matter if there are one or more john=14. I thought about negation of the regular expression, such that I want to find every string that isn't equal to the one I want to find but I had a problem with the regular expression ([^\bjohn\b=14]\n).
Any help would be appreciated :)!
You need to use negative lookahead.
^(?!john=14$).*
Negative lookahead at the start asserts that the string going to be matched won't contain the exact john=14 string. If yes then match all the chars.
or
^(?!.*=14$).*
I have the string
7,456.23%
where I would like to use a regular expression to match BOTH the comma(,) and percent(%) characters and remove them so the result is
7456.23
I can figure out how to match one character or the other, but not both.
Simply use Character Classes or Character Sets
With a "character class", also called "character set", you can tell the regex engine to match only one out of several characters.
Simply place the characters you want to match between square brackets. If you want to match an a or an e, use [ae].
System.out.println("7,456.23%".replaceAll("[,%]",""));
OR try with ORing (Alternation Operator)
System.out.println("7,456.23%".replaceAll(",|%",""));
I've been trying to come up with a regular expression search string that does the following, with no luck:
string contains ipth but does not contain bipth. xipth is acceptable. The string can contain anything before or after "ipth".
Any clues?
You can use this regular expression
([^b]|^)ipth
Use a negative look-behind:
(?<!b)ipth
The regex (?<!b) means "the preceding character must not be b".
The look-behind also matches start of input, so this expression also matches ipth at start of input.
I have the sentence as below:
First learning of regular expression.
And I want to extract only First learning and expression by means of regular expressions.
Where would I start/
Regular expressions are for pattern matching, which means we'd need to know a pattern that is to be matched.
If you literally just want those strings, you'd just use First learning and expression as your patterns.
As #orique says, this is kind of pointless; you don't need RegEx for that. If you want something more complicated, you'd need to explain what you're trying to match.
Regex is not usually used to match literal text like what you're doing, but instead is used to match patterns of text. If you insist on using regex, you'll have to match the trivial expression
(First learning|expression)
As already pointed out, it is unusual to match a literal string like you are asking, but more common to match patterns such as several word characters followed by a space character etc...
Here is a pattern to match several word characters (which are a-z, A-Z, 0-9 and _) followed by a space, followed by several more word characters etc... It ends up capturing three groups. The first group will match the first two words, the second part the next to words, and the last part, the fifth word and the preceding space.
$words = "First learning of regular expression.";
preg_match(/(\w+\s\w+)\s(\w+\s\w+)(\s\w+)/, $words, $matches);
$result = matches[1]+matches[3];
I hope this matches your requirement.