match everything but particular words using regex

match everything but particular words using regex - regex

var text = "!john david sue !jay";
I want to get all strings except words that begin with "!" like "!john" and
"!jay"...As a result i should get "david" and "sue" strings in this case.
Why doesn't this regex work?
/[^(![a-z0-9]+)]/

You can use negative lookbehind:
(?<!!)\b\w+
See Regex DEMO
Your regex does not work because your pattern is inside [^ ] (negated character set). All characters are matched literally in a negated char set i.e ( will match a literal ( instead of grouping bracket, etc.

Related

Regex: match a string at start or after some special characters

I'm using Java Pattern class to find a string "keyword" which is at the beginning of the string or after a character that is in a list of characters. For example, the list of characters is ' ' and '<', then:
match:
"keyword..."
"...<keyword..."
"... keyword..."
not match:
"...akeyword..."
I've tried all these:
"[^ <]keyword"
"[ <^]keyword"
"[\\^ <]keyword" note:for a Java/C# string backslash need to be escaped
This question is similar Match only at string start or after whitespace but with only basic skills of Regex I can't adopt it to this problem. I'v tried:
"(?<!\\S<)keyword"
"(?<!([\\S<]))keyword"
And this seems to be a very basic problem, there may be a very easy and clear way.

This should work (^|[< ])keyword
(...|...) has ^ and [< ], stating either it should be start of string of be after char(<) or char( )

You could use an alternation | in a non capturing group (?:^|[ <]) to assert either the start of the string ^ or match a space or < in a character class and use a capturing group for keyword.
(?:^|[ <])(keyword)\b
Regex demo
Or you could use a positive lookbehind (?<=...) and match only keyword
(?<=^|[< ])keyword\b
Regex demo

(^keyword |[< ^]keyword)
Write in the square brackets the character you need.

How to replace text without changing quoted string with regex

I want to replace
$this->input->post("product_name");
with
$post_data["product_name"];
I want to use notepad++ regex, but I couldn't find proper solution
In find --> $this->input->post("[\*w\]");
In replace --> $post_data["$1"];
but its not working

The $this->input->post("[\*w\]"); pattern does not work because:
$ is a special char matching the end of a line, you need to use \$ to match it as a literal char
[\*w'\] is a malformed pattern as there is no matching unescaped ] for the [ that opens a character class. Also, w just matches w, not any letter, digit or underscore, \w does that.
You may use
Find What: \$this->input->post\("(\w*)"\);
Replace With: $post_data["$1"];
If there can be any char inside double quotes use .*? instead of \w*:
Find What: \$this->input->post\("(.*?)"\);
Regulex graph:
NPP test:

Use this pattern to match desired text \$this->input->post\(("[^"]+")\);
And replace it with pattern \$post_data\[\1\]
Explanation:
\$this->input->post - matach $this->input->post literally
\(("[^"]+")\); - match (literally, then match double quates and everything between them with "[^"]+" and store inside first capturing group, then match ); literally

To replace
$this->input->post("product_name");
by
$post_data["product_name"];
do replace, with regex activated
this->input->post\("(.*)"\);
by
post_data\["\1"\];
The \x with x a number, corresponds to the x-th match catched with the parenthesis. Here we catch any character inside this->input->post(XXXX);
Don't forget to escape special character with \.
Your special characters were []()

Select text between brackets starting with ^{text to be selected} using regex

Eg: sdadjaskdhas ^{sdassad ^{/frac{}{} s dcds &} dsdsadsa} ddsfsafdsfs
Answer should be: {sdassad ^{/frac{}{} s dcds &} dsdsadsa}
it should include any character between brackets

Use backspace to escape the ^,{,} characters as they belong to regex. .* to select any number of any characters.
Here is your answer: \^\{.*\}
EDIT:
You can use this regex [^\^]\^{(\S)} , print the match, trim the string. Loop over this you find all matches in a string.

Remove all characters after a certain match

I am using Notepad++ to remove some unwanted strings from the end of a pattern and this for the life of me has got me.
I have the following sets of strings:
myApp.ComboPlaceHolderLabel,
myApp.GridTitleLabel);
myApp.SummaryLabel + '</b></div>');
myApp.NoneLabel + ')') + '</label></div>';
I would like to leave just myApp.[variable] and get rid of, e.g. ,, );, + '...', etc.
Using Notepad++, I can match the strings themselves using ^myApp.[a-zA-Z0-9].*?\b (it's a bit messy, but it works for what I need).
But in reality, I need negate that regex, to match everything at the end, so I can replace it with a blank.

You don't need to go for negation. Just put your regex within capturing groups and add an extra .*$ at the last. $ matches the end of a line. All the matched characters(whole line) are replaced by the characters which are present inside the first captured group. .
matches any character, so you need to escape the dot to match a literal dot.
^(myApp\.[a-zA-Z0-9].*?\b).*$
Replacement string:
\1
DEMO
OR
Match only the following characters and then replace it with an empty string.
\b[,); +]+.*$
DEMO

I think this works equally as well:
^(myApp.\w+).*$
Replacement string:
\1
From difference between \w and \b regular expression meta characters:
\w stands for "word character", usually [A-Za-z0-9_]. Notice the inclusion of the underscore and digits.

(^.*?\.[a-zA-Z]+)(.*)$
Use this.Replace by
$1
See demo.
http://regex101.com/r/lU7jH1/5

extract word with regular expression

I have a string 1/temperatoA,2/CelcieusB!23/33/44,55/66/77 and I would like to extract the words temperatoA and CelcieusB.
I have this regular expression (\d+/(\w+),?)*! but I only get the match 1/temperatoA,2/CelcieusB!
Why?

Your whole match evaluates to '1/temperatoA,2/CelcieusB' because that matches the following expression:
qr{ ( # begin group
\d+ # at least one digit
/ # followed by a slash
(\w+) # followed by at least one word characters
,? # maybe a comma
)* # ANY number of repetitions of this pattern.
}x;
'1/temperatoA,' fulfills capture #1 first, but since you are asking the engine to capture as many of those as it can it goes back and finds that the pattern is repeated in '2/CelcieusB' (the comma not being necessary). So the whole match is what you said it is, but what you probably weren't expecting is that '2/CelcieusB' replaces '1/temperatoA,' as $1, so $1 reads '2/CelcieusB'.
Anytime you want to capture anything that fits a certain pattern in a certain string it is always best to use the global flag and assign the captures into an array. Since an array is not a single scalar like $1, it can hold all the values that were captured for capture #1.
When I do this:
my $str = '1/temperatoA,2/CelcieusB!23/33/44,55/66/77';
my $regex = qr{(\d+/(\w+))};
if ( my #matches = $str =~ /$regex/g ) {
print Dumper( \#matches );
}
I get this:
$VAR1 = [
'1/temperatoA',
'temperatoA',
'2/CelcieusB',
'CelcieusB',
'23/33',
'33',
'55/66',
'66'
];
Now, I figure that's probably not what you expected. But '3' and '6' are word characters, and so--coming after a slash--they comply with the expression.
So, if this is an issue, you can change your regex to the equivalent: qr{(\d+/(\p{Alpha}\w*))}, specifying that the first character must be an alpha followed by any number of word characters. Then the dump looks like this:
$VAR1 = [
'1/temperatoA',
'temperatoA',
'2/CelcieusB',
'CelcieusB'
];
And if you only want 'temperatoA' or 'CelcieusB', then you're capturing more than you need to and you'll want your regex to be qr{\d+/(\p{Alpha}\w*)}.
However, the secret to capturing more than one chunk in a capture expression is to assign the match to an array, you can then sort through the array to see if it contains the data you want.

The question here is: why are you using a regular expression that’s so obviously wrong? How did you get it?
The expression you want is simply as follows:
(\w+)

With a Perl-compatible regex engine you can search for
(?<=\d/)\w+(?=.*!)
(?<=\d/) asserts that there is a digit and a slash before the start of the match
\w+ matches the identifier. This allows for letters, digits and underscore. If you only want to allow letters, use [A-Za-z]+ instead.
(?=.*!) asserts that there is a ! ahead in the string - i. e. the regex will fail once we have passed the !.
Depending on the language you're using, you might need to escape some of the characters in the regex.
E. g., for use in C (with the PCRE library), you need to escape the backslashes:
myregexp = pcre_compile("(?<=\\d/)\\w+(?=.*!)", 0, &error, &erroroffset, NULL);

Will this work?
/([[:alpha:]]\w+)\b(?=.*!)
I made the following assumptions...
A word begins with an alphabetic character.
A word always immediately follows a slash. No intervening spaces, no words in the middle.
Words after the exclamation point are ignored.
You have some sort of loop to capture more than one word. I'm not familiar enough with the C library to give an example.
[[:alpha:]] matches any alphabetic character.
The \b matches a word boundary.
And the (?=.*!) came from Tim Pietzcker's post.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

match everything but particular words using regex - regex

var text = "!john david sue !jay"; I want to get all strings except words that begin with "!" like "!john" and "!jay"...As a result i should get "david" and "sue" strings in this case. Why doesn't this regex work? /[^(![a-z0-9]+)]/

You can use negative lookbehind: (?<!!)\b\w+ See Regex DEMO Your regex does not work because your pattern is inside [^ ] (negated character set). All characters are matched literally in a negated char set i.e ( will match a literal ( instead of grouping bracket, etc.

Related

Regex: match a string at start or after some special characters

How to replace text without changing quoted string with regex

Select text between brackets starting with ^{text to be selected} using regex

Remove all characters after a certain match

extract word with regular expression

Categories

Resources