Regular Expression not Matching Entire Input Text as Expected - regex

I have the following regular expression:
(=)(?<!\\\\)(')(.*?)(?<!\\\\)(')(.*?)
Which should match an equal sign followed by any set of characters between single quotes and then anything that comes after.
But when I test it with the sample text ='abc'xyz, it only matches ='abc'.
I also tested the code here: https://regexr.com/61gof
Any ideas as to why that is?

The ? makes the last (.*?) to match lazily, so matching as few character as possible, which will be 0. Remove the ? or put a $ at the end of the regex telling it it should match until the end of the line (if that is what you want).

Related

Regex matching extra characters

using: this tool to evaluate my expression
My test string: "Little" Timmy (tim) McGraw
my regex:
^[()"]|.["()]
It looks like I'm properly catching the characters I want but my matches are including whatever character comes just before the match. I'm not sure what, or if anything, I'm doing wrong to be catching the preceding characters like that? The goal is to capture characters we don't want in the name field of one of our systems.
Brief
Your current regex ^[()"]|.["()] says the following:
^[()"]|.["()] Match either of the following
^[()"] Match the following
^ Assert position at the start of the line
[()"] Match any character present in the list ()"
.["()] Match the following
. Match any character (this is the issue you were having)
["()] Match any character present in the list "()
Code
You can actually shorten your regex to just [()"].
Ultimately, however, it would be much easier to create a negated set that determines which characters are valid rather than those that are invalid. This approach would get you something like [^\w ]. This means match anything not present in the set. So match any non-word and non-space characters (in your sample string this will match the symbols ()" since they are not in the set).

Regex: ignore characters that follow

I'd like to know how can I ignore characters that follows a particular pattern in a Regex.
I tried with positive lookaheads but they do not work as they preserves those character for other matches, while I want them to be just... discarded.
For example, a part of my regex is: (?<DoubleQ>\"\".*?\"\")|(?<SingleQ>\".*?\")
in order to match some "key-parts" of this string:
This is a ""sample text"" just for "testing purposes": not to be used anywhere else.
I want to capture the entire ""sample text"", but then I want to "extract" only sample text and the same with testing purposes. That is, I want the group to match to be ""sample text"", but then I want the full match to be sample text. I partially achieved that with the use of the \K option:
(?<DoubleQ>\"\"\K.*?\"\")|(?<SingleQ>\"\K.*?\")
Which ignores the first "" (or ") from the full match but takes it into account when matching the group. How can I ignore the following "" (")?
Note: positive lookahead does not work: it does not ignore characters from the following matches, it just does not include them in the current match.
Thanks a lot.
I hope I got your questions right. So you want to match the whole string including the quotes, but you want to replace/extract it only the expression without the quotes, right?
You typically can use the regex replace functionality to extract just a part of the match.
This is the regex expression:
""?(.*?)""?
And this the replace expression:
$1

search pattern in Notepad++

1) First I want to search a text with pattern such as
app(abs(something),abs(something))
in a large text using Notepad++, a sample of the text shown below:
app(abs(any length of characters here),abs(any length of characters here)),
tapp(abs(any length of characters here),abs(any length of characters here)),
app(abs(any length of characters here),app(any length of characters here)),
app(abs(any length of characters here),some(any length of characters here)),
app(abs(any length of characters here)) ,abs(any length of characters here))
when I use "app(abs((.?)),abs((.?)))" to search it finds first and second line in above sample.
The second line is not what I am searching.
what is wrong with my expression?
2) If possible ,I want the opened and closed parenthesis ( ) after each "abs" should matched, such as
"app( abs(..(..)..),abs(..(..(...)..)..) )"
but not as
"app(abs((), abs())"
where first abs has unmatched parenthesis.
Please give some advice!
Thanks in advance
Yes, you should switch Search Mode to Regular expression (at the bottom of Find dialog) and use regular expression as a pattern.
Assuming that asterisk in your pattern means any single character, you should replace * with . (matches any single character in the regular expression syntax) and put \ before each parenthesis (( and ) are special characters and have to be escaped using \). Thus, you will get:
str1\(str2\(.....\),str2\(........\)\)
To make it less ugly, you can replace 5 dots with .{5}
str1\(str2\(.{5}\),str2\(.{8}\)\)
Answer to the first part updated question
Actualy, pattern above doesn't give the results that you describe. .? matches zero or one any character and parentheses are interpreted as special symbols. Thus, your pattern matches strings like appabsX,abs.
It should be modified like this:
app\(abs\((.*)\),abs\((.*)\)\)
it finds first and second line in above sample
Actually, it finds a part of the second line between t and , and it's correct behavior. If you want to ignore such cases, you should somehow specify the beginning of string you are searching. Some examples:
^ matches the begging of line:
^app\(abs\((.*)\),abs\((.*)\)\)
(\s+) matches at least one white space character
(\s+)app\(abs\((.*)\),abs\((.*)\)\)
Also, it would be better to enable lazy matching by putting ? after *, like this:
^app\(abs\((.*?)\),abs\((.*?)\)\)
Is that possible in Notepad++?
Yes it is possible with regular expressions.
How to do it?
Take a look at that link: Regular Expressions Notepad
Look at that link if you want to learn more about learning, Building and testing regular expressions:
RegExr
Something like this:
^app\(abs\((.*?)\),abs\((.*?)\)\)
checkbox in search window ". matches new line" need unchecked.

Regex: Find multiple matching strings in all lines

I'm trying to match multiple strings in a single line using regex in Sublime Text 3.
I want to match all values and replace them with null.
Part of the string that I'm matching against:
"userName":"MyName","hiScore":50,"stuntPoints":192,"coins":200,"specialUser":false
List of strings that it should match:
"MyName"
50
192
200
false
Result after replacing:
"userName":null,"hiScore":null,"stuntPoints":null,"coins":null,"specialUser":null
Is there a way to do this without using sed or any other substitution method, but just by matching the wanted pattern in regex?
You can use this find pattern:
:(.*?)(,|$)
And this replace pattern:
:null\2
The first group will match any symbol (dot) zero or more times (asterisk) with this last quantifier lazy (question mark), this last part means that it will match as little as possible. The second group will match either a comma or the end of the string. In the replace pattern, I substitute the first group with null (as desired) and I leave the symbol matched by the second group unchanged.
Here is an alternative on amaurs answer where it doesn't put the comma in after the last substitution:
:\K(.*?)(?=,|$)
And this replacement pattern:
null
This works like amaurs but starts matching after the colon is found (using the \K to reset the match starting point) and matches until a comma of new line (using a positive look ahead).
I have tested and this works in Sublime Text 2 (so should work in Sublime Text 3)
Another slightly better alternative to this is:
(?<=:).+?(?=,|$)
which uses a positive lookbehind instead of resetting the regex starting point
Another good alternative (so far the most efficient here):
:\K[^,]*
This may help.
Find: (?<=:)[^,]*
Replace: null

Write a wildcard that matches specific delimiter in Word

I'm writing a wildcard string in Word that should match:
{0>yadayada<}100{>yadayada<0}
Where yadayada can be anything EXCEPT the start of a new delimiter denoted by: {0>
This is what I have so far:
(\{0\>)*(\<\}100\{\>)*(\<0\})
This works except that the first '*' keeps matching tekst until it finds <}100{>yadayada<0}
I need to change it so that the * selects everything EXCEPT strings that contain '{0>'
I tried this by changing the first * with
[!(\{0>)]*
Or everything together:
(\{0\>)[!(\{0>)]*(\<\}100\{\>)*(\<0\})
But this evidently doesn't work.
Please help!
Try this:
\{0>.+?(?=\{0>)
You only need to escape the \{
What this regular expression says is:
Match all strings containging {0> then any text one or more times .+ and the ? at the end tells the regex engine to do a lazy search, since .+ will consume all characters if you let it. The lazy search says find the least amount of characters until the next part of the regex can take over.
Then the (?=\{0>) says to match the next deliminter but do not include it in selection.
Hope this helps!