search pattern in Notepad++ - regex

1) First I want to search a text with pattern such as
app(abs(something),abs(something))
in a large text using Notepad++, a sample of the text shown below:
app(abs(any length of characters here),abs(any length of characters here)),
tapp(abs(any length of characters here),abs(any length of characters here)),
app(abs(any length of characters here),app(any length of characters here)),
app(abs(any length of characters here),some(any length of characters here)),
app(abs(any length of characters here)) ,abs(any length of characters here))
when I use "app(abs((.?)),abs((.?)))" to search it finds first and second line in above sample.
The second line is not what I am searching.
what is wrong with my expression?
2) If possible ,I want the opened and closed parenthesis ( ) after each "abs" should matched, such as
"app( abs(..(..)..),abs(..(..(...)..)..) )"
but not as
"app(abs((), abs())"
where first abs has unmatched parenthesis.
Please give some advice!
Thanks in advance

Yes, you should switch Search Mode to Regular expression (at the bottom of Find dialog) and use regular expression as a pattern.
Assuming that asterisk in your pattern means any single character, you should replace * with . (matches any single character in the regular expression syntax) and put \ before each parenthesis (( and ) are special characters and have to be escaped using \). Thus, you will get:
str1\(str2\(.....\),str2\(........\)\)
To make it less ugly, you can replace 5 dots with .{5}
str1\(str2\(.{5}\),str2\(.{8}\)\)
Answer to the first part updated question
Actualy, pattern above doesn't give the results that you describe. .? matches zero or one any character and parentheses are interpreted as special symbols. Thus, your pattern matches strings like appabsX,abs.
It should be modified like this:
app\(abs\((.*)\),abs\((.*)\)\)
it finds first and second line in above sample
Actually, it finds a part of the second line between t and , and it's correct behavior. If you want to ignore such cases, you should somehow specify the beginning of string you are searching. Some examples:
^ matches the begging of line:
^app\(abs\((.*)\),abs\((.*)\)\)
(\s+) matches at least one white space character
(\s+)app\(abs\((.*)\),abs\((.*)\)\)
Also, it would be better to enable lazy matching by putting ? after *, like this:
^app\(abs\((.*?)\),abs\((.*?)\)\)

Is that possible in Notepad++?
Yes it is possible with regular expressions.
How to do it?
Take a look at that link: Regular Expressions Notepad
Look at that link if you want to learn more about learning, Building and testing regular expressions:
RegExr

Something like this:
^app\(abs\((.*?)\),abs\((.*?)\)\)
checkbox in search window ". matches new line" need unchecked.

Related

Skip Second String Between Characters with Regex

I've been working on a regex issue. I have a lot of lines formatted like this:
3240985|#Apple.-+240538|34346|346356356|36433565|6agf8s89auf
The end goal should look like this:
#Apple.-+240538|6agf8s89auf
#Apple.-+240538 is random characters, and 6agf8s89auf is random alphanumeric characters.
I've been using (.*?)[\|] and replacing the parts I need with blank characters in Notepad++ but it's impossible to complete it this way with the number of lines I have.
The regex for this kind of string is (?:(?<=^)|(?<=\|))(\d+(?:$|\|))
Demo: https://regex101.com/r/sO0fZ2/2
However Find and Replace in Notepad++ may have some issues because Notepad++ finds and replace strings only once. Some other text editors like, sublime text find and replaces the contents recursively. However you can simple overcome this by clicking Replace All button multiple times.
Input
Result after clicking "Replace All in All Opened Documents" twice
In sublime text, you can achieve this in single click:
Input
Result
P.S.: I'm not aware if there's any feature in Notepad++ that finds and replaces the content recursively. You can google for that. If there's any feature like that, then you can use it. However, I think that this shouldn't be a problem because it will only require a couple of more clicks.
There is a simple approach with an alternation:
^\d+\||\|\d+(?=\||$)
Details:
^\d+\| - Branch 1 matching a chunk of 1+ digits (\d+) at the beginning of the string (^) and a | after them
| - alternation operator meaning OR
\|\d+(?=\||$) - a literal pipe (\|, must be escaped) with 1+ digits after it (\d+) that are followed with a literal pipe or end of string ((?=...) is a positive lookahead that does not advance the regex index, thus, you can still match adjacent matches with the same pattern.)

Regular Expression for a alphanumeric after a text

This is my regular expression
(\b(serial|sheet))+(\s(number|code|no))+?\b
For the input :
Serial no
sheet no
Sheet Number
Requirement is to parse the text which contain:
Serial no : 2424ABC
Sheet No 5 (Without colon)
Sheet No : 5
Serial No = 5335ABC
How to escape a assignment character (if available) and parse the next alphanumeric character?
This should work:
(\b(serial|sheet))+(\s(number|code|no))+?\b\s*[:=#~– ]*(.*)
You can try it here : https://regex101.com/r/rO2cX1/1
To escape a assignment character, do \=.
To parse the alphanumeric characters, do [a-zA-Z0-9]* or simply \w*.
If the = is optional, you could replace the \s in the regular expression with [=\s] to allow either a space or an equals. Perhaps better and matching your example try \s=?\s*.
If may characters might be between the word and the number then perhaps use \s[-=#~_]?\s*. Note the - goes at the start, otherwise it will be interpreted as a range of characters. Namely [a-f] means [abcdef], ie any of those six characters, whereas [-af] means any of those three characters.
Hence the regular expression becomes:
(\b(serial|sheet))+(\s[-=#~_]?\s*(number|code|no))+?\b
Try the following pattern:
(serial\s+no|sheet\s*no)(\s*\:\s*)([a-z0-9]+)
Demo.
You can add further cases to the pattern in first group. I covered two cases separated by |.
You can find the alphanumeric value in last group of this pattern.
Please note that, this pattern is written as a case-insensitive pattern.

Write a wildcard that matches specific delimiter in Word

I'm writing a wildcard string in Word that should match:
{0>yadayada<}100{>yadayada<0}
Where yadayada can be anything EXCEPT the start of a new delimiter denoted by: {0>
This is what I have so far:
(\{0\>)*(\<\}100\{\>)*(\<0\})
This works except that the first '*' keeps matching tekst until it finds <}100{>yadayada<0}
I need to change it so that the * selects everything EXCEPT strings that contain '{0>'
I tried this by changing the first * with
[!(\{0>)]*
Or everything together:
(\{0\>)[!(\{0>)]*(\<\}100\{\>)*(\<0\})
But this evidently doesn't work.
Please help!
Try this:
\{0>.+?(?=\{0>)
You only need to escape the \{
What this regular expression says is:
Match all strings containging {0> then any text one or more times .+ and the ? at the end tells the regex engine to do a lazy search, since .+ will consume all characters if you let it. The lazy search says find the least amount of characters until the next part of the regex can take over.
Then the (?=\{0>) says to match the next deliminter but do not include it in selection.
Hope this helps!

Regular Expression to find matches of String series

I'm a new bee in regular expression and need help in delimiting string that follows a certain pattern.
My string will be always follow a pattern like ".(0.satQA).(1.SomewhatEnjoyable).(0.satQC).(0.ShorterThanExpected).(0.Q12).(0._1)".
My first search should return (the bold one here) (0.satQA).(1.SomewhatEnjoyable).(0.satQC).(0.ShorterThanExpected).(0.Q12).(0._1)
second as (0.satQA).(1.SomewhatEnjoyable).(0.satQC).(0.ShorterThanExpected).(0.Q12).(0._1)
Third as (0.satQA).(1.SomewhatEnjoyable).(0.satQC).(0.ShorterThanExpected).(0.Q12).(0._1)
In short, I need to delimit this into 3 parts (in this case). It should start with "(" and follow with characters (any), must include ").(" in the middle and then end with ")".
The regex for the pattern you are looking for is \(.*?\)\.\(.*?\)
.*? is a reluctant greedy quantifier, meaning that will match as it can before the next match in the regex
You also need to escape characters like . ) and (

RegEx \D matches start and end of line as well

I need to find lines that are 3 digits and 3 other characters: I thought I use the following RegEx:
^\d{3}\D{3}$
But take the following sample text file and run the RegEx above (the text must have the empty lines in it):
1
12
123xxx
123y
aaabb
The problem is that there are two matches: 123xxx (which is fine), but also 123y is matched!
I suspect the reason is that "y" + the end-of-line + the beginning-of-next-line are also matched.
How can I tell the regex engine to ignore line beginnings and endings with \D and match characters only, not positions?
The behavior of $ in UltraEdit changes depending on whether you have "Match Whole Word Only" checked or not. To get the behavior you want you need to make sure that that option is checked. Your regular expression doesn't need to change.
Maybe:
/^\d{3}\D{3}$/m
The m means
Treat string as multiple lines. That is, change "^" and "$" from matching the start or end of the string to matching the start or end of any line anywhere within the string.
http://perldoc.perl.org/perlre.html
I don't know about UltraEdit exactly but I expect it will have something similar.
Try this :
^\d{3}[\S]{3}$
Match lines with 3 digits followed by three characters that are not blank characters.