<.*>|\n.*\s.*\sid="(\w*)".*\n+|.*>\n|\n.+
and replace $1
This regex can take all id out from file
<a href="java" class="total" id="maker" placeholder="getTheResult('local6')">master6<a>
Result is maker
How can I extract getTheResult key name?
so my result will be local6
Tried <.*>|\n.*\s.*\sgetTheResult('(\w*)').*\n+|.*>\n|\n.+ but didn't helped
I assume that:
you have files with text like getTheResult('local6')
you may have several values like that on a line
you'd like to keep those text only, one value per line.
I suggest
getTheResult\('([^']*)'\)|(?:(?!getTheResult\(')[\s\S])*
and replace with $1\n. The \n will insert a newline between the values. You can then use ^\n regex (to replace with empty string) to remove empty lines.
Pattern details:
getTheResult\(' - matches getTheResult(' as a literal string (note the ( is escaped)
([^']*) - Group 1 capturing 0+ chars other than '
'\) - a literal ')
| - or
(?:(?!getTheResult\(')[\s\S])* - 0+ chars that are not starting chars of the getTheResult(' character sequence (this is a tempered greedy token).
Related
I have some strings that can be written in two different ways. I am trying to extract both patterns in the same piece of regex.
The first i'm hoping to do is extract the substring before a substring (i'll call this "endWord")
So
Title Text (Descriptor Text) endword - More words i don't want
Would turn into "Title Text (Descriptor Text)"
NEXT, of this substring i just extract, i am hoping to extract just the word before the " (" (if it exists)
So the final result will be "Title Text".
(.+?(?= endWord))(.+?(?= \()) ends in no result
You can use
^(.*?)\s+\([^()]*\)(?=\s+endword\b)
See the regex demo.
Details:
^ - start of string
(.*?) - Group 1: any zero or more chars other than line break chars as few as possible
\s+ - one or more whitespaces
\([^()]*\) - (, zero or more chars other than ( and ), and then )
(?=\s+endword\b) - a positive lookahead that requires one or more whitespaces and a whole word endword immediately to the right of the current location.
I need to do a find and delete the rest in a text file with notepad+++
i want tu use RegeX to find variations on thban..... the variable always has max 5 chars behind it(see dots).
with my search string it hit the last line but the whole line. I just want the word preserved.
When this works i also want keep the words containing C3.....
The rest of a tekst file can be delete.
It should also be caps insensitive
(?!thban\w+).*\r?\n?
\
THBANES900 and C3950 bla bla
THBAN
..THBANES901.. C3850 bla bla
THBANMP900
**..thbanes900..**
This should result in
THBANES900 C3950
THBAN
THBANES901 C3850
THBANMP900
thbanes900
Maybe just capture those words of interest instead of replacing everything else? In Notepad++ search for pattern:
^.*\b(thban\S{0,5})(?:.*(\sC3\w+))?.*$|.+
See the Online Demo
^ - Start string ancor.
.*\b - Any character other than newline zero or more times upto a word-boundary.
(- Open 1st capture group.
thban\S{0,5} - Match "thban" and zero or 5 non-whitespace chars.
) - Close 1st capture group.
(?: - Open non-capturing group.
.* - Any character other than newline zero or more times.
( - Open 2nd capture group.
\sC3\w+ - A whitespace character, match "C3" and one ore more word characters.
) - Close 2nd capture group.
)? - Close non-capturing group and make it optional.
.* - Any character other than newline zero or more times.
$ - End string ancor.
| - Alternation (OR).
.+ - Any character other than newline once or more.
Replace with:
$1$2
After this, you may end up with empty line you can switly remove using the build-in option. I'm unaware of the english terms so I made a GIF to show you where to find these buttons:
I'm not sure what the english checkbutton is for ignore case. But make sure that is not ticked.
You may use
Find What: (?|\b(thban\S{0,5})|\s(C3\w+))|(?s:.)
Replace With: (?1$1\n:)
Screenshot & settings
Details
(?| - start of a branch reset group:
\b(thban\S{0,5}) - Group 1: a word boundary, then thban and any 0 to 5 non-whitespace chars
| - or
\s(C3\w+) - a whitespace char, and then Group 1: C3 and one or more word chars
) - end of the branch reset group
| - or
(?s:.) - any one char (including line break chars)
The replacement is
(?1 - if Group 1 matched,
$1\n - Group 1 value with a newline
: - else, replace with empty string
) - end of the conditional replacement pattern
I have this two lines of text, that I want to manipulate using Regular Expression and substitute:
Obj.FieldNameA = Reader.GetEnumFromInt32<ClassName>(QueryGenerator,nameof(Obj.));
Obj.FieldNameB=Reader.GetTrimmedStringOrNull(QueryGenerator,nameof(Obj.));
Attached on the first Obj. there is a Field name, so in this case they are FieldNameA,FieldNameB
I want to attach these values to the second Obj. found on the same line, so the text should become:
Obj.FieldNameA = Reader.GetEnumFromInt32<ClassName>(QueryGenerator,nameof(Obj.FieldNameA));
Obj.FieldNameB=Reader.GetTrimmedStringOrNull(QueryGenerator,nameof(Obj.FieldNameB));
I have tested this very simple (and wrong) regex:
Obj\.(\w*).*\n
With substituition as $1
But I don't know how to use substitution...
Sample code here
Some Notes:
After FieldNameA there is always an equal sign that could be preceded or followed by a space.
Before the second Obj. there could be any character, including < ( etc...
Could this be achieved?
You may use
Find: (Obj\.(\w+).*\(Obj\.)\)
Replace: $1$2)
See the regex demo.
You may also add ^ to the start of the regex to match only at the start of a line/string.
Details
^ - start of string
(Obj\.(\w+).*\(Obj\.) - Group 1 ($1 in the replacement):
Obj\. - Obj. text
(\w+) - Group 2 ($2): 1 or more word chars
.* - any 0+ chars other than line break chars as many as possible (you may use .*? to only match the second Obj. on a line, your current input only has two with the second one closer to the end of a line, so .* will work better)
\(Obj\. - (Obj. text
\) - a ) char.
If I have the following example:
X-FileName: pallen (Non-Privileged).pst
Here is our forecast
Message-ID: <15464986.1075855378456.JavaMail.evans#thyme>
How can I select the text
Here is our forecast
after "X-FileName .... \n" until "Message-ID" execluded?
I read about lookahead and behind and tried this but didn't work:
(?<=X-FileName:(\n)+$).+(?=Message-ID:)
This should do it:
(?:X-FileName:[^\n]+)\n+([^\n]+)\n+(?:Message-ID:) (group #1 is the match)
Demo
Explanation:
(?:X-FileName:[^\n]+) matches X-Filename: followed by any number of characters that aren't newlines, without capturing it (?:).
\n+ matches any number of consecutive newlines.
([^\n]+) matches and captures any number of consecutive characters that aren't newlines.
\n+, again, matches any number of consecutive newlines.
(?:Message-ID:) matches Message-ID: without capturing it (?:).
Edit: as #WiktorStribiżew mentioned though, splitting your text into lines may be an easier/cleaner way to retrieve what you want.
There are two approaches here, and they depend on the broader context. If your expected substring is the second paragraph, just split with \n\n (or \r\n\r\n) and get the second item from the resulting list.
If it is a text inside some larger text, use a regex.
See a Python demo:
import re
s='''X-FileName: pallen (Non-Privileged).pst
Here is our forecast
Message-ID: <15464986.1075855378456.JavaMail.evans#thyme>'''
# Non-regex way for the string in the exact same format
print(s.split('\n\n')[1])
# Regex way to get some substring in a known context
m = re.search(r'X-FileName:.*[\r\n]+(.+)', s)
if m:
print(m.group(1))
The regex means:
X-FileName: - a literal substring
.* - any 0+ chars other than line break chars
[\r\n]+ - 1 or more CR or LF chars
(.+) - Group 1: one or more chars other than line break chars, as many as possible.
See the regex demo.
How do you use regex to insert | every two characters from a starting position to the end of the line?
Using regex on the following sample (tshark output of packet data), the regex inserts | after the first two characters and the next two characters, but does not apply the pattern to the rest of the line. I think the issue is with a repeated pattern on the 2nd grouping (or lackthereof).
Sample:
1478646603.255173000 10.10.10.1 0000000000000000000000
^(.{34})(..) replace with \1|\2| OR ^(.{34})(.*?(..)) replace with \1|\2
Produces this:
1478646603.255173000 10.10.10.1 00|00|000000000000000000
What I want is:
1478646603.255173000 10.10.10.1 00|00|00|00|00|00|00|00|00|00|00
You may use
(?:\G(?!^)|^.{36})\K..(?!$)
and replace with $&|.
Details:
(?:\G(?!^)|^.{36}) - matches the location at the end of the previous successful match (with \G(?!^)) or (|) the start of a line (^) and the first 36 characters other than linebreak chars (.{36})
\K - the match reset operator that discards the whole text matched so far
.. - any 2 chars other than linebreak chars
(?!$) - that are not at the end of the string.
The replacement pattern only contains the backreference to the whole match ($&) and a | pipe symbol (a literal symbol in the replacement pattern).