Regex to check number of spaces after full stop - Strictly 2 required - regex

I need to check occurrences where I have put one whitespace after a full-stop, and replace it by 2 spaces. I have the Regex for it, but Atom seems to call in invalid.
(?<=\.|\") {1,}(?=[a-zA-Z])
Conditions:
1 spaces after period.
If period in with a closing double quote, then 1 space after the quote.
The above regex works perfectly for my conditions however Atom is not able to validate it. I need to use it for existing files.

You may use
([."]) ([a-zA-Z])
and replace with $1 $2. See the regex demo and a regex graph:
Details
([."]) - Group 1 (its value is referred to with $1 backreference from the replacement pattern): . or "
- a space (use \s to match any whitespace)
([a-zA-Z]) - Group 2 ($2): an ASCII letter.

Related

Pattern to match everything except a string of 5 digits

I only have access to a function that can match a pattern and replace it with some text:
Syntax
regexReplace('text', 'pattern', 'new text'
And I need to return only the 5 digit string from text in the following format:
CRITICAL - 192.111.6.4: rta nan, lost 100%
Created Time Tue, 5 Jul 8:45
Integration Name CheckMK Integration
Node 192.111.6.4
Metric Name POS1
Metric Value DOWN
Resource 54871
Alert Tags 54871, POS1
So from this text, I want to replace everything with "" except the "54871".
I have come up with the following:
regexReplace("{{ticket.description}}", "\w*[^\d\W]\w*", "")
Which almost works but it doesn't match the symbols. How can I change this to match any word that includes a letter or symbol, essentially.
As you can see, the pattern I have is very close, I just need to include special characters and letters, whereas currently it is only letters:
You can match the whole string but capture the 5-digit number into a capturing group and replace with the backreference to the captured group:
regexReplace("{{ticket.description}}", "^(?:[\w\W]*\s)?(\d{5})(?:\s[\w\W]*)?$", "$1")
See the regex demo.
Details:
^ - start of string
(?:[\w\W]*\s)? - an optional substring of any zero or more chars as many as possible and then a whitespace char
(\d{5}) - Group 1 ($1 contains the text captured by this group pattern): five digits
(?:\s[\w\W]*)? - an optional substring of a whitespace char and then any zero or more chars as many as possible.
$ - end of string.
The easiest regex is probably:
^(.*\D)?(\d{5})(\D.*)?$
You can then replace the string with "$2" ("\2" in other languages) to only place the contents of the second capture group (\d{5}) back.
The only issue is that . doesn't match newline characters by default. Normally you can pass a flag to change . to match ALL characters. For most regex variants this is the s (single line) flag (PCRE, Java, C#, Python). Other variants use the m (multi line) flag (Ruby). Check the documentation of the regex variant you are using for verification.
However the question suggest that you're not able to pass flags separately, in which case you could pass them as part of the regex itself.
(?s)^(.*\D)?(\d{5})(\D.*)?$
regex101 demo
(?s) - Set the s (single line) flag for the remainder of the pattern. Which enables . to match newline characters ((?m) for Ruby).
^ - Match the start of the string (\A for Ruby).
(.*\D)? - [optional] Match anything followed by a non-digit and store it in capture group 1.
(\d{5}) - Match 5 digits and store it in capture group 2.
(\D.*)? - [optional] Match a non-digit followed by anything and store it in capture group 3.
$ - Match the end of the string (\z for Ruby).
This regex will result in the last 5-digit number being stored in capture group 2. If you want to use the first 5-digit number instead, you'll have to use a lazy quantifier in (.*\D)?. Meaning that it becomes (.*?\D)?.
(?s) is supported by most regex variants, but not all. Refer to the regex variant documentation to see if it's available for you.
An example where the inline flags are not available is JavaScript. In such scenario you need to replace . with something that matches ALL characters. In JavaScript [^] can be used. For other variants this might not work and you need to use [\s\S].
With all this out of the way. Assuming a language that can use "$2" as replacement, and where you do not need to escape backslashes, and a regex variant that supports an inline (?s) flag. The answer would be:
regexReplace("{{ticket.description}}", "(?s)^(.*\D)?(\d{5})(\D.*)?$", "$2")

Using regex replacement in Sublime 3

I am trying to use replace in Sublime using regular expressions but I'm stuck. I tried various combinations but don't seem to be getting there.
This is the input and my desired output:
Input: N_BBP_c_46137_n
Output : BBP
I tried combinations of:
[^BBP]+\b
\*BBP*+\g
But none of the above (and many others) don't seem to work.
To turn N_BBP_c_46137_n into BBP and according to the comment just want that entire long name such as N_BBP_ to be replaced by only BBP* you might also use a capture group to keep BBP.
\bN_(BBP)_\S*
\bN_ Match N preceded by a word boundary
(BBP) Capture group 1, match BBP (or use [A-Z]+ to match 1+ uppercase chars)
_\S* Match _ followed by 0+ times a non whitespace char
In the replacement use the first capturing group $1
Regex demo
You may use
(N_)[^_]*(_c_\d+_n)
Replace with ${1}some new value$2.
Details
(N_) - Group 1 ($1 or ${1} if the next char is a digit): N_
[^_]* - any 0 or more chars other than _
-(_c_\d+_n) - Group 2 ($2): _c_, 1 or more digits and then _n.
See the regex demo.

Removing everything but the regex result (Notepad++) [duplicate]

If i have a big text, and i'm needind to keep only matched content, how can i do that?
For example, if I have a text like this:
asdas8Isd8m8Td8r
asdia8y8dasd
asd8is88n8gd
asd8t8od8lsdas
as9ea9ad8r1n88r8e87g6765ejasdm8x
And use this regex: [0-9]([a-z]) to group all letters after a number and replace with \1 i will repace all (number)(letter) to (letter) (And if i want to delete the rest and stay only with the letter matched)?...
Converting this text to
ImTr
y
ing
tol
earnregex
How can i replace this text with grouped and delete the rest?
And if i want to delete all but no matched?
In this case, converting the text to:
8I8m8T8r
8y8d
8i8n8g
8t8o8l
9e9a9r1n8r7g5e8x
Can i match all that is not [0-9]([a-z])?
Thanks! :D
You may use the following regex:
(?i-s)[0-9]([a-z])|.
Replace with (?{1}$1:).
To delete all but non-matched, use the (?{1}$0:) replacement with the same regex.
Details:
(?i-s) - an inline modifier turning on case insensitive mode and turning off the DOTALL mode (. does not match a newline)
[0-9]([a-z]) - an ASCII digit and any ASCII letter captured into Group 1 (later referred to with $1 or \1 backreference from the string replacement pattern)
| - or
. - any char but a line break char.
Replacement details
(?{1} - start of the conditional replacement: if Group 1 matched then...
$1 - the contents of Group 1 (or the whole match if $0 backreference is used)
: - else... nothing
) - end of the conditional replacement pattern.

Regular expressions in notepad++ (Search and Replace)

I have a list of thousands of records within a .txt document.
some of them look like these records
201910031044 "00059" "11.31AG" "Senior Champion"
201910031044 "00060" "GBA146" "Junior Champion"
201910031044 "00999" "10.12G" "ProAM"
201910031044 "00362" "113.1LI" "Abcd"
Whenever a record similar to this occurs I'd like to get rid of the last words/numbers/etc in the last quotation marks (like "Senior Champion", "Junior Champion" etc. There are many possibilities here)
e.g. (before)
201910031044 "00059" "11.31AG" "Senior Champion"
after
201910031044 "00059" "11.31AG"
I tried the following regex but it wouldn't work.
Search: ^([0-9]{17,17} + "[0-9]{8,8}" + "[a-zA-Z0-9]").*$
Replace: \1 (replace string)
OK I forgot the . (dot) sign however even if I do not have a . (dot) sign it would not work. Not sure if it has anything to do when using the + sign used more than once.
I'd like to get rid of the last words/numbers/etc in the last quotation marks
This does the job:
Ctrl+H
Find what: ^.+\K\h+".*?"$
Replace with: LEAVE EMPTY
CHECK Wrap around
CHECK Regular expression
UNCHECK . matches newline*
Replace all
Explanation:
^ # beginning of line
.+ # 1 or more any character but newline
\K # forget all we have seen until this position
\h+ # 1 or more horizontal spaces
".*?" # something inside quotes
$ # end of line
Screen capture (before):
Screen capture (after):
The RegEx looks for the 4th double quote:
^(?:[^"]*\"){4}([^|]*)
You can see this demo: https://regex101.com/r/wJ9yS6/163
You will still need to parse the lines, so probably easier opening in excel or parsing using code as a CSV.
You have a problem with the count of your characters:
you specify that the line should start with exactly 17 digits ([0-9]{17,17}). However, there are only 12 digits in the data 201910031044.
you can specify exactly 12 digits by using {12} or if it could be 12-17, then {12,17}. I'll assume exactly 12 based on the current data.
similarly, for the second column you specify that it's exactly 8 digits surrounded by quotes ("[0-9]{8,8}") but it only has 5 digits surrounded by quotes.
again, you can specify exactly 5 with {5} or 5-8 with {5,8}. I will assume exactly 5.
finally, there is no quantifier for the final field, so the regex tries to match exactly one character that is a letter or a number surrounded by quotes "[a-zA-Z0-9]".
I'm not sure if there is any limit on the number of characters, so I would go with one or more using + as quantifier "[a-zA-Z0-9]+" - if you can have zero or more, then you can use *, or if it's any other count from m to n, then you can use {m,n} as before.
Not a character count problem but the final column can also have dots but the regex doesn't account for. You can just add . inside the square brackets and it will only match dot characters. It's usually used as a wildcard but it loses its special meaning inside a character class ([]), so you get "[a-zA-Z0-9.]+"
Putting it all together, you get
Search: ^([0-9]{12} + "[0-9]{5}" + "[a-zA-Z0-9.]+").*$
Replace: \1
Which will get rid of anything after the third field in Notepad++.
This can be shortened a bit by using \d instead of [0-9] for digits and \s+ for whitespace instead of +. As a benefit, \s will also match other whitespace like tabs, so you don't have to manually account for those. This leads to
Search: ^(\d{12}\s+"\d{5}"\s+"[a-zA-Z0-9.]+").*$
Replace: \1
If you want to get rid of the last words/numbers/etc in the last quotation marks you could capture in a group what is before that and match the last quotation marks and everything between it to remove it using a negated character class.
If what is between the values can be spaces or tabs, you could use [ \t]+ to match those (using \s could also match a newline)
Note that {17,17} and {8,8} may also be written as {17} and {8} which in this case should be {12} and {5}
^([0-9]{12}[ \t]+"[0-9]{5}"[ \t]+"[a-zA-Z0-9.]+")[ \t]{2,}"[^"\r\n]+"
In parts
^ Start of string
( Capture group 1
[0-9]{12}[ \t]+ Match 12 digits and 1+ spaces or tabs
"[0-9]{5}"[ \t]+ Match 5 digits between " and 1+ spaces or tabs
"[a-zA-Z0-9.]+" Match 1+ times any of the listed between "
) Close group
[ \t]{2,} Match 1+ times
"[^"\r\n]+"
In the replacement use group 1 $1
Regex demo
Before
After

Regular expression for substitute a string with another

I have this two lines of text, that I want to manipulate using Regular Expression and substitute:
Obj.FieldNameA = Reader.GetEnumFromInt32<ClassName>(QueryGenerator,nameof(Obj.));
Obj.FieldNameB=Reader.GetTrimmedStringOrNull(QueryGenerator,nameof(Obj.));
Attached on the first Obj. there is a Field name, so in this case they are FieldNameA,FieldNameB
I want to attach these values to the second Obj. found on the same line, so the text should become:
Obj.FieldNameA = Reader.GetEnumFromInt32<ClassName>(QueryGenerator,nameof(Obj.FieldNameA));
Obj.FieldNameB=Reader.GetTrimmedStringOrNull(QueryGenerator,nameof(Obj.FieldNameB));
I have tested this very simple (and wrong) regex:
Obj\.(\w*).*\n
With substituition as $1
But I don't know how to use substitution...
Sample code here
Some Notes:
After FieldNameA there is always an equal sign that could be preceded or followed by a space.
Before the second Obj. there could be any character, including < ( etc...
Could this be achieved?
You may use
Find: (Obj\.(\w+).*\(Obj\.)\)
Replace: $1$2)
See the regex demo.
You may also add ^ to the start of the regex to match only at the start of a line/string.
Details
^ - start of string
(Obj\.(\w+).*\(Obj\.) - Group 1 ($1 in the replacement):
Obj\. - Obj. text
(\w+) - Group 2 ($2): 1 or more word chars
.* - any 0+ chars other than line break chars as many as possible (you may use .*? to only match the second Obj. on a line, your current input only has two with the second one closer to the end of a line, so .* will work better)
\(Obj\. - (Obj. text
\) - a ) char.