RegEx help for NotePad++ - regex

I need help with RegEx I just can't figure it out I need to search for broken Hashtags which have an space.
So the strings are for Example:
#ThisIsaHashtagWith Space
But there could also be the Words "With Space" which I don't want to replace.
So important is that the String starts with "#" then any character and then the words "With Space" which I want to replace to "WithSpace" to repair the Hashtags.
I have a Document with 10k of this broken Hashtags and I'm kind of trying the whole day without success.
I have tried on regex101.com
with following RegEx:
^#+(?:.*?)+(With Space)
Even I think it works on regex101.com it doesn't in Notepad++
Any help is appreciated.
Thanks a lot.
BR

In your current regex you match a # and then any character and in a capturing group match (With Space).
You could change the capturing group to capture the first part of the match.
(#+.*?)With Space
Then you could use that group in the replacement:
$1WithSpace
As an alternative you could first match a single # followed by zero or more times any character non greedy .*? and then use \K to reset the starting point of the reported match.
Then match With Space.
#+(?:.*?)\KWith Space
In the replacement use WithSpace
If you want to match one or more times # you could use a quantifier +. If the match should start at the beginning of string you could use an anchor ^ at the start of the regex.

Try using ^(#.+?)(With\s+Space) for your regex as it also matches multiple spaces and tab characters - if you have multiple rows that you want to affect do gmi for the flags. I just tried it with the following two strings, each on a separate line in Notepad++
#blablaWith Space
#hello###$aWith Space
The replace with value is set to $1WithSpace and I've tried both replaceAll and replace one by one - seems to result in the following.
#blablaWithSpace
#hello###$aWithSpace
Feel free to comment with other strings you want replaced. Also be sure that you have selected the Regular Extension search mode in NPP.

Try this? (#.*)( ).
I tried this in Notepad++ and you should be able to just replace all with $1. Make sure you set the find mode to regular expressions first.
const str = "#ThisIsAHashtagWith Space";
console.log(str.replace(/(#.*)( )/g, "$1"));

Related

Regex to find if all the characters in a word are the same specific character

I have a set of words coming in one by one like aa, ##, ???, ~~~, ?~ etc
I need a regex to find if any of these words is containing only ? or only ~.
Of the above input examples, ??? and ~~~ should match but not the others.
I tried ^[\s?]*$ and ^[\s~]*$ separately and it works, I am trying to combine them.
^[\s?||~]*$ doesn't work as it also recognizes ?~ as valid.
Any help?
You can use this regex, which looks for a string starting with a ~ or a ?, and then asserts that every other character in the string is the same as the first one using a backreference (\1):
^([~?])\1+$
Demo on regex101
You need to use backreference to achived your desired result.
If you want only ~ or ? use
^([~?])\1+$
If you want any repetitive pattern, use
^(.)\1+$
Explanation (.) or ([~?]) capturing the first charactor.
Then, \1+ checking the first charactor, one or more times (backreferencing)
You want to match lines that both start and end with any number of either a tilde or questionmark. That would be ^\(~\|?\)*$. The parentheses to make a group and the vertical bar to do the 'or' need to be backslash escaped.

Notepad++ Regex Find all endline without periods

I'm trying to find all lines without ending period (dot) but without finding blank (empty) lines. And after that I want to add ending period to that sentence.
Example:
The good is whatever stops such things from happening.
Meaning as the Higher Good
It was from this that I drew my fundamental moral conclusions.
I have tried few regex but they also find empty lines as well.
Is there a regex for Notepad++ that can achieve that?
Enable Regular Expression match, then search for:
\S(?<!\.)\K\s*$
and replace with:
.$0
Breakdown:
\S Match a non-whitespace character
(?<!\.) It shouldn't be a period
\K Reset match
\s* Match optional whitespace characters
$ End of line
You could use something like this to find the lines that you are interested in adding capture group to it and appending you needed chars.
(?<!\.)\r\n
This works by using negative look behind (?<!\.) to check that there is no . before \r
There is a group or regex operators that can be used to accomplish this type of tasks.
Look ahead positive (?=)
Look ahead negative (?!)
Look behind positive (?<=)
Look behind negative (?
Try this short and effective solution too.
Search: \w$
Replace: $0.

Regex: ignore characters that follow

I'd like to know how can I ignore characters that follows a particular pattern in a Regex.
I tried with positive lookaheads but they do not work as they preserves those character for other matches, while I want them to be just... discarded.
For example, a part of my regex is: (?<DoubleQ>\"\".*?\"\")|(?<SingleQ>\".*?\")
in order to match some "key-parts" of this string:
This is a ""sample text"" just for "testing purposes": not to be used anywhere else.
I want to capture the entire ""sample text"", but then I want to "extract" only sample text and the same with testing purposes. That is, I want the group to match to be ""sample text"", but then I want the full match to be sample text. I partially achieved that with the use of the \K option:
(?<DoubleQ>\"\"\K.*?\"\")|(?<SingleQ>\"\K.*?\")
Which ignores the first "" (or ") from the full match but takes it into account when matching the group. How can I ignore the following "" (")?
Note: positive lookahead does not work: it does not ignore characters from the following matches, it just does not include them in the current match.
Thanks a lot.
I hope I got your questions right. So you want to match the whole string including the quotes, but you want to replace/extract it only the expression without the quotes, right?
You typically can use the regex replace functionality to extract just a part of the match.
This is the regex expression:
""?(.*?)""?
And this the replace expression:
$1

Regex: Find multiple matching strings in all lines

I'm trying to match multiple strings in a single line using regex in Sublime Text 3.
I want to match all values and replace them with null.
Part of the string that I'm matching against:
"userName":"MyName","hiScore":50,"stuntPoints":192,"coins":200,"specialUser":false
List of strings that it should match:
"MyName"
50
192
200
false
Result after replacing:
"userName":null,"hiScore":null,"stuntPoints":null,"coins":null,"specialUser":null
Is there a way to do this without using sed or any other substitution method, but just by matching the wanted pattern in regex?
You can use this find pattern:
:(.*?)(,|$)
And this replace pattern:
:null\2
The first group will match any symbol (dot) zero or more times (asterisk) with this last quantifier lazy (question mark), this last part means that it will match as little as possible. The second group will match either a comma or the end of the string. In the replace pattern, I substitute the first group with null (as desired) and I leave the symbol matched by the second group unchanged.
Here is an alternative on amaurs answer where it doesn't put the comma in after the last substitution:
:\K(.*?)(?=,|$)
And this replacement pattern:
null
This works like amaurs but starts matching after the colon is found (using the \K to reset the match starting point) and matches until a comma of new line (using a positive look ahead).
I have tested and this works in Sublime Text 2 (so should work in Sublime Text 3)
Another slightly better alternative to this is:
(?<=:).+?(?=,|$)
which uses a positive lookbehind instead of resetting the regex starting point
Another good alternative (so far the most efficient here):
:\K[^,]*
This may help.
Find: (?<=:)[^,]*
Replace: null

Regular expression to remove parenthesis and space before it

I'm trying to write a regular expression (inside a Google Spreadsheet) to remove parenthesis, the text inside the parenthesis, and space before the parenthesis. Or in other words, I'm trying to extract only the name inside of the text. For example, I'd like the string "A.J. Smith (iOS Developer, San Francisco)" to become "A.J. Smith"
So far I've gotten both =REGEXEXTRACT(D2,"[^()]*") and =REGEXEXTRACT(D2,"^[^(]+") to extract "A.J. Smith " but it leaves that last space at the end. This is probably a really easy problem to solve, I'm just not great with regex.
Just use word boundary.
=REGEXEXTRACT(D2,"^[^(]+\\b")
^[^(]+ greedily matches all the characters upto the first ( symbol including the space which exists before (. Then it backtracks to the last word boundary appears on the matched string because of \b present in the regex.
DEMO
Try this instead:
=REGEXREPLACE(D2,"\s\(.*","")
What I'm doing is replacing everything from a space next to a parenthesis to the end of the string with nothing.
I used https://regoio.herokuapp.com/ to help build a regex to match. This regex would match this example without the space. ^(.+)\s\(
The regex works like this, The ^ matches the beginning of the string, the parenthesis captures whatever expression is inside that you want to use. in this case .+ which matches any character 1 or more times. The \s matchs a whitespace character and \( matches the opening parenthesis.
If you want a regex that removes whitespace at the beginning of the string and any before the parenthesis this should work: ^[\s]*(.+)[\s]+\(
With this regex you can extract all the text you wanted in a single REGEXEXTRACT instead of using multiple ones:
=REGEXEXTRACT(D2,"^[\s]*(.+)[\s]+\(")
I found that =REGEXEXTRACT(D2,"(.*)\s\(") also worked for me.
This should work to remove all parentheses and white space before:
=REGEXTRACT(D2,"\s|\(|\)|\[|]|{|}|")
Feel free to play around with this on rubular.