notepad++ regex insert value inbetween pattern - regex

I need a regex matcher to find the pattern for a list consisting of a bunch of records
all of which end with a comma.
I want to, at the first occurrence of the comma insert beginning and end h1 tags.
I tried using (.*),

This should capture everything on a line up until and including the comma:
[^,]*?,

You can use this regex:
^([^,]*),
This will locate the string before the first comma in a line. There is also capturing group that captures the text before the first comma for reference in replacement.

Try using either (.*?), or ([^,]+),. The former is preferred, but Notepad++ may not support it.

Related

Regex to find if all the characters in a word are the same specific character

I have a set of words coming in one by one like aa, ##, ???, ~~~, ?~ etc
I need a regex to find if any of these words is containing only ? or only ~.
Of the above input examples, ??? and ~~~ should match but not the others.
I tried ^[\s?]*$ and ^[\s~]*$ separately and it works, I am trying to combine them.
^[\s?||~]*$ doesn't work as it also recognizes ?~ as valid.
Any help?
You can use this regex, which looks for a string starting with a ~ or a ?, and then asserts that every other character in the string is the same as the first one using a backreference (\1):
^([~?])\1+$
Demo on regex101
You need to use backreference to achived your desired result.
If you want only ~ or ? use
^([~?])\1+$
If you want any repetitive pattern, use
^(.)\1+$
Explanation (.) or ([~?]) capturing the first charactor.
Then, \1+ checking the first charactor, one or more times (backreferencing)
You want to match lines that both start and end with any number of either a tilde or questionmark. That would be ^\(~\|?\)*$. The parentheses to make a group and the vertical bar to do the 'or' need to be backslash escaped.

RegEx help for NotePad++

I need help with RegEx I just can't figure it out I need to search for broken Hashtags which have an space.
So the strings are for Example:
#ThisIsaHashtagWith Space
But there could also be the Words "With Space" which I don't want to replace.
So important is that the String starts with "#" then any character and then the words "With Space" which I want to replace to "WithSpace" to repair the Hashtags.
I have a Document with 10k of this broken Hashtags and I'm kind of trying the whole day without success.
I have tried on regex101.com
with following RegEx:
^#+(?:.*?)+(With Space)
Even I think it works on regex101.com it doesn't in Notepad++
Any help is appreciated.
Thanks a lot.
BR
In your current regex you match a # and then any character and in a capturing group match (With Space).
You could change the capturing group to capture the first part of the match.
(#+.*?)With Space
Then you could use that group in the replacement:
$1WithSpace
As an alternative you could first match a single # followed by zero or more times any character non greedy .*? and then use \K to reset the starting point of the reported match.
Then match With Space.
#+(?:.*?)\KWith Space
In the replacement use WithSpace
If you want to match one or more times # you could use a quantifier +. If the match should start at the beginning of string you could use an anchor ^ at the start of the regex.
Try using ^(#.+?)(With\s+Space) for your regex as it also matches multiple spaces and tab characters - if you have multiple rows that you want to affect do gmi for the flags. I just tried it with the following two strings, each on a separate line in Notepad++
#blablaWith Space
#hello###$aWith Space
The replace with value is set to $1WithSpace and I've tried both replaceAll and replace one by one - seems to result in the following.
#blablaWithSpace
#hello###$aWithSpace
Feel free to comment with other strings you want replaced. Also be sure that you have selected the Regular Extension search mode in NPP.
Try this? (#.*)( ).
I tried this in Notepad++ and you should be able to just replace all with $1. Make sure you set the find mode to regular expressions first.
const str = "#ThisIsAHashtagWith Space";
console.log(str.replace(/(#.*)( )/g, "$1"));

Replace duplicates Items from a string using Regex

I have a string which looks something like this
xyz 123;abc;xyz 123;efg;
I want to remove the duplicates and keep only one occurrence in the string. I want the output to be like this
xyz 123;abc;efg;
I tried using (?<=;|^)([^;]*);(\1)+(?=;|$) but couldn't figure out how to remove one of the duplicates. Any suggestions ?
Brief
Since you didn't specify a language, I'll assume the tokens in your original regex are all working in whatever language you're using.
Code
See regex in use here
(([^;]*;).*)\2
Replace with \1
Explanation
(([^;]*;).*) Capture the following into capture group 1
([^;]*;) Capture the following into capture group 2
-[^;]* Match any character except the semi-colon character ; any number of times
; Match the semi-colon character literally
\2 Matches the same text as most recently matched by the second capture group
Thanks all for your suggestions. Finally i got this working with this regex
(?<=,|^)([^,]*)(?=.*\\b\\1\\b)(?=,|$)
The below is for java.
For duplicate words(consequent/random) you can use the regex string as
\b(\w+)\b(?=.*?\b\1\b
For duplicate characters(consequent/random) in a string you can use
(.)(?=.*?\1)

Regex: Find multiple matching strings in all lines

I'm trying to match multiple strings in a single line using regex in Sublime Text 3.
I want to match all values and replace them with null.
Part of the string that I'm matching against:
"userName":"MyName","hiScore":50,"stuntPoints":192,"coins":200,"specialUser":false
List of strings that it should match:
"MyName"
50
192
200
false
Result after replacing:
"userName":null,"hiScore":null,"stuntPoints":null,"coins":null,"specialUser":null
Is there a way to do this without using sed or any other substitution method, but just by matching the wanted pattern in regex?
You can use this find pattern:
:(.*?)(,|$)
And this replace pattern:
:null\2
The first group will match any symbol (dot) zero or more times (asterisk) with this last quantifier lazy (question mark), this last part means that it will match as little as possible. The second group will match either a comma or the end of the string. In the replace pattern, I substitute the first group with null (as desired) and I leave the symbol matched by the second group unchanged.
Here is an alternative on amaurs answer where it doesn't put the comma in after the last substitution:
:\K(.*?)(?=,|$)
And this replacement pattern:
null
This works like amaurs but starts matching after the colon is found (using the \K to reset the match starting point) and matches until a comma of new line (using a positive look ahead).
I have tested and this works in Sublime Text 2 (so should work in Sublime Text 3)
Another slightly better alternative to this is:
(?<=:).+?(?=,|$)
which uses a positive lookbehind instead of resetting the regex starting point
Another good alternative (so far the most efficient here):
:\K[^,]*
This may help.
Find: (?<=:)[^,]*
Replace: null

Regex expressions to match text between first comma and the comma before the first number

I have a csv file with all UK areas (43000 rows).
However, even though the fields are separated with commas, they are not enclosed with anything, hence if the field has commas within its contents, import to a database fails.
Fortunately, there is only one field that has commas within its content.
I need a regular expression that I could use to select this field on all rows.
Here is an example of data:
Aberaman,Rhondda, Cynon, Taf (Rhondda, Cynon, Taff),51.69N,03.43W,SO0101
Aberangell,Powys,52.67N,03.71W,SH8410
This should look like:
Aberaman,"Rhondda, Cynon, Taf (Rhondda, Cynon, Taff)",51.69N,03.43W,SO0101
Aberangell,"Powys",52.67N,03.71W,SH8410
So I need to basically select the second field, which is between the first comma and the comma just before the first number.
I will use sublime text 2 to perform this regex search.
Sublime text2 supports \K,
Regex:
^[^,]*,\K(.*?)(?=,\d)
Replacement string:
"\1"
DEMO
Explanation:
^ Asserts that we are at the start of a line.
[^,]* Matches any character not of comma zero or more times.
, Literal comma.
\K Previously matched characters would be discarded.
(.*?)(?=,\d) Matches any character zeror or more times which must be followed by , and a number. ? after * does a reluctant match.
You can try with capturing groups. Simply substitute it with $1"$2"$3 or \1"\2"\3
^(\w+,)([^\d]*)(,.*)$
Live Demo
You can do it in Notepad++ as well.
Find what: ^(\w+,)([^\d]*)(,.*)$
Replace with: $1"$2"$3
A regex which should be able to solve your problem is:
^.*?,(.*?),\d+
This matches
anything (non-greedy) up to first comma (which will not be included in result)
then anything up to second comma (which will be in a group)
and additional condition is that there has to be a number after second comma
So your group is in $1