Anyone would kindly help with a regex for Notepad++ to replace Word with #Word (only after the first occurrence of #)?
#Celebrity #Glad #Known #Lord Byron #British #Poet
should become
#Celebrity #Glad #Known #Lord #Byron #British #Poet
^
To replace Word with #Word only after the first occurrence of #, you could use an alternation:
Find what
(?>^[^#]*#\w+\h*|#\w+\h*|\G)\K(\w+\h*)
Replace with
#\1
Regex demo
Explanation
(?> Atomic group
^[^#]*#\w+\h* Match from the start of the string not a # 0+ times using a negated character class followed by matching a #. Then match 1+ times a word character followed by 0+ times a horizontal whitespace character.
| Or
#\w+\h* Match #, a word character 1+ times followed by a horizontal whitespace character 0+ times
| Or
\G Assert position at the end of the previous match
) Close atomic group
\K Forget what what previously matched
(\w+\h*) Capture in a group 1+ word characters followed by 0+ times a horizontal whitespace character
You can use the the following regex to match and replace:
\s([^#]\w+)
It starts by matching a White Space then it creates a Group, that does not start with '#', but contains one or more Word characters.
You then replace with:
' #$1'
That will add '#' to the Words thats doesn't start with it.
Related
I'm trying to filter out strings in project code which have the following form
'alphanumeric.alphanumeric.alphanumeric.alphanumeric'
(surrounded by quote and has one or more dots between alphanumeric words)
and another regex to find strings with the form
'this is a regular sentence with space'
I'm new to regex and have the following pattern which doesn't work. Which should mean:
(' + anything + . + anything + ')
/'*[^.]*'
I need multiple words with . connecting them.
The pattern that you tried /'*[^.]*' matches a /, then optional occurrences of ' followed by optional chars other than ' and match a ' so a dot can not be matched.
You could use 2 separate patterns matching either a dot or a space at the start of the group and matching alphanumerics [^\W_]+ exluding the underscore from a word character.
'[^\W_]+(?:\.[^\W_]+)+'
Another option is to use a capture group matching either a dot or space and use a backreference in the repetition and match any letter or any number:
'[\p{L}\p{N}]+([.\p{Zs}\t])[\p{L}\p{N}]+(?:\1[\p{L}\p{N}]+)*'
' Match literally
[\p{L}\p{N}]+ Match 1+ alphanumerics
([.\p{Zs}\t])[\p{L}\p{N}]+ Capture group 1, match either . or a space and 1+ alphanumerics
(?:\1[\p{L}\p{N}]+)* Optionally match what is captured in group 1 using the backreference \1 followed by 1+ alphanumerics
' Match literally
Regex demo
need an expression to allow only the below pattern
end word(dot)(space)start word [eg: end. start]
in other words
no space before colon,semicolon and dot |
one space after colon,semicolon and dot
rest of the all other patterns need to get capture to identify such as
end.start || end . start || end .start
i used
"([\s{0,}][\.]|[\.][\s{2,}a-z]|[\.][\s{0,}a-z])"
but not working as i expected.Need your support please
need_regex_patterns aim_of_regex_need
You could match 1+ word characters using \w+ and match either a colon or semi colon using a character class [;:] between optional spaces ?.
After that, match again 1+ word characters.
\w+ ?[;:] ?\w+
Regex demo
To match the dot followed by a single space variant, you don't need a character class but you could match the dot only using \.
\w+\. \w+
Regex demo
Edit
To highlight all the matches for the punctuations:
(?: [.:;]|[.:;] {2,}|(?<=\S)[;:.](?=\S))
Explanation
(?: Non capture group
[.:;] match a space followed by either . : or ;
| Or
[.:;] {2,} Match one of the listed followed by 2 or more spaces
| Or
(?<=\S)[;:.](?=\S) Match one of the listed surrounded by non whitespace chars
) Close group
Regex demo
Given a string such as below:
word.hi. bla. word.
I want to construct a regex which will match all "."s preceded by "word" and any other non space character
So, in the above example I would want the the first, second and last dots to be matched.
While matching the first and last dots would be easy with global flag (/(?:word.*)\K./gU), I'm not sure how to construct a regex that would also match the second dot.
Appreciate any pointers.
You might match word and then get all consecutive matches using the \G anchor excluding matching whitespace chars or a dot.
(?:\bword|\G(?!\A))[^.\s]*\K\.
In parts
(?: Non capture group
\bword Match word preceded by a word boundary
| Or
\G(?!\A) Assert the position at the end of the previous match, not at the start
) Close non capture group
[^.\s]* Match 0+ occurrences of any char except . or a whitespace char
\K Clear the match buffer (forget what is matched until now)
\. Match a dot
Regex demo
I am looking to create a match for the following:
"Adam Lambert"
"Mr. Adam Lambert"
"adam#test.com"
But not match the following
"Adam Lambert"
"Adam Lambert "
Rules:
Any alphanumeric character should be matches
A single space at any point should be matched.
Any number of single spaces can be matches
double spaces are not matched
a single space at the end of a string is not matched
EDIT
I also need to match the following. Sorry I missed this.
name:((\w+(?:\S\w+)*|\s(?:\w+\S)*)\S)*
I need to match to:
name:
name:A
name:Adam Lambert
The above regex matches from "name:Ad..." but it will not match "name:A"
I would generalize a solution to matching a sequence of non-space characters followed by optional groups of non-space characters following a single space only, since your only hard criterion seems to be the number of spaces. For example:
^\S+(?: \S+)*$
^(?:\S+(?:\s\S+)*|\s(?:\S+\s)*)\S$
Meaning:
^ start of the line
(?: non-capturing group
\S+ one or more non-whitespace characters
(?:\s\S+)* zero or more groups of a single whitespace and one or more
non-whitespace characters
or (|)
^ start of the line
\s one whitespace character
(?:\S+\s)* zero or more groups of non-whitespace characters and one whitespace character
) end non-capturing group
Finally one non whitespace character \S and the end of the line: $.
In your third example the # won't be matched with \w but it will if you change it to \S (any non-whitespace character)
See it in action here: regexr.com/50lp2
edit: I can't type
So I currently have a regex (https://regex101.com/r/zBE4Ju/1) that highlights the words before and after a linebreak. This is nice, but the issue is sometimes there are whitespaces after the word that appears BEFORE the line break. So they end up
You can see on my regex101 how the issue happens, and I have outlined the problem. I need to recognize the word before and after the line break, regardless of if there is a space after the word.
(\w*(?:[\n](?![\n])\w*)+)
You can see it in action here https://regex101.com/r/zBE4Ju/3
Expected: Line 1
Actual: Line 3
You can use $1 from:
/([^ ]+) *(\r|\n)/gm
https://regex101.com/r/o87VP7/5
If you want to highlight the last "word" in the sentence followed by possible spaces and a newline, you could repeat 0+ times a group matching 1+ non whitespace chars followed by 1+ spaces.
Then capture in a group matching non whitespace chars (\S+) and match possible spaces followed by a newline.
^ *(?:\S+ +)*(\S+) *\r?\n
Explanation
^ Start of string
* Match 0+ times a space
(?: Non capturing group
\S+ + Match 1+ non whitespace chars and 1+ spaces
-)* Close non capturing group and repeat 0+ times (to also match a single word at the beginning)
(\S+) Capture group 1, match 1+ times a non whitespace char
*\r?\n Match 0+ times a space followed by a newline
Regex demo