RegEx replace all after and all before [closed] - regex

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I have a file with about 2000 lines and the columns are divided with ,.
I need to replace all dots ., that are after the 10th comma , with a comma. However, I do not replace any dots that are before that 10th comma on each line.
How can I make it replace all dots after the 10th comma with commas?

Find what:
(^(?:[^,\n]*,){10}[^.\n]*|(?!^)\G[^.\n]*).
Replace with:
$1,
Place the cursor at the beginning of the line. Then Replace All.
Explanation
( # Capturing group 1, whatever that stays the same
^(?:[^,\n]*,){10}[^.\n]* # From the beginning of the line, skip 10 columns
# (with 10 commas), then skip to the nearest dot
| # OR
(?!^)\G[^.\n]* # Continue from where the last dot matches
# and skip to the nearest dot
)
. # Dot, to be replaced

I would use this regex:
(?:^(?:[^\R,]*,){10}|(?!^)\G)[^\R.]*\K\.
And replace with ,.
Are you sure it's notepad++ v4.6? That version is pretty old and unfortunately, its regex capabilities won't support the above. The above works on v6.1.
(?: # Beginning of non-capture group
^ # Match only at the start of the string
(?: # Beginning of non-capture group
[^\R,]* # Match non-newlines and non-comma characters
, # Match commas
){10} # Close of non-capture group and repeat 10 times
| # OR
(?!^)\G # A \G anchor that is not at the start to match from previous matches
) # Close of non-capture group
[^\R.]* # Match non-newlines and non-dot characters
\K # Reset the matching
\. # Match a dot

Related

notepad++ line combine [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
We have this line order
129
12
2020
5424180606943758
we need to be this way
5424180606943758|12|2020|129
how to do this in notepad ++ or in onoter app
Ctrl+H
Find what: (\d+)\R(\d+)\R(\d+)\R(\d+)\R?
Replace with: $4|$3|$2|$1
CHECK Wrap around
CHECK Regular expression
Replace all
Explanation:
(\d+) # group 1, 1 or more digits
\R # any kind of linebreak
(\d+) # group 2, 1 or more digits
\R # any kind of linebreak
(\d+) # group 3, 1 or more digits
\R # any kind of linebreak
(\d+) # group 4, 1 or more digits
\R # any kind of linebreak, optional
Replacement:
$4 # content of group 4
| # a pipe
$3 # content of group 3
| # a pipe
$2 # content of group 2
| # a pipe
$1 # content of group 1
Screenshot (before):
Screenshot (after):

How do I return only the nth match using a regular expression? [duplicate]

This question already has answers here:
Regex to extract nth token of a string separated by pipes
(2 answers)
Closed 2 years ago.
So I'm looking at a string like this with a | delimiter:
XL231241424|AB|ABCDE|LK|Word|A Phrase|Another random phrase|GH49|Example|31/02/2020|05/03/2020|N/A|N/A|N/A
The end goal is to pluck specific items from this string and concatenate them together, essentially giving a more concise bit of text for readable in a reporting platform.
I tried this and this but couldn't get it to work for my string. The match is simple, it's just:
([^\|]*) but how do I then just pull a match in position x? Specifically, if I want to get Another random phrase which is position 7, how do I do that?
Any help is much appreciated and an explanation is even better!
Thanks.
You could use
^(?:[^|]*\|){6}([^|]*)
Demo
In your example Another random phrase would be saved to capture group 1.
The regex engine performs the following operations.
^ # match beginning of line
(?: # begin a non-capture group
[^|]*\| # match 0+ chars other than '|' followed by '|'
) # end non-capture group
{6} # execute non-capture group 6 times
([^|]*) # match 0+ chars other than '|' in capture group 1
If your regex engine supports \K (as does PCRE (PHP), P and others), you don't need a capture group:
^(?:[^|]*\|){6}\K[^|]*
Demo
In your example this matches Another random phrase.
^(?:[^|]*\|){6}\K[^|]*
The regex engine performs the following operations.
^ # match beginning of line
(?: # begin a non-capture group
[^|]*\| # match 0+ chars other than '|' followed by '|'
) # end non-capture group
{6} # execute non-capture group 6 times
\K # discard everything matched so far
[^|]* # match 0+ chars other than '|'

How to search for a pattern only if it's the start of a word? [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 2 years ago.
Example: search for the pattern man but only as the beginning of a word (i.e. not preceded directly by letters).
This pattern would be found in the strings man, spider-man, manpower, iron_man. But not in woman or human.
I have assumed that if the word "man" is preceded by a hyphen or underscore, to achieve a match the hyphen or underscore must be preceded by a letter (e.g., "-man" would not be matched).
The \K escape sequence resets the beginning of the match to the current position in the token list. If supported by the regex engine, the following regular expression (with the case-indifferent flag set) could be used.
(?:^| |[a-z][-_])\Kman
Demo
The selected answer to this SO question provides a list of regex engines that support \K. That list was last updated in August 2019.
The regex engine performs the following operations.
(?: # begin non-capture group
^ # match beginning of line
| # or
# match a space
| # or
[a-z] # match a letter
[-_] # match '-' or '_'
) # end non-capture group
\K # discard everything matched so far
man # match 'man'
Alternatively, a capture group could be used.
(?:^| |[a-z][-_])(man)
Demo
You can use positive lookbehind to achieve this:
(?<=^|[a-z][_-]|\s)man
regex101 demo
Add a word boundary \b or look behind for underscore to the start:
((?<=_)|\b)man
See live demo.

Regex to get a name from parentheses without spaces [duplicate]

This question already has answers here:
Regex match entire words only
(7 answers)
Closed 4 years ago.
I'm trying to get the file name between two parentheses that can contain spaces between the name and any parentheses.
example:
( file_name )
I used the regex:
(([A-Za-z_][A-Za-z0-9_]*)[ \t]*)
The problem is that, it matches the file_name with the spaces before and after it.
I want to match the file_name without the spaces.
Any help is appreciated.
Just add \s* outside the capturing parenthesis. (also you need to escape the outermost parenthesis if you want to match a litteral parenthesis) :
\(\s*([A-Za-z_][A-Za-z0-9_]*)\s*\)
You could use
\(\s*(\S+)\s*\)
and take the first group, see a demo on regex101.com.
Explained:
\( # match ( literally
\s* # zero or more whitespaces
(\S+) # capture anything not a whitespace, at least one character
\s*. # same as above
\) # match ) literally

How can I exclude a character from a regex capturing group? [duplicate]

This question already has answers here:
Regular expression to skip character in capture group
(6 answers)
Closed 7 years ago.
I have a regex capture, and I would like to exclude a character (a space, in this particular case) from the middle of the captured string. Can this be done in one step, by modifying the regex?
(Quick and dirty) example:
Text: Key name = value
My regex: (.*) = (.*)
Output: \1 = "Key name" and \2 = "value"
Desired output: \1 = "Keyname" and \2 = "value"
Update: I'm not sure what regex engine will run this regex, since it's part of a larger software product. If you have a solution, please specify which engines it will run on, and on which it will not.
Update2: The aforementioned product takes a regex as an input, and then uses the matched values further, which is the reason for which a one-step solution is asked for. There is no opportunity to insert an intermediate processing step in the pipeline.
This is a possible theoretical pure-regex implementation using the end-of-previous-match \G anchor:
/(?:\G(\w+)\h(?:(?:=\h)(\w+))?)+/g
Online demo
Legenda
(?: # Non capturing group 1
\G # Matches where the regex engine stops in the previous step
(\w+) # capture group 1: a regex word of 1+ chars
\h* # zero or more horizontal spaces (space, tabs)
(?: # Non capturing group 2
=\h* # literal '=' follower by zero or more hspaces
(\w+) # capture group 2: a regex word of 1+ chars
)? # make the non capturing group 2 optional
)+ # repeat the non capturing group 1, one or more
In the substitution section of the demo:
\1 actually contains Keyname (the 2 terms are separated by a fake space)
\2 is value
NOTE: i don't recommend using this unless actually needed (why?).
There are multiple possible approaches in 2 steps: as surely already stated simply strip spaces from the first capturing group of the OP regex.
I would come up with sth. like:
(?<key>[\w]+)\s*=\s*(?<value>.+)
# look for a word character and capture it in a group called "key"
# followed by zero or unlimited times of a whitespace character (\s)
# followed by an equation sign
# followed by zero or unlimited times of a whitespace character (\s)
# capture the rest in a group called value
... and process the captured output afterwards. But with the \w character class no whitespace will matched (do you have keys with a whitespace in it?).
See a working demo here. But as mentionned in the comments, it depends on your programming language.