Regex to split text within parentheses at end of line using Java - regex

I need to parse some text that has data enclosed within parentheses at the end of line.
Snarky Spark was at the stage with his team (Jerry Mander/Kodi Player/Bella Bella)
I need to extract the text within parantheses and separated by forward slash in capture groups.
Jerry Mander
Kodi Player
Bella Bella
I have tried the following to split by /
(?:[^\/])+
But not able to split it within parenthesis or as end of line criteria.
Appreciate help

You may use this regex with \G:
(?:\(|(?!^)\G/)([^/)]+)(?=[^()]*\))
RegEx Demo
\G asserts position at the end of the previous match or the start of the string for the first match.

Related

Regex to match lines that do NOT have exactly three pipes in a Notepad++ csv

I have a pipe-delimited csv. Each row should have just three pipes to separate the columns. I need to find any lines that do NOT have three pipes - more OR less should match.
I'm learning regex and I came up with this (kind of hacked together finding parts I thought would work...)
^(?:[^|\r\n]*\|){3,}.*$
However, it's just matching all rows, regardless of the number of pipes in the row.
What's the correct syntax for what I want to do?
[UPDATE]
As #anubhava pointed out, I should provide an example.
This is example data in my file:
John Doe|1hgds234|Some comment|
Mary Jane|5df678|This column is the end of this record|Harry Jones|3456|Harry's record should be on the next line|
Sue Anderson|037dsf533|Another comment|
Harry Jones' record should start on a new line, starting at "Harry". Each line ends in a pipe and CRLF.
So I need a find/replace with a regex that would match on that second line and put a CRLF after the third pipe in the second line.
Assuming you don't have escaped | or | inside quoted cell value, uou can match using this regex:
^((?:[^|\n]*\|){3})(?![\r\n])
And replace this with:
$1\n
RegEx Demo
RegEx Details:
^: Start
(: Start capture group #1
(?:[^|\n]*\|){3}:
): End capture group
(?![\r\n]): Negative lookahead to assert that we don't have \r or \n ahead of the current position
The natural thing would be to put a check for 2 pipes plus some data in a positive lookbehind, but Notepad++ doesn't do variable length lookbehind. Instead we can put the lead-up into a capture group and include that capture group in the result.
^(((([^|]*?)|("[^"]*?"))\|){2}([^|]*?|"[^"]*?"))\|(?!$)
This allows for quoting between the pipes. Your replacement string should be $1\n to restore what is captured in group 1. I took the liberty of allowing a naked pipe character at line end using a negative lookahead.
Try this short code too, working as expected,
Find what: ^(.*?\K\|){3}(?=.)
Replace with: |\n

Notepad++ regex code to extract end of line

i have a source code that i need to capture. the whole file is of one line but i am not able to capture the data that i require.
allow=ok&secret=4326dwsaddsafsd286435dsfs754
now i need to capture this data 4326dwsaddsafsd286435dsfs754 which changes everytime. it contains mixed a-z and 0-9, total lenght 40 letters
i tried using Left and Right selectors by using "secret=" on left but since the source ends at the end of the value, i dont have any thing to put in right selector.
so i need to know how can i capture this data? is there any regex cmd that can let me.?
thanks
Try this:
\&secret=(.*)
and then capture the first group with $1
Use this regex:
[a-z0-9]*$
It searches for the longest sequence of alpha-numeric characters ([a-z0-9]*) at the end of the string ($).
Test here.

How to write Regex to extract first few characters from specific word without or without ending delimiters?

I have the following string and would like to extract the first few characters until the end of the word or until "Response"
<ns2:GetJobStatus
<ns10:JobIDResponse
<ns2:JobStatusResponse
<ns3:GetJobId
I would like the regular expression such that I could extract either GetJobStatus and GetJobID from all the above lines. I would like to drop "Response" from the result, such that I would get 2 of each in the above example. This is in splunk so I can't use awk or sed or any other unix /linux commands.
Here's what I have done so far
<ns\d+:(?P<ws_name>.+?)(?:Response)
with the above I am able to extract only where there is "Response"
With lookbehind and lookahead, you should be able to get the result you want with the pattern
(?<=:)(\w+?)(?=Response|\b|$)
You would be interested in the capture group (\w+?) because it'll come after the ":" character and be before the word "Response". The "\b|$" sets a word boundary or end of line.
Tested at Regex101
You're off to a good start. What you need to find after your ws_name group is either the word Response or a word boundary. Therefore, all you need to do is add |\b in your non-capturing group:
<ns\d+:(?P<ws_name>.+?)(?:Response|\b)
Here's a demo.
References:
Alternation in Regular Expressions.

notepad++ regex insert value inbetween pattern

I need a regex matcher to find the pattern for a list consisting of a bunch of records
all of which end with a comma.
I want to, at the first occurrence of the comma insert beginning and end h1 tags.
I tried using (.*),
This should capture everything on a line up until and including the comma:
[^,]*?,
You can use this regex:
^([^,]*),
This will locate the string before the first comma in a line. There is also capturing group that captures the text before the first comma for reference in replacement.
Try using either (.*?), or ([^,]+),. The former is preferred, but Notepad++ may not support it.

regular expression pattern handling brackets

I am looking for a regular expression pattern that is able to handle the following problem:
(to make someone) happy (adj.)
I only want to get the word "happy" and the regular expression pattern should also handle lines if only one part is in brackets e.g.:
(to make someone) happy
happy (adj.)
I've tried the following: "\s*\(.*\)"
But I am somehow wrong with my idea!
This one will get you the right word in the first capturing group in all three options:
(?:\([^)]*\)\s*)?(\w+)(?:\s*\([^)]*\))?
You can adjust and be more permissive in case you'd like to get a couple of words or to allow special characters:
(?:\([^)]*\)\s*)?([^()\n]+)(?:\s*\([^)]*\))?
A regex for finding the text between two parenthesized groups is
/(?:^|\([^)]*\))([^(]*)(?:$|\([^)]*\))/m
The breakdown is a follows:
Start with some text in parentheses or the beginning of a line: (?:^|\([^)]*\)). This matches from an open paren to the first closed paren
Then match the text outside of the parentheses, and put it in a group ([^(]*). This matches up to the next open paren.
Then match more text in parentheses or the end of a line: (?:$|\([^)]*\))
I used multiline mode (m) so that ^ and $ would match line breaks as well as the start and end of the string
Try regex (?:^|\))\s*([^\(\)]+?)\s*(?:\(|$)