Matching String regex with Optional (foobar) - regex

I have this data:
/blabla/blabla (abs,def)
/yxz
I use this regex
(.*)(?:\(([^$]*)\))?\n
But it doesn't work, and i don't know whats wrong.
I need the first "directory" information and optional the Information in "()".

Try using some online regexp matcher (ex: http://www.rubular.com/ ) to test by your own. Many of them has the match highlight function, and you can refine your regex by them

This regex extracts the first directory in group 1 and anything between the () optionally:
/([^/]*)(?:\((.*?)\)|.)*
Let me know if this works or need some assistance.
Match 1: /blabla/blabla (abs,def) 0 24
Group 1: blabla 1 6
Group 2: abs,def 16 7
Match 2: /yxz 28 4
Group 1: yxz 29 3
Group 2 did not participate in the match
edit for quick joe

something like this, maybe? ([^(\n]+)(?:\(([^)]*)\))?

Related

Regex - capture all repeated iteration

I have a variable like this
var = "!123abcabc123!"
i'm trying to capture all the '123' and 'abc' in this var.
this regex (abc|123) retrieve what i want but...
My question is: when i try this regex !(abc|123)*! it retrieve only the last iteration. what will i do to get this output
MATCH 1
1. [1-4] `123`
MATCH 2
1. [4-7] `abc`
MATCH 3
1. [7-10] `abc`
MATCH 4
1. [10-13] `123`
https://regex101.com/r/mD4vM8/3
Thank you!!
If your language supports \G then you may free to use this.
(?:!|\G(?!^))\K(abc|123)(?=(?:abc|123)*!)
DEMO

RegEx multiple capture groups replaced in a string

I have a string of data...
"123456712J456","D","TEST1~TEST2~TEST3~TEST4~TEST5"
I want to take the following string and make 5 strings.
"123456712J456","D","TEST1"
"123456712J456","D","TEST2"
"123456712J456","D","TEST3"
"123456712J456","D","TEST4"
"123456712J456","D","TEST5"
I currently have the following regex...
//In a program like Textpad
<FIND> "\(.\{13\}\)","D","\([^~]*\)~\(.*\)
<REPLACE> "\1","D","\2"\n"\1","D","\3
//On the regex101 site
"(.{13})","D","([^~]*)~(.*)
Now if I run this 5 times it would work fine. The problem is there is an unknown number of lines to be made. For example...
"123456712J456","D","TEST1~TEST2~TEST3~TEST4~TEST5"
"123456712J457","D","TEST1~TEST2~TEST3"
"123456712J458","D","TEST1~TEST2"
"123456712J459","D","TEST1~TEST2~TEST3~TEST4"
I was hoping to be able to use a MULTI capture group to make this work. I found this PAGE talking about the common mistake between repeating a capturing group and capturing a repeated group. I need to capture a repeated group. For some reason I just could not make mine work right though. Anyone else have an idea?
RESOURCES:
http://www.regular-expressions.info/captureall.html
http://regex101.com/
Try this.See demo.Just club match1 and rest of the matches.
http://regex101.com/r/yR3mM3/17
RegEx:
(.*,)|([^"~]+)
Example:
"1234567123456","T","TEST1~TEST2~TEST3~TEST4~TEST5"
Results:
MATCH 1
1. [0-20] `"1234567123456","T",`
MATCH 2
2. [21-26] `TEST1`
MATCH 3
2. [27-32] `TEST2`
MATCH 4
2. [33-38] `TEST3`
MATCH 5
2. [39-44] `TEST4`
MATCH 6
2. [45-50] `TEST5`

REGEX : get a text in middle of string and number

The problem I am facing is that if I have this string:
STARTGAME grindurr 9 51 19 3 7 1 2 2 0
...I want to extract name grindurr from the middle. I tried this regex:
STARTGAME\t.*\t[^\d]
...but it didn't work. :( Can anyone tell me what I'm doing wrong?
STARTGAME\s+(.*?)\s+\d
might work. The result is then in the first capturing group. You can remove the need for the capturing group by using lookaround, but I don't know the exact capabilities by that regex engine, so above is probably the safest way.
You can use...
STARTGAME\s+(.*?)\s+\d+
That should have the word between STARTGAME and the first number.

Matching a group that may or may not exist

My regex needs to parse an address which looks like this:
BLOOKKOKATU 20 A 773 00810 HELSINKI SUOMI
-------------------- ----- -------- -----
1 2 3 4*
Groups one, two and three will always exist in an address. Group 4 may not exist. I've written a regex that helps me get the first, second and third part but I would also need the fourth part. Part 4 is the country name and can either be FINLAND or SUOMI. If the fourth part didn't exist in an address the fourth group would be empty. This is my regex so far but the third group captures the country too. Any help?
(.*?)\s(\d{5})\s(.*)$
(I'm going to be using this Oracles REGEXP function)
Change the regex to:
(.*?)\s(\d{5})\s(.+?)\s?(FINLAND|SUOMI)?$
Making group three none greedy will let you match the optional space + country choices. If group 4 doesn't match I think it will be uninitialized rather than blank, that depends on language.
To match a character (or in your case group) that may or may not exist, you need to use ? after the character/subpattern/class in question. I'm answering now because RegEx is complicated and should be explained: only posting the fix without the answer isn't enough!
A question mark matches zero or one of the preceding character, class, or subpattern. Think of this as "the preceding item is optional". For example, colou?r matches both color and colour because the "u" is optional.
Above quote from http://www.autohotkey.com/docs/misc/RegEx-QuickRef.htm
Try this:
(.*?)\s(\d{5})\s(.*?)\s?([^\s]*)?$
This will match your input more tightly and each of your groups is in its own regex group:
(\w+\s\d+\s\w\s\d+)\s(\d+)\s(\w+)\s(\w*)
or if space is OK instead of "whitespace":
(\w+ \d+ \w \d+) (\d+) (\w+) (\w*)
Group 1: BLOOKKOKATU 20 A 773
Group 2: 00810
Group 3: HELSINKI
Group 4: SUOMI (optional - doesn't have to match)
(.*?)\s(\d{5})\s(\w+)\s(\w*)
An example:
SQL> with t as
2 ( select 'BLOOKKOKATU 20 A 773 00810 HELSINKI SUOMI' text from dual
3 )
4 select text
5 , regexp_replace(text,'(.*?)\s(\d{5})\s(\w+)\s(\w*)','\1**\2**\3**\4') new_text
6 from t
7 /
TEXT
-----------------------------------------
NEW_TEXT
-----------------------------------------------------------------------------------------
BLOOKKOKATU 20 A 773 00810 HELSINKI SUOMI
BLOOKKOKATU 20 A 773**00810**HELSINKI**SUOMI
1 row selected.
Regards,
Rob.

What is wrong with this Regular Expression?

I am beginner and have some problems with regexp.
Input text is : something idUser=123654; nick="Tom" something
I need extract value of idUser -> 123456
I try this:
//idUser is already 8 digits number
MatchCollection matchsID = Regex.Matches(pk.html, #"\bidUser=(\w{8})\b");
Text = matchsID[1].Value;
but on output i get idUser=123654, I need only number
The second problem is with nick="Tom", how can I get only text Tom from this expresion.
you don't show your output code, where you get the group from your match collection.
Hint: you will need group 1 and not group 0 if you want to have only what is in the parentheses.
.*?idUser=([0-9]+).*?
That regex should work for you :o)
Here's a pattern that should work:
\bidUser=(\d{3,8})\b|\bnick="(\w+)"
Given the input string:
something idUser=123654; nick="Tom" something
This yields 2 matches (as seen on rubular.com):
First match is User=123654, group 1 captures 123654
Second match is nick="Tom", group 2 captures Tom
Some variations:
In .NET regex, you can also use named groups for better readability.
If nick always appears after idUser, you can match the two at once instead of using alternation as above.
I've used {3,8} repetition to show how to match at least 3 and at most 8 digits.
API links
Match.Groups property
This is how you get what individual groups captured in a match
Use look-around
(?<=idUser=)\d{1,8}(?=(;|$))
To fix length of digits to 6, use (?<=idUser=)\d{6}(?=($|;))