RegEx multiple capture groups replaced in a string

RegEx multiple capture groups replaced in a string - regex

I have a string of data...
"123456712J456","D","TEST1~TEST2~TEST3~TEST4~TEST5"
I want to take the following string and make 5 strings.
"123456712J456","D","TEST1"
"123456712J456","D","TEST2"
"123456712J456","D","TEST3"
"123456712J456","D","TEST4"
"123456712J456","D","TEST5"
I currently have the following regex...
//In a program like Textpad
<FIND> "\(.\{13\}\)","D","\([^~]*\)~\(.*\)
<REPLACE> "\1","D","\2"\n"\1","D","\3
//On the regex101 site
"(.{13})","D","([^~]*)~(.*)
Now if I run this 5 times it would work fine. The problem is there is an unknown number of lines to be made. For example...
"123456712J456","D","TEST1~TEST2~TEST3~TEST4~TEST5"
"123456712J457","D","TEST1~TEST2~TEST3"
"123456712J458","D","TEST1~TEST2"
"123456712J459","D","TEST1~TEST2~TEST3~TEST4"
I was hoping to be able to use a MULTI capture group to make this work. I found this PAGE talking about the common mistake between repeating a capturing group and capturing a repeated group. I need to capture a repeated group. For some reason I just could not make mine work right though. Anyone else have an idea?
RESOURCES:
http://www.regular-expressions.info/captureall.html
http://regex101.com/

Try this.See demo.Just club match1 and rest of the matches.
http://regex101.com/r/yR3mM3/17
RegEx:
(.*,)|([^"~]+)
Example:
"1234567123456","T","TEST1~TEST2~TEST3~TEST4~TEST5"
Results:
MATCH 1
1. [0-20] `"1234567123456","T",`
MATCH 2
2. [21-26] `TEST1`
MATCH 3
2. [27-32] `TEST2`
MATCH 4
2. [33-38] `TEST3`
MATCH 5
2. [39-44] `TEST4`
MATCH 6
2. [45-50] `TEST5`

Related

How can I use Regex to capture a certain set of ages?

I have a set of data, like below;
1
2
3
4
5
6
7
8
9
10
1,1
1,2
1,3
2,12
11,13,15
7,8,12
And so on... I am trying to use Regex in to target a certain set of ages between 1-7, but I am getting matches on any double digit which contains any of these characters too. My regex is currently as below;
/^(1)|(2)|(3)|(4)|(5)|(6)|(7)|$/g
My current matches include 1,2,3,4,5,6,7 - perfect. However, it matches the line with 11,13,15 and 7,8,12 - not what I wanted.
Any advice would be appreciated on how to resolve? Thanks in advance, I am continuing to try to correct.

You can use word boundaries:
\b[1-7]\b
See a demo on regex101.com.
As pointed out by #Quantic, this matches numbers from 1-7 regardless where they are.
If you only want to have lines where there is a number between 1-7, you'll need to use anchors:
^[0-7]$
Or if you want to capture the number:
^([0-7])$
With this, you'll need the multiline flag, see a demo on regex101.com as well.

(?<!\d)[1-7](?!\d)
This looks for any digit 1-7 that does not have another digit on either side of it. (using negative lookbehind/lookahead)
regex101 test

Regex - capture all repeated iteration

I have a variable like this
var = "!123abcabc123!"
i'm trying to capture all the '123' and 'abc' in this var.
this regex (abc|123) retrieve what i want but...
My question is: when i try this regex !(abc|123)*! it retrieve only the last iteration. what will i do to get this output
MATCH 1
1. [1-4] `123`
MATCH 2
1. [4-7] `abc`
MATCH 3
1. [7-10] `abc`
MATCH 4
1. [10-13] `123`
https://regex101.com/r/mD4vM8/3
Thank you!!

If your language supports \G then you may free to use this.
(?:!|\G(?!^))\K(abc|123)(?=(?:abc|123)*!)
DEMO

Regex and numeric value to capture between two differents tags

I'm trying to make a script which help me to get new books from a website.
I'm working with preg_match_all. I have 7 informations to get : title, author, editor...
I've some problem to create my preg match mask. For example, I need the product code from here. There is between 3 and 10 code product to get on each page. :
<li><label>Réf : </label>21608</li>
At first I'm trying this :
$mask ="/Réf :(.*)<\/li>/Us";
It's work, but I want only the numbers. I'm searching on regex guides on the web, but I don't understand how to use the syntax for my goal, because this code product is not betweend two tags like that : <open>...</open>. This code product have 4 or 5 numbers.
Thanks for any help !

Try following regular expression:
/Réf :\D*(\d+)<\/li>/
\D: non-digit
\d: digit

Let's try step by step to match those digits:
We have Réf, let's make it /réf/i and use the i modifier to match case insensitive.
There is space : space, let's make it dynamic and match it with \s* which will match zero or more times whitespaces /réf\s*:\s*/i
We then have no digits at all, we may use \D* which will match everything except digits: /réf\s*:\s*\D*/i
We know that there is 4 to 5 digits, we'll use \d{4,5} which will match a digit 4 or 5 times : /réf\s*:\s*\D*\d{4,5}/i
We need only the digits, so let's put them into a group: /réf\s*:\s*\D*(\d{4,5})/i
PHP code
$string = '<li><label>Réf : </label>21608</li>';
preg_match_all('/réf\s*:\s*\D*(\d{4,5})/i', $string, $m);
print_r($m[1]);
Output
Array
(
[0] => 21608
)

Try this...
/>\s*(\d{3,10})\s*</

How can I get a list of regex matches for a group?

I have a group which can occur any number of times in the input string. I need to get a list of all the matching items.
For example, for input:
example repeattext 1 anything here repeattext 2 anything repeattext 3
My regex is:
(repeattext \d)
I want to get the list of matches for the group. Is it possible to use regex here or do I need to parse it myself?

Yes, you can use regex here. Your existing regex will do fine.
See http://rubular.com/r/fS8c9C61rG for it in use on your example.
If numbers will ever become 10 or higher, consider this regex:
(repeattext \d+)
^
|
`- matches 1 or more repeating of previous

Use
result = subject.scan(/repeattext \d+/)
=> ["repeattext 1", "repeattext 2", "repeattext 3"]
See the docs for the .scan() method.

What is wrong with this Regular Expression?

I am beginner and have some problems with regexp.
Input text is : something idUser=123654; nick="Tom" something
I need extract value of idUser -> 123456
I try this:
//idUser is already 8 digits number
MatchCollection matchsID = Regex.Matches(pk.html, #"\bidUser=(\w{8})\b");
Text = matchsID[1].Value;
but on output i get idUser=123654, I need only number
The second problem is with nick="Tom", how can I get only text Tom from this expresion.

you don't show your output code, where you get the group from your match collection.
Hint: you will need group 1 and not group 0 if you want to have only what is in the parentheses.

.*?idUser=([0-9]+).*?
That regex should work for you :o)

Here's a pattern that should work:
\bidUser=(\d{3,8})\b|\bnick="(\w+)"
Given the input string:
something idUser=123654; nick="Tom" something
This yields 2 matches (as seen on rubular.com):
First match is User=123654, group 1 captures 123654
Second match is nick="Tom", group 2 captures Tom
Some variations:
In .NET regex, you can also use named groups for better readability.
If nick always appears after idUser, you can match the two at once instead of using alternation as above.
I've used {3,8} repetition to show how to match at least 3 and at most 8 digits.
API links
Match.Groups property
This is how you get what individual groups captured in a match

Use look-around
(?<=idUser=)\d{1,8}(?=(;|$))
To fix length of digits to 6, use (?<=idUser=)\d{6}(?=($|;))

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

RegEx multiple capture groups replaced in a string - regex

Related

How can I use Regex to capture a certain set of ages?

Regex - capture all repeated iteration

Regex and numeric value to capture between two differents tags

How can I get a list of regex matches for a group?

What is wrong with this Regular Expression?

Categories

Resources