Regex: Matching only groups that have a specific word embedded [closed] - regex

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I cannot figure out how to match only on groups that contain a certain word ('test' for example below). It is a big text file and the groups start with a line 'Group x' and include text with an empty line separation to the next group. I think I need to use lookaheads and lookbehinds but don't know how. I can use vb.net for this but trying to test out different expressions in the regex testers and can't get anywhere.
Group 1
adfdf
dd test ddfdf
dfdfadf
Group 2
ddfadfa
Group 3
add test
adfdff
Group 4
adfdf
Expected 2 matches:
Group 1
adfdf
dd test ddfdf
dfdfadf
Group 3
add test
adfdff

Start your pattern with ^Group \d+$ and end with (?:^$|\Z). In the middle match test but not preceeded by an empty line $(?:.(?!^$) (see Regular expression to match a line that doesn't contain a word? for details on how the latter works). Don't forget the m and s modifiers:
^Group \d+$(?:.(?!^$))*?test.*?(?:^$|\Z)
Demo: https://regex101.com/r/kM9qB3/2

Related

Return the first occurrence using Regex [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
I have the following expression:
[Document[_id=5f9ecf8ca9bec5549493ba7d,·policy_name=xxx,·is_mobile=false, Document[_id=6090fead53bc363849fce989,·policy_name=yyy,·is_mobile=true, Document[_id=619cf036761c281e3ad12327,·policy_name=zzz,·is_mobile=false, Document[_id=619cf729ea016d1e3336e903,·policy_name=xyz,·is_mobile=false]
I would like to capture ONLY the first Document id (i.e- 5f9ecf8ca9bec5549493ba7d).
i tried this regex- (?<=Document\[_id=).*?[^,]* BUT it will return all the Document id's.
1).how can i capture the first / second (Nth match) of document id from the expression?
2). is it possible to do regex AND operator to find the Document id with "is_mobile=true"?
(i.e 5f9ecf8ca9bec5549493ba7d & true)
would really appreciate any help
EDIT:
i'm using https://regex101.com/
this is the link in which i tried to capture the first / second (nth occurance of Document id ( i need only the number) - https://regex101.com/r/ZnYRhq/1
There is not language listed, but one approach could be using a capture group for the value that you want, and start the pattern with an anchor ^ to assert the start of the string.
For the first Document id:
^.*?\bDocument\[_id=([^\]\[\s,]+)
Regex demo
For the first Document id that has is_mobile=true (assuming that the order of the key-value pairs is as given in the example and is within the same opening and closing square brackets)
^.*?\bDocument\[_id=([^\]\[\s,]+),[^\]\[]*\bis_mobile=true\b
The pattern matches:
^ Start of string
.*?\bDocument\[_id= Match the first occurrence of Document[_id=
( Capture group 1
[^\]\[\s,]+ Match 1+ times any char except ] [ whitespace char or ,
) Close group 1
,[^\]\[]* Match a comma and optional chars other than ] and [
\bis_mobile=true\b Match is_mobile=true between word boundaries
Regex demo
Or using lookarounds for a single (not global) match:
(?<=Document\[_id=)[^,]*(?=,[^][]*\bis_mobile=true\b)
Regex demo
How about this one?
(?:^\[Document\[_id=)([^,]+) for the first?
For n-th you need to use capturing group but how to do this is language/framework dependent.
txt="""
[Document[_id=5f9ecf8ca9bec5549493ba7d,·policy_name=xxx,·is_mobile=false, Document[_id=6090fead53bc363849fce989,·policy_name=yyy,·is_mobile=true, Document[_id=619cf036761c281e3ad12327,·policy_name=zzz,·is_mobile=false, Document[_id=619cf729ea016d1e3336e903,·policy_name=xyz,·is_mobile=false]
"""
print([i.split('=')[1].strip(' ') for i in txt.split(',') if '_id' in i ][0])
output:
5f9ecf8ca9bec5549493ba7d

Regular expression Regex to extract a string [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Please can somebody help me, I`m new to regex and have no idea how to do this!.
I`m trying to extract from a list which looks like this...
Joe-Age23-46737-251.aspx
Tim-Age18-46909-451.aspx
Roger-Age41-59768-251.aspx
What I want is this...
46737-251.aspx
46909-451.aspx
59768-251.aspx
so basically anything after the second to last hyphen.
Cheers
Let's translate "everything after the second-to-last hyphen" into regex:
(?<=-)[^-]*-[^-]*$
Explanation:
(?<=-) # Assert starting position right after a hyphen
[^-]* # Match zero or more characters except hyphens
- # Match a single hyphen
[^-]* # see above
$ # until end of string.
Test it live on regex101.com.
Step1 : Split the string on the basis of hyphen(-) . You will get array of strings.
Step2 : extract the second , fifth and eighth
and so on( incremented by 3 ).
Step3 : concatinate all the strings formed in step2.

RegEx or PowerShell to remove repeat characters in sequence [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Writing a script to convert a Windows host file into a CSV. Used RegEx to get to this stage:
1.1.1.1,,server1,,,
2.2.2.2,,server2
3.3.3.3,,server3
4.5.6.7,,server4,,server5,,server6,,server7,,
8.8.8.8,,server8
9.9.9.9,server9
I need some RegEx that can remove the duplicate commas (in sequence) so it would look like this:
1.1.1.1,server1,
2.2.2.2,server2
3.3.3.3,server3
4.5.6.7,server4,server5,server6,server7,
8.8.8.8,server8
9.9.9.9,server9
Will also need to remove the comma at the end of each line (if there is one) but think this will be simpler to do.
The regex for your first task of removing duplicate commas was already provided in the comments above, but if you also want to remove trailing commas at the end of the line, you can use this to solve both problems at once:
(?m),(?=,|$)
Explanation:
(?m) # turn on multiline mode ($ matches end-of-line, not just end-of-string)
, # Match a comma
(?= # only if followed by
, # another comma
| # or
$ # the end of the string.
) # End of lookahead assertion
Test it live on regex101.com.

Regex to match custom markdown syntax [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I want to match the following with multiple capture groups:
Definition 1
: This is the definition text that described the term. Can have markdown formatting and
multiple lines.
Definition 2
: This is the definition text with **markdown**.`code`
I also want to replace it with the following text (HTML definition list):
<dl>
<dt>Definition 1</dt>
<dd>This is the definition text that described the term. Can have markdown formatting and
multiple lines.</dd>
<dt>Definition 1</dt>
<dd>This is the definition text with **markdown**.`code`</dd>
</dl>
You could do this in two steps:
1. Insert the dt and dd tags
Perform a search with:
(.*)\R: ((?:.+(?:\R|$))*?)(?=\R|.*\R:|$)/g
and substitute by:
<dt>$1</dt>\n<dd>$2</dd>\n
See regex tester.
2. Add the dl tags
Use the result of the previous substitution and perform the following search:
/(<dt>.*?<\/dd>(?!\s*<dt>))/gs
and substitute by:
<dl>\n$1\n</dl>
See regex tester.
Remarks
If the \R escape is not supported, use \n instead.
The back-references $1, $2 might need to be changed to \1, \2 depending on your regex engine.

How to match a line which should not contain a word after a matched word [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
1) /abs/2-bhk-property-for-sale-in-builders-apartments-bang123asdxc/38070127?page=509
2) /vjr-apartments/private-k3zs0gdf
3) /dolphin-jasmine-apartments-navimumbai-approvals/psddp-3qfci22i
4) /kanaka-lakshmi-apartments-andra/private-67mwcdbe
What is the regex expression to match strings with 'apartment' but should not match 'private'?
i.e Should match 1) and 3) but not 2) and 4)
I wrote this regex .*?(-)(apartments)(?!\/private).* but it is not working.
You can use this regex:
-apartments(?!.*?/private)
(?!.*?/private) is negative lookahead that will fail the match if /private string comes after -apartments.
RegEx Demo
In some languages / needs to be escaped so use:
/-apartments(?!.*?\/private)/
This matches the line.
.*apartments(?!.*/private).*