Regex to match two factors in one? - regex

Using <div dir=.*?> works fine to match <div dir="auto">.
However, why does <div dir=.*?><br \/> not match <div dir="auto"><br />?
Code: https://regex101.com/r/5pP38n/1

The regexp starts matching at the first <div dir= in the input. Then it looks for the next ><br \/> in the input. .*? will match everything between them, which is
"auto">Please 🙏 sir my youtube channel delete <div dir="auto"
You don't match <div dir="auto"><br /> because it's contained inside this match, and a regexp doesn't return overlapping matches.
If you don't want .*? to match across multiple tags, you can use [^>]* instead.
<div dir=[^>]*><br \/>
DEMO

Related

Match values in pairs from Html using RegEx

I need to use Regex only to extract the following output:
Match 1: (Group 1: Packaged Quantity) (Group 2: 1)
Match 2: (Group 1: Width) (Group 2: 14.7 cm)
Given the following input:
<li>
<div class="col-3"> Packaged Quantity </div>
<div class="col-5"> 1 </div>
</li>
<li>
<div class="col-3"> Width </div>
<div class="col-5"> 14.7 cm </div>
</li>
So far I have tried using :
(?<=class=\"col-3\">)[^<]+|(?<=class=\"col-5\">)[^<]+
This gives me 4 different matches. But I want two matches, with two groups in each match. I know I could use xpath to do the same, but I am limited to use Regex for some constraints that I won't be able to comment on.
You can match the col-3"> at the start, then capture non-< characters for the first group, match </div> followed by non-> characters, and capture non-< characters again for the second group:
col-3">([^<]+)<\/div>[^>]+>([^<]+)
https://regex101.com/r/YAZFvV/1
(that said, if at all possible, it would be better to use a proper HTML parser for this sort of thing)

Sublime Text Regex Search for alphanumeric string, not working..

I'm trying to replace a common theme used in hundreds of pages in my project:
<div id="PageTitle"> (Page title as a string) </div>
And the title varies each page. I want to replace it with
<div class="row">
<div class="col-md-12 col-sm-12">
<h3><?= $pageTitle?></h3>
</div>
</div>
I've tried searching with <div id="PageTitle">/^\w+$/</div>, and <div id="PageTitle">"^[a-zA-Z0-9_]*$"</div> with no luck. Any ideas?
You are almost there. Looks like you got the pattern from somewhere else. ^ and $ are starting and ending anchors so they match with the start and end of an input so you should probably get rid of them.
Next if your page title is only going to contain alphanumeric characters (no spaces too) then \w is fine, else you might want to use . instead.
<div id="PageTitle">\w+<\/div>
For a title containing any character:
<div id="PageTitle">.+?<\/div>
Here's a demo
Hope this helps!
Try this one as well, I think its pretty strict:
<div id="PageTitle">(?:(?!<\/div>).)+<\/div>
Or even:
<div id="PageTitle">[\s\S]*?<\/div>

HTML pattern does't work even with correct regular expression

Regular expression: ((?=.*\d)(?=.*[A-Z]))
Input string: qwer1Q
The input string above pass the validation if you check it in regex101
However, if you include the regex in a html pattern attribute and try to validate the same string again, it shall not pass:
<form>
<div>
<input type="text" placeholder="Password"
pattern="((?=.*\d)(?=.*[A-Z]))">
</div>
<div>
<button>Submit</button>
</div>
</form>
You need to make sure the pattern matches (and consumes) the entire string because the HTML5 pattern regex is anchored by default.
<form>
<div>
<input type="text" placeholder="Password"
pattern="(?=.*\d)(?=.*[A-Z]).*">
</div>
<div>
<button>Submit</button>
</div>
</form>
The (?=.*\d)(?=.*[A-Z]).* pattern will be turned into ^(?:(?=.*\d)(?=.*[A-Z]).*)$ and it will match:
^ - start of string
(?: - start of a non-capturing group:
(?=.*\d) - a positive lookahead check to make sure there is at least 1 digit
(?=.*[A-Z]) - a positive lookahead check to make sure there is at least 1 uppercase letter
.* - any 0+ chars, greedily, up to the end of string
) - end of the non-capturing group
$ - end of string.

Regex lookahead and behind?

So I have a unordered list that looks like:
<ul class='radio' id='input_16_5'>
<li>
<input name='input_5' type='radio' value='location_1' id='choice_16_5_0' />
<label for='choice_16_5_0' id='label_16_5_0'>Location 1</label></li>
<li>
<input name='input_5' type='radio' value='location_2' id='choice_16_5_1' />
<label for='choice_16_5_1' id='label_16_5_1'>Location 2</label></li>
<li>
<input name='input_5' type='radio' value='location_3' id='choice_16_5_2' />
<label for='choice_16_5_2' id='label_16_5_2'>Location 3</label></li>
</ul>
I would like to pass a value (ie. location_2) to a regular expression that will then capture the whole list item that it's a part of in order to remove it. So if I pass it location_2 it will match the to the (including) <li> and the </li> of the list item that it's in.
I can match up to the end of the list item with /location_3.+?(?=<li|<\/ul)/ but is there something I can do to match before and not capture other items?
This should get what you want
<li>(?:(?!<li>)[\S\s])+location_1[\S\s]+?<\/li>
Exaplanation
<li>: open li tag,
(?:(?!<li>)[\S\s])+: match for any characters including a newline and use negative look ahead to make sure that your highlight will not consume two or more <li> tags,
location_1: keyword that you use for highlight the whole <li> tag,
[\S\s]+?: any characters including a newline. (Here, thanks #Tensibai for your comment that make this regex be more simple with non-greedy)
<\/li> close li tag.
DEMO: https://regex101.com/r/cU4eC6/5
Additional information:
/<li>(?:(?!<li>).)+location_2.+?<\/li>/s
This regex is also work where you use modifier s to handle a newline instead of [\S\s]. (Thanks again to #Tensibai)

Regular expression for exactly one match

I am using the following regular expression in my code editor (sublime text) in order to search for the ASP.NET comments.
<%--.*(\n.*)*--%>
I want this regular expression to stop looking any forward as soon as the first --%> is found. But it keeps looking until the last comment's --%> is found. I have got this idea that i've to use some kind of flag to make it stop as soon as the first --%> but I am unable to figure it out.
Can anyone please tell me how may I modify this regex?
UPDATE
I forgot to post some sample markup. Here it is:
<div class="modal-footer">
<%--<button class="btn" data-dismiss="modal">
Close</button>
<button id="btnAddCountry" class="btn btn-primary" data-dismiss="modal">
Save changes</button>--%>
</div>
</div>
<div class="row-fluid">
<div class="span12">
<div class="box paint_hover">
<div class="title">
<h3>Sale Voucher</span>
</h3>
</div>
<div class="content">
<ul id="tabExample1" class="nav nav-tabs">
<li class="active"><a id="lnkAddEditVoucher" href="#AddEditVoucher" data-toggle="tab">Add/Update Sale Voucher</a></li>
<li><a id="lnkViewVouchers" href="#ViewVouchers" data-toggle="tab">Search Sale Voucher</a></li>
<%-- <li><a id="lnkViewParties" href="#ViewParties" data-toggle="tab">Search Parties</a></li>--%>
</ul>
I just want to match the first comment and not the second one.
You need to make the * quantifiers non-greedy. Usually this is done by adding a ? after them, e.g. .*? instead of just .*.
I've also simplified the regex a bit. Sublime Text supports the (?s) modifier at the beginning of the pattern to make the dot match even newlines:
(?s)<%--.*?--%>
If you prefer matching the newline explicitly:
<%--(.|\n)*?--%>
The problem you seem to have is that you use the greedy version of .*, which matches anything (including --%>). Try using <%--.*?(\n.*?)*?--%> instead to make it non-greedy.