Need help in regular expression for notepad ++ - regex

Below is my sample text .
<ul>
<li>Google</li>
<li>Yahoo</li>
<li>Bing</li>
</ul>
I would like to add an extra attribute in anchor tag with the value of hyperlink like below.
<ul>
<li>Google</li>
<li>Yahoo</li>
<li>Bing</li>
</ul>
I want to do this using notepad++ regular expression. Appreciate your help !!

You can use this regular expression find/replace:
Find: >([^<>]+)</a>
Replace:  aria-label="$1"$0
Transforming Quotes
In comments you asked to also replace a single quote by a repeated single quote, in both the texts. This cannot be done in the same replace operation, but you could launch a separate one, that should be executed before the one above:
Find: '(?=[^<>]*</a>)
Replace: ''
And then after this is done, you could apply the first replace operation.

I will assume that all your tags are correctly formed (no missing closing tag, no missing bracket, etc...). You can then do something like :
Replace :
(<a[^>]*)>([^<]*)(<\/a>)
by
$1 aria-label="$2">$2$3
Demo here

Use (?<=www\.)(\w+)(\..+\")(?=>) as a find template and \1\2 aria-label="\1" as replace template.
Click on Replace All button.

Related

How to Match Redundant Lines From Contenteditable Div in Regex

I'm trying to process the html inside a contenteditable div. It might look like:
<div>Hi I'm Jack...</div>
<div><br></div>
<div><br></div>
<div>More text.</div> *<div><br></div>*
*<div><br></div>**<div><br></div>*
*<div><br></div>*
*<div>
<br>
</div>*
What regex expression would match all trailing <div><br></div> but not the ones sandwiched between useful divs containing text, i.e., <div> text (not html) </div>?
I have enclosed all expressions I want to match in asterisks. The asterisk are for reference only and are not part of my string.
Thanks,
Jack
You can use the pattern:
(?:<div>[\n\s]*<br>[\n\s]*<\/div>)(?!.*?<div>[^<]+<\/div>)
You can try it here.
Let me know if this works for all your cases and I will write a detailed explanation of the pattern.

Regular expression: Remove first match pattern in front and behind certain text

I have the following text.
<span style="color:#FF0000;">赤色</span><span style="color:#0;">|*|</span><span style="color:#0070C0;">青色</span><span style="color:#0;">|*|</span><span style="color:#00B050;">緑色</span><span style="color:#0;">|*|</span>
I need to remove any span tag that defines color for "|*|" only. That is in this case, I need to remove
<span style="color:#0;">
and
</span>
Can anyone help to do that?
Thanks in advance!
You want something like this:
<span[^>]+style="[^"]*color:[^>]+>(\|\*\|)<\/span>
This matches <span, then one or more non-> characters, then a style attribute that contains color:, then the rest of the tag, then |*|, then </span>.
You would replace with $1 or just |*|.
Here's a demo.
Note: one reason your attempt didn't work is that you escaped the |s, but not the *. You need to escape the * as \*.

Sublime: replace everything between quotes

I need some help with Regular expression to Search and Replace in Sublime to do the following.
I have HTML-code with links like
href="http://www.example.com/test=123"
href="http://www.example.com/test=6546"
href="http://www.example.com/test=3214"
I want to replace them with empty links:
href=""
href=""
href=""
Please help me to create a Reg. ex. filter to match my case. I guess it would sound like "starts with Quote, following with http:// .... ends with Quote and has digitals and '=' sign", but I'm not very confident of how to write this in Reg. ex. way.
(?<=href=")[^"]*
Try this.Replace by empty string.
See demo.
https://regex101.com/r/sH8aR8/40

Notepad++ ( perl ) regex match multiple line pattern

I want to remove a div from a couple hundred html files
<div id="mydiv">
blahblah blah
more blah blah
more html
<some javascript here too>
</div>
I thought that this would do the job but it doesn't
<div(.*)</div>
Does anyone know which is the proper regex for this?
Regex
<div[^>]+>(.*?)</div>
Don't forget to check the option . matches newline like in the image below :
Alternatively, you can use this regex also: <div[^>]+>([\s\S]*?)</div> with or without the checkbox checked.
Discussion
Since * metacharacter is greedy, you need to tell him to take as few as possible characters (use of ?).
Check that the divs you want to remove DO NOT contain nested div. In that case, the regex at the start of my answer won't help you.
If you face this case, I'd suggest you using an html parser.

How can I find all attributes with single quotes in a Sublime Text 2 document and replace with double quotes?

I'm feeling particularly nit-picky today. I'm working in some HTML docs that have single quotes around all attribute values through the docs, like this:
<div class='classone classtwo'>
I'd love to be able to do a find-and-replace in each doc and replace with double quotes, like this:
<div class="classone classtwo">
Many elements in the document will have multiple attributes:
<div class='classone classtwo' data-scripts='lazyload'>
And some will have the correct double quotes:
<div class='classone classtwo' data-scripts="lazyload">
What's the best way to replace all single quotes wrapping values with double?
Doing a "replace" operation, I might try something simple like this:
Find What: =\s*'(.*?)'
Replace With: ="$1"
Make sure to enable the Regular Expression toggle button (.*)
The nice thing about doing this in SublimeText2 is that it will immediately show you which sections are going to be matched before you perform the replace. So the regex doesn't have to be fullproof.
In Ruby I would use that Regular Expression:
ruby-1.9.2-p290 :001 > s = "<div class='classone classtwo' data-scripts='lazyload'>"
=> "<div class='classone classtwo' data-scripts='lazyload'>"
s.gsub(/=\s?'([^']+)'/,"=\"#{$1}\"")
=> "<div class=\"lazyload\" data-scripts=\"lazyload\">"
It all boils down to capturing everything within the two ' ':
'([^'])'
and returning the captured match with the " ".
If there are no single quotes anywhere else except in the tags, and you want to replace them all, this is faster:
Highlight any single quote anywhere in the document.
Press Alt+F3.
Type a ".