Match Regex except Regex - regex

I have a text like this:
22 <a data-event="event:noted:tasks" class="btn btn-default show-if-closed" title="Noted Tasks">
25 <a data-event="event:until-today" class="btn btn-default show-if-closed" title="Until Today">
28 <a data-event="event:until-one-week" class="btn btn-default show-if-closed" title="Until One Week">
31 <a data-event="event:until-one-month" class="btn btn-default show-if-closed" title="Until One Month">
Now I want to replace the entire text except the string that is inside the title-Tag.
After replacing the text I would like to get lines like this:
Noted Tasks
Until Today
Until One Week
Until One Month
What Regex-Pattern do I need to match the text except the title-Values? The pattern should be universal, not limited to a-Tags

Use the following regex:
^.*?title="([^"]*)".*$
and replace with \1. In that way, the entire line is replaced with the desired info.
Test here.
Please note that it is better to use proper HTML parsers to... ahem... parse HTML.
The pattern should be universal, not limited to a-Tags
Considering that the word title can appear anywhere on a web page (normal text, class names, keywords...), only a dedicated HTML parser will help you in the long run.

Related

RegEx to find a string included between two characters while EXCLUDING the delimiters

I'm kinda lost with Regex and would appreciate some help.
Target: To extract the URL between the two " ", without returning the " themselves.
Base string:
<span class="fa fa-eye fa-fw poptip" data-toggle="tooltip" title="" data-original-title="Inspect in-game"></span>
I came up with the following solution:
(="(.*)" class="btn btn-xs btn-default ")
Too bad it is matching
="somerandomurl" class="btn btn-xs btn-default "
Is it possible to match only the inner result, without the delimiters?
somerandomurl
Since this should be included in a script that should run as fast as possible, maybe there is a faster and better approach? In reality this regex search will be applied on a complete website.
Using RegEx to match markup is usually not a good idea. If you have the option you might want prefer a HTML / DOM parser.
That said your RegEx should match the sample in most languages. But it defines two sets of parenthesis so the result you want is located in group 2. Both group 0 and 1 will hold the full match.
If you have trouble reading the correct result group, please provide some additional information like which language your're working in and preferabbly a snippet.

How do I conditionally add a space in a regex replace

When I woke up this morning, I didn’t know a stroke of regex. By the time I went to Mass, I’d been able to cobble together this regex to find occurrences of ‘Mph’ in an html document.
(?i)(?<=[\s|\d])mph+
If I run it against the following test data:
<div class="vsMph">
<p>95 Mph</p>
</div>
<div class="vsMph">
<p>95Mph</p>
</div>
It correctly matches:
‘ Mph’ and
‘Mph’
And equally correctly leaves the ‘vsMph’ alone, which is exactly what I want. Eventually, I'm going to use the same technique to match knots, ft, in, km and so on.
I’m executing this expression in in Sublime Text 3 using RegReplace and ultimately, what I hope to do is to use this regular expression to find all occurrences of ‘Mph’ preceded by a space or a digit and:
Enclose ‘Mph’ in <abbr> tags.
Add a space between the digit and the
opening <abbr> tag if there was no space between the last digit and
'Mph' originally.
In other words, I want to convert the above test data to:
<div class="vsMph">
<p>95 <abbr title="Miles per hour">Mph</abbr></p>
</div>
<div class="vsMph">
<p>95 <abbr title="Miles per hour">Mph</abbr></p>
</div>
I can get RegReplace to add the <abbr> tags as described in 1. above, but I’ve searched around on Google and I can’t find anything that tells me how to conditionally insert a space in a regex replace.
So I’m wondering. Is it possible in the first place to conditionally add a space in a regex replacement and if so how do I do it, or do I have to search for ‘\sMph’ and ‘\dMph’ and replace them separately?
Regards.
I would suggest using groups to match Mph. You could search for simply the following regex:
(\d)(\s)?(Mph)
Then replace using groups
$1 <abbr title="Miles per hour">$3</abbr>
output:
<div class="vsMph">
<p>95 <abbr title="Miles per hour">Mph</abbr></p>
</div>
<div class="vsMph">
<p>95 <abbr title="Miles per hour">Mph</abbr></p>
</div>

Textmate Find regex, Replace wild

In textmate-1.5 I can use the regex syntax (.*) to find both lines in the below use case:
<span class="class1"></span>
<span class="class2"></span>
Now I want to append more code to each of them so my find query is span class="(.*)" and my replace query is span class="(.*)" aria-hidden="true" which i had hoped would result in this:
<span class="class1" aria-hidden="true"></span>
<span class="class2" aria-hidden="true"></span>
but it actually resulted in this:
<span class="(.*)" aria-hidden="true"></span>
<span class="(.*)" aria-hidden="true"></span>
Using find/replace (not using column selection which would work for this example but not for the actual situation) is it possible to maintain the area matched by regex in the replace action with a representative wild character or something?
Change your replace query as,
span class="$1" aria-hidden="true"
$1 would refer the characters which are present inside group index 1.
(<span class="[^"]*")
Try this.Replace with $1 aria-hidden="true".See demo.
http://regex101.com/r/wQ1oW3/22

Replacing an open-ended HTML regex match in Sublime Text

I've been doing a lot of finding and replacing in Sublime Text and decided I needed to learn RegEx. So far, so good. I'm no expert by any means, but I'm learning quickly.
The trouble is knowing how to replace open-ended HTML matches.
For example, I wanted to find all <button>s that didn't have a role attribute.
After some hacking and searching, I came up with the following pattern (see in action):
<button(?![^>]+role).*?>(.*?)
Great! Except, in the code base I'm working in, there are tons of results.
How do I do about replacing the results safely by injecting role=button at the end of <button, just before the closing > in the opening tag?
Desired results
Before: <button type="button">
After: <button type="button" role="button">
Before: <button class="btn-lg" type="button">
After: <button class="btn-lg" type="button" role="button">
You can capture everything before the ending > and put it back, before the insertion of role=button:
<(button(?![^>]+role).*?)>
This captures everything in the tag.
Replace by:
<$1 role="button">
The $1 contains what the first regex captured.
See the updated regexr.

regexp, help with assertions

I have the following string:
<a name="subhd_182"></a>
<a name="st_394"></a>
<a name="st_395"></a>
<a name="qn_494"></a>
<a name="st_495"></a>
<a name="qn_594"></a>
<a name="st_595"></a>
<a name="subhd_282"></a>
<a name="qn_694"></a>
<a name="st_695"></a>
<a name="qn_794"></a>
<a name="st_795"></a>
<a name="qn_894"></a>
<a name="st_895"></a>`
And I want to replace every <a name="st_\d*"></a> with <a name="qn_\d*"></a> if it follows immediately <a name="subhd_\d*"></a>
I use this regex %(.*<a name="subhd_.*)(?=<a name="st(?!<a name="qn))(<a name=")st(.*)%sU and replace with $1$2qn$3. But it also replaces second case too
I'm assuming you only want to match name after the first subhd row above, but not the second, since the first one is an "st_" and the second one is a "qn_".
Try:
(<a name="subhd_\d+">\s*<\/a>\s*<a name=")st(_\d+">)
where you would replace as $1qn$2 Note that here I have assumed that you were quite literal when you said "it follows immediately .
I don't really understand why you're throwing the lookahead in, unless the actual rule you're trying to implement is more complicated than you've stated.
Try: %(<a name="subhd_\d+"></a>\n<a name=")st(.*)%sU and replace with $1qn$2. On a sidenote I don't really know what the U modifier does for you here. Also, you might want to change your \n newline matcher according to your operating system.
I have found RegExr a really useful tool for regular expressions.