I am fairly new to regular expressions and have been having difficulty using one to extract the data I am after. Specifically, I am looking to extract the date touched and the the counter from the following:
<span style="color:blue;"><query></span>
<span style="color:blue;"><pages></span>
<span style="color:blue;"><page pageid="3420" ns="0" title="Test" touched="2011-07-08T11:00:58Z" lastrevid="17889" counter="9" length="6269" /></span>
<span style="color:blue;"></pages></span>
<span style="color:blue;"></query></span>
<span style="color:blue;"></api></span>
I am currently using vs2010. My current expression is:
std::tr1::regex rx("(?:.*touch.*;)?([0-9-]+?)(?:T.*count.*;)([0-9]+)(&.*)?");
std::tr1::regex_search(buffer, match, rx);
match[1] contains the following:
2011-07-08T11:00:58Z" lastrevid="17889" counter="9" length="6269" /></span>
<span style="color:blue;"></pages></span>
<span style="color:blue;"></query></span>
<span style="color:blue;"></api></span>
match[2] contains the following:
6269" /></span>
<span style="color:blue;"></pages></span>
<span style="color:blue;"></query></span>
<span style="color:blue;"></api></span>
I am looking for just "2011-07-08" in match[1] and just "9" in match[2]. The date format will never alter, but the counter will almost certainly be much larger.
Any help would be highly appreciated.
That's because cmatch::operator[](int i) returns a sub_match, whose sub_match::operator basic_string() (used in the context of cout) returns a string starting at the beginning of the match and ending at the end of the source string.
Use sub_match::str(), i.e. match[1].str() and match[2].str().
Moreover, you'll need your expression to be more specific: .* tries to match the world, and gives up some if it can't.
Try std::tr1::regex rx("touched="([0-9-]+).+counter="([0-9]+)");.
You could even use non-greedy matchers (like +? and *?) to prevent excessive matching.
Try
std::tr1::regex rx("(?:.*touch.*;)?([0-9-]+)(?:T.*count.*;)([0-9]+)(&.*)?");
removing the question mark makes the term greedy, so it will fill as much as it can.
Related
I'm kinda lost with Regex and would appreciate some help.
Target: To extract the URL between the two " ", without returning the " themselves.
Base string:
<span class="fa fa-eye fa-fw poptip" data-toggle="tooltip" title="" data-original-title="Inspect in-game"></span>
I came up with the following solution:
(="(.*)" class="btn btn-xs btn-default ")
Too bad it is matching
="somerandomurl" class="btn btn-xs btn-default "
Is it possible to match only the inner result, without the delimiters?
somerandomurl
Since this should be included in a script that should run as fast as possible, maybe there is a faster and better approach? In reality this regex search will be applied on a complete website.
Using RegEx to match markup is usually not a good idea. If you have the option you might want prefer a HTML / DOM parser.
That said your RegEx should match the sample in most languages. But it defines two sets of parenthesis so the result you want is located in group 2. Both group 0 and 1 will hold the full match.
If you have trouble reading the correct result group, please provide some additional information like which language your're working in and preferabbly a snippet.
I'm doing a find/replace and but I have already made a few changes the slow way. I want to use regex to replace the rest but make sure I don't replace ones I've already done. So, I need it to match 1 but not 2. The end result will be replacing all instances that look like 1 with 2. The -icon can be anything
1: <span class="glyphicons icon">
2: <span class="glyphicons glyphicons-icon">
More examples:
<span class="glyphicons hand">
<span class="glyphicons flower">
<span class="glyphicons bucket">
<span class="glyphicons glyphicons-stone_head">
<span class="glyphicons glyphicons-decapitated-corpse">
I need to replace the first 3 examples but not the last 2. The application is quite large so I'd really like to be able to do this with one 'replace all'.
Assuming icon can be any word, I'd try replacing glyphicons\s([A-Za-z]+)" by glyphicons glyphicons-$1".
I wrote a section of a webpage that had the following bit...
<span id="item01"> some first presented text</span>
<span id="item02"> some other text, presented second</span>
<span id="item03"> more text</span>
....
<span id="item15"> last bit of text.</span>
I then realized that it should have been numbered from 14 to 0, not 1 to 15. (Yes, bad design on my part, not planning out the JavaScript first.)
Question. Is there an easy way in vim to do math on the numbers in a regular expression? What I would like to do is a search on the text "item[00-99]", and have it return the text "item(15-original number)"
The search seems easy enough -- /item([0-9][0-9])/
(parentheses to put the found numbers into a buffer), but is it even possible to do math on this?
Macro for making numbered lists in vim? gives a way to number something from scratch, but I'm looking for a renumbering method.
:%s/item\zs\d\+/\=15 - submatch(0)/
will do what you want.
Breaking it down:
item\zs\d\+: match numbers after item (the \zs indicates the beginning of the match)
\=: indicate that the replace is an expression
15 - submatch(0): returns 15 minus the number matched
Another interesting way is to use g<CTRL-a> (:help v_g_CTRL-A for more information)
Start from
<span id="item01"> some first presented text</span>
<span id="item02"> some other text, presented second</span>
<span id="item03"> more text</span>
....
<span id="item15"> last bit of text.</span>
Use visual block mode to reset all numbers to 00:
<CTRL-V> select all numbers
r0 replace all numbers with zeros
You should be seen:
<span id="item00"> some first presented text</span>
<span id="item00"> some other text, presented second</span>
<span id="item00"> more text</span>
....
<span id="item00"> last bit of text.</span>
Now restore your block select with gv or just select all lines with V and press g<CTRL+a>
<span id="item01"> some first presented text</span>
<span id="item02"> some other text, presented second</span>
<span id="item03"> more text</span>
....
<span id="item015"> last bit of text.</span>
Unfortunately one last clean up is needed here. As you can see, all two digit numbers get 0 in front. Use visual block mode <CTRL+v> again to select and remove unwanted zeros.
<span id="item01"> some first presented text</span>
<span id="item02"> some other text, presented second</span>
<span id="item03"> more text</span>
....
<span id="item15"> last bit of text.</span>
Now you are done :)
If you have vim with perl (many distributions have that by default), you can
use :perldo commands to do it. (#Marth solution is better)
:perldo s/(?<=item)(\d+)/15 - $1/e
You might want to take a look at the VisIncr plugin. It adds support for increasing / decreasing columns of numbers, dates, and day names, in various formats. Quite handy when you have to deal with these kind of things.
In textmate-1.5 I can use the regex syntax (.*) to find both lines in the below use case:
<span class="class1"></span>
<span class="class2"></span>
Now I want to append more code to each of them so my find query is span class="(.*)" and my replace query is span class="(.*)" aria-hidden="true" which i had hoped would result in this:
<span class="class1" aria-hidden="true"></span>
<span class="class2" aria-hidden="true"></span>
but it actually resulted in this:
<span class="(.*)" aria-hidden="true"></span>
<span class="(.*)" aria-hidden="true"></span>
Using find/replace (not using column selection which would work for this example but not for the actual situation) is it possible to maintain the area matched by regex in the replace action with a representative wild character or something?
Change your replace query as,
span class="$1" aria-hidden="true"
$1 would refer the characters which are present inside group index 1.
(<span class="[^"]*")
Try this.Replace with $1 aria-hidden="true".See demo.
http://regex101.com/r/wQ1oW3/22
I have the following line, and I want to add a brackets before and after it:
from:
<span class="Footnote"> Matt. xx. 19.</span>
or:
<span class="Footnote"> 1 Thess. i. 7.</span>
and different values of verse references.. (in other words anything in between those > and <
to:
<span class="Footnote"> (Matt. xx. 19.)</span>
and so on (it takes anything in between those > and < and add () before and after it..
p.s. I use notepad++ to search and replace..
edit:
the first 3 replies work great, even for anything not in the same format of the verse.. which is helpful.. however I noticed in the code some differences that doesn't get changed.. like if the code has any tags in between.. like:
<span class="Footnote"> [See <i>Dan</i>, note 12, p. 26, <i>infra</i>. “Eternal” ="long.”]</span>
or if the code is divided in more than one line! like
<span class="Footnote"> some text
more text
</span>
Thanks in advance,
Find what:
Footnote">\s*([^>]+)\s*<
Replace with:
Footnote">(\1)<
Search for
<span class="Footnote">\s*([^<>]*?)\s*</span>
and replace with
<span class="Footnote">(\1)</span>
This changes
<span class="Footnote"> Matt. xx. 19. </span>
into
<span class="Footnote">(Matt. xx. 19.)</span>
Try this: (Couldnt test it, my family wants me to close the computer at Christmas breakfast).
preg_replace("/Footnote">([^>]*?)</span>/i","[\1]",$subject);