Replacing HTML text elements with increment variable - regex

In the below HTML part, I want to replace, whenever a text is found, with an incremental variable:
<li class="cat-item">
<a href="#" >Beautiful Reclessness</a>
</li>
<li class="cat-item">
<a href="#" >Comfort vs. Appearance</a>
</li>
<li class="cat-item">
<a href="#" >Highlights of the Runway</a>
<ul class='children'>
<li class="cat-item">
<a href="#" >Christian Louboutin Show</a>
</li>
<li class="cat-item">
<a href="#" >Givenchy F/W 2016</a>
</li>
<li class="cat-item">
<a href="#" >Spring by Gaultier</a>
To this using the x++ increment:
<li class="cat-item">
<a href="#" >x1</a>
</li>
<li class="cat-item">
<a href="#" >x2</a>
</li>
<li class="cat-item">
<a href="#" >x3</a>
<ul class='children'>
<li class="cat-item">
<a href="#" >x4</a>
</li>
<li class="cat-item">
<a href="#" >x5</a>
</li>
<li class="cat-item">
<a href="#" >x6</a>
Is there a way in Notepad++ or Vim (looking for in between > <) to do find the text contents using REGEX and replace them with an x counter?

Simple vim answer:
Open the file—vim filename
Set up a convenience variable—:let num=1
Do the replaclement—:g/href/execute printf("normal! citx%d", num) | let num=num+1
The :global command allows one to perform an operation all lines matching a pattern (in this case, href). The operation we want to do is change the text inside the <a> tag to x followed by the contents of num, and increment num.
execute lets us build a command line from strings; I often combine with printf() because I find it easier to read. normal! is an Ex-command that lets us execute normal-mode commands. cit is a vim'ism for "change inside tag" from normal mode. Then we just feed it the appropriate replacement text (x%d) and increment the counter.
If you're wondering how I came up with this, it's a pretty well-established pattern among vimmers. In practice, it took me probably about a minute to get the whole sequence done (faster if I used it more often), so it isn't one of those "spend 30 minutes trying to write a good regex" answers—this can be done in a live editing session without too much thought, if the person editing has a good grasp of vim fundamentals.

Hope that helps.
Download python script plugin
plugins > python script > new script > save as "increment.py"
Develop your regex at regex101 or somewhere else and write the script
i=0
def increment(match):
global i
i=i+1
return "x"+str(i)
editor.rereplace('(?<=>)\\b[^><]+', increment)
Save and run your script: plugins > python script > scripts > increment

A slightly different approach on vim:
:let c=1 | g/a href="#" >\zs.*\ze</ s//\='x'.c/g | let c+=1
Using \zs and \ze we can select the pattern we want to remove. The counter
will gives the number sequence concatenated with space:
\='x'.c ................. concatenate 'x' with the counter

Related

How can I click on numeric buttons (onclick Event) using selenum in python?

I have a HTML code like this:
<ul aria-hidden="false" aria-labelledby="resultsPerPage-button" id="resultsPerPage-menu" role="listbox" tabindex="0" class="ui-menu ui-corner-bottom ui-widget ui-widget-content" aria-activedescendant="ui-id-2" aria-disabled="false" style="width: 71px;">
<li class="ui-menu-item">
<div id="ui-id-1" tabindex="-1" role="option" class="ui-menu-item-wrapper">20</div>
</li>
<li class="ui-menu-item"><div id="ui-id-2" tabindex="-1" role="option" class="ui-menu-item-wrapper ui-state-active">50</div>
</li>
<li class="ui-menu-item"><div id="ui-id-3" tabindex="-1" role="option" class="ui-menu-item-wrapper">100</div>
</li>
<li class="ui-menu-item"><div id="ui-id-4" tabindex="-1" role="option" class="ui-menu-item-wrapper">200</div>
</li>
</ul>
I want to click on "200". Can u help me? I used selenium in python 2.7
I tried doing this:
import time
time.sleep(10)
x=driver.find_element_by_link_text("200").click()
x.click()
time.sleep(8)
The problem here is that the element that contains the text 200 is not a "Link", but only a li tag which could work as a clickable element was defined on that site.
The documentation doesn't specify it directly, but "Link" means only a tags.
The idea is the same, but you'll have to find that element on a different way than thinking about as a Link. Using xpath would be I think the best way for this approach:
x = driver.find_element_by_xpath("//div[./text()='200']")
x.click()
Now of course that would work for finding an element depending on the text it contains, but for finding the specific node you want would be even easier and better to use the id, as it should always be unique:
x = driver.find_element_by_id('ui-id-4')
I can run it by use of "send_keys":
import time
number.click()
number.send_keys("200")
var200=driver.find_element_by_xpath("""//*[#id="ui-id-4"]""")
var200.click()

RIDE Robot framework Select from dynamic list

I am trying to choose an element("Classic") from a dynamic dropdown list. Problem is that word Classic contains 2 elements.
Html page is:
<ul id="dynamic-14" class="results" role="list">
<li class="results-dept result">
<div dynamic-102" class="results" role="option">
<span class="match"/>
</div>
</li>
<li class="results-dept result">
<div dynamic-12" class="results" role="option">
<span class="match"/>
Classic
</div>
</li>
<li class="results-dept result">
<div dynamic-1022" class="results" role="option">
<span class="match"/>
Classic numbers
</div>
</li>
I tried to do it with xpath using:
//ul[#class="results"] //div[contains(.,'Classic')]
but it gives me back 2 values so robot framework can't choose one I need.
user normalize-space() function to get rid of the leading and trailing whitespace.
//ul[#class="results"] //div[ normalize-space(.)='Classic']

How can I extract URLs from html content with ruby regexp?

Lets go directly with an example since it is not easy to explain:
<li id="l_f6a1ok3n4d4p" class="online"> <div class="link"> random strings - 4 <a style="float:left; display:block; padding-top:3px;" href="http://www.webtrackerplus.com/?page=flowplayerregister&a_aid=&a_bid=&chan=flow"><img border="0" src="/resources/img/fdf.gif"></a> <!-- a class="none" href="#">random strings - 4 site2.com - # - </a --> </div> <div class="params"> <span>Submited: </span>7 June 2015 | <span>Host: </span>site2.com </div> <div class="report"> <a title="" href="javascript:report(3191274,%203,%202164691,%201)" class="alert"></a> <a title="" href="javascript:report(3191274,%203,%202164691,%200)" class="work"></a> <b>100% said work</b> </div> <div class="clear"></div> </li> <li id="l_zsgn82c4b96d" class="online"> <div class="link"> <a href="javascript:show('zsgn82c4b96d','random%20strings%204',%20'site1.com');%20" onclick="visited('zsgn82c4b96d');" style
In the above content i want to extract from
javascript:show('f6a1ok3n4d4p','random%20strings%204',%20'site2.com')
the string "f6a1ok3n4d4p" and "site2.com" then make it as
http://site2.com/f6a1ok3n4d4p
and same for
javascript:show('zsgn82c4b96d','random%20strings%204',%20'site1.com')
to become
http://site1.com/zsgn82c4b96d
I need it to be done with ruby regex
This should give you some insight of how to do it.
https://regex101.com/r/wD4oT8/2
javascript:show\(\'(.*?)'.*?\'([^\']*)\'\) will capture the first argument as $1, last part within ' as $2, so you get what you want by substituting as $2/$1.
That's the regex part of it, and, of course, you can adjust the regex as you see fit, for example, to include the usage of " (javascript:show\((?:\'|\")(.*?)(?:\'|\").*?\'([^\'\"]*)(?:\'|\")\) or allow only with 3 arguments.
/yourregex/.match(yourstring) will extract the information you need.

Zurb-Foundation Tabs with Div Layout

The example from Foundation 3 explains how to set up tabs using lists but how do you use the tabs with a div layout?
<dl class="tabs">
<dd class="active">Simple Tab 1</dd>
<dd>Simple Tab 2</dd>
<dd class="hide-for-small">Simple Tab 3</dd>
</dl>
<ul class="tabs-content">
<li class="active" id="simple1Tab">This is simple tab 1s content. Pretty neat, huh?</li>
<li id="simple2Tab">This is simple tab 2s content. Now you see it!</li>
<li id="simple3Tab">This is simple tab 3s content.</li>
</ul>
<div class="tabs-content">
<div class="active" id="simple1Tab">This is simple tab 1s content. Pretty neat, huh?</li>
<div id="simple2Tab">This is simple tab 2s content. Now you see it!</div>
<div id="simple3Tab">This is simple tab 3s content.</div>
</div>
Add the divs in the list item.
<ul class="tabs-content">
<li class="active" id="simple1Tab">
<div>This is simple tab 1s content. Pretty neat, huh?</div>
</li>
<li id="simple2Tab">
<div>This is simple tab 2s content.</div>
</li>
<li id="simple3Tab">
<div>This is simple tab 3s content.</div>
</li>
</ul>

Optional param Zend Route Regex

How to make route regex parameters optionals with Zend ?
I try to make well formatted URLs ("search?s=mp&t=w" instead of "search/index/s/mp/t/w") for search filters, ex. :
Popularity
Most popular (s=mp)
Most viewed (s=mv, default)
Top rated (s=tr)
Most commented (s=mc)
Period
All period (t=a, default)
Today (t=d)
This week (t=w)
This month (t=m)
So, to get all top rated items from today i will have : search?s=tr&t=d
With regex routes i must specify defaults values and the problem is that the url view helper generates links with the default values and not with the current values.
Here is my route :
resources.router.routes.search.type = "Zend_Controller_Router_Route_Regex"
resources.router.routes.search.route = "search\?s\=(.+)\&t\=(.+)"
resources.router.routes.search.map.1 = s
resources.router.routes.search.map.2 = t
resources.router.routes.search.defaults.module = front
resources.router.routes.search.defaults.controller = search
resources.router.routes.search.defaults.action = index
resources.router.routes.search.defaults.s = mv
resources.router.routes.search.defaults.t = a
resources.router.routes.search.reverse = "search?s=%s&t=%s"
and links :
<div class="filters note">
<div class="filters-content">
<h3>Popularity</h3>
<ul class="filters-list">
<li>
<a href="<?=$this->url(array('s' => 'mp'), 'search')?>">
Most popular
</a>
</li>
<li>
<a href="<?=$this->url(array('s' => 'mv'), 'search')?>">
Most viewed
</a>
</li>
<li>
<a href="<?=$this->url(array('s' => 'tr'), 'search')?>">
Top rated
</a>
</li>
<li>
<a href="<?=$this->url(array('s' => 'mc'), 'search')?>">
Most commented
</a>
</li>
</ul>
</div>
</div>
<div class="filters period">
<div class="filters-content">
<h3>Period</h3>
<ul class="filters-list">
<li>
<a href="<?=$this->url(array('t' => 'a'), 'search')?>">
All period
</a>
</li>
<li>
<a href="<?=$this->url(array('t' => 'd'), 'search')?>">
Today
</a>
</li>
<li>
<a href="<?=$this->url(array('t' => 'w'), 'search')?>">
This week
</a>
</li>
<li>
<a href="<?=$this->url(array('t' => 'm'), 'search')?>">
This month
</a>
</li>
</ul>
</div>
</div>
For example, if current page is "search?s=tr&t=d" and i clic on "This week", the link is : "search?s=mv&t=w" instead of "search?s=tr&t=w" because of the default values.
I must specify default values or i get an error.
Any idea ?
Thanks,
Benjamin.
I haven't used the regex routes, but I have seen this error. Basically, the defaults.[param] parts need values. I my custom route, I am setting them to be empty:
; Navigation ID Route (uses navigation id)
resources.router.routes.nav.route = "p/:id/:title/*"
resources.router.routes.nav.defaults.module = "default"
resources.router.routes.nav.defaults.controller = "index"
resources.router.routes.nav.defaults.action = "index"
resources.router.routes.nav.defaults.id =
resources.router.routes.nav.defaults.title =
resources.router.routes.nav.reqs.id = "\d+"
resources.router.routes.nav.reqs.title = ".*"
are you sure you want to do it this way? a few times i've tried to "fight the framework" but found out the framework knew better (grin). can i suggest another way? (that Google may like better: makes your URLs more "people friendly" too). Use URLs like
/top-rated-widgets/today
/most-viewed-widgets/this-month
/most-commented-widgets
where you replace "widgets" with whatever your site is about eg "videos", "blog posts", "unicycles" whatever.
then each of the above routes to your search controller