Hi I'm trying to get date from this content:
<div class="article-meta">
<h1>Kelkraščiu ir prieš eismą</h1>
<div class="clear"></div>
<ul>
<li>
<strong>Publikuota:</strong>
2012 spalio 8d.
</li>
<li>
<strong>Autorius:</strong>
Vardas, Pavardė
</li>
<li>
<strong>Rubrika:</strong>
Fotopolicija
</li>
</ul>
<div class="clear"></div>
</div>
I need to get this 2012 spalio 8d. put into variable.
I was trying with preg_match but don't now how to complete pattern.
Can someone help me?
Try this:
preg_match_all('/Publikuota:<\/[^>]+>\s*(\d+\s*\w+\s*\w+)/i', $html, $out);
print_r($out[1]);
Related
Hej all !
As the title of the subject suggest, I wonder if it's possible to have deep linking that works on nested tabs.
I mean, if I use only one tabs container, it works fine, but I don't know how to deep link to a tab inside a parent tab. It should first open the parent tab and then display the goal child tab.
Is it possible using Foundation 6 deep-linking please (without hacks) ?
Let's say we have this code :
<div>
<ul class="tabs" data-tabs id="tabs" data-deep-link="true">
<li class="tabs-title is-active">Content 1</li>
<li class="tabs-title">Content 2</li>
</ul>
</div>
<div class="tabs-content" data-tabs-content="tabs">
<div class="tabs-panel is-active" id="tab1">
My Content 1
</div>
<div class="tabs-panel" id="tab2">
<ul class="tabs" data-tabs id="tab2-tabs" data-deep-link="true">
<li class="tabs-title is-active">Content 2-1</li>
<li class="tabs-title">Content 2-2</li>
</ul>
<div class="tabs-content" data-tabs-content="tab2-tabs">
<div class="tabs-panel is-active" id="tab2-tab1">
My Content 2-1
</div>
<div class="tabs-panel" id="tab2-tab2">
My Content 2-2
</div>
</div>
</div>
</div>
How can I open "My Content 2-2" using deep-linking please ?
<ul class="products-grid">
<li class="item">
<div class="product-block">
<div class="product-block-inner">
<img src="#/producta.jpg">
<h2 class="product-name">Product A</h2>
<div class="price-box">
<span class="regular-price" id="#">
<span class="price">Rs 1,849</span>
</span>
</div>
</div>
</div>
</li>
<li class="item">
<div class="product-block">
<div class="product-block-inner">
<img src="#/productb.jpg">
<h2 class="product-name">Product B</h2>
<div class="price-box">
<span class="regular-price" id="#">
<span class="price">Rs 1,849</span>
</span>
</div>
</div>
</div>
</li>
</ul>
I am at this moment scraping the item in a loop.
products = response.xpath('//ul[#class="products-grid"]//li//div[#class="product-block"]//div[#class="product-block-inner"]').extract()
After getting the product-block-inner node, I save it into products and then I will have to loop like
for product in products:
// parse the div.product-block-inner further deep down
// to get name, price, image etc
// and save it to a dict and yeild
pass
Is this possible that i get text, href for all div.product-block-inner in the final list without looping
Yes, but it's very confusing, for example you could try this:
products = response.xpath(
'//ul[#class="products-grid"]//li//div[#class="product-block"]//div[#class="product-block-inner"]'
).css(
'.product-name a::attr(href), .product-name a::text, .price::text'
).extract()
but I would suggest to always loop (btw, why do you call extract() when you assign it to products?)
products = response.xpath(
'//ul[#class="products-grid"]//li//div[#class="product-block"]//div[#class="product-block-inner"]'
)
for product in products:
yield {'name': product.css('.product-name a::text').extract_first()
'url': product.css('.product-name a::attr(href)').extract_first()
'price': product.css('.price::text').extract_first()}
(I've used css selectors in this case because the equivalent xpaths are longer, but the same can also be achieved using xpath)
I would like to get
PA-1400-11PA ADP-40PH ABA
Here html code
</div>
<div class="ref">
<h2 id='affiche_sous_titre'>eee :</h2> <p>
<a href='eee' title='PA-1400-11PA' class='lien_menu'>PA-1400-11PA</a> - <a href='uuu' title='ADP-40PH ABA' class='lien_menu'>ADP-40PH ABA</a> </p>
</div>
<div class="modele_tout">
</div>
<div class="star-customer">
Here my reg code
line=line.replace(/[\"\']lien_menu[\"\']>(.*?)<\/a>/ig,"$1\n")
But I have only
ADP-40PH ABA
What is the problem.I dont understand?
thanks for your help
I am trying to choose an element("Classic") from a dynamic dropdown list. Problem is that word Classic contains 2 elements.
Html page is:
<ul id="dynamic-14" class="results" role="list">
<li class="results-dept result">
<div dynamic-102" class="results" role="option">
<span class="match"/>
</div>
</li>
<li class="results-dept result">
<div dynamic-12" class="results" role="option">
<span class="match"/>
Classic
</div>
</li>
<li class="results-dept result">
<div dynamic-1022" class="results" role="option">
<span class="match"/>
Classic numbers
</div>
</li>
I tried to do it with xpath using:
//ul[#class="results"] //div[contains(.,'Classic')]
but it gives me back 2 values so robot framework can't choose one I need.
user normalize-space() function to get rid of the leading and trailing whitespace.
//ul[#class="results"] //div[ normalize-space(.)='Classic']
I have html content like this
<div class='.desc_html_aff'>
sdsdfdfdgdsg
<ul class="pi_ul">
<li>abc</li>
<li>def </li>
<li>ererefe </li>
</ul>
wfwfwsfgdhfhfhdf
dgdfhfj
</div>
I woudl like te replace content of <ul> by '!!'
here jquery which dont work
var desc_aff=$('.desc_html_aff').html().replace(/<ul(.*?)>(.*?)<\/ul>/gi,"!!")
I really cannot see what I'm doing wrong, any ideas?
Please help, thanks
The HTML string that you want to transform contains newline characters, which aren't matched by .*?. You can use [\s\S]*? instead:
var desc_aff=$('.desc_html_aff').html().replace(/<ul(.*?)>([\s\S]*?)<\/ul>/gi,"!!");
console.log(desc_aff);
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<div class='desc_html_aff'>
sdsdfdfdgdsg
<ul class="pi_ul">
<li>abc</li>
<li>def </li>
<li>ererefe </li>
</ul>
wfwfwsfgdhfhfhdf
dgdfhfj
</div>