I really need help for get the code with this case:
<tr class="detail-middle">
<td colspan="4">
<span class="font-bold">Address</span>
<p>
<strong>Orin Fade</strong>
<br>
19 rue marciere
<br>
Lyon
<br>
Lyon
<br>
France
<br>
Phone Number: +33 0478372730
</p>
</td>
</tr>
I use imacros code:
TAG POS=1 TYPE=SPAN ATTR=TXT:"Address" EXTRACT=TXT
but i need th TXT after TXT address, can imacros get after any word?
Thank you
You can use relative positioning in your case:
TAG POS=1 TYPE=SPAN ATTR=TXT:"Address"
TAG POS=R1 TYPE=P ATTR=* EXTRACT=TXT
Related
SITUATION: I am finding it difficult to EXTRACT a specific text from a website.
The template example on the iMacros website (http://wiki.imacros.net/Data_Extraction#Data_Extraction_and_Web_Scraping) for
extracting a variable from iMacros is as follows:
TAG POS=1 TYPE=SPAN ATTR=CLASS:bdytxt&&TXT:* EXTRACT=HTM
However in the html code below, the specific element text1 doesn't have a class to specify in the ATTR section. I am specifically trying to extract text1 from the example below:
//This code is within an html page
<div class="class1">
<img class="class2" src="...">
<strong>
text1
</strong>
<br>
<small>text2</small>
<small class="class3">
<br>
<em>text3:</em>
<span>
<a href="..." class="class4">
<small style="color: #aaa; font-size: 80%">text4</small>
text5
</a>
</span>
<br>
<em>text6</em>
text7,
text8
</small>
</div>
What I have tried:
I know that when I record using "Experimental event recording mode" and click on the specific text1 that I get the following code:
EVENT TYPE=CLICK SELECTOR="HTML>BODY>DIV:nth-of-type(5)>DIV>STRONG>A" BUTTON=0
I tested to see if the SELECTOR would work in the EXTRACT code like so:
TAG POS=1 TYPE=SPAN SELECTOR="HTML>BODY>DIV:nth-of-type(5)>DIV>STRONG>A" EXTRACT=TXT
but as you can imagine, it didn't.
QUESTION: Does anyone know how I can extract text1 from the above situation?
Well, there can be several ways to extract this text. For example:
TAG POS=1 TYPE=IMG ATTR=CLASS:"class2"
TAG POS=R1 TYPE=A ATTR=* EXTRACT=TXT
Or if you use 'iMacros for Chrome', here's a solution with the help of selector:
TAG SELECTOR="HTML>BODY>DIV:nth-of-type(5)>DIV>STRONG>A" EXTRACT=TXT
In an iMacros script, how can you trigger a click on a link with a specific attribute? In this case, the link I would like to have clicked has a class of "i-project":
<div data-explore-index="1" >
<div class="i-project-card ">
<a href="/xxxxxxxxxxxxxxxxxx" ">
<span ></span>
</a>
<a href="blablabla" class="i-project">
<img src="https://blabla.jpg">
</a>
</div>
</div>
You should be able to select this link based upon its CLASS attribute:
TAG POS=1 TYPE=A ATTR=CLASS:i-project
I am crawling a webpage and i am using Beautifulsoup. There is a condition where i want to skip the content of one particular tag and get other tag contents. In the below code i don't want div tag contents. But i couldn't solve this. Please help me.
HTML code,
<blockquote class="messagetext">
<div style="margin: 5px; float: right;">
unwanted text .....
</div>
Text..............
<a class="externalLink" rel="nofollow" target="_blank" href="#">text </a>
<a class="externalLink" rel="nofollow" target="_blank" href="#">text</a>
<a class="externalLink" rel="nofollow" target="_blank" href="#">text</a>
,text
</blockquote>
I have tried like this,
content = soup.find('blockquote',attrs={'class':'messagetext'}).text
But it is fetching unwanted text inside div tag also.
Use the clear function like this:
soup = BeautifulSoup(html_doc)
content = soup.find('blockquote',attrs={'class':'messagetext'})
for tag in content.findChildren():
if tag.name == 'div':
tag.clear()
print content.text
This yields:
Text..............
text
text
text
,text
I am not expert on imacros search source command, I tried to looking some text on the source page to be extracted..
<div id='keywordsDiv' name='keywordsDiv' class='r-sidebar'>
<dl class="list normal-text">
<dt class="key">Category</dt>
<dd class="value"><a class="black" href="http://www.abcd">abcd</a> </dd>
<dt class="key">Style</dt>
<dd class="value"><a class="black" href="http://www.def.com/">def</a> </dd>
<dt class="key">Location</dt>
<dd class="value"><a class="black" href="http://www.ghi.com/">GHI</a> </dd>
<dt class="key">Keywords</dt>
<dd class="value">
</dd>
</dl>
</div>
How can I extract from source a text from div id=keywordsDiv.
Thank you
I've used the SEARCH command. It uses regex and has worked well for me searching source code. It can really be powerful in automating dynamic pages.
Here is a link:
http://wiki.imacros.net/SEARCH
*Note: I've run into issues with complex regex, I think there are a few flavors or regex and iMacros uses a specific one, plus there are regex limitations.
TAG POS=1 TYPE=DIV ATTR=ID:keywordsDiv EXTRACT=TXT
Try this.
I have been using iMacros to input multiple tags when posting photos to save time. They have recently updated the site and I can not figure out how to get iMacro to enter multiple tags.
When recording a macro this is the code iMacro comes up with
TAG POS=11 TYPE=INPUT:TEXT FORM=NAME:NoFormName ATTR=* CONTENT=foo,
The , is needed to start a new tag. It is not starting a new tag or recording the content correctly.
I have looked at the code where the tags come up and this is it below
<section class="tag_editor" style="display: block;">
<div class="tags">
<input class="post_tags" type="text" value="" style="display: none;" name="post[tags]">
<div class="editor_wrapper">
<input class="editor borderless" type="text">
</div>
</div>
</section>
It looks like the input I need is around editor_wrapper and editor borderless and I think I need it added to FORM=NAME:NoFormName and ATTR=* in the iMacro TAG. I have tried different combinations yet iMacro will not autofill the tags for me. The new post feature on tumblr is a pop-up ajax looking window.
The old macro that worked before the site update looked like this. Not sure if it will be of any help.
TAG POS=1 TYPE=INPUT:TEXT FORM=ACTION:/blog/foobar/new/photo ATTR=ID:tag_editor_input CONTENT=foo,
WAIT SECONDS=.3
TAG POS=1 TYPE=INPUT:TEXT FORM=ACTION:/blog/foobar/new/photo ATTR=ID:tag_editor_input CONTENT=bar,
WAIT SECONDS=.3
TAG POS=1 TYPE=INPUT:TEXT FORM=ACTION:/blog/foobar/new/photo ATTR=ID:tag_editor_input CONTENT=foo,
Looking for help getting this to work again. It saves me a ton of time to have these tags auto filled for me.
try this one to fill the second input:
TAG POS=1 TYPE=INPUT:TEXT ATTR=CLASS:editor* CONTENT=foo,
let me know if it works