in txt file:
<img src= {attach1}></img>
<img src= {attach2}></img>
string.Replace("{attach1}",picture1.jpeg)
string.Replace("{attach2}",picture2.jpeg)
output: two attachment should appear side by side
Related
I have a requirement where I need to modify html 'img' tags in an html string that do not end with a '/>'
ex: <img src=""> needs to be changed to <img src=""/>
I am using following regex: <img(.*[^/])> to replace with <img$1/>
This works fine however for cases like: <center><img src=""/></center> the regex returns: <center><img src=""></center/>
Any suggestions how to impact this regex only upto the end of the img tag? Thanks.
You may use this:
<\s*img\s+([^>]*=(?:\".*?\"|\'.*?\'))[\s\w\-]*>
with following replace by:
<img $1/>
this will match these simple and complex cases:
<img src="images/a.jpg" title="test"><br/>
<img src="a/b.jpg" >
<span><img src="a.jpg"></span>
<img src="" title="">
<img src="" data-val>
<img src="a.jpg" title="a'>b">
<img src="a.jpg" title='a">b'>
<img src="a.jpg" title='a>=b"=>' >
but not following:
<img src="a.jpg" />
<imgXTag src="b.jpg" >
<img src="a.jpg" / >
Sample Demo
I would like to know how I can use a regex for removing everything EXCEPT all image tags.
I already tried these:
(?s)^[^<](.) -> removes all the text before the img tag
(?s)^([^>]+>).* -> removes all the text after the img tag
Does anybody know how to combine these 2 for multiple images?
Here is an example of the content I want to apply it to:
Text text text. <img alt="alt text" src="path/to-image.png" />Text text text. <img alt="alt text" src="path/to-image.png" />Text text text. <img alt="alt text" src="path/to-image.png" />Text text text. <img alt="alt text" src="path/to-image.png" />
My desired result should be:
<img alt="alt text" src="path/to-image.png" /><img alt="alt text" src="path/to-image.png" /><img alt="alt text" src="path/to-image.png" /><img alt="alt text" src="path/to-image.png" />
Your example expressions do not work for me. However, turn "removing everything except all image tags" into "extract only image tags" and you can easily get what you want:
<img [^>]*\/> <!-- EDIT: XHTML only -->
<img [^>]*\/?> <!-- covers HTML and XHTML -->
Try: http://www.regexr.com/3eq09
I use photobucket to host my imagery for my ebay ads when I sell things, so I copy the html out of photobucket into notepad, and I'm always left the <img> tag being wrapped in photobucket's <a> tag, and I have to go through each line and manually delete each <a></a>, which on 26 lines across multiple items can soon equate too hundreds of "highlight and delete" actions.
I already do a search for the closing tag </a> and just do a "replace" with nothing, thus removing it, but the string I cannot fathom to remove, due to the image file name being different on every line is as the following example demonstrates:
So it's essentially the section of the anchor tag up to and including the > I need to be able to remove on a mass scale - Any help would be greatly appreciated!
<img src="http://i1297.photobucket.com/albums/ag35/eye/Programmes/Yes%20joblot/DSC02424c_zpslt9m0cuu.jpg" border="0" alt=" photo DSC05653_zpslt9m0cuu.jpg"/>
<img src="http://i1297.photobucket.com/albums/ag35/eye/Programmes/Yes%20joblot/DSC04444_zpspkgjw6vf.jpg" border="0" alt=" photo DSC05654_zpspkgjw6vf.jpg"/>
<img src="http://i1297.photobucket.com/albums/ag35/eye/Programmes/Yes%20joblot/DSC05655_zpsxuev7czs.jpg" border="0" alt=" photo DSC05655_zpsxuev7czs.jpg"/>
<img src="http://i1297.photobucket.com/albums/ag35/eye/Programmes/Yes%20joblot/DSC06624_zpsifjidypy.jpg" border="0" alt=" photo DSC05656_zpsifjidypy.jpg"/>
<img src="http://i1297.photobucket.com/albums/ag35/eye/Programmes/Yes%20joblot/DSC07777_zpsacyjrnnr.jpg" border="0" alt=" photo DSC05663_zpsacyjrnnr.jpg"/>
<a href="[^"]+?" target="_blank">
would do what you want, or even more general:
<a href=[^>]+?>
I am crawling a webpage and i am using Beautifulsoup. There is a condition where i want to skip the content of one particular tag and get other tag contents. In the below code i don't want div tag contents. But i couldn't solve this. Please help me.
HTML code,
<blockquote class="messagetext">
<div style="margin: 5px; float: right;">
unwanted text .....
</div>
Text..............
<a class="externalLink" rel="nofollow" target="_blank" href="#">text </a>
<a class="externalLink" rel="nofollow" target="_blank" href="#">text</a>
<a class="externalLink" rel="nofollow" target="_blank" href="#">text</a>
,text
</blockquote>
I have tried like this,
content = soup.find('blockquote',attrs={'class':'messagetext'}).text
But it is fetching unwanted text inside div tag also.
Use the clear function like this:
soup = BeautifulSoup(html_doc)
content = soup.find('blockquote',attrs={'class':'messagetext'})
for tag in content.findChildren():
if tag.name == 'div':
tag.clear()
print content.text
This yields:
Text..............
text
text
text
,text
I am using yahoo pipes to get content matching a certian category from my WordPress.com Blog. Everything is working fine but WordPress adds "share" links to the bottom of the feed that I would like to remove.
Here is what's being added:
<a rel="nofollow" target="_blank" href="http://feeds.wordpress.com/1.0/gocomments/bandonrandon.wordpress.com/87/">
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/bandonrandon.wordpress.com/87/"/></a>
<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bandonrandon.wordpress.com&blog=1046814&post=87&subd=bandonrandon&ref=&feed=1" width="1" height="1"/>
I edited out some of the services but you get the idea. I tried to use regex to remove this content what I tried was this:
<a rel="nofollow" target="_blank" href="http://feeds.wordpress.com/.*?><img alt="" border="0" src="http://feeds.wordpress.com.*?></a>
and
<img alt="" border="0" src="http://stats.wordpress.com.*?>
however it didn't fileter the results at all.
Using this would filter ALL images and works fine
<a.*?><img.*?></a>
<a[^>]+href="http://feeds.wordpress.com[^"]*"[^>]*>\s*<img[^>]+src="http://feeds.wordpress.com/[^"]*"[^>]*>\s*</a>\s*<img[^>]+src="http://stats.wordpress.com/[^"]*"[^>]*>
Regex updated, try that to match the whole lot.