Why doesnt this regexp work for this html? - regex

<div class="_1zGQT _2ugFP message-in">
<div class="-N6Gq">
<div class="copyable-text" data-pre-plain-text="[18:09, 3.6.2019] Лера сестра: ">
<div class="_12pGw">
<div class="_3X58t selectable-text invisible-space copyable-text">
<span class="_2ZDCk">
<img crossorigin="anonymous" src="URL" alt="😆" draggable="false" class="_298rb _2FANH selectable-text invisible-space copyable-text" data-plain-text="😆" style="visibility: visible;">
</span>
</div>
</div>
</div>
</div>
</div>
Ive try to get with this code:
soup.find('div', class_=re.compile('^selectable-text invisible-space copyable-text'))
All i got: None.
The problem is that part of the class (_3X58t ) is changing.

This would be likely due to using ^ anchor, which we could modify to:
soup.find('div', class_=re.compile('selectable-text invisible-space copyable-text'))
or we might try this expression for the divs:
(.+?selectable-text invisible-space copyable-text)
Demo

I would first see if a single class, from the compound class list, could be used e.g.
soup.select_one('.selectable-text')
Else combine classes
soup.select_one('[class$="selectable-text invisible-space copyable-text"]')
Rather than resorting to regex.

Related

Non-breaking space with Django template code and Bootstrap 4 badges

I am trying to keep text generated with Django template language which is contained within a Bootstrap 4 badge together with some additional text that is not contained in the badge.
Here is my code:
<span>Submitted by: <span class="badge badge-primary">{{
user.username }}</span></span>
I want all the words in the phrase "Submitted by USER" to always be on the same line, but the code above does not achieve that. Any idea what is wrong?
Add the class text-nowrap to the outer <span> element and remove the unnecessary .
text-nowrap in Bootstrap 4 prevents wrapping as the name suggests.
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0/css/bootstrap.min.css" integrity="sha384-Gn5384xqQ1aoWXA+058RXPxPg6fy4IWvTNh0E263XmFcJlSAwiGgFAW/dAiS6JXm" crossorigin="anonymous">
<div class="container">
<div class="row">
<div class="col-4 bg-success">
<span class="text-nowrap">Submitted by: <span class="badge badge-primary">Usernameverylongusernameevenlongerthanthat</span></span>
</div>
</div>
</div>

preg_replace regular expression to replace link within a particular tags

I need one help, i want to replace the href link to my link within a particular div class only.
<div id="slider1" class="owl-carousel owl-theme">
<div class="item">
<div class="imagens">
<img src="https://image.oldste.org" alt="The Fate of the Furious" width="100%" height="100%" />
<span class="imdb">
<b class="icon-star"></b> N/A
</span>
</div>
<span class="ttps">The Fate of the Furious</span>
<span class="ytps">2017</span>
</div>
</div>
Here i want to change http://oldsite.com/ to http://newsite.com/?id=
i want these href links like
<a href="http://newsite.com/?id=the-fate-of-the-furious">
Please help me with preg_replace regular expression.
Thanks
this may help you
$content = get_the_content();
$pattern = "/(?<=href=(\"|'))[^\"']+(?=(\"|'))/";
$newurl = get_permalink();
$content = preg_replace($pattern,$newurl,$content);
echo $content;
Lookbehinds are too expensive, use \K to start the fullstring match and avoid a capture group.
<a href="\K[^"]+\/ This pattern will be very efficient. I should state that this pattern will match ALL <a href urls. It also matches greedily until it finds the last / in the url -- I assume this is okay by your input sample.
Pattern Demo
Code (PHP Demo):
$in='<div id="slider1" class="owl-carousel owl-theme">
<div class="item">
<div class="imagens">
<img src="https://image.oldste.org" alt="The Fate of the Furious" width="100%" height="100%" />
<span class="imdb"><b class="icon-star"></b> N/A</span>
</div>
<span class="ttps">The Fate of the Furious</span>
<span class="ytps">2017</span>
</div>';
echo preg_replace('/<a href="\K[^"]+\//','http://newsite.com/?id=',$in);
Output:
<div id="slider1" class="owl-carousel owl-theme">
<div class="item">
<div class="imagens">
<img src="https://image.oldste.org" alt="The Fate of the Furious" width="100%" height="100%" />
<span class="imdb"><b class="icon-star"></b> N/A</span>
</div>
<span class="ttps">The Fate of the Furious</span>
<span class="ytps">2017</span>
</div>

jquery regex get several key not only one

I would like to get
PA-1400-11PA ADP-40PH ABA
Here html code
</div>
<div class="ref">
<h2 id='affiche_sous_titre'>eee :</h2> <p>
<a href='eee' title='PA-1400-11PA' class='lien_menu'>PA-1400-11PA</a> - <a href='uuu' title='ADP-40PH ABA' class='lien_menu'>ADP-40PH ABA</a> </p>
</div>
<div class="modele_tout">
</div>
<div class="star-customer">
Here my reg code
line=line.replace(/[\"\']lien_menu[\"\']>(.*?)<\/a>/ig,"$1\n")
But I have only
ADP-40PH ABA
What is the problem.I dont understand?
thanks for your help

Find and replace Dreamweaver - Adding a number each time?

I've just been doing some research about regular expressions on dreamweaver and had no idea you could do so much with find and replace.
One thing i haven't been able to figure out is if i have:
<div class="item active"> <img src="images/gates2/large/gates1.jpg">
<div class="carousel-caption">WG1</div>
</div>
<div class="item"><img src="images/gates2/large/gates2.jpg">
<div class="carousel-caption">WG1</div>
</div>
<div class="item"><img src="images/gates2/large/gates3.jpg">
<div class="carousel-caption">WG1</div>
</div>
<div class="item"><img src="images/gates2/large/gates4.jpg">
<div class="carousel-caption">WG1</div>
</div>
Can i automatically change the 1 to 2, 3 ect? So it becomes an order? Rather than going through each manually?
Kind Regards,
Shaun
Unfortunately, you won't be able to do what you're describing using the Find & Replace method in Dreamweaver. I know this probably isn't the answer you want to hear but, you'll have to go through them manually. Sorry to be the bearer of bad news.

UnicodeDecodeError in template

I get the following error code when trying to load the template.
'utf8' codec can't decode byte 0x94 in position 720: invalid start byte
Here is the template:
{% extends "base.html" %}
{% block site_wrapper %}
<div id="main">
Skip to main content
<div id="banner">
<div class="bannerIEPadder">
<div class="cart_box">
[link to cart here]
</div>
Modern Musician
</div>
</div>
<div id="navigation">
<div class="navIEPadder">
[navigation here]
</div>
</div>
<div id="middle">
<div id="sidebar">
<div class="sidebarIEPadder">
[search box here]
<br/>
[category listing here]
</div>
</div>
<div id="content">
<a name=”content”></a>
<div class="contentIEPadder">
{% block content %}{% endblock %}
</div>
</div>
</div>
<div id="footer">
<div class="footerIEPadder">
[footer here]
</div>
</div>
</div>
{% endblock %}
In UTF-8 0x94 is nothing, however in ISO1252 it's a right quote (”). Generally speaking the plain quote (") is much safer.
Make sure you're not copying and pasting this out of some blog that has weird accented quotes or something like that.
If you're using a text editor save it as ascii and see what crops up missing.
You have weird double quotes around div#content, try replacing them with ASCII quotes.
Maybe your template is encoded with something other than utf-8? It depends on your terminal/editor or maybe OS settings.
I had some strange characters in my code because i copied out of a pdf-file.
I had this same error . . . and it turned out that the problem was I included a "©" in my source copied as a part of a template.
Got to check that code for strange characters.........