Re.Sub Expected string or buffer - python-2.7

I read and try this, this and this solutions. But my error not solving.
def read_news(self, news_url):
try:
self.checkRequests(news_url)
result = self.soup.find("div", {'class', 'm-article__entry'})
if isinstance(result, bs4.element.Tag):
for r in result:
re.sub("<.*?>", "", r)
print ('result', result)
return result.lower()
return
except Exception, e:
self.insertErrorLog('thevergetech.read_news', news_url, e)
This is code's part occuring error. I run my code in debugging mode and result return this:
<div class="m-article__entry">
<p>Malloy Aeronautics, a UK-based company, has been slowly and publicly developing a hoverbike over the last few years — it even used Kickstarter to raise funds. But it looks like the project is now headed in a new direction, because the US Department of Defense just announced a deal with Malloy to develop the vehicle for the US Army.</p>
<p>The DoD is interested in the technology for a few reasons. For one, it's safe. The hoverbike's rotors are guarded so they won't tear into humans and other objects. It's also a cheaper option than, say, a helicopter. And it's more maneuverable in tight spaces, with options to operate it autonomously or with a human pilot.</p>
<div class="m-ad m-ad__article-body">
<div class="dfp_ad" data-cb-ad-id="Mobile article body" data-cb-dfp-id="unit=mobile_article_body" id="div-gpt-ad-mobile_article_body">
<script type="text/javascript">
SBN.Ads.showAd("mobile_article_body");
</script>
</div>
</div>
<p><q class="right">From Kickstarter to the US Military-industrial complex</q></p>
<p>Developers of the hoverbike told Reuters that they consider it ideal for search and rescue or cargo delivery missions. It could also be used for surveillance — plans for the full-scale version include an attachable humanoid figure with a head-mounted camera.</p>
<p>The first step in the deal will be to build a functioning full-scale model, and from there the DoD will reportedly design military-grade prototypes. In the meantime, Malloy Aeronautics will continue to make scale models while developing a commercial version of the hoverbike.</p>
<hr/>
<div class="video-wrap p-scalable-video"><div class="chorus-video-embed" data-analytics-placement="entry:middle" data-chorus-video-id="57712" id="chorus-video-57712"></div></div>
<b>Verge Video</b> <i>We rode a hoverboard</i>
<ul class="m-article__sources">
<li class="source"><strong>Source</strong>Reuters</li>
<li class="tags"><strong>Related Items</strong>
us army
department of defense
drones
hoverbike
malloy
</li>
</ul>
</div>
I try for loop just like for r in return but for cycle 2 times and get same error.

Looks like you have 8 space intend than 4

I fix my code. This code not get error;
def read_news(self, news_url):
try:
self.checkRequests(news_url)
result = self.soup.find("div", {'class', 'm-article__entry'})
if isinstance(result, bs4.element.Tag):
result = str(result)
result = re.sub("<.*?>", "", result)
return result.lower()
return
except Exception, e:
self.insertErrorLog('thevergetech.read_news', news_url, e)

Related

How to make a for loop break on counter in Django Templates?

How can I make the for product in products loop break after the if condition is fulfilled 3 times. I have been trying to set up a counter but that isn't working... because set is not accepted inside of for loops. Though testing it out a bit more it isn't being accepted anywhere.
When I use set it throws this exception or a variation of it: Invalid block tag on line 11: 'set', expected 'endblock'. Did you forget to register or load this tag?
I am aware that I should put all the logic I'm using Templates for inside of the view and then pass it through the dictionary but it would take too much time to research everything about that and i just want to be done with this.
Honestly I am really tired it is 6am and idk what to do. Thank you for your help in advance. :)
Edit: I know I have to use namespace() for set to be able to propagate across scopes. But set itself is still raising the same exception as above.
Edit2: I thought that django uses jinja2 as it's templating language but it seems like that is not the case. I fixed the mentions of jinja2 as I haven't changed any of the defaults that come with a fresh install of django.
HTML
{% for category in categories %}
<div>
<h5 class="text-line"><span>{{category.name}}</span></h5>
</div>
<!-- Cards Container -->
<div class="shop-cards">
{% set counter = 0 %}
{% for product in products %}
{% if product.category.categoryID == category.categoryID %}
<!-- CARD 1 -->
<div class="card">
<image class="product-image" src="{% static 'images/product-4.png' %}"></image>
<div class="card-category-price">
<h3 class="product-name">{{product.name}}</h3>
<h3 class="product-price">{{product.price}}</h3>
</div>
</div>
{% endif %}
{% endfor %}
</div>
{% endfor %}
You can't (by design).Django is opinionated by design, and the template language is intended for display and not for logic.
You can use Jinja instead.
Or, you can do the complicated stuff in Python and feed the results to the template language through the context. Bear in mind that appending arbitrary objects to a list in Python is a cheap operation. So (in a CBV) something like
context['first_three'] = self.get_first_three( whatever)
...
def get_first_three(self, whatever):
first_three = []
for obj in ...:
if complicated stuff ...
first_three.append( obj)
if len(first_three) == 3:
break
return first_three
BTW you can't (easily?) implement break in Django templating, but you can easily give it the ability to count:
class Counter( object):
def __init__(self):
self.n = 0
def value(self):
return self.n
def incr(self):
self.n += 1
return ''
In the context pass context['counter'] = Counter()
In the template refer to {{counter.value}} as much as you want, and in a conditional where you want to count occurences, use {{counter.incr}}. But, it's hacky, and probably best avoided.
The problem seems to arise due to a misconception of the templating functionality. It is to be used mainly for display purposes and not filtering purposes, and you are facing a filtering problem:
I want to filter the 3 first elements of a list that fulfill a specific condition
is your main issue and templating frameworks will be poor at solving your issue, whereas this is exactly what python and Django's ORM are good at.
Why don't you first filter in Django, then display in template? For example as follows:
...
products = queryset.filter(abc=condition)[:3]
context = Context({ 'products': products })
return HttpResponse(template.render(context))

Regex: Remove a <p></p> paragraph that has curly brackets inside

I would like to remove any paragraph for article body that has curly brackets inside.
For example, from this piece of content:
<p>While orthotic inserts are able to provide great support and pain relief, they aren’t quite as good as a specialty shoe. Remember that an ill-fitting insert can cause permanent damage and talk to a podiatrist about your foot pain for the best recommendation. Click here if you want to learn more about pain in the foot arch unrelated to plantar fasciitis.</p> <h2>Related Posts</h2> <h2>So What Are These Socks Really Good For?</h2> <h2>Are the bottom of your feet causing you problems?</h2> <h2>A PF Relief Guide</h2> <h2>What is Foot Reflexology & What is it Good For?</h2> <h2>Leave a Reply Cancel reply</h2> <p>Your email address will not be published. Required fields are marked *</p> <p>Name</p> <p>Email</p> <p>Website</p> <p>five − = 2 .hide-if-no-js { display: none !important; } </p><h2>Food For Thought January 2016</h2> <h2>Show Us Some Social Love!!</h2> <h2>Recent Posts</h2> <li> The Climate Pledge of Resistance</li> <li> Green Activism in Boulder, Colorado</li> <li> The Truth About Money and Happiness</li> <li> Why Is There So Much Skepticism About Climate Change?</li> <li> Which Device Would Work Best For You?</li>
I would like to remove this part:
<p>five − = 2 .hide-if-no-js { display: none !important; } </p>
Using the following regex: <p>.*?\{.*?\}.*?</p>
It removes the whole article instead of this paragraph that contains curly braces, for some strange reason...
What am I doing wrong with the regex code?
Thanks!
Lazy / greedy quantifiers not always work as intended, instead of them match the string excluding <, this works for me: <p>[^<]*\{[^<]*</p>
Try this:
var str = '<p>While orthotic inserts are able to provide great support and pain relief, they aren’t quite as good as a specialty shoe. Remember that an ill-fitting insert can cause permanent damage and talk to a podiatrist about your foot pain for the best recommendation. Click here if you want to learn more about pain in the foot arch unrelated to plantar fasciitis.</p> <h2>Related Posts</h2> <h2>So What Are These Socks Really Good For?</h2> <h2>Are the bottom of your feet causing you problems?</h2> <h2>A PF Relief Guide</h2> <h2>What is Foot Reflexology & What is it Good For?</h2> <h2>Leave a Reply Cancel reply</h2> <p>Your email address will not be published. Required fields are marked *</p> <p>Name</p> <p>Email</p> <p>Website</p> <p>five − = 2 .hide-if-no-js { display: none !important; } </p><h2>Food For Thought January 2016</h2> <h2>Show Us Some Social Love!!</h2> <h2>Recent Posts</h2> <li> The Climate Pledge of Resistance</li> <li> Green Activism in Boulder, Colorado</li> <li> The Truth About Money and Happiness</li> <li> Why Is There So Much Skepticism About Climate Change?</li> <li> Which Device Would Work Best For You?</li>';
var result = str.replace(/(<p>[^<]*\{.*<\/p>)/, '');
console.log(result);
Regex Demo
I'd suggest a two step approach (parsing and analyzing the text node).
Below you'll find examples for both Python and PHP (could be adopted for other languages, obviously):
Python:
# -*- coding: utf-8> -*-
import re
from bs4 import BeautifulSoup
html = """
<html>
<p>While orthotic inserts are able to provide great support and pain relief, they aren’t quite as good as a specialty shoe. Remember that an ill-fitting insert can cause permanent damage and talk to a podiatrist about your foot pain for the best recommendation. Click here if you want to learn more about pain in the foot arch unrelated to plantar fasciitis.</p> <h2>Related Posts</h2> <h2>So What Are These Socks Really Good For?</h2> <h2>Are the bottom of your feet causing you problems?</h2> <h2>A PF Relief Guide</h2> <h2>What is Foot Reflexology & What is it Good For?</h2> <h2>Leave a Reply Cancel reply</h2> <p>Your email address will not be published. Required fields are marked *</p> <p>Name</p> <p>Email</p> <p>Website</p> <p>five − = 2 .hide-if-no-js { display: none !important; } </p><h2>Food For Thought January 2016</h2> <h2>Show Us Some Social Love!!</h2> <h2>Recent Posts</h2> <li> The Climate Pledge of Resistance</li> <li> Green Activism in Boulder, Colorado</li> <li> The Truth About Money and Happiness</li> <li> Why Is There So Much Skepticism About Climate Change?</li> <li> Which Device Would Work Best For You?</li>
</html>
"""
soup = BeautifulSoup(html, 'lxml')
regex = r'{[^}]+}'
for p in soup.find_all('p', string=re.compile(regex)):
p.replaceWith('')
print soup
PHP:
<?php
$html = "<html>
<p>While orthotic inserts are able to provide great support and pain relief, they aren’t quite as good as a specialty shoe. Remember that an ill-fitting insert can cause permanent damage and talk to a podiatrist about your foot pain for the best recommendation. Click here if you want to learn more about pain in the foot arch unrelated to plantar fasciitis.</p> <h2>Related Posts</h2> <h2>So What Are These Socks Really Good For?</h2> <h2>Are the bottom of your feet causing you problems?</h2> <h2>A PF Relief Guide</h2> <h2>What is Foot Reflexology & What is it Good For?</h2> <h2>Leave a Reply Cancel reply</h2> <p>Your email address will not be published. Required fields are marked *</p> <p>Name</p> <p>Email</p> <p>Website</p> <p>five − = 2 .hide-if-no-js { display: none !important; } </p><h2>Food For Thought January 2016</h2> <h2>Show Us Some Social Love!!</h2> <h2>Recent Posts</h2> <li> The Climate Pledge of Resistance</li> <li> Green Activism in Boulder, Colorado</li> <li> The Truth About Money and Happiness</li> <li> Why Is There So Much Skepticism About Climate Change?</li> <li> Which Device Would Work Best For You?</li>
</html>";
$html = str_replace(' ', ' ', $html); // only because of the
$xml = simplexml_load_string($html);
# look for p tags
$lines = $xml->xpath("//p");
# the actual regex - match anything between curly brackets
$regex = '~{[^}]+}~';
for ($i=0;$i<count($lines);$i++) {
if (preg_match($regex, $lines[$i]->__toString())) {
# unset it if it matches
unset($lines[$i][0]);
}
}
// vanished without a sight...
print_r($xml);
// convert it back to a string
$html = echo $xml->asXML();
?>
I'd suggest a two step approach (parsing and analyzing the text node). Below you'll find examples for both Python and PHP (could be adopted for other languages, obviously):

Plugging fundamental security vulerabilities in a Django web app

I have a Django app which users congregate to, use as a forum and gain reputation points. Majority of the users belong to underserved communities and use primitive, non-js feature phones with proxy browsers such as Opera mini, over low bandwidth internet. Essentially I'm a "Next Billion" digital non-profit.
There are some security vulnerabilities that users can exploit in my forum - I need advice in plugging those. Here are the facts.
There's no SSL certificate installed - all comm. takes place over HTTP. To post in the forum one uses the following piece of code in the django template:
<form action="{% url 'private_group_reply' slug=unique %}" method="POST" enctype="multipart/form-data">
{% csrf_token %}
<input type="hidden" name="unique" value="{{ unique }}">
<br><span style="color:green;">Image: </span>{{ form.image }}<br>
<br><span style="color:green;">Comment:</span>{{ form.text }}
<br>
<input class="button" type="submit" value="OK" id="id_submit">
</form>
unique is a uuid that identifies which group the comment is to be posted in.
The relevant url pattern is: url(r'^group/(?P<slug>[\w.#+-]+)/private/$', auth(PrivateGroupView.as_view()), name='private_group_reply')
In views.py, the relevant class-based view's method is:
def form_valid(self, form):
if self.request.user_banned:
return redirect("profile", slug=self.request.user.username)
else:
f = form.save(commit=False)
text = f.text
if text == self.request.user.userprofile.previous_retort:
redirect(self.request.META.get('HTTP_REFERER')+"#sectionJ")
else:
self.request.user.userprofile.previous_retort = text
self.request.user.userprofile.score = self.request.user.userprofile.score + 2
self.request.user.userprofile.save()
#print "image: %s" % f.image
if f.image:
image_file = clean_image_file(f.image)
if image_file:
f.image = image_file
else:
f.image = None
else:
f.image = None
#print "image:%s" % f.image
which_group = Group.objects.get(unique=self.request.POST.get("unique"))
reply = Reply.objects.create(writer=self.request.user, which_group=which_group, text=text, image=f.image)
GroupSeen.objects.create(seen_user= self.request.user,which_reply=reply)
try:
return redirect(self.request.META.get('HTTP_REFERER')+"#sectionJ")
except:
return redirect("private_group_reply", slug=reply.which_group.unique)
One particular user is, I think, running a script bot and flooding the group he's a part of, garnering multiple points (each posting is point-incentivized).
Though I'm new to software development, I now have a thriving community on this forum and am thus ramping up my skill-set. I've been reading a ton about SSL security, and rules of thumb any respectable web app ought to follow. What someone can help with currently is this particular case of the flooding user I've described, how to curtail such behavior within my set up. Also any general guidelines to follow will be highly appreciated as well.
Thanks in advance, and please ask for any information you need.
I'm not sure if its the answer you're looking for but rather than trying to patch up this one part of your code, you probably need to look into creating some rules/regulations on your site along with penalties for the users that break this.
If you can prove that a user is doing something against your rules/guidelines, you apply your set penalty (score rollback/bans/whatever) and this would remove the incentive for most to even try doing this.
Yes, ssl may help but who knows, and your users will always have the chance to bypass your safeguards.
In terms of the actual code? I wouldn't save the score to the model until everything else has been done. You might also want to make it an atomic transaction to roll back db changes if something goes awry when saving

sqllite read/write queue concern with django

I'm building a website where college students can order delivery food. One special attribute about our site is that customers have to choose a preset delivery time. For example we have a drop at 7pm, 10pm, and at midnight.
All the information about the food is static (ie price, description, name), except the quantity remaining for that specific drop time.
Obviously i didn't want to hardcode the HTML for all the food items on my menu page, so i wrote a forloop in the html template. So i need to store the quantity remaining for the specific time somewhere in my model. the only problem is that I'm scared that if i use the same variable to transport the quantity remaining number to my template, i'll give out wrong information if alot of people are accessing the menu page at the same time.
For example, lets say the 7pm drop has 10 burritos remaining. And the 10pm drop has 40 burritos. Is there a chance that if someone has faster internet than the other customer, the wrong quantity remaining will display?
how would you guys go around to solve this problem?
i basically need a way to tell my template the quantity remaining for that specific time. and using the solution i have now, doesn't make me feel at ease. Esp if many people are going to be accessing the site at the same time.
view.py
orders = OrderItem.objects.filter(date__range=[now - timedelta(hours=20), now]).filter(time=hour)
steak_and_egg = 0
queso = 0
for food in orders:
if food.product.name == "Steak and Egg Burrito":
steak_and_egg = steak_and_egg + food.quantity
elif food.product.name == "Queso Burrito":
queso = queso + food.quantity
#if burritos are sold out, then tell template not to display "buy" link
quantity_steak_and_egg = max_cotixan_steak_and_egg - steak_and_egg
quantity_queso = max_cotixan_queso - queso
#psuedocode
steakandegg.quantity_remaining = quantity_steak_and_egg
queso.quantity_remaining = quantity_queso
HTML:
{% for item in food %}
<div id="food_set">
<img src="{{item.photo_menu.url}}" alt="" id="thumbnail photo" />
<div style='overflow:hidden'>
<p id="food_name">{{item.name}}</p>
<p id="price">${{item.price}}</p>
</div>
<p id="food_restaurant">By {{item.restaurant}}</p>
<div id="food_footer">
<img src="{{MEDIA_URL}}/images/order_dots.png" alt="" id="order_dots" />
<a id ="order_button" href="{{item.slug}}"></a>
<p id="quantity_remaining">{{item.quantity_remaining}} left</p>
</div><!-- end food_footer-->
</div><!-- end food_set-->
I don't understand what "faster Internet" or "using the same variable" have to do with anything here (or, indeed, what it has to do with sqlite particularly).
This question is about a fundamental property of web apps: that they are request/response based. That is, the client makes a request, and the server replies with a response, which represents the status of the data at that time. There's simply no getting around that: you can make it more dynamic, by using Ajax to update the page after the initial load, which is what StackOverflow does to show update messages while you're on the page. But even then, there's still a delay.
(I should note that there are ways of doing real-time updates, but they're complicated, and almost certainly overkill for a college food-ordering website.)
Now the issue is, why does this matter? It shouldn't. The user sees a page saying there is 1 burrito left - perhaps with a red warning saying "order quickly! almost gone!" - and they press the order button. On submission of that order, your code presumably checks for the actual status at that time. And, guess what, in the meantime you've processed another order and the burrito has already gone. So what? You simply show a message to the user, "sorry, it's gone, try something else". Anyone with any experience ordering things on the web - say, concert tickets - will understand what's happened.

Django forms: custom error list format

New to python and django. Using the forms module and going through the errors one by one (so not just dumping them at the top)
I noticed this solution to be able to set a custom format for errors, in my case, it'd be:
<dd class="error">errstr</dd>
And more or less copying the example provided, I have the following:
forms.py ( I expanded it slightly just for my sake)
class DefinitionErrorList(forms.util.ErrorList):
def __unicode__(self):
return self.view_as_dd()
def view_as_dd(self):
if not self:
return u''
return u'<dd class="error">%s</dd>' % '<br />'.join([u'<span>%s</span>' % e for e in self])
main.py
from poke.forms import PokeForm, DefinitionErrorList
def create_new(response, useless):
if response.method == 'POST':
# They posted something, so collect it (duh)
f = PokeForm(response.POST, error_class=DefinitionErrorList)
if f.is_valid():
cd = f.cleaned_data
template for reference
<dt>A small message to remind yourself</dt>
{{ form.message.errors }}
<dd>
<span class="input_border" style="width: 75%;"> {{ form.message }}</span>
<span class="tooltip_span">{{ tooltip.message }}</span>
</dd>
The problem is is that with the above, if a field has an error, it still uses the default format (the ), and no matter what I try I can't get my one to be used. I'm pretty sure I missed something small or misunderstood some instructions.
Thanks for any help and I'm sorry if I forgot any information!
Edit: I'm using Django 1.1, if that helps any. And to make it (possibly) clearer, the errors are displaying fine, they just aren't looking the way I want them to.
there has been a long lasting ticket on this. http://code.djangoproject.com/ticket/6138
latest update says it's fixed. see if it works in trunk. if not, submit your bug. if it does, get the patch or wait til the next release. (although i thought the latest release was after the date of the last update on the ticket).