According to the task I have to make Django 1.6 chat with Long polling technology (without the use of redis, websokets and so on). For the authorized user I store in the request.session dict with last message ID in each thread (dialogue):
request.session['admin_lastid'] = {'dialogue1': 112, 'dialogue2': 34}
I have two views with long polling. One def get_new(request) scans the ongoing dialogue and updates the dict when a new message:
request.session['admin_lastid']['dialogue1'] = current_thread.lastid
request.session.save()
request.session.modified = True
Another view def scan_threads(request) scans all the dialogue for new messages and this view reads the values from the request.session['admin_lastid'].
But I see the strange behavior of the session. When the view get_new is recorded in the request.session['admin_lastid'] a new value, the next view scan_threads considered the updated value. But when the view get_new start up again, he saw the old value. Why? Can not understand...
Related
I am sure I am probably being stupid but struggling to wrap my head around this one.
I have a flask website and I am setting up a checkout page for it so users can add their items to the cart etc. Everything was going great, I was able to add items to the cart, get a total etc (using sessions) however when I have tried to implement the ability for users to update the cart on the checkout page, when my form posts, the session data only survives the initial load. The print statement shows the data I am collecting is fine, and the session cookie is set initially, as everything updates, however the moment I then change page, it reverts to whatever it was before I made the update.
#views.route("/shopping-cart", methods=['GET','POST'])
def to_cart():
clear_cart = 'views.to_clear_cart'
if 'shopping' in session:
shopping_list = session['shopping']
sum_list = []
for quantity, price in shopping_list.values():
sum_list.append(quantity * price)
total = sum(sum_list)
if request.method == "POST":
new_quantity = int(request.form.get('quantity'))
product_id = request.form.get('issue')
unit_price = int(request.form.get('price'))
print(new_quantity, product_id, unit_price)
shopping_list[f'{product_id}'] = [new_quantity, unit_price]
return redirect(url_for('views.to_cart'))
return render_template("cart.html",
shopping_list=shopping_list,
total=total,
clear_cart=clear_cart,
)
else:
return render_template("cart.html",
clear_cart=clear_cart
)
I just do not really understand why it is not updating as from what I can tell, the code is running fine, and it does update, but then the session cookie just reverts itself to whatever it was before (using browser side cookies for this for testing).
Any help appreciated!!
After much confusion as everything seemed to be working absolutely fine after I rewrote this in about 5 different ways and printed half the app in the console, I finally found the answer and it is indeed me being an idiot.
It turns out if you modify a value in place rather than creating or deleting it does not automatically save the session state and you just need to state explicitly that it has been modified.
Turns out the answer was as simple as this line of code.
session.modified = True
In views, I have a function defined which is executed when the user submits the form online. After the form submission there are some database transactions that I perform and then based on the existing data in the database API's are triggered:
triggerapi():
execute API to send Email to the user and the administrator about
the submitted form
def databasetransactions():
check the data in the submitted form with the data in DB
if the last data submitted by the user is before 10 mins or more:
triggerapi()
def formsubmitted(request):
save the user input in variables
Databasetransactions()
save the data from the submitted form in the DB
In the above case, the user clicks on submit button 2 times in less than 5 milliseond duration. So 2 parallel data starts to process and both trigger Email which is not the desired behavior.
Is there a way to avoid this ? So that for a user session, the application should only accept the data once all the older data processing is completed ?
Since we are talking in pseudo-code, one way could be to use a singleton pattern for triggerapi() and return Not Allowed in case it is already istantiated.
There are multiple ways to solve this issue.
One of them would be to create a new session variable
request.session['activetransaction'] = True
This would however require you to pass request, unless it is already passed and we got a changed code portion. You can also add an instance/ class flag for it in the same way and check with it.
Another way, which might work if you need those submissions handled after the previous one, you can always add a while request.session['activetransaction']: and do the handling afterwards.
def formsubmitted(request):
if 'activetransaction' not in request.session or not request.session['activetransaction']:
request.session['activetransaction'] = True
# save the user input in variables
Databasetransactions()
# save the data from the submitted form in the DB
request.session['activetransaction'] = False
...
I have a ReactJS component inside a Django template, where a user clicks on a checkout button, posts the item_code and gets redirected to checkout:
onCheckout = () => {
fetch("/onCheckout/", {
method: "POST",
body: JSON.stringify({'item': this.props.item_info.code})
}).then(window.location.replace("/checkout"))
}
A Django view receives the request and stores it in a session.
def onCheckout(request):
if request.method == "POST":
items = request.session.get('items', [])
new_item = json.loads(request.body.decode('utf-8'))['item']
items.append(new_item)
request.session['items'] = items
I am having a issue with storing data in the session. After the first item gets stored correctly in the array, and I then checkout on a second item, the items array starts acting up:
(Pdb) items
['15130BC.ZZ.8042BC.01']
(Pdb) new_item
'5213G-001'
(Pdb) items
['15130BC.ZZ.8042BC.01']
(Pdb) items
['5213G-001']
If I try to access request.session['item'] from any other view function, I get a KeyError.
I am fairly new to Django, any help would be appreciated. Also, I would like to know if there are better alternatives to accomplish the above.
Sessions Config
settings.SESSION_ENGINE = 'django.contrib.sessions.backends.db'
settings.SESSION_CACHE_ALIAS = 'default'
settings.CACHES = {'default': {'BACKEND': 'django.core.cache.backends.locmem.LocMemCache'}}
Some reading on change detection for Django sessions: https://docs.djangoproject.com/en/2.0/topics/http/sessions/#when-sessions-are-saved
Based on your code, it appears to me that the change detection should happen. However, let's try to brute force this, can you add the following line as the last line of your code: request.session.modified = True - see if this fixes your issue?
Update: some basic checks
Can you verify the following
Check if your db backend is configured priestly
If you want to use a database-backed session, you need to add 'django.contrib.sessions' to your INSTALLED_APPS setting. Once you have configured your installation, run manage.py migrate to install the single database table that stores session data.
Check if your session Middleware is enabled
Sessions are implemented via a piece of middleware. The default settings.py created by django-admin startproject has SessionMiddleware activated. To enable session functionality, edit the MIDDLEWARE_CLASSES setting and make sure it contains 'django.contrib.sessions.middleware.SessionMiddleware'.
Update 2: Test the session
Maybe modify a style existing endpoint as follows and see if you are able to store values and persist them in session :
test_keys = request.session.get('test_keys', [])
test_keys.append(random.randint())
request.session['test_keys'] = test_keys
return Response(request.session.get('test_keys', []))
You should see that each time you hit the api, you get a list with one new integer in it in addition to all past values. Lmk how this goes.
I go bananas here. With my attempts to implement concurrency protection.
So I figured out there are 2 ways to do it.
Optimistic - basically we add a version field to the record and each save will increase it. So if my current version field is different that one on the disc it means something got modified and I give an error.
And the pessimistic approach that means we just lock a record and it will be not possible to edit.
I implemented both ways plus concurrency package for Django and nothing works I teas on SQLLite and on Postgres with Heroku.
Below is my view method it has both approaches
for pessimistic, I lock with transaction atomic using select_for_update
and for optimistic I use concurrency django package that increases my version field on each edit and do a simple comparison with on_disc value
What I do wrong? Please help?
#login_required
def lease_edit(request,pk,uri):
logger = logging.getLogger(__name__)
title = 'Edit Lease'
uri = _get_redirect_url(request, uri)
with transaction.atomic():
post = get_object_or_404(Lease.objects.select_for_update(), pk=pk)
if request.method == "POST":
form = LeaseForm(request.POST, instance=post)
if form.is_valid():
lease = form.save(commit=False)
on_disc= Lease.objects.get(pk=post.id)
if on_disc.version > post.version:
messages.add_message(request, messages.ERROR, str(lease.id) + "-Error object was modified by another user. Please reopen object.")
return redirect(uri)
lease.save()
messages.add_message(request, messages.SUCCESS, str(lease.id) + "--SUCCESS Object saved successfully")
return redirect(uri)
else:
form = LeaseForm(instance=post)
#lease = post.lease
return render(request, 'object_edit.html', {'form': form, 'title':title, 'extend': EXTEND})
To test I just open 2 browsers with the same record and try to edit both in parallel (I use same user)
The problem is with how websites work. Currently your locking works like this:
GET request loads data
you lock the row, load, send to browser, unlock
you edit data on the browser
POST data to server
you lock the row, update, unlock
There is nothing here that keeps the row locked. Multiple instances can send data to the server since the row is unlocked after each request. This is the way web systems usually work, each request is separate and locking has to be done differently.
As for the row versioning, if I understand the code correctly you don't send the current row number to the browser but you read it while updating. This again means that you get the correct number. You should send the current number to the browser and on POST compare that number to the current number in the database. This way if something changes it detects it.
I may be wrong about this part since I'm not familiar with Django but this is my understanding of the functionality of the code. I would assume form might have the current row version but it's not compared to anything, only the current number in database and one after update (which also might change the number possibly).
I need to scrap the data of each item from a website using Scrapy(http://example.com/itemview). I have a list of itemID and I need to pass it in a form in example.com.
There is no url change for each item. So for each request in my spider the url will always be the same. But the content will be different.
I don't wan't a for loop for handling each request. So i followed the below mentioned steps.
started spider with the above url
added item_scraped and spider_closed signals
passed through several functions
passed the scraped data to pipeline
trigerred the item_scraped signal
After this it automatically calls the spider_closed signal. But I want the above steps to be continued till the total itemID are finished.
class ExampleSpider(scrapy.Spider):
name = "example"
allowed_domains = ["example.com"]
itemIDs = [11111,22222,33333]
current_item_num = 0
def __init__(self, itemids=None, *args, **kwargs):
super(ExampleSpider, self).__init__(*args, **kwargs)
dispatcher.connect(self.item_scraped, signals.item_scraped)
dispatcher.connect(self.spider_closed, signals.spider_closed)
def spider_closed(self, spider):
self.driver.quit()
def start_requests(self):
request = self.make_requests_from_url('http://example.com/itemview')
yield request
def parse(self,response):
self.driver = webdriver.PhantomJS()
self.driver.get(response.url)
first_data = self.driver.find_element_by_xpath('//div[#id="itemview"]').text.strip()
yield Request(response.url,meta={'first_data':first_data},callback=self.processDetails,dont_filter=True)
def processDetails(self,response):
itemID = self.itemIDs[self.current_item_num]
..form submission with the current itemID goes here...
...the content of the page is updated with the given itemID...
yield Request(response.url,meta={'first_data':response.meta['first_data']},callback=self.processData,dont_filter=True)
def processData(self,response):
...some more scraping goes here...
item = ExamplecrawlerItem()
item['first_data'] = response.meta['first_data']
yield item
def item_scraped(self,item,response,spider):
self.current_item_num += 1
#i need to call the processDetails function here for the next itemID
#and the process needs to contine till the itemID finishes
self.parse(response)
My piepline:
class ExampleDBPipeline(object):
def process_item(self, item, spider):
MYCOLLECTION.insert(dict(item))
return
I wish I had an elegant solution to this. But instead it's a hackish way of calling the underlying classes.
self.crawler.engine.slot.scheduler.enqueue_request(scrapy.Request(url,self.yourCallBack))
However, you can yield a request after you yield the item and have it callback to self.processDetails. Simply add this to your processData function:
yield item
self.counter += 1
yield scrapy.Request(response.url,callback=self.processDetails,dont_filter=True, meta = {"your":"Dictionary"}
Also, PhantomJS can be nice and make your life easy, but it is slower than regular connections. If possible, find the request for json data or whatever makes the page unparseable without JS. To do so, open up chrome, right click, click inspect, go to the network tab, then enter the ID into the form, then look at the XHR or JS tabs for a JSON that has the data or next url you want. Most of the time, there will be some url made by adding the ID, if you can find it, you can just concatenate your urls and call that directly without having the cost of JS rendering. Sometimes it is randomized, or not there, but I've had fair success with it. You can then also use that to yield many requests at the same time without having to worry about phantomJS trying to do two things at once or having to initialize many instances of it. You could use tabs, but that is a pain.
Also, I would use a Queue of your IDs to ensure thread safety. Otherwise, you could have processDetails called twice on the same ID, though in the logic of your program everything seems to go linearly, which means you aren't using the concurrency capabilities of Scrapy and your program will go more slowly. To use Queue add:
import Queue
#go inside class definition and add
itemIDQueue = Queue.Queue()
#within __init__ add
[self.itemIDQueue.put(ID) for ID in self.itemID]
#within processDetails replace itemID = self.itemIDs[self.current_item_num] with
itemID = self.itemIDQueue.get()
And then there is no need to increment the counter and your program is thread safe.