How do I check if a user has entered the URL from another website in Django? - django

I want an effect to be applied when a user is entering my website. So therefore I want to check for when a user is coming from outside my website so the effect isnt getting applied when the user is surfing through different urls inside the website, but only when the user is coming from outside my website

You can't really check for where a user has come from specifically. You can check if the user has just arrived on your site by setting a session variable when they load one of your pages. You can check for it before you set it, and if they don't have it, then they have just arrived and you can apply your effect. There's some good examples of how sessions work here: https://developer.mozilla.org/en-US/docs/Learn/Server-side/Django/Sessions
There's a couple of ways to handle this. If you are using function based views, you can just create a separate util function and include it at the top of every page, eg,
utils.py
def first_visit(request):
"""returns the answer to the question 'first visit for session?'
make sure SESSION_EXPIRE_AT_BROWSER_CLOSE set to False in settings for persistance"""
if request.session['first_visit']:
#this is not the first session because the session variable is used.
return False
else:
#This is the first visit
...#do something
#set the session variable so you only do the above once
request.session[first_visit'] = True
return True
views.py
from utils.py import first_visit
def show_page(request):
first_visit = first_visit(request)
This approach gives you some control. For example, you may not want to run it on pages that require login, because you will already have run it on the login page.
Otherwise, the best approach depends on what will happen on the first visit. If you want just to update a template (eg, perhaps to show a message or run a script on th epage) you can use a context processor which gives you extra context for your templates. If you want to interrupt the request, perhaps to redirect it to a separate page, you can create a simple piece of middleware.
docs for middleware
docs for context processors
You may also be able to handle this entirely by javascript. This uses localStorage to store whether or not this is the user's first visit to the site and displays the loading area for 5 seconds if there is nothing in localStorage. You can include this in your base template so it runs on every page.
function showMain() {
document.getElementByID("loading").style.display = "none";
document.getElementByID("main").style.display = "block";
}
const secondVisit = localStorage.getItem("secondVisit");
if (!secondVisit) {
//show loading screen
document.getElementByID("loading").style.display = "block";
document.getElementByID("main").style.display = "none";
setTimeout(5000, showMain)
localStorage.setItem("secondVisit", "true" );
} else {
showMain()
}

Related

Flask session cookie reverting

I am sure I am probably being stupid but struggling to wrap my head around this one.
I have a flask website and I am setting up a checkout page for it so users can add their items to the cart etc. Everything was going great, I was able to add items to the cart, get a total etc (using sessions) however when I have tried to implement the ability for users to update the cart on the checkout page, when my form posts, the session data only survives the initial load. The print statement shows the data I am collecting is fine, and the session cookie is set initially, as everything updates, however the moment I then change page, it reverts to whatever it was before I made the update.
#views.route("/shopping-cart", methods=['GET','POST'])
def to_cart():
clear_cart = 'views.to_clear_cart'
if 'shopping' in session:
shopping_list = session['shopping']
sum_list = []
for quantity, price in shopping_list.values():
sum_list.append(quantity * price)
total = sum(sum_list)
if request.method == "POST":
new_quantity = int(request.form.get('quantity'))
product_id = request.form.get('issue')
unit_price = int(request.form.get('price'))
print(new_quantity, product_id, unit_price)
shopping_list[f'{product_id}'] = [new_quantity, unit_price]
return redirect(url_for('views.to_cart'))
return render_template("cart.html",
shopping_list=shopping_list,
total=total,
clear_cart=clear_cart,
)
else:
return render_template("cart.html",
clear_cart=clear_cart
)
I just do not really understand why it is not updating as from what I can tell, the code is running fine, and it does update, but then the session cookie just reverts itself to whatever it was before (using browser side cookies for this for testing).
Any help appreciated!!
After much confusion as everything seemed to be working absolutely fine after I rewrote this in about 5 different ways and printed half the app in the console, I finally found the answer and it is indeed me being an idiot.
It turns out if you modify a value in place rather than creating or deleting it does not automatically save the session state and you just need to state explicitly that it has been modified.
Turns out the answer was as simple as this line of code.
session.modified = True

Storing in Django Sessions

I have a ReactJS component inside a Django template, where a user clicks on a checkout button, posts the item_code and gets redirected to checkout:
onCheckout = () => {
fetch("/onCheckout/", {
method: "POST",
body: JSON.stringify({'item': this.props.item_info.code})
}).then(window.location.replace("/checkout"))
}
A Django view receives the request and stores it in a session.
def onCheckout(request):
if request.method == "POST":
items = request.session.get('items', [])
new_item = json.loads(request.body.decode('utf-8'))['item']
items.append(new_item)
request.session['items'] = items
I am having a issue with storing data in the session. After the first item gets stored correctly in the array, and I then checkout on a second item, the items array starts acting up:
(Pdb) items
['15130BC.ZZ.8042BC.01']
(Pdb) new_item
'5213G-001'
(Pdb) items
['15130BC.ZZ.8042BC.01']
(Pdb) items
['5213G-001']
If I try to access request.session['item'] from any other view function, I get a KeyError.
I am fairly new to Django, any help would be appreciated. Also, I would like to know if there are better alternatives to accomplish the above.
Sessions Config
settings.SESSION_ENGINE = 'django.contrib.sessions.backends.db'
settings.SESSION_CACHE_ALIAS = 'default'
settings.CACHES = {'default': {'BACKEND': 'django.core.cache.backends.locmem.LocMemCache'}}
Some reading on change detection for Django sessions: https://docs.djangoproject.com/en/2.0/topics/http/sessions/#when-sessions-are-saved
Based on your code, it appears to me that the change detection should happen. However, let's try to brute force this, can you add the following line as the last line of your code: request.session.modified = True - see if this fixes your issue?
Update: some basic checks
Can you verify the following
Check if your db backend is configured priestly
If you want to use a database-backed session, you need to add 'django.contrib.sessions' to your INSTALLED_APPS setting. Once you have configured your installation, run manage.py migrate to install the single database table that stores session data.
Check if your session Middleware is enabled
Sessions are implemented via a piece of middleware. The default settings.py created by django-admin startproject has SessionMiddleware activated. To enable session functionality, edit the MIDDLEWARE_CLASSES setting and make sure it contains 'django.contrib.sessions.middleware.SessionMiddleware'.
Update 2: Test the session
Maybe modify a style existing endpoint as follows and see if you are able to store values and persist them in session :
test_keys = request.session.get('test_keys', [])
test_keys.append(random.randint())
request.session['test_keys'] = test_keys
return Response(request.session.get('test_keys', []))
You should see that each time you hit the api, you get a list with one new integer in it in addition to all past values. Lmk how this goes.

Scraping data off flipkart using scrapy

I am trying to scrape some information from flipkart.com for this purpose I am using Scrapy. The information I need is for every product on flipkart.
I have used the following code for my spider
from scrapy.contrib.spiders import CrawlSpider, Rule
from scrapy.contrib.linkextractors import LinkExtractor
from scrapy.selector import HtmlXPathSelector
from tutorial.items import TutorialItem
class WebCrawler(CrawlSpider):
name = "flipkart"
allowed_domains = ['flipkart.com']
start_urls = ['http://www.flipkart.com/store-directory']
rules = [
Rule(LinkExtractor(allow=['/(.*?)/p/(.*?)']), 'parse_flipkart', cb_kwargs=None, follow=True),
Rule(LinkExtractor(allow=['/(.*?)/pr?(.*?)']), follow=True)
]
#staticmethod
def parse_flipkart(response):
hxs = HtmlXPathSelector(response)
item = FlipkartItem()
item['featureKey'] = hxs.select('//td[#class="specsKey"]/text()').extract()
yield item
What my intent is to crawl through every product category page(specified by the second rule) and follow the product page(first rule) within the category page to scrape data from the products page.
One problem is that I cannot find a way to control the crawling and scrapping.
Second flipkart uses ajax on its category page and displays more products when a user scrolls to the bottom.
I have read other answers and assessed that selenium might help solve the issue. But I cannot find a proper way to implement it into this structure.
Suggestions are welcome..:)
ADDITIONAL DETAILS
I had earlier used a similar approach
the second rule I used was
Rule(LinkExtractor(allow=['/(.?)/pr?(.?)']),'parse_category', follow=True)
#staticmethod
def parse_category(response):
hxs = HtmlXPathSelector(response)
count = hxs.select('//td[#class="no_of_items"]/text()').extract()
for page num in range(1,count,15):
ajax_url = response.url+"&start="+num+"&ajax=true"
return Request(ajax_url,callback="parse_category")
Now i was confused on what to use for callback "parse_category" or "parse_flipkart"
Thank you for your patience
Not sure what you mean when you say that you can't find a way to control the crawling and scraping. Creating a spider for this purpose is already taking it under control, isn't it? If you create proper rules and parse the responses properly, that is all you need. In case you are referring to the actual order in which the pages are scraped, you most likely don't need to do this. You can just parse all the items in whichever order, but gather their location in the category hierarchy by parsing the breadcrumb information above the item title. You can use something like this to get the breadcrumb in a list:
response.css(".clp-breadcrumb").xpath('./ul/li//text()').extract()
You don't actually need Selenium, and I believe it would be an overkill for this simple issue. Using your browser (I'm using Chrome currently), press F12 to open the developer tools. Go to one of the category pages, and open the Network tab in the developer window. If there is anything here, click the Clear button to clear things up a bit. Now scroll down until you see that additional items are being loaded, and you will see additional requests listed in the Network panel. Filter them by Documents (1) and click on the request in the left pane (2). You can see the URL for the request (3) and the query parameters that you need to send (4). Note the start parameter which will be the most important since you will have to call this request multiple times while increasing this value to get new items. You can check the response in the Preview pane (5), and you will see that the request from the server is exactly what you need, more items. The rule you use for the items should pick up those links too.
For a more detail overview of scraping with Firebug, you can check out the official documentation.
Since there is no need to use Selenium for your purpose, I shall not cover this point more than adding a few links that show how to use Selenium with Scrapy, if the need ever occurs:
https://gist.github.com/cheekybastard/4944914
https://gist.github.com/irfani/1045108
http://snipplr.com/view/66998/

How to check user is authenticated properly in Django and Backbone.js

I want to build single page application using Backbone.js and Django.
For checking user is authenticated or not,
I wrote a method get_identity method in django side.
If request.user.is_authenticated is true it returns request.user.id otherwise it returns Http404
In backbone side, I defined a User model and periodically make ajax call to get_identity.
I think it is the most straightforward way to check user is authenticated or not.
For learning single page application, I want to do this operation more sensible and efficient than this way if it is possible.
So what is your advice about this? When I search Django+Backbone.js + User Authentication, I couldn't find any satisfactory result and I really wonder how people do this simple operation.
Any help or idea will be appreciated.
(By the way I tried to read cookie periodically but HttpOnly True flagged cookies are not reacheable in client side.)
Django views.py
def get_identity(request):
if not request.user.is_authenticated():
raise Http404
return HttpResponse(json.dumps({'identity':request.user.id}), mimetype="application/json")
Backbone.js side.
updateUser:function(){
var $self=this;
$.ajaxSetup({async:false});
$.get(
'/get_identity',
function(response){
// update model...
$self.user.id =response.identity;
//check user every five minutes...
$self.user.fetch({success: function() {
$self.user.set('is_authenticated',true);
setTimeout($self.updateUser, 1000*60*1);
}
},this);
}).fail(function(){
//clear model
$self.user.clear().set($self.user.defaults);
setTimeout($self.updateUser, 1000*60*1);
});
$.ajaxSetup({async:true});
}
I had var is_authenticated = {{request.user.is_authenticated}}; in my base.html
and used the global variable to check.
I'm in pursuit of better solution (because this breaks when you start caching).
But you might find it useful.

Same random record across website

I am building a website that shows a different background every time a user enters it. This background is then used across the website with the same user and session.
So basically, a user entres the homepage, gets a background and that image won't change until the user closes the website or opens up a new page. I think you understand what I mean.
I know how to get a random record from the database using Django, but I'm not sure how to keep that record persistent across the website, because if I pull it on every view, I'll get a different image on different pages.
So my "index" view could be calling
bgimage = BackgroundImage.objects.random()
But then I have a problem. How can I get this random record unchanged across all the other views. Is that possible? Should I be looking into sessions, cookies?
Thank you!
you could use sessions - something like
if 'bgimage' not in request.session:
bgimage = BackgroundImage.objects.random()
request.session['bgimage'] = bgimage.pathtoimage
Context processor:
def bgimage(request):
if 'bg_image' not in request.session:
image = BackgroundImage.objects.random()
request.session['bg_image'] = image.file
return {'background_image' : request.session['image']}