When cookies expired after end of session, how to scrape data? - cookies

When cookies keep expired, how to scrape web?
I want to scrape my timetable data in the university web. However, the cookie, which I got after login, would immediately expire if the session were ended OR another new session were made. This makes scraping data with cookies impossible. I wonder if username&passwords could directly replace cookie during scraping or just keep session.

Related

Send Ajax request with cookie from 3rd Party Iframe - Safari 14+

I have a server side application that uses cookies for session management. The browser has some script that sends an ajax request to add information to the session. This is working well and in production.
The business wants to be able to insert this application in other companies' websites via iframes. ie myapp.com is in an iframe in otherbusiness.com and when the user clicks a button in the application in the iframe launched from myapp.com, it sends a request with a cookie that contains the session id to update the user's session on the myapp.com server.
For the browser to be able to send a cookie, 3rd party cookies needs to be enabled by setting the cookie options of SameSite=None and Secure. This works for all browsers except Safari.
Safari no longer accepts 3rd party cookies.
The only solution I can come up with is to use session ids in the URL but this is a little cumbersome.
Can anyone suggest a better option or perhaps a good implementation of session ids in the url?
I used hidden html fields to pass the session id and expiration.
My server side code checks for a cookie if it cannot find it, looks for the session id and expiration in the hidden fields.
This avoids security issues with passing the id in the url. It is a little clumsy to implement but it works.

How do browsers manage session cookies?

I know there are session cookies and persistent cookies. As far as I can understand, session cookies are managed by browsers (e.g. ended when closing the browser). So my questions are: How do browsers end session cokies? Do they send some sort of request to a server that you technically also could do manually?
Some browsers like Chrome has the option to "start from where you left off by NOT ending the sessions. How does this technically work? How are the sessions kept alive? Even after restarting the OS, the sessions are still alive, just as if they were converted to persistent cookies.
From a technical perspective, the definition of a Cookie can be found here. Loosely, think of a cookie as a piece of data returned by the Web Server you connected to in the past. This data is associated with the host that returned that data and can never be seen by other hosts. When you subsequently connect to the host in the future, the previously returned value (the cookie value) is sent back to the server. This allows the server to generate some data that can be used to "remember you" when you subsequently come back.
A session cookie is still "just a cookie" but is used to maintain "state of the session". For example, imagine a shopping cart. If you place items in your cart, the server will send back a cookie value that is a key used to find your cart again. If you place items in your cart today and come back tomorrow, the server can use the cookie value to lookup your cart.
As for "ending a session" ... this can be done at the browser by asking the browser to "forget" the cookie such that when you subsequently visit the web site, there is no cookie and hence it has no knowledge of your past interactions. Alternatively, the server can choose to ignore any cookie value you sent. A cookie can also have an implicit self deletion value such that if a time has passed, the cookie evaporates. Finally, the server can ask for the cookie value to be replaced / deleted when you next visit it.
I would suggest having a good google at Cookies in general as there are a lot of good references to how they work and how they are used.

Avoid updating session cookie expire time on request to django

I'm trying to ping Django from a javascript frontend to find out when a user's session will expire. I'm doing this so I can proactively notify a user when their session has expired.
Unfortunately, the session expire time is updated because I'm hitting the Django app. I've tried reading the session cookie from javascript, but it is not accessible (nor recommended to be accessible) from javascript.
How can I ping my Django app from javascript to get when the session will end?
What about passing the number of seconds until session will expire directly to your template/javascript? For example, you can get it using this method in your view function and pass it further.

When django session is created

I don't really understand when session is created and per what entity it is created (per ip, per browser, per logged in user). I see in documentation that sessions by default is created per visitor - but what is visitor (browser or ip)?
What are HTTP sessions?
To display a webpage your browser sends an HTTP request to the server, the server sends back an HTTP response. Each time you click a link on website a new HTTP transacation takes place, i.e. it is not a connection that is persistant over time (like a phone call). Your communication with a website consists of many monolitic HTTP transactions (tens or hundres of phonecalls, each phonecall being a few words).
So how can the server remember information about a user, for instance that a user is logged in (ip addresses are not reliable)? The first time you visit a website, the server creates a random string, and in the HTTP response it asks the browser to create a so called HTTP cookie with that value. A cookie is really just a name (of the cookie) and a value. If you go to a simple session-enabled Django site, the server will ask your browser to set a cookie named 'sessionid' with such a random generated value.
The subsequent times your browser will make HTTP requests to that domain, it will include the cookie in the HTTP request.
The server saves these session ids (for django the default is to save in the database) and it saves them together with so called session variables. So based on the session id sent along with an HTTP request it can dig out previously set session variables as well as modify or add session variables. If you delete your cookies (ctrl+shift+delete in Firefox), you will realize that no website remembers you anymore (Gmail, Facebook, Django sites, etc.) and you have to log in again. Most browsers will allow you to disable cookies in general or for specific sites (for privacy reasons) but this means that you can not log into those websites.
Per browser, per window, per tab, per ip?
It is not possible to log into different GMail accounts within the same browser, not even from different windows. But it is possible to log in to one account with Firefox and another with Chrome. So the answer is: per browser. However, it is not always that simple. You can use different profiles in Firefox, and each can keep different cookies and thus you can log into different accounts simultaneously. There are also Firefox plugins for keeping multiple sessions, e.g. MultiFox.
The session all depends on which session cookie your browser sends in it's HTTP request.
Play around
To get the full understanding of what is going on, I recommend installing the FireBug and FireCookie plugins for Firefox. The above screenshots are taken from FireBug's net panel. FireCookie will give you an overview of when and which cookies are set when you visit a site, and will let you regulate which cookies are allowed.
If there is a server side error, and you have DEBUG=True, then the Django error message will show you information about the HTTP request, including the cookies sent
It's browser (not IP). A session is basically data stored on your server that is identified by a session id sent as a cookie to the browser. The browser will send the cookie back containing the session id on all subsequent requests either until the browser is closed or the cookie expires (depending on the expires value that is sent with the cookie header, which you can control from Django with set_expiry).
The server can also expire sessions by basically ignoring the (unexpired) cookie that the browser sends and requiring a new session to be started.
There is a great description on how sessions work here.

Get cookie according to domain

I want to reset all previous cookie for particular domain.
Is there any way so I can get all the cookie for particular domain? Right now cookie I have cookies for google and my site. I want cookies only for my site.
Expiring ( removing ) a cookie uses the same command as creating a cookie. The cookie value is left blank and the expiration time needs to be in the past.
To expire the cookie ‘mycookie’ use:
setcookie('mycookie','',1);
To retrieve cookie information, use:
// Print a cookie
echo $_COOKIE["mycookie"];
// View all cookies
print_r($_COOKIE);
You cannot get any more information than the information you store in the cookie. The cookie is not stored on the server, but on the client computer, that is the immediate reason why you can't get more information about the cookie.
I hope this is sufficient information to be an answer to you.