I'm wondering what is the default Django policy for CSRF generation? Are they created per page or per session? And if it is per session, why is it chosen? Isn't it less secure than per-page CSRF?
Are they created per page or per session?
From Django's official documentation:
A CSRF cookie that is based on a random secret value, which other sites will not have access to.
This cookie is set by CsrfViewMiddleware. It is sent with every response that has called django.middleware.csrf.get_token() (the function used internally to retrieve the CSRF token), if it wasn’t already set on the request.
In order to protect against BREACH attacks, the token is not simply the secret; a random salt is prepended to the secret and used to scramble it.
For security reasons, the value of the secret is changed each time a user logs in
That means the secret used to generate the CSRF tokens is generated per-session (kind of).
When validating the ‘csrfmiddlewaretoken’ field value, only the secret, not the full token, is compared with the secret in the cookie value. This allows the use of ever-changing tokens. While each request may use its own token, the secret remains common to all.
This check is done by CsrfViewMiddleware.
That means if we want, we can generate different CSRF token according to our needs (e.g. per-page) but the secret will remain the same.
You might want to read Radwan's answer as well.
Reading through Django's Official Documentation about CSRF is really helpful
It explains how it is internally work as a middleware, if you go to this section, point 3
For all incoming requests that are not using HTTP GET, HEAD, OPTIONS or TRACE, a CSRF cookie must be present, and the ‘csrfmiddlewaretoken’ field must be present and correct. If it isn’t, the user will get a 403 error.
When validating the ‘csrfmiddlewaretoken’ field value, only the secret, not the full token, is compared with the secret in the cookie value. This allows the use of ever-changing tokens. While each request may use its own token, the secret remains common to all.
This check is done by CsrfViewMiddleware.
This means that you will have the flexibility to generate tokens Per-Page.
If your requirements enforce generating new csrf token on each request/page, checkout this question and its answers, they are really helpful.
Related
I am using Spring Security OAuth2 and JWT tokens. My question is: How can I revoke a JWT token?
As mentioned here
http://projects.spring.io/spring-security-oauth/docs/oauth2.html, revocation is done by refresh token. But it does not seem to work.
In general the easiest answer would be to say that you cannot revoke a JWT token, but that's simply not true. The honest answer is that the cost of supporting JWT revocation is sufficiently big for not being worth most of the times or plainly reconsider an alternative to JWT.
Having said that, in some scenarios you might need both JWT and immediate token revocation so lets go through what it would take, but first we'll cover some concepts.
JWT (Learn JSON Web Tokens) just specifies a token format, this revocation problem would also apply to any format used in what's usually known as a self-contained or by-value token. I like the latter terminology, because it makes a good contrast with by-reference tokens.
by-value token - associated information, including token lifetime, is contained in the token itself and the information can be verified as originating from a trusted source (digital signatures to the rescue)
by-reference token - associated information is kept on server-side storage that is then obtained using the token value as the key; being server-side storage the associated information is implicitly trusted
Before the JWT Big Bang we already dealt with tokens in our authentication systems; it was common for an application to create a session identifier upon user login that would then be used so that the user did not had to repeat the login process each time. These session identifiers were used as key indexes for server-side storage and if this sounds similar to something you recently read, you're right, this indeed classifies as a by-reference token.
Using the same analogy, understanding revocation for by-reference tokens is trivial; we just delete the server-side storage mapped to that key and the next time the key is provided it will be invalid.
For by-value tokens we just need to implement the opposite. When you request the revocation of the token you store something that allows you to uniquely identify that token so that next time you receive it you can additionally check if it was revoked. If you're already thinking that something like this will not scale, have in mind that you only need to store the data until the time the token would expire and in most cases you could probably just store an hash of the token so it would always be something of a known size.
As a last note and to center this on OAuth 2.0, the revocation of by-value access tokens is currently not standardized. Nonetheless, the OAuth 2.0 Token revocation specifically states that it can still be achieved as long as both the authorization server and resource server agree to a custom way of handling this:
In the former case (self-contained tokens), some (currently non-standardized) backend interaction between the authorization server and the resource server may be used when immediate access token revocation is desired.
If you control both the authorization server and resource server this is very easy to achieve. On the other hand if you delegate the authorization server role to a cloud provider like Auth0 or a third-party component like Spring OAuth 2.0 you most likely need to approach things differently as you'll probably only get what's already standardized.
An interesting reference
This article explain a another way to do that: Blacklist JWT
It contains some interesting pratices and pattern followed by RFC7523
The JWT cann't be revoked.
But here is the a alternative solution called as JWT old for new exchange schema.
Because we can’t invalidate the issued token before expire time, we always use short-time token, such as 30 minute.
When the token expired, we use the old token exchange a new token. The critical point is one old token can exchange one new token only.
In center auth server, we maintain a table like this:
table auth_tokens(
user_id,
jwt_hash,
expire
)
user_id contained in JWT string.
jwt_hash is a hash value of whole JWT string,Such as SHA256.
expire field is optional.
The following is work flow:
User request the login API with username and password, the auth server issue one token, and register the token ( add one row in the table. )
When the token expired, user request the exchange API with the old token. Firstly the auth server validate the old token as normal except expire checking, then create the token hash value, then lookup above table by user id:
If found record and user_id and jwt_hash is match, then issue new token and update the table.
If found record, but user_id and jwt_hash is not match , it means someone has use the token exchanged new token before. The token be hacked, delete records by user_id and response with alert information.
if not found record, user need login again or only input password.
when use changed the password or login out, delete record by user id.
To use token continuously ,both legal user and hacker need exchange new token continuously, but only one can succeed, when one fails, both need to login again at next exchange time.
So if hacker got the token, it can be used for a short time, but can't exchange for a new one if a legal user exchanged new one next time, because the token validity period is short. It is more secure this way.
If there is no hacker, normal user also need exchange new token periodically ,such as every 30 minutes, this is just like login automatically. The extra load is not high and we can adjust expire time for our application.
source: http://www.jianshu.com/p/b11accc40ba7
This doesn't exactly answer you question in regards to the Spring framework, but here's an article that talks about why if you need the ability to revoke JWT's, you might not want to go with JWT's in the first place, and instead use regular, opaque Bearer tokens.
https://www.dinochiesa.net/?p=1388
One way to revoke a JWT is by leveraging a distributed event system that notifies services when refresh tokens have been revoked. The identity provider broadcasts an event when a refresh token is revoked and other backends/services listen for the event. When an event is received the backends/services update a local cache that maintains a set of users whose refresh tokens have been revoked.
This cache is then checked whenever a JWT is verified to determine if the JWT should be revoked or not. This is all based on the duration of JWTs and expiration instant of individual JWTs.
This article, Revoking JWTs, illustrates this concept and has a sample app on Github.
For Googlers:
If you implement pure stateless authentication there is no way to revoke the token as the token itself is the sole source of truth
If you save a list of revoked token IDs on the server and check every request against the list, then it is essentially a variant of stateful authentication
OAuth2 providers like Cognito provides a way to "sign out" a user, however, it only really revokes refresh token, which is usually long-lived and could be used multiple times to generate new access tokens thus has to be revoked; the existing access tokens are still valid until they expire
What about storing the JWT token and referencing it to the user in the database? By extending the Guards/Security Systems in your backend application with an additional DB join after performing the JWT comparison, you would be able to practically 'revoke' it by removing or soft-deleting it from the DB.
In general, the answer about tokens by reference vs. tokens by value has nailed it. For those that stumble upon this space in future.
How to implement revocation on RS side:
TL;DR:
Take a cache or db that is visible to all your backend service instances that are verifying tokens. When a new token arrives for revocation, if it's a valid one, (i.e. verifies against your jwt verification algo), take the exp and jti claims, and save jti to cache until exp is reached. Then expire jti in cache once unixNow becomes > exp.
Then on authorization on other endpoints, you check everytime if a given jti is matching something in this cache, and if yes, you error with 403 saying token revoked. Once it expires, regular Token Expired error kicks in from your verification algo.
P.S. By saving only jti in cache, you make this data useless to anyone since it's just a unique token identifier.
The best solution for JWT revocation, is short exp window, refresh and keeping issued JWT tokens in a shared nearline cache. With Redis for example, this is particularly easy as you can set the cache key as the token itself (or a hash of the token), and specify expiry so that the tokens get automatically evicted.
I found one way of resolving the issue, How to expire already generated existing JWT token using Java?
In this case, we need to use any DB or in-memory where,
Step 1: As soon as the token is generated for the first time for a user, store it in a db with the token and it's "issuedAt()" time.
I stored it in DB in this JSON format,
Ex: {"username" : "username",
"token" : "token",
"issuedAt" : "issuedAt" }
Step 2: Once you get a web service request for the same user with a token to validate, fetch "issuedAt()" timestamp from the token and compare it with stored(DB/in-memory) issued timestamp.
Step 3: If stored issued timestamp is new (using after()/before() method) then return that the token is invalid (in this case we are not actually expiring the token but we are stop giving access on that token).
This is how I resolved the issue.
I am trying to enforce CSRF for a Django Rest API which is open to anonymous users.
For that matter, I've tried two different approaches:
Extending the selected API views from one CSRFAPIView base view, which has an #ensure_csrf_cookie annotation on the dispatch method.
Using a custom Authentication class based on SessionAuthentication, which applies enforce_csrf() regardless of whether the user is logged in or not.
In both approaches the CSRF check seems to work superficially. In case the CSRF token is missing from the cookie or in case the length of the token is incorrect, the endpoint returns a 403 - Forbidden.
However, if I edit the value of the CSRF token in the cookie, the request is accepted without issue. So I can use a random value for CSRF, as long as it's the correct length.
This behaviour seems to deviate from the regular Django login view, in which the contents of the CSRF do matter. I am testing in local setup with debug/test_environment flags on.
What could be the reason my custom CSRF checks in DRF are not validated in-depth?
Code fragment of the custom Authentication:
class RestCsrfAuthentication(SessionAuthentication):
def authenticate(self, request):
self.enforce_csrf(request)
rotate_token(request)
return None
And in settings:
REST_FRAMEWORK = {
'DEFAULT_AUTHENTICATION_CLASSES': [
'csrfexample.authentication.RestCsrfAuthentication',
]
}
The specific contents of CSRF tokens in Django never matter, actually.
This reply by a Django security team member to a question similar to yours says this:
The way our CSRF tokens work is pretty simple. Each form contains a CSRF token, which matches the CSRF cookie. Before we process the protected form, we make sure that the submitted token matches the cookie. This is a server-side check, but it's not validating against a stored server-side value. Since a remote attacker should not be able to read or set arbitrary cookies on your domain, this protects you.
Since we're just matching the cookie with the posted token, the data is not sensitive (in fact it's completely arbitrary - a cookie of "zzzz" works just fine), and so the rotation/expiration recommendations don't make any difference. If an attacker can read or set arbitrary cookies on your domain, all forms of cookie-based CSRF protection are broken, full stop.
(Actually "zzzz" won't work because of length requirements, but more on that later.) I recommend reading the entire mailing list message for a fuller understanding. There are explanations there about how Django is peculiar among frameworks because CSRF protections are independent of sessions.
I found that mailing list message via this FAQ item on the Django docs:
Is posting an arbitrary CSRF token pair (cookie and POST data) a vulnerability?
No, this is by design. Without a man-in-the-middle attack, there is no way for an attacker to send a CSRF token cookie to a victim’s browser, so a successful attack would need to obtain the victim’s browser’s cookie via XSS or similar, in which case an attacker usually doesn’t need CSRF attacks.
Some security audit tools flag this as a problem but as mentioned before, an attacker cannot steal a user’s browser’s CSRF cookie. “Stealing” or modifying your own token using Firebug, Chrome dev tools, etc. isn’t a vulnerability.
(Emphasis mine.)
The message is from 2011, but it's still valid, and to prove it let's look at the code. Both Django REST Framework's SessionAuthentication and the ensure_csrf_cookie decorator use core Django's CsrfViewMiddleware (source). In that middleware class's process_view() method, you'll see that it fetches the CSRF cookie (a cookie named csrftoken by default), and then the posted CSRF token (part of the POSTed data, with a fallback to reading the X-CSRFToken header). After that, it runs _sanitize_token() on the POSTed/X-CSRFToken value. This sanitization step is where the check for the correct token length happens; this is why you're getting 403s as expected when you provide shorter or longer tokens.
After that, the method proceeds to compare the two values using the function _compare_salted_tokens(). If you read that function, and all the further calls that it makes, you'll see that it boils down to checking if the two strings match, basically without regard to the values of the strings.
This behaviour seems to deviate from the regular Django login view, in which the contents of the CSRF do matter.
No, it doesn't matter even in the built-in login views. I ran this curl command (Windows cmd format) against a mostly default Django project:
curl -v
-H "Cookie: csrftoken=abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijkl"
-H "X-CSRFToken: abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijkl"
-F "username=admin" -F "password=1234" http://localhost:8000/admin/login/
and Django returned a session cookie (plus a CSRF cookie, of course).
Just a note on the way you're overriding SessionAuthentication.authenticate(): you probably already know this, but according to the DRF docs that method should return a (User, auth) tuple instead of None if the request has session data, i.e. if the request is from a logged-in user. Also, I think rotate_token() is unnecessary, because this code only checks for authentication status, and is not concerned with actually authenticating users. (The Django source says rotate_token() “should be done on login”.)
First, I want to make sure I got the CSRF token workflow right.
The server sets a cookie on my machine, on the site's domain. The browser prevents access to this cookie from other domains. When a POST request is made, I send the CSRF token to the server that then compares it to my cookie. It they're not the same, a 403 Forbidden page is returned.
Now, if I manually change the value of the token in the cookie and send that new value in the POST request, should the server return a 403 or not? Does the server need to validate the token agains a value stored on the server or on the cookie?
I am using the default implementation of CSRF protection on Django 1.3 (https://docs.djangoproject.com/en/1.3/ref/contrib/csrf/) and it validates the token sent in the request against the token only.
How do you send the token?
Usually, the tokens should be some function (with a secret key - known only to the server; e.g., MAC) of the cookie! not the cookie.
Than the flow is as follows:
1. Client sends the server request with a cookie.
2. Server returns a web page with CSRF token(s) for different purposes (e.g., forms or just a simple get requests via the URL).
3. The client performs some action (via POST or GET) and sends request with the token (in the request body or in the URL) and with the cookie.
4. The server is stateless, but it can verify that the request was sent by the same client by calculating the function (with the secret key that the server knows) on the cookie (or on part of it), and comparing the output with the token.
In the case of CSRF, the cookie is automatically appended to the request by the browser, but the attacker (that probably even doesn't know the cookie) cannot add the corresponding tokens.
I believe you should do something like this.
Now, if I manually change the value of the token in the cookie and
send that new value in the POST request, should the server return a
403 or not? Does the server need to validate the token agains a value
stored on the server or on the cookie?
The server should be stateless (usually). You don't want to verify the token every request against some value in a database or something like that. It is better to verify against the cookie.
In that case, if you change the token, than it probably won't match the cookie, and you should send 403.
TL;DR: Yes, either you, or the framework you are using, needs to have server-side logic to validate a CSRF token. It cannot be a cookie, it has to be something that requires the user to be on your page, versus click on a link an attacker provides.
You've got the workflow pretty much correct. The first step is to generate a cryptographically random string that cannot be predicted by an attacker. Every programming language has its own construct to do this, but a 24 - 32 character string should be good to serve the purpose.
Before we get to the next step, let's make sure we know what threat we're dealing with - we don't want an attacker to make a request on behalf of the user, so there should be something that is accessible to the browser that requires the user to perform an action to send the token, BUT, if the user clicks on something the attacker has set up, the token should not be sent.
Given this, the one way this should NOT be done is using cookies. The browser will automatically send cookies every single time a request is made to the domain the cookie is set on, so this automatically defeats our defense.
That said, let's go to the next step, which is to set this token in a way that is verifiable by you on the server side, but not accessible to the attacker. There's multiple ways to do this:
1) A CSRF Header: This is done in many node.js/Express installations - the CSRF token is sent as a header, to be specific, a X-CSRF-Token header. After generating this token, the server stores this in the session store for that particular cookie. On the front end, the token is stored as a JavaScript variable, which means only requests generated on that particular page can have the header.. Whenever a request is made, both the session cookie (in the case of node.js, connect.sid) and the X-CSRF-Token is required for all POST/PUT/DELETE requests. If the wrong token is sent, the server sends a 401 Unauthorized, and regenerates the token, requesting login from the user.
<script type="text/javascript">
window.NODE_ENV = {};
window.NODE_ENV.csrf = "q8t4gLkMFSxFupWO7vqkXXqD";
window.NODE_ENV.isDevelopment = "true";
</script>
2) A Hidden Form Value: A lot of PHP installations use this as the CSRF defense mechanism. Depending on the configuration, either a session specific or a request specific (latter is overkill unless the application needs it) token is embedded in a hidden form field. This way, it is sent every time a form is submitted. The method of verification varies - it can be via verifying it against the database, or it can be a server-specific session store.
3) Double Submit Cookies: This is a mechanism suggested by OWASP, where in addition to sending the session cookies via the header, you also include it in the forms submitted. This way, once you verify that the session is valid, you can verify that the form contains the session variables also. If you use this mechanism, it is critical to make sure that you validate the user's session before validating CSRF; otherwise, it introduces flaws.
While building/testing this mechanism, it is important to note that while a lot of implementations limit it to POST/DELETE/PUT transactions, this is because it is automatically assumed that all sensitive transactions happen through this verbs. If your application performs sensitive transactions (such as activations) using GET, then you need this mechanism for GET/HEAD also.
On django website, https://docs.djangoproject.com/en/dev/ref/contrib/csrf/ it states:
The CSRF protection is based on the following things:
1. A CSRF cookie that is set to a random value (a session independent nonce, as it is called), which other sites will not have access to.
2. ...
Then, it also states the csrf token can be obtained from cookie by javascript:
var csrftoken = $.cookie('csrftoken');
Aren't these two statements conflicting? Say there is a Cross Origin attack, then the attacker can just obtain the CSRF token from cookie, and then make a POST request with the CSRF token in the header? Can someone explain this please?
UPDATE
I realize now that, only the javascript from the same origin is allowed to access the cookie. A follow-up question is:
If a POST request automatically adds the cookie as part of the request, and django's csrf cookie value is the same as csrf token, then a malicious cross source request will still have the correct CSRF token anyways? (in cookie)
I believe that this post answers your updated question:
Because of the same-origin policy, the attacker cannot access the cookie indeed. But the browser will add the cookie to the POST request anyway, as you mentioned. For this reason, one must post the CSRF token from the code as well (e.g. in a hidden field). In this case, the attacker must know the value of the CSRF token as stored in the victim's cookie at the time she creates the malicious form. Since she cannot access the cookie, then she cannot replicate the token in her malicious code, and the attack fails.
Now, one might imagine other ways of storing the token than in the cookie. The point is that the attacker must not be able to get it. And the server must have a way to verify it. You could imagine saving the token together with the session on the server-side, and storing the token in some "safe" way on the client side ("safe" meaning that the attacker cannot access it).
Here is a quote from OWASP:
In general, developers need only generate this token once for the current session. After initial generation of this token, the value is stored in the session and is utilized for each subsequent request until the session expires. When a request is issued by the end-user, the server-side component must verify the existence and validity of the token in the request as compared to the token found in the session. If the token was not found within the request or the value provided does not match the value within the session, then the request should be aborted, token should be reset and the event logged as a potential CSRF attack in progress.
In the end, the security needs two things:
The CSRF token must be sent from the code, which means that the malicious code must know it.
The CSRF token must be stored in some "safe" place for comparison (the cookie is convenient for this).
I am not a specialist, but this is my understanding of the problem. Hope it helps.
From the name CSRF (Cross Site Request Forgery), you can already guess the attacker must perform the request from "cross site" (other site).
"The key to understanding CSRF attacks is to recognize that websites typically don't verify that a request came from an authorized user. Instead they verify only that the request came from the browser of an authorized user." - quoted here
So for sites that don't prevent CSRF attacks, the attacker can send the malicious request from anywhere: browsers, emails, terminal... Since the website doesn't check the origin of the request, it believes that the authorized user made the request.
In this case, in every Django form, you have a hidden input called "CSRF token". This value is randomly and uniquely generated at the time the form rendered, and will be compared after the request has been made. So the request can only be sent from the authorized user's browser. There is no way (which I know of) an attacker can get this token and perform the malicious request that can be accepted by Django backend.
Clear enough?
I am writing down my understanding of the CSRF protection mechanism in Django. Please correct me if it is faulty.
The csrfViewMiddleware creates a unique string and stores it in a hidden field 'csrfmiddlewaretoken' of a form originating from the host. Since a malicious website mimicking this form will not know about the value of this field, it cannot use it.
When someone tries to POST the form, the website checks the 'csrfmiddlewaretoken' field and its value. If it is wrong or not set, then a csrf attack is detected.
But then, what exactly is the CSRFCookie? The doc says the unique value is set in CSRFCookie and also in the hidden field.This is where I am confused. Does a cookie get sent to the browser with the unique string embedded?
Django assigns an authenticated user a CSRF token that is stored in a cookie. The value in this cookie is read every time a user makes a request that is considered "unsafe" (namely POST, PUT, DELETE) in order to validate that the user, not a malicious third-party, is making the request.
The CSRF tag you place in a form actually grabs the CSRF token from the cookie and then passes it in as a POST variable when you submit a form.
With my current understanding, I am not entirely satisfied with the validated answer.
You can find my version here.
To summarize, the CSRFCookie is "safe", in the sense that the attacker cannot access it because of the same-origin policy. The browser will send this value automatically. Now, your form must also send this value (e.g. in a hidden field). This means that your form must know this value, and it can get it from the cookie.
The attacker cannot get the token from the cookie, and therefore cannot forge a malicious code that contains the token.
What is important, in the end, is that the user can send a csrf token, and that the server can verify it. Using a cookie is a convenient way of doing this, but this could be implemented differently (e.g. the server could save the CSRF tokens for each session, for instance).
I am not a specialist, but this is how I understand it. Hope it helps.