Understanding CSRF in Django; hidden field in form and the CSRFCookie

Understanding CSRF in Django; hidden field in form and the CSRFCookie - django

I am writing down my understanding of the CSRF protection mechanism in Django. Please correct me if it is faulty.
The csrfViewMiddleware creates a unique string and stores it in a hidden field 'csrfmiddlewaretoken' of a form originating from the host. Since a malicious website mimicking this form will not know about the value of this field, it cannot use it.
When someone tries to POST the form, the website checks the 'csrfmiddlewaretoken' field and its value. If it is wrong or not set, then a csrf attack is detected.
But then, what exactly is the CSRFCookie? The doc says the unique value is set in CSRFCookie and also in the hidden field.This is where I am confused. Does a cookie get sent to the browser with the unique string embedded?

Django assigns an authenticated user a CSRF token that is stored in a cookie. The value in this cookie is read every time a user makes a request that is considered "unsafe" (namely POST, PUT, DELETE) in order to validate that the user, not a malicious third-party, is making the request.
The CSRF tag you place in a form actually grabs the CSRF token from the cookie and then passes it in as a POST variable when you submit a form.

With my current understanding, I am not entirely satisfied with the validated answer.
You can find my version here.
To summarize, the CSRFCookie is "safe", in the sense that the attacker cannot access it because of the same-origin policy. The browser will send this value automatically. Now, your form must also send this value (e.g. in a hidden field). This means that your form must know this value, and it can get it from the cookie.
The attacker cannot get the token from the cookie, and therefore cannot forge a malicious code that contains the token.
What is important, in the end, is that the user can send a csrf token, and that the server can verify it. Using a cookie is a convenient way of doing this, but this could be implemented differently (e.g. the server could save the CSRF tokens for each session, for instance).
I am not a specialist, but this is how I understand it. Hope it helps.

Related

Why does Django/Django REST Framework not validate CSRF tokens in-depth, even with enforce-CSRF?

I am trying to enforce CSRF for a Django Rest API which is open to anonymous users.
For that matter, I've tried two different approaches:
Extending the selected API views from one CSRFAPIView base view, which has an #ensure_csrf_cookie annotation on the dispatch method.
Using a custom Authentication class based on SessionAuthentication, which applies enforce_csrf() regardless of whether the user is logged in or not.
In both approaches the CSRF check seems to work superficially. In case the CSRF token is missing from the cookie or in case the length of the token is incorrect, the endpoint returns a 403 - Forbidden.
However, if I edit the value of the CSRF token in the cookie, the request is accepted without issue. So I can use a random value for CSRF, as long as it's the correct length.
This behaviour seems to deviate from the regular Django login view, in which the contents of the CSRF do matter. I am testing in local setup with debug/test_environment flags on.
What could be the reason my custom CSRF checks in DRF are not validated in-depth?
Code fragment of the custom Authentication:
class RestCsrfAuthentication(SessionAuthentication):
def authenticate(self, request):
self.enforce_csrf(request)
rotate_token(request)
return None
And in settings:
REST_FRAMEWORK = {
'DEFAULT_AUTHENTICATION_CLASSES': [
'csrfexample.authentication.RestCsrfAuthentication',
]
}

The specific contents of CSRF tokens in Django never matter, actually.
This reply by a Django security team member to a question similar to yours says this:
The way our CSRF tokens work is pretty simple. Each form contains a CSRF token, which matches the CSRF cookie. Before we process the protected form, we make sure that the submitted token matches the cookie. This is a server-side check, but it's not validating against a stored server-side value. Since a remote attacker should not be able to read or set arbitrary cookies on your domain, this protects you.
Since we're just matching the cookie with the posted token, the data is not sensitive (in fact it's completely arbitrary - a cookie of "zzzz" works just fine), and so the rotation/expiration recommendations don't make any difference. If an attacker can read or set arbitrary cookies on your domain, all forms of cookie-based CSRF protection are broken, full stop.
(Actually "zzzz" won't work because of length requirements, but more on that later.) I recommend reading the entire mailing list message for a fuller understanding. There are explanations there about how Django is peculiar among frameworks because CSRF protections are independent of sessions.
I found that mailing list message via this FAQ item on the Django docs:
Is posting an arbitrary CSRF token pair (cookie and POST data) a vulnerability?
No, this is by design. Without a man-in-the-middle attack, there is no way for an attacker to send a CSRF token cookie to a victim’s browser, so a successful attack would need to obtain the victim’s browser’s cookie via XSS or similar, in which case an attacker usually doesn’t need CSRF attacks.
Some security audit tools flag this as a problem but as mentioned before, an attacker cannot steal a user’s browser’s CSRF cookie. “Stealing” or modifying your own token using Firebug, Chrome dev tools, etc. isn’t a vulnerability.
(Emphasis mine.)
The message is from 2011, but it's still valid, and to prove it let's look at the code. Both Django REST Framework's SessionAuthentication and the ensure_csrf_cookie decorator use core Django's CsrfViewMiddleware (source). In that middleware class's process_view() method, you'll see that it fetches the CSRF cookie (a cookie named csrftoken by default), and then the posted CSRF token (part of the POSTed data, with a fallback to reading the X-CSRFToken header). After that, it runs _sanitize_token() on the POSTed/X-CSRFToken value. This sanitization step is where the check for the correct token length happens; this is why you're getting 403s as expected when you provide shorter or longer tokens.
After that, the method proceeds to compare the two values using the function _compare_salted_tokens(). If you read that function, and all the further calls that it makes, you'll see that it boils down to checking if the two strings match, basically without regard to the values of the strings.
This behaviour seems to deviate from the regular Django login view, in which the contents of the CSRF do matter.
No, it doesn't matter even in the built-in login views. I ran this curl command (Windows cmd format) against a mostly default Django project:
curl -v
-H "Cookie: csrftoken=abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijkl"
-H "X-CSRFToken: abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijkl"
-F "username=admin" -F "password=1234" http://localhost:8000/admin/login/
and Django returned a session cookie (plus a CSRF cookie, of course).
Just a note on the way you're overriding SessionAuthentication.authenticate(): you probably already know this, but according to the DRF docs that method should return a (User, auth) tuple instead of None if the request has session data, i.e. if the request is from a logged-in user. Also, I think rotate_token() is unnecessary, because this code only checks for authentication status, and is not concerned with actually authenticating users. (The Django source says rotate_token() “should be done on login”.)

Does Django use per-page or per-session CSRF?

I'm wondering what is the default Django policy for CSRF generation? Are they created per page or per session? And if it is per session, why is it chosen? Isn't it less secure than per-page CSRF?

Are they created per page or per session?
From Django's official documentation:
A CSRF cookie that is based on a random secret value, which other sites will not have access to.
This cookie is set by CsrfViewMiddleware. It is sent with every response that has called django.middleware.csrf.get_token() (the function used internally to retrieve the CSRF token), if it wasn’t already set on the request.
In order to protect against BREACH attacks, the token is not simply the secret; a random salt is prepended to the secret and used to scramble it.
For security reasons, the value of the secret is changed each time a user logs in
That means the secret used to generate the CSRF tokens is generated per-session (kind of).
When validating the ‘csrfmiddlewaretoken’ field value, only the secret, not the full token, is compared with the secret in the cookie value. This allows the use of ever-changing tokens. While each request may use its own token, the secret remains common to all.
This check is done by CsrfViewMiddleware.
That means if we want, we can generate different CSRF token according to our needs (e.g. per-page) but the secret will remain the same.
You might want to read Radwan's answer as well.

Reading through Django's Official Documentation about CSRF is really helpful
It explains how it is internally work as a middleware, if you go to this section, point 3
For all incoming requests that are not using HTTP GET, HEAD, OPTIONS or TRACE, a CSRF cookie must be present, and the ‘csrfmiddlewaretoken’ field must be present and correct. If it isn’t, the user will get a 403 error.
When validating the ‘csrfmiddlewaretoken’ field value, only the secret, not the full token, is compared with the secret in the cookie value. This allows the use of ever-changing tokens. While each request may use its own token, the secret remains common to all.
This check is done by CsrfViewMiddleware.
This means that you will have the flexibility to generate tokens Per-Page.
If your requirements enforce generating new csrf token on each request/page, checkout this question and its answers, they are really helpful.

Should CSRF tokens be server-side validated?

First, I want to make sure I got the CSRF token workflow right.
The server sets a cookie on my machine, on the site's domain. The browser prevents access to this cookie from other domains. When a POST request is made, I send the CSRF token to the server that then compares it to my cookie. It they're not the same, a 403 Forbidden page is returned.
Now, if I manually change the value of the token in the cookie and send that new value in the POST request, should the server return a 403 or not? Does the server need to validate the token agains a value stored on the server or on the cookie?
I am using the default implementation of CSRF protection on Django 1.3 (https://docs.djangoproject.com/en/1.3/ref/contrib/csrf/) and it validates the token sent in the request against the token only.

How do you send the token?
Usually, the tokens should be some function (with a secret key - known only to the server; e.g., MAC) of the cookie! not the cookie.
Than the flow is as follows:
1. Client sends the server request with a cookie.
2. Server returns a web page with CSRF token(s) for different purposes (e.g., forms or just a simple get requests via the URL).
3. The client performs some action (via POST or GET) and sends request with the token (in the request body or in the URL) and with the cookie.
4. The server is stateless, but it can verify that the request was sent by the same client by calculating the function (with the secret key that the server knows) on the cookie (or on part of it), and comparing the output with the token.
In the case of CSRF, the cookie is automatically appended to the request by the browser, but the attacker (that probably even doesn't know the cookie) cannot add the corresponding tokens.
I believe you should do something like this.
Now, if I manually change the value of the token in the cookie and
send that new value in the POST request, should the server return a
403 or not? Does the server need to validate the token agains a value
stored on the server or on the cookie?
The server should be stateless (usually). You don't want to verify the token every request against some value in a database or something like that. It is better to verify against the cookie.
In that case, if you change the token, than it probably won't match the cookie, and you should send 403.

TL;DR: Yes, either you, or the framework you are using, needs to have server-side logic to validate a CSRF token. It cannot be a cookie, it has to be something that requires the user to be on your page, versus click on a link an attacker provides.
You've got the workflow pretty much correct. The first step is to generate a cryptographically random string that cannot be predicted by an attacker. Every programming language has its own construct to do this, but a 24 - 32 character string should be good to serve the purpose.
Before we get to the next step, let's make sure we know what threat we're dealing with - we don't want an attacker to make a request on behalf of the user, so there should be something that is accessible to the browser that requires the user to perform an action to send the token, BUT, if the user clicks on something the attacker has set up, the token should not be sent.
Given this, the one way this should NOT be done is using cookies. The browser will automatically send cookies every single time a request is made to the domain the cookie is set on, so this automatically defeats our defense.
That said, let's go to the next step, which is to set this token in a way that is verifiable by you on the server side, but not accessible to the attacker. There's multiple ways to do this:
1) A CSRF Header: This is done in many node.js/Express installations - the CSRF token is sent as a header, to be specific, a X-CSRF-Token header. After generating this token, the server stores this in the session store for that particular cookie. On the front end, the token is stored as a JavaScript variable, which means only requests generated on that particular page can have the header.. Whenever a request is made, both the session cookie (in the case of node.js, connect.sid) and the X-CSRF-Token is required for all POST/PUT/DELETE requests. If the wrong token is sent, the server sends a 401 Unauthorized, and regenerates the token, requesting login from the user.
<script type="text/javascript">
window.NODE_ENV = {};
window.NODE_ENV.csrf = "q8t4gLkMFSxFupWO7vqkXXqD";
window.NODE_ENV.isDevelopment = "true";
</script>
2) A Hidden Form Value: A lot of PHP installations use this as the CSRF defense mechanism. Depending on the configuration, either a session specific or a request specific (latter is overkill unless the application needs it) token is embedded in a hidden form field. This way, it is sent every time a form is submitted. The method of verification varies - it can be via verifying it against the database, or it can be a server-specific session store.
3) Double Submit Cookies: This is a mechanism suggested by OWASP, where in addition to sending the session cookies via the header, you also include it in the forms submitted. This way, once you verify that the session is valid, you can verify that the form contains the session variables also. If you use this mechanism, it is critical to make sure that you validate the user's session before validating CSRF; otherwise, it introduces flaws.
While building/testing this mechanism, it is important to note that while a lot of implementations limit it to POST/DELETE/PUT transactions, this is because it is automatically assumed that all sensitive transactions happen through this verbs. If your application performs sensitive transactions (such as activations) using GET, then you need this mechanism for GET/HEAD also.

Django CSRF cookie accessible by javascript?

On django website, https://docs.djangoproject.com/en/dev/ref/contrib/csrf/ it states:
The CSRF protection is based on the following things:
1. A CSRF cookie that is set to a random value (a session independent nonce, as it is called), which other sites will not have access to.
2. ...
Then, it also states the csrf token can be obtained from cookie by javascript:
var csrftoken = $.cookie('csrftoken');
Aren't these two statements conflicting? Say there is a Cross Origin attack, then the attacker can just obtain the CSRF token from cookie, and then make a POST request with the CSRF token in the header? Can someone explain this please?
UPDATE
I realize now that, only the javascript from the same origin is allowed to access the cookie. A follow-up question is:
If a POST request automatically adds the cookie as part of the request, and django's csrf cookie value is the same as csrf token, then a malicious cross source request will still have the correct CSRF token anyways? (in cookie)

I believe that this post answers your updated question:
Because of the same-origin policy, the attacker cannot access the cookie indeed. But the browser will add the cookie to the POST request anyway, as you mentioned. For this reason, one must post the CSRF token from the code as well (e.g. in a hidden field). In this case, the attacker must know the value of the CSRF token as stored in the victim's cookie at the time she creates the malicious form. Since she cannot access the cookie, then she cannot replicate the token in her malicious code, and the attack fails.
Now, one might imagine other ways of storing the token than in the cookie. The point is that the attacker must not be able to get it. And the server must have a way to verify it. You could imagine saving the token together with the session on the server-side, and storing the token in some "safe" way on the client side ("safe" meaning that the attacker cannot access it).
Here is a quote from OWASP:
In general, developers need only generate this token once for the current session. After initial generation of this token, the value is stored in the session and is utilized for each subsequent request until the session expires. When a request is issued by the end-user, the server-side component must verify the existence and validity of the token in the request as compared to the token found in the session. If the token was not found within the request or the value provided does not match the value within the session, then the request should be aborted, token should be reset and the event logged as a potential CSRF attack in progress.
In the end, the security needs two things:
The CSRF token must be sent from the code, which means that the malicious code must know it.
The CSRF token must be stored in some "safe" place for comparison (the cookie is convenient for this).
I am not a specialist, but this is my understanding of the problem. Hope it helps.

From the name CSRF (Cross Site Request Forgery), you can already guess the attacker must perform the request from "cross site" (other site).
"The key to understanding CSRF attacks is to recognize that websites typically don't verify that a request came from an authorized user. Instead they verify only that the request came from the browser of an authorized user." - quoted here
So for sites that don't prevent CSRF attacks, the attacker can send the malicious request from anywhere: browsers, emails, terminal... Since the website doesn't check the origin of the request, it believes that the authorized user made the request.
In this case, in every Django form, you have a hidden input called "CSRF token". This value is randomly and uniquely generated at the time the form rendered, and will be compared after the request has been made. So the request can only be sent from the authorized user's browser. There is no way (which I know of) an attacker can get this token and perform the malicious request that can be accepted by Django backend.
Clear enough?

Is this how Django's CSRF protection works?

Being a beginner at cookies, CSRF and Django (using 1.4), from what I can make out this is how it works, please correct me where I go wrong...
The following applies where django.middleware.csrf.CsrfViewMiddleware is included in the MIDDLEWARE_CLASSES tuple.
Where a POST form includes the csrf_token tag, and the view concerned passes RequestContext to the template, requesting the page means Django includes a hidden form field which contains an alphanumeric string. Django also returns to the browser a cookie with the name set to csrftoken and value set to the same alphanumeric string.
When receiving the form submission, Django checks that the alphanumeric string value from the hidden form field matches and the csrftoken cookie received from the browser. If they don't match a 403 response is issued.
A CSRF attack might come in the form of a malicious web site that includes an iframe. The iframe includes a POST form and some JavaScript. The form's action attribute points to my Django site. The form is designed to do something nasty at my site, and the JS submits the form when the iframe is loaded.
The browser would include the csrftoken cookie in the header of the form submission. However, the form would not include the hidden field with the matching alphanumeric string, so a 403 is returned and the attack fails. If the iframe JS tried to access the cookie, so as to create the correct hiddden form field, the browser would prevent it from doing so.
Is this correct?

I would say that you are right. You will find here my own formulation of it.
To summarize:
The CSRF token is sent from the code, which means that the malicious code must know it.
The CSRF token is stored in a cookie and sent by the browser.
The attacker cannot access the cookie because of the same-origin policy.
The server can simply verify that the "safe" value coming from the cookie is the same as the one coming from the code.

I think what you want is described here in the official Django Documentation.
https://docs.djangoproject.com/en/dev/ref/contrib/csrf/#how-it-works
Above link was broken when I tried, but for version 1.7 this works:
https://docs.djangoproject.com/en/1.7/ref/contrib/csrf/

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js