How to prevent XSS attack in django - django

I'm trying to prevent XSS attack in Django, is it ok to use {{ body|escape }} in my HTML file only? How can I do some filtration in the backend?

Not every case is the same security-wise. So hard to give complete advice without seeing your application and the use cases and the Version of Django you use.
If you use the Django's template system and make sure that auto-escaping is enabled (it is enabled by default in recent versions), you're 9x% percent safe. Django provides an auto-escaping mechanism for stopping XSS: it'll automatically escape data that are dynamically inserted into the template. You still have to be aware of some issues:
All the attributes where dynamic data is inserted. Do <img alt="{{somevar}}"> instead of . Django's auto-escaping will not cover your unquoted attribute values.
Data inserted into CSS (style tags and attributes) or Javascript (script blocks, event handlers, and onclick attributes), you must manually escape the data using escaping rules that are appropriate for CSS or Javascript (Probably using a custom filter on the Python side).
Data inserted into a URL attribute (href, img src), you must manually validate the URL to make sure it is safe by checking the protocol against a whitelist of allowed protocols (e.g. https:, mailto:, ... but never javascript:).
Avoid setting html attributes from user input.
If you use mark_safe, make sure you know what you are doing and the data is really "safe".
There is always more but this covers the most known issues. Always make sure to refer to OWASP to understand the different XSS attacks and how they apply to your specific application:
https://owasp.org/www-community/attacks/xss/
https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html

Related

Does Django cache custom tags and filters?

I have a considerable number of custom template tags that perform a variety of functions, including:
simple string transformation
display of complex ui elements
timestamp manipulation and formatting
handling and display of user avatars
etc...
All of these functions reside in a single file: app/templatetags/custom_tags.py. When I want to use one of these tags in a template, I import all of them using {% load custom_tags %}.
However, only a small subset of the available tags are actually used in any given template. In other words, all these functions are being 'loaded' into the template, yet only a few of them are called in a specific web request.
Is this inefficient, in terms of performance? Should I be loading code more conservatively -- i.e., splitting up my custom tags into separate files and only loading the subset that I need?
Or does this not matter, because all tags are loaded in memory -- i.e., subsequent calls to {% load custom_tags %} elsewhere in the application won't result in any additional overhead?
I apologize if there are incorrect assumptions or premises in this question. I'd love to have a better understanding of the implications of importing python code in general, or in a Django environment specifically.
For Django <= 1.8:
The load tag is defined here and actually does the loading here and here. Both places call to get_library, defined here. According to the docstrings there, yes, it caches template tag/filter libraries within the same process in the dictionary initalized here.
For Django 1.9:
The modules for template tags are now being loaded even earlier, when the parser is instantiated, and the libraries are being stored directly on the parser. Load tags call out to find_library here and here, which just gets the already-loaded tag directly from the parser.
Aside from the actual loading activity
As #spectras points out below, regardless of the Django version, the tag's loading behavior is, strictly speaking, a side effect, and the tag returns (<=1.8/1.9) a no-op node(<=1.8/1.9), which renders no content--so there's not really a performance consideration as far as that goes.

Is Mustache XSS-proof?

I was thinking about my app's XSS vulnerability. On the server side I don't sanitize either input or output, so
<script>alert(document.cookies)</script>
is stored in database exactly so. To view this value on the client side I use Mustache. If this script was executed by an admin, it is of course easy to hijack his session. However I've noticed that Mustache by default escapes these values & \ " < > when you use the {{}} syntax. Do I need to worry about XSS, when the value from the database would be inserted into
<p>{{value}}</p>
or even
<p data-id='{{value}}'>something</p>
? Should I perhaps review my Mustache templates to look for any vulnerable code, or unless I'd use
<script>{{value}}</script>
I am safe?
Well, you should always worry :) But yes, Mustache accomplishes the goal you are talking about here, protecting your examples from XSS (except where you're outputting the value directly into a <script> tag).
Note: check that the Mustache implementation you're using escapes single quotes. It's apparently not in the spec to do so (https://github.com/mustache/spec/issues/69) but the major implementations thankfully escape it anyway.

Measures to prevent XSS vulnerability (like Twitter's one a few days before)

Even famous sites like Twitter are suffering from XSS vulnerability, what should we do to prevent this kind of attack?
The #1 Thing you can do is set your cookies to HTTP Only ... which at least protects against session cookie hijacking. Like someone stealing your cookie when you are likely admin of your own site.
The rest comes down to validating all user input.
RULE #0 - Never Insert Untrusted Data Except in Allowed Locations
RULE #1 - HTML Escape Before Inserting Untrusted Data into HTML Element Content
RULE #2 - Attribute Escape Before Inserting Untrusted Data into HTML Common Attributes
RULE #3 - JavaScript Escape Before Inserting Untrusted Data into HTML JavaScript Data Values
RULE #4 - CSS Escape Before Inserting Untrusted Data into HTML Style Property Values
RULE #5 - URL Escape Before Inserting Untrusted Data into HTML URL Attributes
Very lengthy subject discussed in detail here:
http://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet
http://www.owasp.org/index.php/Cross_site_scripting
XSS is only one of many exploits and every web dev should learn the top 10 OWASP by heart imho
http://www.owasp.org/index.php/Top_10_2007
Just like you can make SQL injection a non-issue by using prepared statements, you can make XSS non-issue by using templating engine (DOM serializer) that does similar thing.
Design your application so that all output goes via templating engine. Make that templating engine HTML-escapes all data by default. This way you'll have system that's secure by default and does not rely on humans (and rest of the large system) being diligent in escaping of HTML.
I don't what you write your code with, but if your use asp.net, you are partly covered.
asp.net has what they call request validation that when enabled, it prevent malicious script to be introduced via user input.
But sometimes, you'll have to allow some kind of text editor like the one you typed in this question. In this case, you'll have to partly disable request validation to allow some "rich text" html to be input by the end user. In this case you will have to build some kind of white list filtering mechanism.
FYI, I don't know about others but Microsft has library called Anti-Xss.

Best way to decide on XML or HTML response?

I have a resource at a URL that both humans and machines should be able to read:
http://example.com/foo-collection/foo001
What is the best way to distinguish between human browsers and machines, and return either HTML or a domain-specific XML response?
(1) The Accept type field in the request?
(2) An additional bit of URL? eg:
http://example.com/foo-collection/foo001 -> returns HTML
http://example.com/foo-collection/foo001?xml -> returns, er, XML
I do not wish to oblige machines reading the resource to parse HTML (or XHTML for that matter). Machines like the googlebot should receive the HTML response.
It is reasonable to assume I control the machine readers.
If this is under your control, rather than adding a query parameter why not add a file extension:
http://example.com/foo-collection/foo001.html - return HTML
http://example.com/foo-collection/foo001.xml - return XML
Apart from anything else, that means if someone fetches it with wget or saves it from their browser, it'll have an appropriate filename without any fuss.
My preference is to make it a first-class part of the URI. This is debatable, since there are -- in a sense -- multiple URI's for the same resource. And is "format" really part of the URI?
http://example.com/foo-collection/html/foo001
http://example.com/foo-collection/xml/foo001
These are very easy deal with in a web framework that has URI parsing to direct the request to the proper application.
If this is indeed the same resource with two different representations, the HTTP invites you to use the Accept-header as you suggest. This is probably a very reliable way to distinguish between the two different scenarios. You can be plenty sure that user agents (including search engine spiders) send the Accept-header properly.
About the machine agents you are going to give XML; are they under your control? In that case you can be doubly sure that Accept will work. If they do not set this header properly, you can give XML as default. User agents DO set the header properly.
I would try to use the Accept heder for this, because this is exactly what the Accept header is there for.
The problem with having two different URLs is that is is not automatically apparent that these two represent the same underlying resource. This can be bad if a user finds an URL in one program, which renders HTML, and pastes it in the other, which needs XML. At this point a smart user could probably change the URL appropriately, but this is just a source of error that you don't need.
I would say adding a Query String parameter is your best bet. The only way to automatically detect whether your client is a browser(human) or application would be to read the User-Agent string from the HTTP Request. But this is easily set by any application to mimic a browser, you're not guaranteed that this is going to work.

How do you allow the usage of an <img> while preventing XSS?

I'm using ASP.NET Web Forms for blog style comments.
Edit 1: This looks way more complicated then I first thought. How do you filter the src?
I would prefer to still use real html tags but if things get too complicated that way, I might go a custom route. I haven't done any XML yet, so do I need to learn more about that?
If IMG is the only thing you'd allow, I'd suggest you use a simple square-bracket syntax to allow it. This would eliminate the need for a parser and reduce a load of other dangerous edge cases with the parser as well. Say, something like:
Look at this! [http://a.b.c/m.jpg]
Which would get converted to
Look at this! <img src="http://a.b.c/m.jpg" />
You should filter the SRC address so that no malicious things get passed in the SRC part too. Like maybe
Look at this! [javascript:alert('pwned!')]
Use an XML parser to validate your input, and drop or encode all elements, and attributes, that you do not want to allow. In this case, delete or encode all tags except the <img> tag, and all attributes from that except src, alt and title.
If you end up going with a non-HTML format (which makes things easier b/c you can literally escape all HTML), use a standard syntax like markdown. The markdown image syntax is ![alt text](/path/to/image.jpg)
There are others also, like Textile. Its syntax for images is !imageurl!
#chakrit suggested using a custom syntax, e.g. bracketed URLs - This might very well be the best solution. You DEFINITELY dont want to start messing with parsing etc.
Just make sure you properly encode the entire comment (according to the context - see my answer on this here Will HTML Encoding prevent all kinds of XSS attacks?)
(btw I just discovered a good example of custom syntax right there... ;-) )
As also mentioned, restrict the file extension to jpg/gif/etc - even though this can be bypassed, and also restrict the protocol (e.g. http://).
Another issue to be considered besides XSS - is CSRF (http://www.owasp.org/index.php/Cross-Site_Request_Forgery). If you're not familiar with this security issue, it basically allows the attacker to force my browser to submit a valid authenticated request to your application, for instance to transfer money or to change my password. If this is hosted on your site, he can anonymously attack any vulnerable application - including yours. (Note that even if other applications are vulnerable, its not your fault they get attacked, but you still dont want to be the exploit host or the source of the attack...). As far as your own site goes, it's that much easier for the attacker to change the users password on your site, for instance.