I'm working on a website where there is going to be a lot of user generated content. As an WYSIWYG editor I'm using tinyMCE. As a template engine, I'm using ejs.
In order to prevent XSS I decided to use xss npm package.
I'm using these custom rulles:
const strict = {
whiteList: {},
}
const withTags = {
whiteList: {
div: [],
strong: [],
em: [],
br: [],
ul: [],
li: [],
ol: [],
blockquote: [],
},
}
Let's say a user uses text area and submits this code:
<script>alert("hi")</script>
In my DB it's saved this way:
<script>alert("hi")</script>
Now, when the content is rendered by ejs some content is rendered with escaped output (<%= %>). Rendered escaped output (html page) looks like this:
<script>console.log("test")</script>
However, some content is rendered with ejs unescaped raw output (<%- %>). Html will look this way:
<script>console.log("test")</script>
I have 2 questions:
Because < and > are escaped as < and > is it safe to render ejs unescaped output?
Is what I'm doing generally enough for XSS protection?
Thank you very much!
First of all your data probably isn't saved in database the way you presented. In all likelihood it's stored there without any encoding (as it should be).
EJS in itself, when used correctly, takes care of encoding output for you so that you can safely construct parameterized HTML. But in your case you want to disable this protection to render raw HTML, so yes, you must be careful. There are a couple of security controls at your disposal.
1. DOMPurify
I haven't used the xss library personally, it seems to have a lot of downloads and probably it's not a bad option. But DOMPurify is probably better. It also doesn't require configuration and has built-in support for trusted-types (I'll get to that in a minute).
You would use it twice. First on server-side when the HTML is submitted by the user, and second on client-side when the HTML is rendered by EJS.
If you are serious about security then you will connect anomaly alerts from the server-side purification to your SIEM/SOC etc. Then you will know when someone has attempted an XSS attack on your website.
2. Sandboxed Iframes
Another client-side control that you can implement is sandboxed iframes. Instead of just rendering the HTML on the page, you create an IFRAME, give it a properly configured sandbox attribute, and then set the purified HTML as the content. Now even if something goes wrong with the purification, the malicious HTML would be isolated in its own world.
3. Content Security Policy
The coolest and (when used properly) most effective defence against XSS is CSP. How it works is that you give your website restrictions such as "do not execute scripts", "do not load images", etc. And then you allow the scripts that you do want to execute, and nothing else. Now if an attacker manages to inject a script, link, form, etc. on the page, it will not work because it hasn't been specifically allowed.
I've written about CSP at length here, you will even find specific examples for your case (NodeJS and EJS) with CodeSandbox examples on that article. And in general about XSS protection you can read more here.
Hope this helps!
Related
I'm trying to prevent XSS attack in Django, is it ok to use {{ body|escape }} in my HTML file only? How can I do some filtration in the backend?
Not every case is the same security-wise. So hard to give complete advice without seeing your application and the use cases and the Version of Django you use.
If you use the Django's template system and make sure that auto-escaping is enabled (it is enabled by default in recent versions), you're 9x% percent safe. Django provides an auto-escaping mechanism for stopping XSS: it'll automatically escape data that are dynamically inserted into the template. You still have to be aware of some issues:
All the attributes where dynamic data is inserted. Do <img alt="{{somevar}}"> instead of . Django's auto-escaping will not cover your unquoted attribute values.
Data inserted into CSS (style tags and attributes) or Javascript (script blocks, event handlers, and onclick attributes), you must manually escape the data using escaping rules that are appropriate for CSS or Javascript (Probably using a custom filter on the Python side).
Data inserted into a URL attribute (href, img src), you must manually validate the URL to make sure it is safe by checking the protocol against a whitelist of allowed protocols (e.g. https:, mailto:, ... but never javascript:).
Avoid setting html attributes from user input.
If you use mark_safe, make sure you know what you are doing and the data is really "safe".
There is always more but this covers the most known issues. Always make sure to refer to OWASP to understand the different XSS attacks and how they apply to your specific application:
https://owasp.org/www-community/attacks/xss/
https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html
I have 500 or so spambots and about 5 actual registered users on my wiki. I have used nuke to delete their pages but they just keep reposting. I have spambot registration under control using reCaptcha. Now, I just need a way to delete/block/merge about 500 users at once.
You could just delete the accounts from the user table manually, or at least disable their authentication info with a query such as:
UPDATE /*_*/user SET
user_password = '',
user_newpassword = '',
user_email = '',
user_token = ''
WHERE
/* condition to select the users you want to nuke */
(Replace /*_*/ with your $wgDBprefix, if any. Oh, and do make a backup first.)
Wiping out the user_password and user_newpassword fields prevents the user from logging in. Also wiping out user_email prevents them from requesting a new password via email, and wiping out user_token drops any active sessions they may have.
Update: Since I first posted this, I've had further experience of cleaning up large numbers of spam users and content from a MediaWiki installation. I've documented the method I used (which basically involves first deleting the users from the database, then wiping out up all the now-orphaned revisions, and finally running rebuildall.php to fix the link tables) in this answer on Webmasters Stack Exchange.
Alternatively, you might also find Extension:RegexBlock useful:
"RegexBlock is an extension that adds special page with the interface for blocking, viewing and unblocking user names and IP addresses using regular expressions."
There are risks involved in applying the solution in the accepted answer. The approach may damage your database! It incompletely removes users, doing nothing to preserve referential integrity, and will almost certainly cause display errors.
Here a much better solution is presented (a prerequisite is that you have installed the User merge extension):
I have a little awkward way to accomplish the bulk merge through a
work-around. Hope someone would find it useful! (Must have a little
string concatenation skills in spreadsheets; or one may use a python
or similar script; or use a text editor with bulk replacement
features)
Prepare a list of all SPAMuserIDs, store them in a spreadsheet or textfile. The list may be
prepared from the user creation logs. If you do have the
dB access, the Wiki_user table can be imported into a local list.
The post method used for submitting the Merge & Delete User form (by clicking the button) should be converted to a get method. This
will get us a long URL. See the second comment (by Matthew Simoneau)
dated 13/Jan/2009) at
http://www.mathworks.com/matlabcentral/newsreader/view_thread/242300
for the method.
The resulting URL string should be something like below:
http: //(Your Wiki domain)/Special:UserMerge?olduser=(OldUserNameHere)&newuser=(NewUserNameHere)&deleteuser=1&token=0d30d8b4033a9a523b9574ccf73abad8%2B\
Now, divide this URL into four sections:
A: http: //(Your Wiki domain)/Special:UserMerge?olduser=
B: (OldUserNameHere)
C: &newuser=(NewUserNameHere)&deleteuser=1
D: &token=0d30d8b4033a9a523b9574ccf73abad8%2B\
Now using a text editor or spreadsheet, prefix each spam userIDs with part A and Suffix each with Part C and D. Part C will include the
NewUser(which is a specially created single dummy userID). The Part D,
the Token string is a session-dependent token that will be changed per
user per session. So you will need to get a new token every time a new
session/batch of work is required.
With the above step, you should get a long list of URLs, each good to do a Merge&Delete operation for one user. We can now create a
simple HTML file, view it and use a batch downloader like DownThemAll
in Firefox.
Add two more pieces " Linktext" to each line at
beginning and end. Also add at top and at
bottom and save the file as (for eg:) userlist.html
Open the file in Firefox, use DownThemAll add-on and download all the files! Effectively, you are visiting the Merge&Delete page for
each user and clicking the button!
Although this might look a lengthy and tricky job at first, once you
follow this method, you can remove tens of thousands of users without
much manual efforts.
You can verify if the operation is going well by opening some of the
downloaded html files (or by looking through the recent changes in
another window).
One advantage is that it does not directly edit the
MySQL pages. Nor does it require direct database access.
I did a bit of rewriting to the quoted text, since the original text contains some flaws.
On my site, I want to allow users to add reference to images which are hosted anywhere on the internet. These images can then be seen by all users of my site. As far as I understand, this could open the risk of cross site scripting, as in the following scenario:
User A adds a link to a gif which he hosts on his own webserver. This webserver is configured in such a way, that it returns javascript instead of the image.
User B opens the page containg the image. Instead of seeing the image, javascript is executed.
My current security messures are currently such, that both on save and open, all content is encoded.
I am using asp.net(c#) on the server and a lot of jquery on the client to build ui elements, including the generation of image tags.
Is this fear of mine correct? Am I missing any other important security loopholes here? And most important of all, how do I prevent this attack? The only secure way I can think of right now, is to webrequest the image url on the server and check if it contains anything else than binary data...
Checking the file is indeed an image won't help. An attacker could return one thing when the server requests and another when a potential victim makes the same request.
Having said that, as long as you restrict the URL to only ever be printed inside the src attribute of an img tag, then you have a CSRF flaw, but not an XSS one.
Someone could for instance create an "image" URL along the lines of:
http://yoursite.com/admin/?action=create_user&un=bob&pw=alice
Or, more realistically but more annoyingly; http://yoursite.com/logout/
If all sensitive actions (logging out, editing profiles, creating posts, changing language/theme) have tokens, then an attack vector like this wouldn't give the user any benefit.
But going back to your question; unless there's some current browser bug I can't think of you won't have XSS. Oh, remember to ensure their image URL doesn't include odd characters. ie: an image URL of "><script>alert(1)</script><!-- may obviously have bad effects. I presumed you know to escape that.
Your approach to security is incorrect.
Don't approach the topic as "I have a user input, so how can I prevent XSS". Rather approach it like it this: "I have user input - it should be restrictive as possible - i.e. allowing nothing through". Then based on that allow only what's absolutely essential - plain-text strings thoroughly sanitized to prevent anything but a URL, and the specific, necessary characters for URLS only. Then Once it is sanitized I should only allow images. Testing for that is hard because it can be easily tricked. However, it should still be tested for. Then because you're using an input field you should make sure that everything from javascript scripts and escape characters, HTML, XML and SQL injections are all converted to plaintext and rendered harmless and useless. Consider your users as being both idiots and hackers - that they'll input everything incorrectly and try to hack something into your input space.
Aside from that you may run into som legal issues with regard to copyright. Copyrighted images generally may not be used on other people's sites without the copyright owner's consent and permission - usually obtained in writing (or email). So allowing users the opportunity to simply lift images from a site could run the risk of allowing them to take copyrighted material and reposting it on your site without permission which is illegal. Some sites are okay with citing the source, others require a fee to be paid, and others will sue you and bring your whole domain down for copyright infringement.
Even famous sites like Twitter are suffering from XSS vulnerability, what should we do to prevent this kind of attack?
The #1 Thing you can do is set your cookies to HTTP Only ... which at least protects against session cookie hijacking. Like someone stealing your cookie when you are likely admin of your own site.
The rest comes down to validating all user input.
RULE #0 - Never Insert Untrusted Data Except in Allowed Locations
RULE #1 - HTML Escape Before Inserting Untrusted Data into HTML Element Content
RULE #2 - Attribute Escape Before Inserting Untrusted Data into HTML Common Attributes
RULE #3 - JavaScript Escape Before Inserting Untrusted Data into HTML JavaScript Data Values
RULE #4 - CSS Escape Before Inserting Untrusted Data into HTML Style Property Values
RULE #5 - URL Escape Before Inserting Untrusted Data into HTML URL Attributes
Very lengthy subject discussed in detail here:
http://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet
http://www.owasp.org/index.php/Cross_site_scripting
XSS is only one of many exploits and every web dev should learn the top 10 OWASP by heart imho
http://www.owasp.org/index.php/Top_10_2007
Just like you can make SQL injection a non-issue by using prepared statements, you can make XSS non-issue by using templating engine (DOM serializer) that does similar thing.
Design your application so that all output goes via templating engine. Make that templating engine HTML-escapes all data by default. This way you'll have system that's secure by default and does not rely on humans (and rest of the large system) being diligent in escaping of HTML.
I don't what you write your code with, but if your use asp.net, you are partly covered.
asp.net has what they call request validation that when enabled, it prevent malicious script to be introduced via user input.
But sometimes, you'll have to allow some kind of text editor like the one you typed in this question. In this case, you'll have to partly disable request validation to allow some "rich text" html to be input by the end user. In this case you will have to build some kind of white list filtering mechanism.
FYI, I don't know about others but Microsft has library called Anti-Xss.
I'm using ASP.NET Web Forms for blog style comments.
Edit 1: This looks way more complicated then I first thought. How do you filter the src?
I would prefer to still use real html tags but if things get too complicated that way, I might go a custom route. I haven't done any XML yet, so do I need to learn more about that?
If IMG is the only thing you'd allow, I'd suggest you use a simple square-bracket syntax to allow it. This would eliminate the need for a parser and reduce a load of other dangerous edge cases with the parser as well. Say, something like:
Look at this! [http://a.b.c/m.jpg]
Which would get converted to
Look at this! <img src="http://a.b.c/m.jpg" />
You should filter the SRC address so that no malicious things get passed in the SRC part too. Like maybe
Look at this! [javascript:alert('pwned!')]
Use an XML parser to validate your input, and drop or encode all elements, and attributes, that you do not want to allow. In this case, delete or encode all tags except the <img> tag, and all attributes from that except src, alt and title.
If you end up going with a non-HTML format (which makes things easier b/c you can literally escape all HTML), use a standard syntax like markdown. The markdown image syntax is ![alt text](/path/to/image.jpg)
There are others also, like Textile. Its syntax for images is !imageurl!
#chakrit suggested using a custom syntax, e.g. bracketed URLs - This might very well be the best solution. You DEFINITELY dont want to start messing with parsing etc.
Just make sure you properly encode the entire comment (according to the context - see my answer on this here Will HTML Encoding prevent all kinds of XSS attacks?)
(btw I just discovered a good example of custom syntax right there... ;-) )
As also mentioned, restrict the file extension to jpg/gif/etc - even though this can be bypassed, and also restrict the protocol (e.g. http://).
Another issue to be considered besides XSS - is CSRF (http://www.owasp.org/index.php/Cross-Site_Request_Forgery). If you're not familiar with this security issue, it basically allows the attacker to force my browser to submit a valid authenticated request to your application, for instance to transfer money or to change my password. If this is hosted on your site, he can anonymously attack any vulnerable application - including yours. (Note that even if other applications are vulnerable, its not your fault they get attacked, but you still dont want to be the exploit host or the source of the attack...). As far as your own site goes, it's that much easier for the attacker to change the users password on your site, for instance.