This link describes an exploit into my app using fckEditor:
http://knitinr.blogspot.com/2008/07/script-exploit-via-fckeditor.html
How do I make my app secure while still using fckEditor? Is it an fckEditor configuration? Is it some processing I'm supposed to do server-side after I grab the text from fckEditor?
It's a puzzle because fckEditor USES html tags for its formatting, so I can't just HTML encode when I display back the text.
Sanitize html server-side, no other choice. For PHP it would be HTML Purifier, for .NET I don't know. It's tricky to sanitize HTML - it's not sufficient to strip script tags, you also have to watch out for on* event handlers and even more, thanks to stupidities of IE for example.
Also with custom html and css it's easy to hijack look and layout of your site - using overlay (absolutely positioned) which covers all screen etc. Be prepared for that.
The bug is not actually FCKeditors fault. As long as you let users edit HTML that will be displayed on your web site they will always have to possibility to do harm unless you check the data before you output it.
Some people use HTMLencoding to do this, but that will destroy all the formatting done by FCKeditor, not what you want.
Maybe you can use the Microsoft Anti-Cross Site Scripting Library. Samples on MSDN
Is it some processing I'm supposed to do server-side after I grab the text from fckEditor?
Precisely. StackOverflow had some early issues related to this as well. The easiest way to solve it is to use an HTML library to parse user's input, and then escape any tags you don't want in the output. Do this as a post-processing step when printing to the page -- the data in the database should be the exact same as what the user typed in.
For example, if the user enters <b><script>evil here</script></b>, your code would translate it to <b><script>evil here</script></b> before rendering the page.
And do not use regular expressions for solving this, that's just an invitation for somebody clever to break it again.
FCKEditor can be configured to use only a few tags. You will need to encode everything except for those few tags.
Those tags are: <strong> <em> <u> <ol> <ul> <li> <p> <blockquote> <font> <span>.
The font tag only should have face and size attributes.
The span tag should only have a class attribute.
No other attributes should be allowed for these tags.
I understand the DONTS. I'm lacking a DO.
Is use of FCKEditor a requirement, or can you use a different editor/markup language? I advise using Markdown and WMD Editor, the same language used by StackOverflow. The Markdown library for .NET should have an option to escape all HTML tags -- be sure to turn it on.
XSS is a tricky thing. I suggest some reading:
Is HTML a Humane Markup Language?
Safe HTML and XSS
Anyway, my summary is when it comes down to it, you have to only allow in strictly accepted items; you can't reject known exploit vectors because or you'll always be behind the eternal struggle.
I think the issue raised by some is not that Fckeditor only encodes a few tags. This is a naive assumption that an evil user will use the Fckeditor to write his malice. The tools that allow manual changing of input are legion.
I treat all user data as tainted; and use Markdown to convert text to HTML. It sanitizes any HTML found in the text, which reduces malice.
Related
I'm using pdfkit to generate a PDF of a Django template (doing this by getting an HTML string of the page from Django's get_template and render functions and passing that string to pdfkit... see post).
On this page, I have some TextArea's that can contain many lines of text, and by default, they just get cut off when generating the PDF.
I've tried to fix this by using some javascript libraries (I've tried several) to automatically expand the TextAreas on page load. I can get these to work perfectly on normal pages, but when I try to include it on the PDF template, I get various errors ranging from not working at all to expanding the TextArea way too much. My first assumption was that there was some styling differences that were causing the issues, but I'm fairly certain I've ruled that out. I tried to load the PDF template directly as a view, and the TextArea's resized correctly, leading me to believe that there's something with pdfkits generation that isn't playing nicely with the resizing.
Given this, I tried to look if pdfkit has any suggestions for issues like this and couldn't find any, and I also tried to use different input types other than TextAreas, none of which were able to display newlines correctly.
I can't think of any other potential solutions at this point, and I'm open to suggestions. Please let me know if you feel I should provide additional information, and thank you in advance.
I ended up finding a relatively simple fix. Because I was using django forms, I was pretty easily able to change from displaying the form Textarea:
{{ form.paragraph_data }}
to displaying just the plain text:
{{ form.paragraph_data.initial }}
However, this initially caused the newlines to not display correctly, because HTML doesn't process them in a plain string. So I added some processing in the creation of the form to replace the newlines with <br />s:
form.fields['paragraph_data'].initial = form.fields['paragraph_data'].initial.replace('\n', '<br />')
Finally, I had to add the safe filter to Django templating line to tell it to actually render the HTML rather than cleansing it:
{{ form.paragraph_data.initial|safe }}
Again, this was partially easy because of Django forms, but it should translate relatively easily to a more standard javascript/html solution.
I just want text and hyperlinks, and not <p> tags. I have also had an issue when putting in a list, and each <li> gets a   put in front of it which is being recognized as a new paragraph.
Is there any way to stop the rich text editor from adding these in?
The Sitecore Rich Text Editor is configurable in a variety of ways. Internally it's an instance of the Telerik RAD Editor. Hence you can apply many of the same configuration strategies documented on Telerik's site to it.
A while back I wrote a blog post about how you can stop the editor form messing about with your HTML:
https://blog.jermdavis.dev/posts/2014/ever-wished-the-rich-text-field-didnt-mess-with-your-html
While that's not addressing your exact issue, the general strategy there for configuring the internal behaviour of the editor can probably be used to meet your requirements. The underlying editor has a series of filtering behaviours that you can enable and disable to help with your requirements. The "FixEnclosingP" and "ConvertCharactersToEntities" options might be of help here? They're documented on Telerik's website:
http://www.telerik.com/help/aspnet-ajax/t_telerik_web_ui_editorfilters.html
There are also other strategies, such as post-processing the HTML that's saved by the editor. Sitecore's SaveRichTextContent pipeline might be of help here? This blog post might offer you some ideas about how that can be used:
https://techmusingz.wordpress.com/2014/06/14/wrapping-rich-text-value-in-paragraph-tag-in-sitecore/
I am having a textbox in my MVC view, that allows user to input HTML tags, but only few tags (such as, B, I, U, and A).
For this, I have set ValidateInput attribute on my POST action to False, so it allows users to input HTML tags.
But now I want to restrict users to input other HTML tags such as (INPUT, SCRIPT, etc). I mean, anything except the ones which I want to allow.
I guess, one way is to use a regex, but I am unable to find a proper regex for this.
Any idea of how to achieve this? Any help on this much appreciated.
Thanks and Regards
That's dangerous, man. Your users could still insert undesired tags using some tricks, for example encoding data. Even if you try to think all the possible ways a user can employ to enter "dangerous" tags in your code, he'll find an additional one.
So you should try to look some kind of proven solution for your problem. Look for HTML sanitizer, for example Google ASP.NET MVC sanitize html input and you'll find several solutions. AntiXSS library could be a good solution: now it's called Microsoft Web protection Library. You can include it in your solution as a NuGet package:
Install-Package AntiXSS
I recommend you to read this article to get a deeper view of the problem and its solutions:
.NET HTML Sanitation for rich HTML Input
In this article you'll find that AniXSS and a less restrictive solution with full explanation of pros, cons, and how it all works. Don't miss the references in the comments.
I have a WYsIWYG editor in my coldfusion app and need to prevent XSS Attacks. Is there any Coldfusion ways to strip out all script type attacks?
http://blog.pengoworks.com/index.cfm/2008/1/3/Using-AntiSamy-to-protect-your-CFM-pages-from-XSS-hacks
http://code.google.com/p/owaspantisamy/downloads/list
The main question I would ask is what is this WYSIWYG for? Many WYSIWYG's allow you to define specific tags to have stripped out of the code.
For instance you can have TinyMCE strip out the script tags with
http://wiki.moxiecode.com/index.php/TinyMCE:Configuration/invalid_elements
This unfortunately does not solve your problem since all client side data form submissions are circumventable. If you must use a WYSIWYG ,then what you really need to make sure to do is to cover all your bases on the form's validation and display. You can strip out all script tags and make sure to remove any event attributes and javascript code in links href attributes.
If it is acceptable to only allow a specific subset of tags I would suggest either using BBML, BBCode, or Markdown.
http://www.depressedpress.com/Content/Development/ColdFusion/Extensions/DP_ParseBBML/Index.cfm
http://en.wikipedia.org/wiki/BBCode
http://sebduggan.com/projects/cfxmarkdown
You can use TinyMCE as a WYSIWYG for BBCode http://tinymce.moxiecode.com/examples/example_09.php and StackOverflow uses a great markdown editor http://github.com/cky/wmd
Here is some good info if you would like to render BBCode in Coldfusion
http://www.sitepoint.com/forums/showthread.php?t=248040
Something to consider is that while stripping the tags out in the browser with TinyMCE is a good idea, it makes a fatal assumption that the user is going to be submitting content via the browser. Anything that you do in the browser needs to be duplicated on the server because attackers can bypass any validation that happens in the browser.
With that said check this article: http://www.fusionauthority.com/techniques/3908-how-to-strip-tags-in-three-easy-lessons.htm which spells this out in more detail than I could here. Basically it discusses using regex and UDFs to strip tags out easily. The last example is particularly important... check it out.
To convert these tags <> or use HTMLEditformat function.
I've found a similar question here, but I'm looking for more general solutions.
As it is now, when Django generates anykind of HTML for you (this mainly happens when generating forms), it uses self-closing tags by default i.e. <br /> instead of <br>. <br /> is valid XHTML and I think HTML5 also, but it's not valid HTML4.
Is there any clean way to override this? Or is it better to write django sites in XHTML or HTML5 instead?
There was a whole series of discussions on this when development for 1.2 kicked off, with a range of solutions proposed, but no general way forward was agreed.
But see Simon Willison's Django-HTML project for one possible solution.
You can rewrite entirely the way django output HTML for you. E.G : for the form, you can :
choose between output using table, p or li by using the property "as_xxx".
print the form label by label, choosing the tag wrappers.
use widget to define how a form piece will print to HTML.
Of course you need the new forms to do so, and there for use Django 1.X.