Django text file upload and security when using 'mark_safe' - django

I'm working on a Django app where the user uploads a space/tab/comma delimited text file. I display the text in a browser and the user can then interactively parse columns of delimited values which get highlighted with css as they change the settings. (Only a sample is displayed not the whole file!)
To highlight the selections I insert html/css code in and around the text but have to 'mark_safe' the text to get the html/css to render. I assume this opens security issues as even I, a complete noob could insert html in my input file and get it to render.
My Question:
Is there something I can use to strip html out of the text file immediately after I've uploaded it and before I render it in the browser? Would stripping '<' and '>' out be enough? What about something to disable .js if required?
I understand there are other well documented security measures I can take regarding file uploads. However I'm after a solution to my specific issue relating to me 'marking_safe' the input text I then render to the browser.

Django already has Automatic HTML escaping for this. Take a look at the link I posted in the docs. Hope this helps.

Related

Displaying Text on HTML

I made a website on Big Cartel and I need to add some text under the slideshow I am using on my homepage.
From what I gather the only way for me to do this is via code on HTML. I know nothing about coding so I was trying to look up tutorials on how to add text, but I could never get the text to display on the page.
How can I go about adding the text?
I tried using code I found online, just as an example so I can see how it's displayed. I couldn't get any text to display anywhere on the page.

How to restrict Shiny fileInput to text files only?

I want to restrict my file browser to displaying only the types of file I specify, e.g. .txt files only. The only relevant snippets of a code I found are like the following:
fileInput("in_file",
"Input file:",
accept=c("txt/csv", "text/comma-separated-values,text/plain", ".csv")
However, this does't filter the files showing in the browser to just .txt and .csv. Any ideas?
As far as I understand, that's the right way to do it. If you view your app in the RStudio viewer it wouldn't do anything, but in a browser it should. I'm using Chrome and I just ran that code and it did in fact only show me txt and csv files.
Of course the user can still choose to view other files by going to the little select box and choosing to view all files, but they have to consciously choose to do that. By default only csv and txt files will be shown

Should I Html Encode the Html input from user?

We are developing an application which takes the user input as Html and render the same Html as output in a different page. And the input should never have any dynamic behaviour in it like script tags.
We Html Encode the value in Javascript and save the encoded value in DB. We Html Decode the saved value and render it in the new page to get the expected result(check below example).
From what I have read so far, I should Html Encode the input before rendering it as output in a different page. The problem I am facing in this is that whatever the Html added by user is displayed the same in the new page
Example:
User Input:
<div><h2>Header</h2><p>this is the body text</p></div>
Output in the new page when Html encoded and assigned it to another div:
<div><h2>Header</h2><p>this is the body text</p></div>
Expected:
Header
this is the body text
The only way I was able to achieve the expected result was when I Html decoded the saved value and assigned it to another container control.
Am I missing something, I tried all the ways I am aware of Html Encoding the user input and rendering it back is not giving me the expected result. Any idea on how to achieve this?
If there is no other solution, is there any validation framework in .net available to avoid XSS attacks. I have went through AntiXSS framework from microsoft they are more for stripping any harmfull html and encoding. They do not help in letting the user know that they should not be entering some tags.
Thanks for any help in advance.
If the user input is HTML, and you encode it before saving it, then when you display it, you should decode it.
The reason the recommendation exists to encode before displaying is if the user input is expected to be text, it is recommended to encode for general display purposes (so that an ampersand actually displays as &) and also to prevent potentially malicious input from being rendered on the page and interpreted by the browser (e.g. <script> tags).
Please be careful: If you are intending to display HTML that is provided by a user that you try to sanitize the input as much as possible -- make sure they aren't trying to do anything malicious and also to make sure they don't make a simple mistake that could wreck the entire layout of a webpage (e.g. have an opening tag without a closing tag). This type of sanitation is no simple task and one of the major factors why other flavors of markup exist in the first place (e.g. Mark Down, BBCode, etc.).
#Brian Ball has answered the question, but I feel some further explanation is warranted.
The many and varied encoding protocols are context-specific.
As I understand it, the only point of HTMLencoding (as opposed to other encoding protocols like URIencoding etc) is to allow text to be rendered by a browser 'as is' if it contains elements that otherwise would be parsed as HTML (e.g. the characters & < > / and double and single quotes). The endcoding 'hides' these characters from the browser's HTML parser.
So really, the only place HTMLencoding serves any purpose is at the point of preparing the text to be rendered by a browser. There is no purpose served by HTMLencoding user-entered text that is heading for a database. You may need to use other encodings for transmission, for ensuring appropriate handling by server-side languages, etc., but HTMLencoding has no place in these contexts.
In your situation, it is the very fact that you previously HTMLencoded the content that is preventing it from being rendered as HTML when you later retrieve it from the database. The encoding is doing exactly what it is meant to.
So the simple answer is,
a. there's no point HTMLencoding the user-entered data before saving it to your database, and
b. if you want it rendered as HTML rather than printed to the screen 'as is', do not HTMLencode it at the point of displaying it on another page.

sitecore - dot in rich text editor

I'm finding that some strange things are happening with my rich text editor in Sitecore 6.6. Some users have been entering content that is breaking items. Looking at the text it seems fine but when I copy the text into notepad I can see this dot character "ยท"
Any time this is in the text editor clicking Show Editor does nothing and I can't view raw values either. When I click Edit Html the edit box comes up but my changes will never save, the only way I can seem to fix these pages is if I update the content straight into the database.
Is there a way I can:
Prevent these characters from getting into the database,
Update the content without having to run update statements on the database?
Thanks in advance.

Importing HTML into TinyMCE using ColdFusion

Hey everyone, I would appreciate a pointing in the right direction with the problem I'm having. In short, I'm working on an application that will create PDFs using TinyMCE and ColdFusion 8. I have the ability to create a PDF by just entering in text, pictures, etc. However, I want to be able to import an html template and insert it into the TinyMCE .
Basically, I have a file directory code snippet that lets me browse through my 'HTMLTemplates' folder, and am able to select an HTML document. Now, I want to be able to take all the code from that selected HTML document and insert it into my TinyMCE box. Any tips on how I might do this, maybe?
Thanks!
If I understood you correctly, you already have a TinyMCE plugin which pops up a window and allows you to browse the certain directory using existing cfm page which you render within the popup window. Right?
If not, you should start with this. Not sure how easy it is done in current version, but in the older TinyMCE I've created the custom upload plugin (needed to track the site security permissions for current user) pretty quickly.
Next, I can see two quick ways to pass the server file contents to the client-side:
Make it available via HTTP so you can make the GET request and read contents into the variable.
Output it on the page using CF (say, on form submit when file selected) and grab using JavaScript.
I'd personally tried the second option. After you grab the text into the variable you can put it into the TinyMCE using it's API.
It can be as simple as output escaped text into the hidden div with known ID and read it using DOM operations (assuming that there is cfoutput around):
<div id="myTemplate">#HTMLEditFormat(myFileContents)#</div>
Also you can output the text directly into the JavaScript variable (of cource, with accurate escaping), maybe like this.
<script type="text/javascript">
var text = '#HTMLEditFormat(myFileContents)#';
</script>
Most advanced and possibly better for performance (and definitely "cooler") way is to use the concept of script tags as data containers, like this:
<script type="text/plain">
#HTMLEditFormat(myFileContents)#
</script>
Last time I've seen this in Nadel's blog, I think. Read it, pretty interesting.
Hope this helps.