Pyramid Chameleon template security for HTML and Javascript - xss

Do Chameleon templates escape/strip XSS and HTML tags for variables? Would the following be safe?
<script type="text/javascript">
var initialComments = ${comments};
for (var i = 0; i < initialComments.length; i++) {
initialComments[i].userId = initialComments[i].user_id;
}
var post = ${post}
// ...
</script>

This is actually harder than it seems, and depends upon circumstances. If the output format is HTML, then the script contents are a special CDATA section, wherein the XML/SGML escapes are not interpreted. If the output format is XML, with XML content type, then the escapes work as expected. Even though it could be possible to patch Chameleon so that it understood the <script></script> element as special, currently it does not.
When embedded in <script> tag, depending on a browser version, a sole </ or </script will end the script tag, making it vulnerable to XSS. Furthermore, a < can start a HTML comment as in <!--, and legacy considerations require that <script></script> can be written balanced WITHIN the script tag, thus the safest is to escape the < altogether in all strings.
Another problem is that JSON is not a strict subset of JavaScript, allowing character strings like U+2028 and U+2029 break the script, if you use ensure_ascii=False:
Thus the proper way to do this on python/chameleon, to embed within a <script> tag in HTML is to use:
post_json = (json.dumps(foo, ensure_ascii=False)
.replace('\u2028', r'\u2028')
.replace('\u2029', r'\u2029')
.replace('<', r'\u003c'))
This is true for only content embedded in <script> tags.
or if you just want to get unreadable escapes for pretty much all characters you can be content with:
post_json = json.dumps(foo, ensure_ascii=True)\
.replace('<', r'\u003c')
and then embed with
${structure: post_json}
I posted a feature request in the Python bug tracker to have a keyword argument to do this automatically, but it was rejected.
The safer way to embed stuff is to store it in data-* attributes.
UPDATE escaping </ alone is not enough.

From the Chameleon introduction page:
By default, the string is escaped before insertion. To avoid this, use the structure: prefix
However, that won't escape JavaScript values. Use JSON for that; in your view, use:
post_json = (json.dumps(post)
.replace(u'<', u'\\u003c')
.replace(u'>', u'\\u003e')
.replace(u'&', u'\\u0026')
.replace(u"'", u'\\u0027'))
to produce a HTML-safe JSON string; JSON (as produced by json.dumps() at least 1) is a Javascript subset, here with any HTML-dangerous characters escaped (with thanks to the Flask json.htmlsafe_dumps() function).
Interpolate that into your template using structure:
<script type="text/javascript">
var initialComments = ${comments};
for (var i = 0; i < initialComments.length; i++) {
initialComments[i].userId = initialComments[i].user_id;
}
var post = ${structure:post_json};
// ...
</script>
1JSON allows for U+2028 and U+2029 characters but the json.dumps() function escapes all non-ASCII codepoints by default.

Related

Hyperlinks inside object fields

I have an object inside my Ember app, with a description field. This description field may contain hyperlinks, like this
My fancy text <a href='http://other.site.com' target='_blank'>My link</a> My fancy text continues...
However, when i output it normally, like {{ description }} my hyperlinks are displayed as a plain text. Why is this happening and how can i fix this?
Handlebars escapes any HTML within output by default. For unescaped text in markup use triple-stashes:
{{{ description }}}
There's an alternative when one controls the property: Handlebars.SafeString. SafeStrings are assumed to be safe and are not escaped either. From the documentation:
Handlebars.registerHelper('link', function(text, url) {
text = Handlebars.Utils.escapeExpression(text);
url = Handlebars.Utils.escapeExpression(url);
var result = '' + text + '';
return new Handlebars.SafeString(result);
});
Note - please be careful with this. There are security concerns with rendering unescaped text that has come from user input; an attacker could inject a malicious script into the description and hijack your page, for example.

actionscript htmltext. removing a table tag from dynamic html text

AS 3.0 / Flash
I am consuming XML which I have no control over.
the XML has HTML in it which i am styling and displaying in a HTML text field.
I want to remove all the html except the links.
Strip all HTML tags except links
is not working for me.
does any one have any tips? regEx?
the following removes tables.
var reTable:RegExp = /<table\s+[^>]*>.*?<\/table>/s;
but now i realize i need to keep content that is the tables and I also need the links.
thanks!!!
cp
Probably shouldn't use regex to parse html, but if you don't care, something simple like this:
find /<table\s+[^>]*>.*?<\/table\s+>/
replace ""
ActionScript has a pretty neat tool for handling XML: E4X. Rather than relying on RegEx, which I find often messes things up with XML, just modify the actual XML tree, and from within AS:
var xml : XML = <page>
<p>Other elements</p>
<table><tr><td>1</td></tr></table>
<p>won't</p>
<div>
<table><tr><td>2</td></tr></table>
</div>
<p>be</p>
<table><tr><td>3</td></tr></table>
<p>removed</p>
<table><tr><td>4</td></tr></table>
</page>;
clearTables (xml);
trace (xml.toXMLString()); // will output everything but the tables
function removeTables (xml : XML ) : void {
xml.replace( "table", "");
for each (var child:XML in xml.elements("*")) clearTables(child);
}

Simple Django question: how to allow html as output from a variable?

this template variable {{object.video.description}} is outputing this text:
Welcome to Saint Francis Academy in the heart of Washington.
How can I get the link to show as an actual link instead of being replaced with html entities. I tried filtering it as safe but no luck: {{object.video.description|safe}}
Can you go to the django shell and see what text is recorded in object.video.description?
How/where does video.description get defined as an html string (what I'm guessing is that a < is already be escaped into < at that point and hence safe won't help). Marking as safe prevents django from converting < to < right before rendering in the template; but won't convert a string containing < into a <.
If the string is originally saved with <s and &gts you can convert them to < and > by a simple python replacement somewhere in your string processing. E.g., in your view do something like:
htmlCodes = (('&', '&'),
('<', '<'),
('>', '>'),
('"', '"'),
("'", '''),)
def unescape(some_html_str):
for c, html_code in htmlCodes:
some_html_str = some_html_str.replace(html_code, c)
return some_html_str
and then remember to unescape your string in your view before putting it in the context (and still remember to mark it safe).
See How do I perform HTML decoding/encoding using Python/Django?
Also it may be better/easier for you to use mark_safe (from django.utils.safestring import mark_safe) in your views to make sure only safe strings are marked safe rather than have your template always render something safe.
{% load markup %}
{{ object.video.description|markdown }}

Inserting a { in an ExpressionEngine template file

I am trying to insert some analytics code into my ExpressionEngine template's footer files, but it treats the {}'s as a function call or something. Is there any way to make it so it understands that EE shouldn't execute what's inside the braces?
I've already tried inserting backslashes and it doesn't seem to work.
Any help would be much appreciated.
ExpressionEngine's Template Class parses curly braces {} as template variables.
Because many programming languages use curly braces, this can cause problems by ExpressionEngine replacing JavaScript curly braces as Template Variables.
For example, the following JavaScript with curly braces all on one line:
<script>var addthis_config = { 'ui_click': true };</script>
Will be parsed by ExpressionEngine as a template variable and rendered as:
<script>var addthis_config = ;</script>
You'll notice everything starting at the opening { and ending with the closing } curly brace gets parsed and replaced! As a workaround, you can place the braces on separate lines and avoid this problem:
<script>
var addthis_config = {
'ui_click': true,
'data_track_clickback': true
};
</script>
If you've written a JavaScript function that expects values from ExpressionEngine, just place your braces on separate lines — which is a good coding convention and is optimal for readability.
What is your Debug preference in EE? It should be set to "1" (recommended). If it's currently at "0" try changing the setting value to "1". In some cases there are possible issues with non-EE {} characters used while debug is set to "0".
You can change the Debug Preference from
CP => Admin => System Administration => Output and Debugging => Debug Preference.
Putting the {} braces on separate lines would also work, but that Debug setting ("1") is highly recommended, and maybe even why this "bug" isn't fixed.
Separate your analytics code into a separate template.
It's probably because you have the analytics code INSIDE another EE loop and so it's trying to parse it as a template variable.
So isolate the code if you need it within the loop and create an embedded template to include.
Thus, create an include called .analytics.
In the .analytics template, do the following (I'm using Google Analytics as an example):
<script type="text/javascript">
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-xxxxxx-1']);
_gaq.push(['_trackPageview']);
(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
</script>
NOTE: Using this method, keep the template as a normal template, do NOT change it to a javascript template because you are using the <script type="text/javascript"> tags inside the template.
Then, in your main template, do a simple:
{embed="template_group/.analytics"}
And you will be good to go.
Try the Protect Javascript config variable. I've used it to mix/match EE vars and JS several times.
EE 1.x
$conf['protect_javascript'] = 'y';
Reference
EE 2
$config['protect_javascript'] = 'y';
Reference
You should be using the hidden config varable protect javascript
$config['protect_javascript'] = 'y';
Have you tried commenting out the whole block of Analtics code using EE template comment tags? i.e.
{!--
Your comments will go in here.
You can even span it across multiple lines.
--}
From here http://expressionengine.com/user_guide/templates/commenting.html
I recommend you to avoid inserting (or trying to insert) raw JS into HTML templates. You can create a different template, with type JavaScript instead of HTML, then you can either add it at the head with a script tag, or {embed="js/piece-of-raw-javascript"}

Replacing a substring using regular expressions

I want to add a call to a onclick event in any href that includes a mailto: tag.
For instance, I want to take any instance of:
<a href="mailto:user#domain.com">
And change it into:
<a href="mailto:user#domain.com" onclick="return function();">
The problem that I'm having is that the value of the mailto string is not consistent.
I need to say something like replace all instances of the '>' character with 'onclick="return function();">' in strings that match '<a href="mailto:*">' .
I am doing this in ColdFusion using the REreplacenocase() function but general RegEx suggestions are welcome.
The following will add your onclick to all mailto links contained withing a string str:
REReplaceNoCase(
str,
"(<a[^>]*href=""mailto:[^""]*""[^>]*)>",
"\1 onclick=""return function();"">",
"all"
)
What this regular expression will do is find any <a ...> tag that looks like it's an email link (ie. has an href attribute using the mailto protocol), and add the onclick attribute to it. Everything up to the end of the tag will be stored into the first backreferrence (referred to by \1 in the replacement string) so that any other attributes in the <a> will be preserved.
If the only purpose of this is to add a JavaScript event handler, I don't think Regex is the best choice. If you use JavaScript to wire up your JavaScript events, you'll get more graceful degradation if JS is not available (e.g. nothing will happen, instead of having onclick cruft scattered throughout your markup).
Plus, using the DOM eliminates the possibility of missing matches or false positives that can occur from a Regex that doesn't perfectly anticipate every possible markup formation:
function myClickHandler() {
//do stuff
return false;
}
var links = document.getElementsByTagName('a');
for(var link in links) {
if(link.href.indexOf('mailto:') == 0) {
link.onclick = myClickHandler;
}
}
Why wouldn't you do this on the frontend with a library like jQuery?
$(function(){
$("a[href^=mailto]").click(function(){
// place the code you want to execute here
})
});