Serializing Django models with non-model values - django

So I have this situation where i want to serialize models with non-model values. I got the serialization of the models [both queryset and single model itself] but trying to combine that with non-model values seem problematic.
for e.g. I want to JSONify User with some status of the request.
Assume model_to_JSON does model=>JSON, so
# it messes the 'user' json by further escaping it,
#which becomes unparseble on client since its a string now
dumps({ 'user': model_to_JSON(user_obj), 'status': 'ok'})
I could do couple of hacks, by first doing loads on the converted model-json. but thats such a hacky way and so much time is just wasted in dumps=>loads=>dumps
second option is string concatenation by doing individuals dumps and then concatenate the strings by stripping '}' of the leftmost string and '{' of the rightmost string with ','.
The Django serializers are very specifically written towards models/queryset, so I can't really override them.
So has anyone faced this problem before and any solutions you could share

You can look at Django Full Serializers, which was the approach I used a long, long time ago.
Another alternative would be to simply write your own serialisation function, that allows you to pass in attribute names (that will be looked up at serialisation time). I've done this, too. If you go down this approach, be aware there is already a django function model_to_dict, which does the pre-serialisation.
A third method might be to use django's forms as an intermediate to serialisation: this might be useful if you may be deserialising data back from the user, too.

Related

Validation with cleaned_data()

I'd like to ask you if you can briefly and in plain English explain to me
how cleaned_data() function validates data ?
Reason for asking is that I'm designing a web app powered by Django
and initially I thought cleaned_data() is smart enough to block user's input that contains potentially harmful characters. Such as ' ; < > and alike. Characters that can be used for SQL injection attacks.
To my surprise, when I deliberately slipped few of those characters into form field, the input made it to database. I was quite shocked.
So then ... what the cleaned_data() function is good for ?
I read about this function in docs, however I couldn't find necessarily answer to this.
cleaned_data is for validated form data. If you have a required CharField, for example, it will validate whether it is present, and whether it has enough characters. If you have an EmailField, then it will validate that it includes an email address.
Take a look at some of the build in form fields for a better idea of what you can do.
It is not intended to prevent XSS or SQL injection. It simply confirms that your form follows basic rules that you have set for it.
You missunderstood cleaned_data. The simplest definition of cleaned_data is something like:
A dict that contains data entered by the user after various validation
(built-in or custom)
Now, that being said, to understand every steps to form validation refer to this link (re-inventing the wheel would be silly since it is greatly explained.)
As for the SQL injection, this is another problem. But again, Django as a built-in way of handling it, this is from the documentation:
By using Django’s querysets, the resulting SQL will be properly
escaped by the underlying database driver. However, Django also gives
developers power to write raw queries or execute custom sql. These
capabilities should be used sparingly and you should always be careful
to properly escape any parameters that the user can control. In
addition, you should exercise caution when using extra() and RawSQL..
I can totally see your confusion, but remember that they are two different things.

Ember Data Model attribute sanitization/escaping to prevent XSS?

How can I perform sanitization on string attributes to prevent XSS? Right now my thoughts are to override my base model's save method and iterate over all the strings in the model and set all the string inputs to safe strings. Would this be a good way to approach this problem or is there a better way?
EDIT:
Problem occurs when saving a name attribute ( alert('xss')) for a person in the app. It saves it in a non-sanitized manner into the database. Then that name is loaded in our other site which does not sanitize the output and that's where the script injection occurs! I'd like to sanitize it before saving it to the DB
Handlebars automatically sanitizes strings. If you want to avoid this, you must explicitly use the triple-brace syntax:
{{{myHtmlString}}}
Rather than trying to sanitise the input, you really ought to change that other site to make sure it html-escapes the data it is presenting from the database. Even if you would "sanitise" things on the Ember side, can you guarantee there are no other vulnerabilities which allow someone to inject HTML in the database?
Always escaping anything being presented is really the only safe way to deal with XSS. If you're filtering input you are very likely to not catch every possible way of injecting unexpected input.

packing objects as json with django?

I've run into a snag in my views.
Here "filtered_posts" is array of Django objects coming back from the model.
I am having a little trouble figuring out how to get as text data that I can
later pack into json instead of using serializers.serialize...
What results is that the data comes double-escaped (escaped once by serializers.serialize and a second time by json.dumps).
I can't figure out how to return the data from the db in the same way that it would come back if I were using the MySQLdb lib directly, in other words, as strings, instead of references to objects. As it stands if I take out the serializers.serialize, I get a list of these django objects, and it doesn't even list them all (abbreviates them with '...(remaining elements truncated)...'.
I don't think I should, but should I be using the __unicode__() method for this? (and if so, how should I be evoking it?)
JSONtoReturn = json.dumps({
'lowest_id': user_posts[limit - 1].id,
'user_posts': serializers.serialize("json", list(filtered_posts)),
})
The Django Rest Framework looks pretty neat. I've used Tastypie before, too.
I've also done RESTful APIs that don't include a framework. When I do that, I define toJSON methods on my objects, that return dictionaries, and cascade the call to related elements. Then I call json.dumps() on that. It's a lot of work, which is why the frameworks are worth looking at.
What you're looking for is Django Rest Framework. It handles related objects in exactly thew way you're expecting it to (you can include a nested object, like in your example, or simply have it output the PK of the related object for the key).

Django - How to pass dynamic models between pages

I have made a django app that creates models and database tables on the fly. This is, as far as I can tell, the only viable way of doing what I need. The problem arises of how to pass a dynamically created model between pages.
I can think of a few ways of doing such but they all sound horrible. The methods I can think of are:
Use global variables within views.py. This seems like a horrible hack and likely to cause conflicts if there are multiple simultaneous users.
Pass a reference in the URL and use some eval hackery to try and refind the model. This is probably stupid as the model could potentially be garbage collected en route.
Use a place-holder app. This seems like a bad idea due to conflicts between multiple users.
Having an invisible form that posts the model when a link is clicked. Again very hacky.
Is there a good way of doing this, and if not, is one of these methods more viable than the others?
P.S. In case it helps my app receives data (as a json string) from a pre-existing database, and then caches it locally (i.e. on the webserver) creating an appropriate model and table on the fly. The idea is then to present this data and do various filtering and drill downs on it with-out placing undue strain on the main database (as each query returns a few hundred results out of a database of hundreds of millions of data points.) W.R.T. 3, the tables are named based on a hash of the query and time stamp, however a place-holder app would have a predetermined name.
Thanks,
jhoyla
EDITED TO ADD: Thanks guys, I have now solved this problem. I ended up using both answers together to give a complete answer. As I can only accept one I am going to accept the contenttypes one, sadly I don't have the reputation to give upvotes yet, however if/when I ever do I will endeavor to return and upvote appropriately.
The solution in it's totality,
from django.contrib.contenttypes.models import ContentType
view_a(request):
model = create_model(...)
request.session['model'] = ContentType.objects.get_for_model(model)
...
view_b(request):
ctmodel = request.session.get('model', None)
if not ctmodel:
return Http404
model = ctmodel.model_class()
...
My first thought would be to use content types and to pass the type/model information via the url.
You could also use Django's sessions framework, e.g.
def view_a(request):
your_model = request.session.get('your_model', None)
if type(your_model) == YourModel
your_model.name = 'something_else'
request.session['your_model'] = your_model
...
def view_b(request):
your_model = request.session.get('your_model', None)
...
You can store almost anything in the session dictionary, and managing it is also easy:
del request.session['your_model']

Pitfalls of generating JSON in Django templates

I've found myself unsatisfied with Django's ability to render JSON data. If I use built in serializes then database foreign key relationships are not included in the data (only the keys). Also, it seems to be impossible to include custom data in the json feed that isn't part of the model being serialized.
As a test I implemented a template that rendered some JSON for the resultset of a particular model. I was able to include/exclude whatever parts of the model I wanted and was able to include custom data as well.
The test seemed to work well and wasn't slower than the recommended serialization methods.
Are there any pitfalls to this using this method of serialization?
While it's hard to say definitively whether this method has any pitfalls, it's the method we use in production as you control everything that is serialized, even if the underlying model is changed. We've been running a high traffic application in for almost two years using this method.
Hope this helps.
One problem might be escaping metacharacters like ". Django's template system automatically escapes dangerous characters, but it's set up to do that for HTML. You should look up exactly what the template escaping does, and compare that to what's dangerous in JSON. Otherwise, you could cause XSS problems.
You could think about constructing a data structure of dicts and lists, and then running a JSON serializer on that, rather than directly on your database model.
I don't understand why you see the choice as being either 'use Django serializers' or 'write JSON in templates'. The middle way, which to my mind is much more robust and fits your use case well, is to build up your data as Python lists/dictionaries and then simply use simplejson.dumps() to convert it to a JSON string.
We use this method to get custom JSON format consumed by datatables.net
It was the easiest method we find to accomplish this task and it looks very fine with no problems so far.
You can find details here: http://datatables.net/development/server-side/django
So far, generating JSON from templates, we've run into the need to escape newlines. Looking at doing simplejson.dumps() next.