I am using James Bennetts code (link text) to create a dynamic form. Everything is working ok but I have now come to the point where I need to save the data and have become a bit stuck. I know I can assess the data returned by the form and simply save this to the database as a string but what I'd really like to do is save what type of data it is e.g. date, integer, varchar along with the value so that when it comes to viewing the data I can do some processing on it depending on what type it is e.g. get dates greater than last week.
So my question is how do I access what database type the form element is based on what type of form element it is e.g. a django.forms.IntegerField has a database field type of int, django.forms.DateField would be a date field and django.forms.ChoiceField would be a varchar field?
Why do you want to know, what kind of database field you are using? Are you storing information from the form through raw sql? You should have some model, that you are storing information from the form and it will do all the work for you.
Maybe you could show some form code? Right now it's hard to determine, what exactly you are trying to do.
I can not understand the exact problem, so forgive me if I get things wrong.
If you are using models, then you don't need to know about database-level data types. They are defined by django according to your model fields.
However, since you are talking about dynamic forms (I've read the article), you are probably not working with models, at least not directly. In that case, it should not matter as well, because you are using form validation so, for example, you can be absolutely sure that an integer comes out of a forms.IntegerField field, unicode comes out of forms.CharField and so on.
If you are writing your database-interaction routies by hand (raw sql), then you have to map python-types to db-types yourself, for example <type 'int'> goes to a column of type integer (or something), <type 'datetime.datetime'> goes to a datetime type of column (or not, this example is arbitrary) and so on. When you are using models, django does this type of mapping for you in a database-engine-independent way.
Either way, you, yourself are defining the datatypes on the python side and you or django must also define the datatypes on the db side. The choice of those types is, at times, not an automatic 1:1 type of decision, but, rather, a design decision, based on what this data is used for in your application.
Sorry, if this makes little sense, but, I must admit, that I don't quite understand the problem behind your question.
If you're using James' code, then you don't get a Model out of the form per se, rather a list of form field elements. That means that you can't save the data as a Model instance.
I think you have two choices; bundle the whole form into a JSON object and save that into a LONGTEXT variable in your database, or save each form element into a row of the database on its own, saving it into a BLOB entry. In this case, you'll need to 'pickle' the object before saving it. If you pickle the object and save it into the database, when you retrieve it and unpickle it, you'll have all the python class information associated with the object.
Trying to make this clearer - if you have the bytes; 2009-11-28 21:34:36.516176, is this a str or a datetime object? You can't tell if it's stored in the database as a VARCHAR or LONGTEXT. Which is the core of your question - you do get object information if you save it as a pickled object though.
By extension, you could save your whole Form object into the database, either as a JSON object, or pickle the object and save that.
I'm struggling with something very similar at the moment, as I'm trying to put together a dynamic form system, and thinking of going the 'individual form field element, pickled' and then saved into the database. So I'll be watching how this question works out! :)
Related
I'm thinking of using a raw query to quickly get around limitations with either my brain or the Django ORM, but I don't want to redevelop the infrastructure required to support the existing ORM code such as filters. Right now I'm stuck with two dead ends:
Writing an inner raw query and reusing that like any other query set. Even though my raw query selects the correct columns, I can't filter on it:
AttributeError: 'RawQuerySet' object has no attribute 'filter'
This is corroborated by another answer, but I'm still hoping that that information is out of date.
Getting the SQL and parameters from the query set and wrapping that in a raw query. It seems the raw SQL should be retrievable using queryset.query.get_compiler(DEFAULT_DB_ALIAS).as_sql() - how would I get the parameters as well (obviously without actually running the query)?
One option for dealing with complex queries is to write a VIEW that encapsulates the query, and then stick a model in front of that. You will still be able to filter (and depending upon your view, you may even get push-down of parameters to improve query performance).
All you need to do to get a model that is backed by a view is have it as "unmanaged", and then have the view created by a migration operation.
It's better to try to write a QuerySet if you can, but at times it is not possible (because you are using something that cannot be expressed using the ORM, for instance, or you need to to something like a LATERAL JOIN).
I am trying to save query result obtained in one view to session, and retrieve it in another view, so I tried something like below:
def default (request):
equipment_list = Equipment.objects.all()
request.session['export_querset'] = equipment_list
However, this gives me
TypeError at /calbase/
<QuerySet [<Equipment: A>, <Equipment: B>, <Equipment: C>]> is not JSON serializable
I am wondering what does this mean exactly and how should I go about it? Or maybe there is alternative way of doing what I want besides using session?
If this is what you are saving:
equipment_list = Equipment.objects.all()
You shouldn't or wouldn't need to use sessions. Why? Because this is a simple query without any filtering. equipment_list would be common to all the users. This can quite easily be saved in the cache
from django.core.cache import cache
equipment_list = cache.get('equipment_list')
if not equipment_list:
equipment_list = Equipment.objects.all()
cache.set('equipment_list',equipment_list)
Note that a queryset can be saved in the cache without it having to be converted to values first.
Update:
One of the other answers mention that a querysets are not json serializable. That's only applicable when you are trying to pass that off as a json response. Isn't applicable when you are trying to cache it because django.core.cache does not use json serialization it uses pickling.
'e4c5' raises a concern which is perfectly valid. From the limited code we can see, it makes no sense to put in the results of that query into the session. Unless ofcourse you have some other plans which we cant quite see here. I'll ignore this and assume you ABSOLUTELY MUST save the query results into the session.
With this assumption, you must understand that the queryset instance which Django is giving you is a python object. You can move this around WITHIN your Django application without any hassles. However, whenever you attempt to send such an entity over the wire into some other data store/application (in your case, saving it into the session, which involves sending this data over to your configured session store), it must be serializable to some format which:
your application knows how to serialize objects into
the data store at the other end knows how to de-serialize. In this case, the accepted format seems to be JSON. (this is optional, the JSON string can be stored directly)
The problem is, the queryset instance not only contains the rows returned from the table, it also contains a bunch of other attributes and meta attributes which come in handy to you when you use the Django ORM API. When you try to send the queryset instance over the wire to your session store, the system knows no better and tries to serialize all these attributes into JSON. This fails because there are attributes within the queryset that are not serializable into JSON.
As far as a solution is concerned, if you must save data into the session, as some people have suggested, simply performing objects.all().values() and saving it into your session may not always work. A simple case is when your table returns datetime objects. Datetime objects are by default, not JSON serializable.
So what should you do? What you need is some sort of serializer which accepts a queryset, and safely iterates over the returned rows converting each python native datatype into a JSON safe equivalent, and then returning that. In case of datetime.datetime objects, you would need to call obj.isoformat() to transform it into an ISO format datetime string.
You cannot save a QuerySet instance in session, cause well as you said, they're not JSON Serializable. Read This for more information.
To save your queryset, you can use values and values_list methods to get your desired fields, then you cast them to a list and then save the list into session.(most of the time saving only the PKs does the job though).
so basically:
qset = Model.objects.values_list("pk", "field_one", "field_two") # Gives you a ValuesListQuerySet object which's still not serializable.
cache_results = list(qset)
# Now you cache the cache_results variable however you want.
redis.setex("cached:user_id:querytype", 10 * 60, json.dumps(cache_results))
It's also better to change the way you save this special result (values_list) so you can have better lookups, a dictionary might be a good choice.
Saving query sets in django sessions requires them to be serialized and that causes the error. One way of easily moving the query set by saving them in sessions is to make a list of the id of the Equipments model. (Or any other field that serves as the primary key of the model), like:
equipments = [equipment.id for equipment in Equipment.objects.all()]
request.session['export_querset'] = equipments
And then whenever you need the Equipments, traverse this list and get the corresponding Equipment.
equipments = [Equipment.objects.get(id=id) for id in request.session['export_querset']]
Note: This method is inefficient and is not recommended for large query sets, but for small query sets, it can be used without worries.
I have a model called "Story" that has two integer fields called "views" and "votes". When I retrieve all the Story objects I would like to annotate the returned QuerySet with a "ranking" field that is simply "views"/"votes". Then I would like to sort the QuerySet by "ranking". Something along the lines of...
Story.objects.annotate( ranking=CalcRanking('views','votes') ).sort_by(ranking)
How can I do this in Django? Or should it be done after the QuerySet is retrieved in Python (like creating a list that contains the ranking for each object in the QuerySet)?
Thanks!
PS: In my actual program, the ranking calculation isn't as simple as above and depends on other filters to the initial QuerySet, so I can't store it as another field in the Story model.
In Django, the things you can pass to annotate (and aggregate) must be subclasses of django.db.models.aggregates.Aggregate. You can't just pass arbitrary Python objects to it, since the aggregation/annotation actually happens inside the database (that's the whole point of aggregate and annotate). Note that writing custom aggregations is not supported in Django (there is no documentation for it). All information available on it is this minimal source code: https://code.djangoproject.com/browser/django/trunk/django/db/models/aggregates.py
This means you either have to store the calculations in the database somehow, figure out how the aggregation API works or use raw sql (raw method on the Manager) to do what you do.
That seems simple enough, but all Django Queries seems to be 'SELECT *'
How do I build a query returning only a subset of fields ?
In Django 1.1 onwards, you can use defer('col1', 'col2') to exclude columns from the query, or only('col1', 'col2') to only get a specific set of columns. See the documentation.
values does something slightly different - it only gets the columns you specify, but it returns a list of dictionaries rather than a set of model instances.
Append a .values("column1", "column2", ...) to your query
The accepted answer advising defer and only which the docs discourage in most cases.
only use defer() when you cannot, at queryset load time, determine if you will need the extra fields or not. If you are frequently loading and using a particular subset of your data, the best choice you can make is to normalize your models and put the non-loaded data into a separate model (and database table). If the columns must stay in the one table for some reason, create a model with Meta.managed = False (see the managed attribute documentation) containing just the fields you normally need to load and use that where you might otherwise call defer(). This makes your code more explicit to the reader, is slightly faster and consumes a little less memory in the Python process.
My Django application retrieves an RSS feed every day. I would like to persist the time the feed was last updated somewhere in the app. I'm only retrieving one feed, it will never grow to be multiple feeds. How can I persist the last updated time?
My ideas so far
Create a model and add a datetime field to it. This seems like overkill as it adds another table to the database, in which there will only ever be one row. Other than that, it's the most obvious and straight-forward solution.
Create a settings object which just stores key/value mappings. The last updated date would just be row in this database. This is essentially a generic version of the previous solution.
Use dbsettings/django-values, which allows you to store settings in the database. The last updated date would just be a 'setting'.
Any other ideas that I'm missing?
In spite of the fact databases regularly store many rows in any given table, having a table with only one row is not especially costly, so long as you don't have (m)any indexes, which would waste space. In fact most databases create many single row tables to implement some features, like monotonic sequences used for generating primary keys. I encourage you to create a regular model for this.
RAM is volatile, thus not persistent: memcached is not what you asked for.
XML it is not the right technology to store a single value.
RDMS is not the right technology to store a single value.
Django cache framework will answer your question if CACHE_BACKEND is set to anything else than file://...
The filesystem is the right technology to "persist a single value".
In settings.py:
RSS_FETCH_DATETIME_PATH=os.path.join(
os.path.abspath(os.path.dirname(__file__)),
'rss_fetch_datetime'
)
In your rss fetch script:
from django.conf import settings
handler = open(RSS_FETCH_DATETIME_PATH, 'w+')
handler.write(int(time.time()))
handler.close()
Wherever you need to read it:
from django.conf import settings
handler = open(RSS_FETCH_DATETIME_PATH, 'r+')
timestamp = int(handler.read())
handler.close()
But cron is the right tool if you want to "run a command every day", for example at 5AM:
0 5 * * * /path/to/manage.py runscript /path/to/retreive/script
Of course, you can still write the last update timestamp in a file at the end of the retreive script, and use it somewhere else, if that makes sense to you.
Concluding by quoting Ken Thompson:
One of my most productive days was
throwing away 1000 lines of code.
One solution I've used in the past is to use Django's cache feature. You set a value to True with an expiration time of one day (in your case.) If the value is not set, you fetch the feed, otherwise you don't do anything.
You can see my solution here: Importing your Flickr photos with Django
If you need it only for caching purposes, why not store it in the memcached?
On the other hand, if you use this data for other purposes (e.g. display it on the page, or to make some calculation, etc.), then I would store it in a new model - in Django, all persistence is built on top of the database, via models, and I would not try to use other "clever" solutions.
One thing I used to do when I was deving with PHP, was to store the xml somewhere, but with a new tag inserted to hold the timestamp of the latest retrieval. It wasn't great, but it was quick and simple.
Keeping it simple would lead to the idea of just storing it in the file system ... why can't you do that? You could, for example, have a siteconfig module in one of your apps which held these sorts of data. This could load up data from a specific file, which could be text, JSON, ConfigParser, pickle or any suitable format. Just import siteconfig somewhere, and it can load the data and make it available to the other modules in your site. You could easily extend this to hold a dict-like object with a number of settings (e.g., if you ever have multiple feeds, but don't want to have a model just for 2-3 rows, you could easily hold the last-retrieved time for each feed in a dict keyed by feed URL).
Create a session key, which persists forever and update the feed timestamp every time you access it.