How do I configure mongoDB indexes in django-nonrel without using Models? - django

I'm building a site using mongodb and django-nonrel. I've read in various places that for mongo, it's better to use straight pymongo than the django ORM. This jives with my experience as well -- django's ORM is awesome for relational databases, but for doesn't give you much that pymongo doesn't do already.
My problem is that I don't know how to set up the database tables (err... "collections") initially without using django's ORM. What do I need to do to cast off the shackles of models.py and syncdb, and just write the code myself?
Seems like somebody should have created a guide for this already, but I can't find one.
A little more detail:
Right now, I'm building models and running syncdb to configure the DB. So far, django's ORM magic has made it work. But I need to do some slightly fancier stuff, like indexing on sub-elements, so I don't think the ORM is going to work for me anymore.
On top of that, I don't use models (other than auth_users and sessions) anywhere else in the project. The real schemas are defined elsewhere in json. I don't want to maintain the model classes when and the json schemas at the same time -- it's just bad practice.
Finally, I have a "loadfixtures" management command that I use to flush, syncdb, and load fixtures. It seems like this would be a very good place for the new ORM-replacing code to live, I just don't know what that code should look like....

With MongoDB you don't need an extra step to predeclare the schema to "set up" collections. The document-oriented nature of MongoDB actually does not enforce a strict schema; documents within a collection may have different fields as needed. It's a different concept to get used to, but the collection will be created as soon as you start saving data to it.
Indexes can be added using pymongo's ensureIndex on a collection.
Similar to the collection creation on data insertion, a collection will also be created if it does not exist when an index is added.
An article that should help you get started: Using MongoDB with Django.
If you're new to MongoDB, you also might want to try the short online tutorial.

Related

Declarative mechanism for Django model rows

With some frequency, I end up with models who contents are approximately constant. For example, I might have a set of plans that users can sign up for, with various attributes -- ID, name, order in the list of plans, cost, whether the purchaser needs to be a student/FOSS project/etc.. I'm going to rarely add/remove/change rows, and when I do there's likely going to be code changes too (eg, to change landing pages), so I'd like the contents of the model (not just the schema) to be managed in the code rather than through the Django admin (and consequently use pull requests to manage them, make sure test deploys are in sync, etc.). I'd also like it to be in the database, though, so I can select them using any column of the model, filter for things like "show me any project owned by a paid account", etc..
What are good ways of handling this?
I think my ideal would be something like "have a list of model instances in my code, and either Django's ORM magically pretends they're actually in the database, or makemigrations makes data migrations for me", but I don't think that exists?
The two workable approaches that come to mind are to either write data migrations by hand or use regular classes (or dicts) and write whatever getters and filters I actually want by hand.
Data migrations give me all the Django ORM functionality I might want, but any time I change a row I need to write a migration by hand, and figuring out the actual state involves either looking in the Django admin or looking through all the data migrations to figure out their combined impact. (Oh, and if somebody accidentally deletes the objects in the database (most like on a test install...) or a migration is screwy recovering will be a mess.)
Just using non-ORM classes in the source is a clearer, more declarative approach, but I need to replacements for many things I might normally do in the ORM (.objects.get(...), __plan__is_student, .values(plan__is_student).annotate(...), etc.).
Which of these approaches is better presumably depends on how much ORM functionality I actually want and how often I expect to be changing things.
Are there other good approaches for this? Am I missing some Django feature (or add-on) that makes this easier?

If I use django, I should not care about any MySQl optimization like indexation?

If I use django, I should not care about any MySQl optimization like indexation or something? Django automatically creates everything? Explain please
Answer: you should care, but not at first (usually).
Django doesn't take care of everything for you, only the basic/initial indexes (if you use migration tool and not creating the tables manually).
The index is related to the actual data that will be stored in the table, not only the table structure, somthing that is not known by django during the tables creation. Some indexs you get for "free", by creating unique constraints etc. (assuming you use tools like south that does that for you) but not real optimizations - this you'll have to do on your own when the time comes.
Some optimizations can be done by "telling" Django which indexes should be added, but you will have to specify it yourself.
No, Django only creates indices required by the database system, for example when creating foreign keys. You will have to optimise your database yourself, more on that in the documentation.
You can tell Django that certain model fields should be used as an index:
https://docs.djangoproject.com/en/1.10/ref/models/fields/#django.db.models.Field.db_index
You can also index columns together:
https://docs.djangoproject.com/en/1.10/ref/models/options/#django.db.models.Options.index_together
Yes, you are right...with Django you can generate your database from your model (classes), you have to use specifics inheritance and types but it's pretty easy to handle and of course it's as optimal as could be needed.

Django MongoDB Embedded Models

When I create a class/model specifically for the purposes of being embedded into another class/model, a collection is still written for the former in my mongodb database. The aforementioned does not cause any trouble other than the inconvenience of being there, but I'm still wondering if there is any way for a collection not to be written?
I have a nonrel django project as well. Its just a thing django does (and that the nonrel fork has not specifically addressed), that when you define a model that is not abstract or proxy, it is going to generate a collection (table) during a syncdb. Whether you save anything to that collection is further dependent on your code obviously.
If there is some trick to having a concrete model not create a collection in nonrel django, then I am missing something as well.
It's possible if you use abstract=True for that model.
However, you can't use lazy lookup (aka EmbeddedModelField('SomeModelThatsNotYetDefined') yet (https://github.com/django-nonrel/djangotoolbox/issues/15).

Django -- add model to database without losing data

I have a simple Django website (just a form, really) which asks a few questions and saves the data in a SQL database using Model.save(). Pretty simple. I want to add a model to do page counting, though -- it'll just be a single object with a field that gets incremented each time the page's view function is called.
Now, I know little to nothing about SQL. I imagine this is not terribly difficult to do, but I would like to avoid losing or breaking all my data because of a slight misunderstanding of how the database works. So how can I go about doing this? I've heard of some third-party apps that will implement such functionality, but I'd like to do it myself just for learning purposes.
I don't understand why your existing data would be affected at all. You're talking about adding a completely new table to the database, which is supported within Django by simply running manage.py syncdb. The case where that doesn't work is when you're modifying existing tables, but you're not doing that here.
I must say though that learning and using South would be of benefit in any case. It's good practice to have a tool that can maintain your model tables.
(Plus, of course, you would never lose any data, because your database is backed up, right? Right?)
Since you're adding new model, you can just run syncdb and it will create new table for your model. If you were to change existing model, then you'd need to manually update database schema using "ALTER TABLE" statements or use South instead.

Expando Model in Django

Is it possible to implement 'expando' model in Django, much like Google App Engine has? I found a django app named django-expando on github but it's still in early phase.
It's possible, but it would be a kludge of epic proportions. GAE uses a different database design known as a column-based database, and the Django ORM is designed to link with relational databases. Since technically everything in GAE is stored in one really big table with no schema (that's why you don't have to syncdb for GAE applications), adding arbitrary fields is easy. With relational databases, where each table stores exactly one kind of data (generally) and has a fixed schema, arbitrary fields aren't so easy.
One possible way you could implement this is to create a new model or table for expando properties that stores a table name, object ID, and a TextField for pickled data, and then have all expando models inherit from a subclass that overrides the __setattr__ and __getattr__ methods that will automatically create a new row in this table. However, there are a few major problems with this:
First off, it's a cheap hack and is contrary to the principles of relational databases.
Second, it is not possible to query these expando fields without even more hacks, and even so it would be ludicrously slow.
My recommendation is to find a way to design your database structure so that you don't need expando models.