I have a simple Django website (just a form, really) which asks a few questions and saves the data in a SQL database using Model.save(). Pretty simple. I want to add a model to do page counting, though -- it'll just be a single object with a field that gets incremented each time the page's view function is called.
Now, I know little to nothing about SQL. I imagine this is not terribly difficult to do, but I would like to avoid losing or breaking all my data because of a slight misunderstanding of how the database works. So how can I go about doing this? I've heard of some third-party apps that will implement such functionality, but I'd like to do it myself just for learning purposes.
I don't understand why your existing data would be affected at all. You're talking about adding a completely new table to the database, which is supported within Django by simply running manage.py syncdb. The case where that doesn't work is when you're modifying existing tables, but you're not doing that here.
I must say though that learning and using South would be of benefit in any case. It's good practice to have a tool that can maintain your model tables.
(Plus, of course, you would never lose any data, because your database is backed up, right? Right?)
Since you're adding new model, you can just run syncdb and it will create new table for your model. If you were to change existing model, then you'd need to manually update database schema using "ALTER TABLE" statements or use South instead.
Related
With some frequency, I end up with models who contents are approximately constant. For example, I might have a set of plans that users can sign up for, with various attributes -- ID, name, order in the list of plans, cost, whether the purchaser needs to be a student/FOSS project/etc.. I'm going to rarely add/remove/change rows, and when I do there's likely going to be code changes too (eg, to change landing pages), so I'd like the contents of the model (not just the schema) to be managed in the code rather than through the Django admin (and consequently use pull requests to manage them, make sure test deploys are in sync, etc.). I'd also like it to be in the database, though, so I can select them using any column of the model, filter for things like "show me any project owned by a paid account", etc..
What are good ways of handling this?
I think my ideal would be something like "have a list of model instances in my code, and either Django's ORM magically pretends they're actually in the database, or makemigrations makes data migrations for me", but I don't think that exists?
The two workable approaches that come to mind are to either write data migrations by hand or use regular classes (or dicts) and write whatever getters and filters I actually want by hand.
Data migrations give me all the Django ORM functionality I might want, but any time I change a row I need to write a migration by hand, and figuring out the actual state involves either looking in the Django admin or looking through all the data migrations to figure out their combined impact. (Oh, and if somebody accidentally deletes the objects in the database (most like on a test install...) or a migration is screwy recovering will be a mess.)
Just using non-ORM classes in the source is a clearer, more declarative approach, but I need to replacements for many things I might normally do in the ORM (.objects.get(...), __plan__is_student, .values(plan__is_student).annotate(...), etc.).
Which of these approaches is better presumably depends on how much ORM functionality I actually want and how often I expect to be changing things.
Are there other good approaches for this? Am I missing some Django feature (or add-on) that makes this easier?
I am using django south data-migrations to update data in new tables after schema-migrations that create some new tables. I have written some post-save methods on existing tables to update data on newly created tables. For this, I am using data-migrations and in forward method, I am saving existing table rows so that in post-saves, new tables will get data populated.
However, the post saves dont run after running the data migrations.
One way is to call the post-save function directly from forward method in datamigrations. But south documentations recommends that you should use orm objects to freeze the state. But in post-save methods, we will be using these models normally. Another way is to copy same code in migration but that way, everytime I make changes in post-save method, I also need to update code in datamigration forward method.
What is the best way to achieve this?
Migration files are just Python modules. You can always import code from other parts of the project hence reusing logic.
That however is a slippery slope because:
Lets say you have a migration which migrates database from state X to state Y (and backwards). Now lets say this migration reuses some code from foo.py. Now after some time you need to change your db state from state Y to state Z. If again you reuse some logic from foo.py, then you can no longer execute your first migration on a fresh db since some logic is shared between 2 migrations.
If you leave the migration code completely independent, it will make it much easier to maintain many migrations in a long run, even though it can introduce some code duplication.
I'm building a site using mongodb and django-nonrel. I've read in various places that for mongo, it's better to use straight pymongo than the django ORM. This jives with my experience as well -- django's ORM is awesome for relational databases, but for doesn't give you much that pymongo doesn't do already.
My problem is that I don't know how to set up the database tables (err... "collections") initially without using django's ORM. What do I need to do to cast off the shackles of models.py and syncdb, and just write the code myself?
Seems like somebody should have created a guide for this already, but I can't find one.
A little more detail:
Right now, I'm building models and running syncdb to configure the DB. So far, django's ORM magic has made it work. But I need to do some slightly fancier stuff, like indexing on sub-elements, so I don't think the ORM is going to work for me anymore.
On top of that, I don't use models (other than auth_users and sessions) anywhere else in the project. The real schemas are defined elsewhere in json. I don't want to maintain the model classes when and the json schemas at the same time -- it's just bad practice.
Finally, I have a "loadfixtures" management command that I use to flush, syncdb, and load fixtures. It seems like this would be a very good place for the new ORM-replacing code to live, I just don't know what that code should look like....
With MongoDB you don't need an extra step to predeclare the schema to "set up" collections. The document-oriented nature of MongoDB actually does not enforce a strict schema; documents within a collection may have different fields as needed. It's a different concept to get used to, but the collection will be created as soon as you start saving data to it.
Indexes can be added using pymongo's ensureIndex on a collection.
Similar to the collection creation on data insertion, a collection will also be created if it does not exist when an index is added.
An article that should help you get started: Using MongoDB with Django.
If you're new to MongoDB, you also might want to try the short online tutorial.
How does Django handle changes to my Model? Or, what help does it offer me to do this?
I am thinking of a situation where I have already have published data to my DB which I don't want to lose, but I need to make changes to my data model - for example, adding extra fields to a particular class, changing the types of fields, etc. My understanding is that syncdb won't ever alter tables that already exist in the DB.
For example, let's say I have the following model:
class Person(models.Model):
name = models.CharField(max_length=200)
phone_number=models.CharField(max_length=200)
hair_colour=CharField(max_length=50)
Things I might want to do to Person off the top of my head:
I wish to add an 'age' field.
I realise I want to use IntegerField instead of CharField for phone_number (whether this is a good idea or not, is out of scope...) - assuming it's possible.
I realise that I no longer wish to define hair_colour 'inline' within Person, because several people share the same hair colour - I wish instead to change this to be a foreign key to some other model.
Whilst I can imagine some of these are tough/impossible for the framework to 'guess' exactly what needs to be done to my data if all I do is update the models.py, I can imagine that there might still be some tooling to help enable it - does it exist?
In particular I imagine there must be some good patterns for option 1.
I'm very new to Django and have no experience with any other ORM-type stuff, which I think this is - I've always been a bit suspicious of ORMs, mainly for the reasons above :)
Django itself will not attempt to modify an already created database table. What you are trying to do is typically called "Migration" and there are a couple of different Database Migration tools available for Django.
South
Schema Migrations
Data Migrations
Backwards Migrations
Nash Vegas
Schema Migrations
Data Migrations
Django Evolution
Schema Migrations
Data Migrations (Unknown)
Backwards Migrations (Unknown)
Of the three South is probably the most widely used but they each have different ways of dealing with migrations. You can see more details on the comparison on Django Packages.
Much of what you're asking about can be done with the django project South. You add it as an INSTALLED_APP. Create a baseline, then as your model changes it creates SQL statements to convert your tables and the rows with-in the tables to the new model format.
Is it possible to implement 'expando' model in Django, much like Google App Engine has? I found a django app named django-expando on github but it's still in early phase.
It's possible, but it would be a kludge of epic proportions. GAE uses a different database design known as a column-based database, and the Django ORM is designed to link with relational databases. Since technically everything in GAE is stored in one really big table with no schema (that's why you don't have to syncdb for GAE applications), adding arbitrary fields is easy. With relational databases, where each table stores exactly one kind of data (generally) and has a fixed schema, arbitrary fields aren't so easy.
One possible way you could implement this is to create a new model or table for expando properties that stores a table name, object ID, and a TextField for pickled data, and then have all expando models inherit from a subclass that overrides the __setattr__ and __getattr__ methods that will automatically create a new row in this table. However, there are a few major problems with this:
First off, it's a cheap hack and is contrary to the principles of relational databases.
Second, it is not possible to query these expando fields without even more hacks, and even so it would be ludicrously slow.
My recommendation is to find a way to design your database structure so that you don't need expando models.