I've been looking for a way to define database tables and alter them via a Django API.
For example, I'd like to be write some code which directly manipulates table DDL and allow me to define tables or add columns to a table on demand programmatically (without running a syncdb). I realize that django-south and django-evolution may come to mind, but I don't really think of these tools as tools meant to be integrated into an application and used by and end user... rather these tools are utilities used for upgrading your database tables. I'm looking for something where I can do something like:
class MyModel(models.Model): # wouldn't run syncdb.. instead do something like below
a = models.CharField()
b = models.CharField()
model = MyModel()
model.create() # this runs the create table (instead of a syncdb)
model.add_column(c = models.CharField()) # this would set a column to be added
model.alter() # and this would apply the alter statement
model.del_column('a') # this would set column 'a' for removal
model.alter() # and this would apply the removal
This is just a toy example of how such an API would work, but the point is that I'd be very interested in finding out if there is a way to programatically create and change tables like this. This might be useful for things such as content management systems, where one might want to dynamically create a new table. Another example would be a site that stores datasets of an arbitrary width, for which tables need to be generated dynamically by the interface or data imports. Dose anyone know any good ways to dynamically create and alter tables like this?
(Granted, I know one can do direct SQL statements against the database, but that solution lacks the ability to treat the databases as objects)
Just curious as to if people have any suggestions or approaches to this...
You can try and interface with the django's code that manages changes in the database. It is a bit limited (no ALTER, for example, as far as I can see), but you may be able to extend it. Here's a snippet from django.core.management.commands.syncdb.
for app in models.get_apps():
app_name = app.__name__.split('.')[-2]
model_list = models.get_models(app)
for model in model_list:
# Create the model's database table, if it doesn't already exist.
if verbosity >= 2:
print "Processing %s.%s model" % (app_name, model._meta.object_name)
if connection.introspection.table_name_converter(model._meta.db_table) in tables:
continue
sql, references = connection.creation.sql_create_model(model, self.style, seen_models)
seen_models.add(model)
created_models.add(model)
for refto, refs in references.items():
pending_references.setdefault(refto, []).extend(refs)
if refto in seen_models:
sql.extend(connection.creation.sql_for_pending_references(refto, self.style, pending_references))
sql.extend(connection.creation.sql_for_pending_references(model, self.style, pending_references))
if verbosity >= 1 and sql:
print "Creating table %s" % model._meta.db_table
for statement in sql:
cursor.execute(statement)
tables.append(connection.introspection.table_name_converter(model._meta.db_table))
Take a look at connection.creation.sql_create_model. The creation object is created in the database backend relevant to the database you are using in your settings.py. All of them are under django.db.backends.
If you must have ALTER table, I think you can create your own custom backend that extends an existing one and adds this functionality. Then you can interface with it directly through a ExtendedModelManager you create.
Quickly off the top of my head..
Create a Custom Manager with the Create/Alter methods.
Related
I have a concern with django subqueries using the django ORM. When we fetch a queryset or perform a DB operation, I have the option of bypassing all assumptions that django might make for the database that needs to be used by forcing usage of the specific database that I want.
b_det = Book.objects.using('some_db').filter(book_name = 'Mark')
The above disregards any database routers I might have set and goes straight to 'some_db'.
But if my models approximately look like so :-
class Author(models.Model):
author_name=models.CharField(max_length=255)
author_address=models.CharField(max_length=255)
class Book(models.Model):
book_name=models.CharField(max_length=255)
author=models.ForeignKey(Author, null = True)
And I fetch a QuerySet representing all books that are called Mark like so:-
b_det = Book.objects.using('some_db').filter(book_name = 'Mark')
Then later if somewhere in the code I trigger a subquery by doing something like:-
if b_det:
auth_address = b_det[0].author.author_address
Then this does not make use of the original database 'some_db' that I had specified early on for the main query. This again goes through the routers and picks up (possibly) the incorrect database.
Why does django do this. IMHO , if I had selected forced usage of database for the original query then even for the subquery the same database needs to be used. Why must the database routers come into picture for this at all?
This is not a subquery in the strict SQL sense of the word. What you are actually doing here is to execute one query and use the result of that to find related items.
You can chain filters and do lots of other operations on a queryset but it will not be executed until you take a slice on it or call .values() but here you are actually taking a slice
auth_address = b_det[0].#rest of code
So you have a materialized query and you are now trying to find the address of the related author and that requires another query but you are not using with so django is free to choose which database to use. You cacn overcome this by using select_related
I'm building a Flask app, which, at startup, should read some number of tsv files, each of which has the same schema, put them in tables (one for each file), and then users will specify which table/file they want to query, and some number of keys.
I'm not sure how to do this, but the best way seems to be to specify one schema and then, once the app starts, read the files and dynamically create tables for each file. I can't find anywhere in the SQLalchemy docs any mention of how to use the same schema multiple times. Perhaps I need to extend my schema class, but i'm not sure how to do this at startup.
Thanks in advance!
-- EDIT --
It looks like this answers half of my question:
Flask-SQLAlchemy. Create several tables with all fields identical
So my question now is: Can you do the above in Flask, and can you do it as the app starts?
You can take 2 approaches.
Sub-classing - You create a base Mixin for schema and subclass it for each concrete tables. This approach is useful, if you expect that in future the schema for different tables might diverge. If a new field needs to be added in only one table you can add it only in sub-class. (variables db, Model etc is used from flask sqlalchemy quickstart)
class BaseMixin(object):
name = db.Column(String(80), unique=True)
field2 = db.Column ...
class SubClass1(BaseMixin, db.Model)
pass
class Subclass2(BaseMixin, db.Model)
additional_field_for_subclass2 = db.Column(...
pass
Common table for all - If you are confident that the schema will remain the same for all tables. I would suggest you create one table for all you data, with a additional field data_source which will indicate where the row/data came from.
class CommonTable(db.Model):
data_source = db.Column(String(100) ..)
field1 = ...
field2 = ...
I need to merge two databases for two different apps. How can add prefix to all Django tables to avoid any conflict?
For example, option should look like:
DB_PREFIX = 'my_prefix_'
You can use meta options for model,
class ModelHere():
class Meta:
db_table = "tablenamehere"
Edit
If you want to add prefix to all of your tables including auth_user, auth_group, etc. Then you are looking for something like django-table-prefix. Just install and add some settings to settings file and you are done.
Add 'table_prefix', to installed apps,
Set the table prefix as DB_PREFIX = 'nifty_prefix'
Then run syncdb and the output will be,
Creating tables ...
Creating table nifty_prefix_auth_permission
Creating table nifty_prefix_auth_group_permissions
Creating table nifty_prefix_auth_group
Creating table nifty_prefix_auth_user_groups
Creating table nifty_prefix_auth_user_user_permissions
Creating table nifty_prefix_auth_user
Creating table nifty_prefix_django_content_type
Creating table nifty_prefix_django_session
Creating table nifty_prefix_django_site
An alternative to prefixing all the names is to put one of the two DBs into a different schema
(multiple schemas can coexist in the same database, even if the objects have the same names) This will also take care of objects other than tables, such as indexes, views, functions, ...
So on one of the databases, just do
ALTER SCHEMA public RENAME TO myname;
After that, you can dump it (pg_dump -n myname to dump only one schema), and import it into the other database, without the chance of collisions.
You refer to tables or other objects in the new schema by myname.tablname or by setting the search_path (this can be done on a per-user basis eg via ALTER USER SET search_path = myschema, pg_catalog;)
Note: there may be a problem with frameworks and clients not being schema-aware, so you might need some additional tweaking. YMMV.
http://www.postgresql.org/docs/9.4/static/sql-alterschema.html
I am using django and have three objects: Customer, Location and Department. Each has a related Setting object.
Is it better form to create a single table with optional/null foreign keys?
Or to create a different setting object/table for each of the 3 entities?
There are a few options
Create a separate Settings table and have a nullable ForeignKey from all of your objects to the Settings table. If you choose this option, you should create an abstract base class that has a ForeignKey to the Settings table and inherit from that abstract base class. That way you don't have to add the ForeignKey every time you create a new model.
Create a separate Settings table and use GenericForeignKeys from the Settings table to reference your object (Customer, Location, and Department). This has the advantage of not having an extra column in all of your tables that need settings. However, you can't do DB joins with GenericForeignKeys via the Django ORM's normal API. You'd have to use raw sql. Also, select_related doesn't work on GenericForeignKeys so you'd have to use prefetch_related instead.
Store the settings in a column in the database. You should interact with the data in some format (I like JSON) and then serialize it to a string to store in the DB. Then to read the settings, you could deserialize the string back into JSON and interact with it. With this method, you wouldn't need to join with another table to get settings, and wouldn't need to run migrations every time you added new settings. You also wouldn't need a separate Settings table. However, constructing a query to find objects with certain settings would be a pain the query would probably be slow as well.
Each option has its pros and cons; so, pick your poison ;)
What is the fastest way to truncate a table in the Django ORM based on the database type in a view? I know you can do this for example
Books.objects.all().delete()
but with tables containing millions of rows it is very slow. I know it is also possible to use the cursor and some custom SQL
from django.db import connection
cursor = connection.cursor()
cursor.execute("TRUNCATE TABLE `books`")
However, the TRUNCATE command does not work with SQLite. And if the database moves to another db type, I need to account for that.
Any ideas? Would it be easier to just drop the table and recreate in my view?
Django's .delete() method is indeed very slow, as it loads the IDs of each object being deleted so that a post_save signal can be emitted.
This means that a simple connection.execute("DELTE FROM foo") will be significantly faster than Foo.objects.delete().
If that's still too slow, a truncate or drop+recreate is definitely the way to go. You can get the SQL used to create a table with: output, references = connection.creation.sql_create_model(model, style), where style = django.core.management.color_style() (this is taken from https://github.com/django/django/blob/master/django/core/management/sql.py#L14).