How to use Django in order to present an old database without ruining it when using syncdb? - django

I have a database in sqlite and I want to use Django to present it and make queries on it.
I know how to create a new database by creating new classes in models.py, but what is the best way to use Django to access an existing database?

This seems to be a question in two parts: firstly, how can one write django model classes to represent an existing database, and secondly how that interacts with syncdb.
The answer to the first of these is that django models are not expressive enough to describe every possible SQL database schema, and instead use a subset that works well with the ORM's usage pattern. Tnerefore you may need to accept some adjustments to your schema in order to describe it with django models. In particular:
Django does not support composite primary keys. That is, you can't have a primary key that spans multiple columns.
Django expects tables to be named appname_modelname, because this convention allows the tables from many apps to easily co-exist in the same database.
If your schema happens to match the subset that django models support or you are willing to make changes to adapt it to be so then your task is simply to write models that match with the schema. The inspectdb tool may provide a useful starting point.
You can test if you've been successful in describing your database by temporarily reconfiguring your project to use a different empty database and running manage.py syncdb, and then comparing the schema that Django created with the schema that already existed. If they are the same (or at least close enough) then you got it right.
If your existing database is not a good match for Django ORM's assumptions then a more flexible alternative is SQLAlchemy. It doesn't natively integrate into django's application system but it does provide a more complete database interface that can work with almost any database; some databases will be easy to map, and some others will require some more manual mapping work, but almost all cases should be possible with some creativity.
As for the interaction with syncdb: the default behavior for this command is to skip over any models that already seem to have tables in the database. Therefore if you've defined models that do indeed match with your existing database tables it should leave them alone. It will, however, create the additional tables required for other apps in your project, including Django's own tables.
Modern Django has support for multiple databases, which could provide you with a further approach: configure your existing database as a second database source in your project and use a database router to ensure that the appropriate models are loaded from that second database, and further to ensure that django won't attempt to run syncdb on this database. This provides true separation at the expense of some additional complexity, but it still requires that your schema be compatible with the ORM's assumptions. It also has some limitations, largely pertaining to relationships between objects that are persisted in different databases.
If you'd like to be able to make versioned changes to the database Django uses, starting with the schema you've inherited from the existing database, then South provides a more flexible and more complete alternative to the builtin syncdb mechanism that supports running arbitrary SQL data definition language statements to make changes to your database schema.

It sounds like you need something like South which will allow you to version and and revert changes to your models.

You just need ./manage.py inspectdb.

Related

Using different databases depending on a parameter in the URL

Is there a way I can tell Django2 to use a different database (and cache/session store) depending on a parameter in the URL?
Note that I have read the docs related to multiple databases en Django (https://docs.djangoproject.com/en/2.1/topics/db/multi-db/#automatic-database-routing), and that is not what I'm asking.
The docs are showing an example about how to use DATABASE_ROUTERS, which is a way of choosing which database should be used programatically when using a model.
What I'm asking is how can I make Django2 use different databases automatically depending on a parameter in the URL. Example:
http://foo.bar/usa <-- use USA database
http://foo.bar/europe <-- use Europe database
Edit: to whoever is marking this question as duplicate.Please read carefully what I'm asking.
First of all, I'm asking to do this automatically, versus the programatically solution that was provided as an answer in Django - Runtime database switching
Second, I'm asking for database, session/cookies and cache storage, which is quite different than just changing the database for model queries.

Mixing SQLAlchemy and Django in Python code base and maintain transaction integrity

I have a project where I would want to (need to?) mix SQLAlchemy models, Django models and respective ORMs in the same web codebase. I'd like to have atomic requests by default and tie SQLAlchemy transaction lifecycle to Django request and Django transaction manager.
Does there exist prior art how to make SQLAlchemy to use Django/connections and transaction machinery or vice versa?
What would be good starting point for such integration work? What limitations there are e.g. if you try to reuse the same database connection?
To narrow down the problem:
Django ORM and SQLAlchemy ORM won't touch the same tables
At the first step, all I care is that when the HTTP request ends both transaction managers commit in somewhat coherent manner e.g. if Django commits the transaction SQLAlchemy does also
How can I I say SQLAlchemy to use the database connection configured for Django?
Can I bind SQLAlchemy session to Django transaction manager? When Django opens the database connection I could open a new SQLAlchemy session and bind it to opened Django transactions? When Django commits I could signal SQLAlchemy to flush its stuff so it goes along the same cmmit. Django 1.6 introduced new semantics on atomic transactions, so this might help.
That's really not going to be easy. I wonder if the effort is worth it. SQLAlchemy and Django use very different abstractions and patterns to deal with object persistence and transactions.
The Django ORM follows the Active Record pattern, in which the object maps more directly to a database table and encapsulates all access and logic. Changes in the object translate directly into a row being changed by SQL code when you call the `save()' method. You can manage transactions on your own, but everything is basically just syntactic sugar for dealing with the underlying database.
SQLAlchemy follows the Data Mapper pattern, where there's another layer of abstraction responsible for moving data between the active objects and the database, independent of each other. The objects don't even know there's a database present, and the mapping between object and database table is very, very flexible. Also, SQLAlchemy has another transaction layer on the Python side, following the Unit of Work pattern, which basically encapsulates the whole SQL transaction as an atomic entity. Objects are tracked in the session by primary key, and changes are saved atomically, in correct order.
Coming from Django, the first time I worked with Flask and SQLAlchemy I did a mistake a lot of people do, which is to try to mimic the usage patterns from the Django ORM on SQLAlchemy. For instance, creating a save() method that commits the transaction looks like something obvious to do when you're used to the Django-ORM, but that's a terrible idea in SQLAlchemy. I learned the hard way how they don't mix very well.
The SQLAlchemy declarative base method encapsulates the class-mapper-table relationship and makes it look more like the ActiveRecord pattern, but that can be very misleading, because you start to think the object itself have knowledge of the database.
If you REALLY need to do that, considering how SQLAlchemy semantics maps more cleanly to the database, I think the best bet is treating SQLAlchemy itself as a database and create an SQLAlchemy backend that knows how to map Django models and queries to an SQLAlchemy model-mapper-table. Maybe you can even use the Django model itself with an SQLAlchemy mapper-table.
So, for instance, when you run the save() method in Django, instead of generating and running SQL on the database, it should retrieve and change the equivalent SQLAlchemy object from the current session, so anyone dealing with the object on the SQLAlchemy layer sees everything as if it were the database. When you commit the transaction in Django, you commit the SQLAlchemy session.
That might be an interesting exercise, but I really don't see much point in doing that for real-world use cases.
Updated Answer
It looks like all you want is to synchronize Django transactions with SQLAlchemy sessions. You don't need to share the connection instance for that. You can use something like django-transaction-hooks to trigger a callback responsible for committing the SQLAlchemy session. If you need to do the opposite, commit the Django transaction when the SQLAlchemy session is commited, you can use the after_commit event.
Be aware that you won't have atomicity between the two engines. If omething goes wrong in the SQLAlchemy commit, you can't rollback the Django commit and vice-versa.

How to move from a database backend to another on a production Django project?

I would like to move a database in a Django project from a backend to another (in this case azure sql to postgresql, but I want to think of it as a generic situation). I can't use a dump since the databases are different.
I was thinking of something at the django level, like dumpdata, but depending on the amount of available memory and the size of the db sometimes it appears unreliable and crashes.
I have seen solutions that try to break the process into smaller parts that the memory can handle but it was a few years ago, so I was hoping to find other solutions.
So far my searches have failed since they always lead to 'south', which refers to schema migration and not moving data.
I have not implemented this before, but what about the following:
Django supports multiple databases...so just configure DATABASES in your settings file to support the old postgresql database and the azure sql database. Then create a small script that makes use of bulk_create, reading the data from one DB and writing it to the other.

Mongodb vs PostgreSQL in django

I am not that experienced with django yet, but we are creating a project soon and we were wondering which database to use for our backend (Mongodb or PostgreSQL).
I've read a lot of post saying the differences between each, but I still can't take the decision of which to go for. Taking in consideration I never worked with Mongodb before.
So what shall I go for ??
Thanks a lot in advance
MongoDB is non-relational, and as such you cannot do things like joins, etc.
For this reason, many of the django.contrib apps, and other 3rd-part apps are likely to not work with mongodb.
But mongodb might be very useful if you need to store schemaless complex objects that won't go straight into postgresql (of course you could json-serialize and put in a text field, but using mongodb instead is just way better, allows you doing searches, ..).
So, the best suggestion is to use two databases:
PostgreSQL for the standard applications, such as django core, authentication, ...
MongoDB only for your application, when you have to store non-relational, complex objects
You also might want to use the raw_* methods that skip lots of (mostly unnecessary) validation by the django orm.
Just remember that databases, especially sql vs no-sql, are not drop-in replacements of each other, but instead they have their own features, pros and cons, so you have to find out which one suits best your needs in each case, not just pick one and use it for everything.
UPDATE
I forgot to say: remember that you have to use the django-nonrel fork in order to make django support non-relational databases. It is currently a fork of django 1.3, but a 1.4-based version is work-in-progress.

sharing database table between two django projects

I have two different Django projects that are meant to run in parallel and do pretty different things.
However they need to share a common database table, the Client table..
Both projects contains multiple apps that needs to contain foreign keys mapped to that Client model..
I'm not sure what would be the best approach..
Assuming both projects are working on the same db, just import the model you want to reference to.
from first_project.some_app.models import Client, OtherSharedModel
class SomeModelInSecondProject(models.Model):
client = models.ForeignKey(Client)
Unfortunately, Django's support for multiple databases does not support cross-database relations. You could fake this on one of the systems (ie. have the table referenced, but handle the key refs yourself), but you would need to be very careful to document what you are doing to make sure you maintain referential integrity in the app that is 'faking' it.
I haven't tested it but another alternative, if you're sharing the same db and having both projects in the same server, is to just merge them into one project, organize their apps in different directories and if you must you can use two different setting files. Please see this question related to that: How to keep all my django applications in specific folder. It's just a different approach that doesn't require you to reference a different project (I'm not sure how recommendable that is).