How to know when the database is ready in Django? - django

I need to do stuff as soon as the database is ready in Django. Specifically, I need to perform some calculations on values from db and fill the results into cache.
Since django 1.7, the application registry makes it easy to know when an app or models are ready to be used. You can write:
from django.apps import apps
if apps.ready:
do_some_stuff()
But I found out that the models being ready does not mean the database can be queried. Django doc's says:
Although you can access model classes as described above, avoid
interacting with the database in your ready() implementation
I tried to hook up to the post_migrate event. It works if I'm rebuilding the database (e.g launching the test suite), but does not if I'm just using an existing db (e.g using runserver).
Is there a way to know if the database is fully available in Django >= 1.7?

I use also the post_migrate signal. (as : https://github.com/mrjmad/django_badgificator/blob/master/badgificator/apps.py).
I realize by reading your question it does not work with 'runserver' ...

You can try hooking up a receiver for the connection_created signal.

If I correctly understand what you are trying to do, you want to fill the cache with data from DB when you start runserver. Since in production, runserver won't reload you will fill your cache only once until you restart the server (and I'm not even sure that gunicorn would behave the same way as runserver for this.)
So you probably have another way to update your cache using celery or something similar after the startup? Why not just use the same way to perform the first run?

You could setup your code in the wsgi.py file after the application is imported and called, like so:
from django.core.wsgi import get_wsgi_application
application = get_wsgi_application()
print MyModel.objects.all()[0:5]
# Setup your startup code here since you already have access to your models
I found this answer based on this link:
Entry point hook for Django projects

Any code in project/__init__.py will run on startup after the Database is ready but before any views/urls can be accessed, so just put some code in __init__.py and it will run as you expect. post_migrate might be redundant because as far as I'm aware, you can't run migrations with the app running, If you absolutely need it, just have a function that runs on startup and when the signal is called.

Related

Is it possible to get an interactive django shell using the test database?

When running tests, you can do:
./manage.py test --keepdb
To run your tests, and keep the test database.
Is it possible to have the django shell actually connect to it, so we can interactively access the test database the same way the Django shell can normally work with the production database?
Note that the answer and its comments here imply that you can access it by doing something like:
from django import test
test.utils.setup_test_environment()
from django.db import connection
db = connection.creation.create_test_db(keepdb=True)
But when I do that, my database appears to be empty when I do queries.
I ran into this, at first I thought it was because the codebase I'm working on has a flush call in the teardown function, but my DB was still empty after removing those. Maybe there were more flushes somewhere I didn't catch.
I ended up getting around this by sleeping at the end of the test, so it doesn't exit and doesn't clean up.

Access to Django ORM from remote Celery worker

I have a Django application and a Celery worker - each running on it's own server.
Currently, Django app uses SQLite to store the data.
I'd like to access the database using Django's ORM from the worker.
Unfortunately, it is not completely clear to me; thus I have some questions.
Is it possible without hacks/workarounds? I'd like to have a simple solution (I would not like to implement REST interface to object access). I imagine that achieving this could be done if I started using PostgreSQL instance which is accessible from both servers.
Which project files (there's just Django + tasks.py file) are required on the worker's machine?
Could you provide me with an example or tutorial? I tried looking it up but found just tutorials/answers bound to a problem of local Celery workers.
I have been searching for ways to do this simply but... Your best option is to attached a kind of callback to the task function that will call another function on the django server to carry out the database update

High response time when setting value for Django settings module inside a middleware

In a Django project of mine, I've written a middleware that performs an operation for every app user.
I've noticed that the response time balloons up if I write the following at the start of the middleware module:
import os
os.environ.setdefault("DJANGO_SETTINGS_MODULE","myproject.settings")
It's about 10 times less if I omit these lines. Being a beginner, I'm trying to clarify why there's such a large differential between the respective response times. Can an expert explain it? Have you seen something like it before?
p.s. I already know why I shouldn't modify the environment variable for Django settings inside a middleware, so don't worry about that.
The reason will likely have to do something with django reloading your settings configuration for every request rather than once per server thread/process (and thus, also, re-instantiating/connecting to your database, cache, etc.). You will want to confirm this with profiling. this behavior is also very likely dependent on which app server you are running.
If you really want this level of control for your settings, it is much easier for you to add this line to manage.py, wsgi.py or whatever file/script you use to launch your app server.
P.S. If you already know you shouldn’t do it, why are you doing it?

Django Directory Structure - Non-website code

I have a Django project that includes code for processes that are scheduled to run (via cron) independently from the website. The processes update the database using the models from one of my apps so I guess the code for these processes could be considered part of that app even though it's not part of the website. Should I create a package inside the app directory to hold these modules?
If the code you're supposed to run is tied to models in a certain app, you can write a custom management command for it.
The code lives inside your app (in myapp/management/commands/command_name.py) and you'll be able to call it using manage.py or django-admin.py, which allows you to add an entry to cron very easily.

Django - Johnny Cache for multiple processes

I've configured johnny cache with one of my applications that is hosted on apache. It is configured with memcached as the backend which runs on the same machine on the default port.
The caching works fine when multiple web clients go through apache. They all read from the cache and any update is invalidating the cache. But when a python program/script reads from the DB using django (same settings.py that has johnny configuration), it doesn't read from the cache and hence any updates made by that program wont affect the cache. Which leaves me with the web clients reading stale data from the cache.
I haven't found anything in johnny cache's documentation related to this. Any thoughts on this situation?
I'm using johnny cache 0.3.3, django 1.2.5 and python 2.7.
Edit:
to answer one of the quetions in the comments, I read from the DB in the script this way-
>>> cmp = MyModelClass.objects.get(id=1)
>>> cmp.cust_field_2
u'aaaa'
I know it doesn't read from the cache because I update the table directly by firing an update sql statement and the updated value is not reflected in my web client as it still reads from the cache. Whereas my script shows the updated value when I re-fetch the object using MyModelClass.objects.get(id=1)
Thanks,
It appears that middleware is not called when you run scripts/management comands which is why you are seeing the difference. This makes sense when reading the documentation on middleware because it processes things like request and views, which don't exist in a custom script.
I found a way around this, and there is an issue regarding it in the Johnny Cache bitbucket repo. In your script put the following before you do anything with the database:
from johnny.middleware import QueryCacheMiddleware
qcm = QueryCacheMiddleware()
# put the code for you script here
qcm.unpatch()
You can see more on that here:
https://bitbucket.org/jmoiron/johnny-cache/issue/49/offline-caching
and here:
https://bitbucket.org/jmoiron/johnny-cache/issue/50/johhny-cache-not-active-in-management
That is the recommended way from the documentation:
from johnny.cache import enable
enable()
Update:
What I observed, as if your tasks.py files have this in the beginning, you can not disable johnny cache using settings.py anymore.
I have reported the issue: https://github.com/jmoiron/johnny-cache/issues/27