How to use django 3.0 ORM in a Jupyter Notebook without triggering the async context check? - django

Django 3.0 is adding asgi / async support and with it a guard around making synchronous requests in an async context. Concurrently, IPython just added top level async/await support, which seems to be running the whole interpreter session inside of a default event loop.
Unfortunately the combination of these two great addition means that any django ORM operation in a jupyter notebook causes a SynchronousOnlyOperation exception:
SynchronousOnlyOperation: You cannot call this from an async context - use a thread or sync_to_async.
As the exception message says, it's possible to wrap each ORM call in a sync_to_async() like:
images = await sync_to_async(Image.objects.all)()
but it's not very convenient, especially for related fields which would usually be implicitly resolved on attribute lookup.
(I tried %autoawait off magic but it didn't work, from a quick glance at the docs I'm assuming it's because ipykernels always run in an asyncio loop)
So is there a way to either disable the sync in async context check in django or run an ipykernel in a synchronous context?
For context: I wrote a data science package that uses django as a backend server but also exposes a jupyter based interface on top of the ORM that allows you to clean/annotate data, track machine learning experiments and run training jobs all in a jupyter notebook.

It works for me
os.environ["DJANGO_ALLOW_ASYNC_UNSAFE"] = "true"
BTW, I start my notebook using the command
./manage.py shell_plus --notebook

Contrary to other answers I'd suggest just running from the shell as:
env DJANGO_ALLOW_ASYNC_UNSAFE=true ./manage.py shell_plus --notebook
and not modifying any config files or startup scripts.
The advantage of doing it like this is that these checks still seem useful to have enabled almost everywhere else (e.g. when debugging locally via runserver or when running tests). Disabling via files would easily disable these in too many places negating their advantage.
Note that most shells provide easy ways of recalling previously invoked command lines, e.g. in Bash or Zsh Ctrl+R followed by notebook would find the last time you ran something that had "notebook" in. Under the fish shell just type notebook and press the up arrow key to start a reverse search.

For now I plan on just using a forked version of django with a new setting to skip the async_unsafe check. Once the ORM gets async support I'll probably have to rewrite my project to support it and drop the flag.
EDIT: there's now a PR to add an env variable (DJANGO_ALLOW_ASYNC_UNSAFE) to disable the check (https://github.com/django/django/pull/12172)

I added
os.environ["DJANGO_ALLOW_ASYNC_UNSAFE"] = "true" code in the setting.py of your project all the way down to the file and then
command python3 manage.py shell_plus --notebook
if you are using python3 or use python
That's it. worked for me.

Related

Deploying Django to production correct way to do it?

I am developing Django Wagtail application on my local machine connected to a local postgres server.
I have a test server and a production server.
However when I develop locally and try upload it there is always some issue with makemigration and migrate e.g. KeyError etc.
What are the best practices of ensuring I do not get run into these issues? What files do I need to port across?
so ill tell you what i do and what most of the companies that i worked as a django developer did and i can tell you by experience that worked pretty well.
First containerize your application, this will make your life much more easy and you will remove external influence in your code, also will get you an easy way to reproduce your environment.
Your Dockerfile should be from some python image and should do 3 basically things:
Install your requirements dependencies
Run the python manage.py migrate --noinput command
Run a http server such as gunicorn with gunicorn -c /gunicorn.py wsgi:application
You ill do the makemigration in your local machine and make sure that everything is working before commit then to the repo.
In your gunicorn.py you ill put your settings to run the app such as the number of CPU, the binding port, the folder that your app is, something like:
import os
import multiprocessing
# Chdir to specified directory before apps loading.
# https://docs.gunicorn.org/en/stable/settings.html#chdir
chdir = '/app/'
# Bind the application on localhost both on ipv6 and ipv4 interfaces.
# https://docs.gunicorn.org/en/stable/settings.html#bind
bind = '0.0.0.0:8000'
Second containerize your other stuff, for example the postgres database, the redis (for cache), a connection pooler for the database depending on the size of your application.
Its highly recommend that you have a step in the pipeline to do tests, they need to run before everything, maybe just after lint
Ok what now? now you need a way to deploy that stuff, the best for that scenario is: pull your image to github registry, and you can add a tag to that for example:
IMAGE_ID=ghcr.io/${{ github.repository_owner }}/$IMAGE_NAME
# Change all uppercase to lowercase
IMAGE_ID=$(echo $IMAGE_ID | tr '[A-Z]' '[a-z]')
docker tag $IMAGE_NAME $IMAGE_ID:staging
docker push $IMAGE_ID:staging
This can be add in a github action in the build step for example.
After having your new code in a new image inside github you just need to update the current one, this can be done by creaaing a script to do it in the server and running that script from github action, is something like:
docker pull ghcr.io/${{ github.repository_owner }}/$IMAGE_NAME
echo 'Restarting Application...'
docker stop {YOUR_CONTAINER} && docker up -d
sudo systemctl restart nginx
echo 'Cleaning old images'
sudo docker system prune -af
You can see that i create the image with a staging tag, you can create a rule in github actions for example to trigger that action when you create a new release for example, and create another action to be trigger in every new commit and build/deploy for a dev tag.
For the migration problem, the first thing is, when your application go live squash every migration to the first one (you can drop the database and all the migration then create the database and run the makemigration command again to reach this), so you can have a clean migration in the server. Never creates unnecessary relation between the tables, prefer always doing cached properties instead of add new columns, use UUID for unique ids, and try to not do breaking changes in the database, its hard but if you plan the database before is not so difficult to do.
Hit me if you have any questions. A lot of the stuff that i said can be done in a lot of other platforms such as gitlab, travis, circle ci, but i use the github action in the example because i think is more simple to picture.
EDIT:
I forgot to tell you to have a cron in your server doing backups of your databases, the migrate command ill apply the changes only after the verification but if something else break the database this can save your life.

Wagtail "schedule_published_pages" Management Command

I was wondering why my scheduled posts were not working automatically in Wagtail, but I see in the documentation that a management command is needed to make this happen. I am unfamiliar with writing custom management commands, and I am wondering how to make the python manage.py publish_scheduled_pages command fire off automatically every hour?
Where would this code go in the document tree? Is there code that I just need to drop in and it runs from there? Or is something required on the server to run these commands on a schedule?
Any help would be appreciated. I couldn't find any existing code anywhere for this functionality in Wagtail and I'm wondering why the button is in the admin to schedule a post, but the functionality is not already built in?
You are probably familiar with management commands since python manage.py runserver and makemigrations and migrate are management commands.
You can see all available commands with python manage.py -h
publish_scheduled_pages should be called periodically. Form the Wagtail docs:
This command publishes, updates or unpublishes pages that have had these actions scheduled by an editor. We recommend running this command once an hour.
Periodically executing a command can be done in various ways. Via crontab is probably the most common. To edit the crontab:
$ crontab -e
Add (for every fist minute of the hour):
0 * * * * python /path/to/your/manage.py publish_scheduled_pages --settings=your.settings

Running unix commands from Django

I have a Django site that is up and running. I need to add a feature to call wget in response to a user action. How should I do this from the Django application?
Since Django is written in Python you can use Python's subprocess module to call wget in one of your views. However, if you merely want to download a file with wget (and not use one of its advanced features), you can emulate its behavior more easily with urllib2.
Is there a reason why you're resorting to a unix command, rather than using something like urllib2?
If there is, you can always use this within your view:
from subprocess import call
call(["wget", "http://myurl.com"])
Here's a pretty comprehensive thread on the matter:
Calling an external command in Python
Use Celery

Does django's runserver option provide a hook for running other restart scripts?

I've recently been playing around with django and celery. One annoying thing during development is the fact that I have to restart the celery daemon each time I modify a task. When I'm developing, I usually like to use 'manage.py runserver' which automatically reloads the django framework on modifications to my apps.
Is there a way to add a hook to the reloading process that runserver does so that it automatically restarts the celery daemon I have running?
Alternatively, does celery have a similar monitor-and-reload-on-change mode that I should be using for development?
Django-supervisor works very well for this purpose. You can have it start the Django server, Celery, and anything else you need, and have different configurations for development and production servers. It also knows to reload the celery daemon when your code changes.
https://github.com/rfk/django-supervisor
I believe you can set CELERY_ALWAYS_EAGER to true.
Yes. Django provides auto reload hook, which can be used to restart other scripts.
Here is a simple management command which prints a message on reload
import subprocess
from django.core.management.base import BaseCommand
from django.utils import autoreload
def reload():
print('Code changed. Auto reloading...')
class Command(BaseCommand):
def handle(self, *args, **options):
autoreload.main(reload)
Now you can save to a reload.py and run it with python manage.py reload. A management command to reload celery workers is available here.
Celery didn't have any feature for reload code or for auto restart when the code change, than you have to restart it manually.
There isn't a way for add an hook, and I think not worthwhile of edit the source code of django just for perform a restart.
Personally while I'm developing i prefere to see the output shell of celery that is decorated with color instead of tail the logs, is more readable.
Celery 2.5 has an experimental runtime option --autoreload that could be used for this purpose, too. Here's more detail in the release notes. That being said, I think django-supervisor (via #Lee Semel) looks like the better way of doing things. I thought I would post this alternative here in case other readers do not want to have to configure another app for asynchronous processing.

What is the correct configuration for %autoreload in a Django ipython shell?

Ipython has a plugin called autoreload that will presumably reload all your modules after every command, so you can change the source and not have to quit the shell and reenter all your commands. See http://dsnra.jpl.nasa.gov/software/Python/tips-ipython.html for example.
However, this seems flaky at best when using it with Django, e.g.
python manage.py shell
gives me an IPython shell with Django context, but the autoreloading does NOT seem to work reliably at all.
Here's what I have added to my ipy_user_conf.py file:
def main():
... # rest of the fn here
import ipy_autoreload
ip.magic('%autoreload 2')
The autoreloading works in limited cases, maybe 10-20% of the time.
Has anyone successfully configured this to work with Django?
This answer might also be applicable to your situation. Django keeps its own cache of all models, so if you want to reload everything, you have to clean this cache manually.