How to hook a debugger to a python code running via django crontab? - django

I have a Django based web application, some functionality of the application is scheduled to be run as a part of cron jobs using django-crontab. I want to hook a debugger so that I can inspect some odd behaviours of my code. I normally use visual studio code. Is it possible to hook a debugger, since cron jobs basically run independently apart from server?

You can put a breaking point debugger in the code using pdb or ipdb. Like this:
def some_function():
# some code
import pdb;pdb.set_trace() # or use ipdb
# rest of the code
Then in shell, run python manage.py crontab show to show cronjobs with ids, then run python manage.py crontab run <id>. It will hit the debugger, then you will hit the breaking point. Thus you can use debugger here.

Related

How to have a Django app running on Heroku doing a scheduled job using Heroku Scheduler

I am developing a Django app running on Heroku.
In order to update my database with some data coming from a certain API service, I need to periodically run a certain script (let's say myscript).
How can I use Heroku Scheduler to do it?
As already explained here, quick and simple way to answer this question is asking yourself how would you do to run that script periodically, as if you were the scheduler yourself.
Now, the best way to run a script in your Django app at any moment, it is to create a custom management command and to run it from your command prompt when you need it, like this:
python manage.py some_custom_command
Then, if you were the scheduler, you would run that command from your command prompt at every time written in the schedule.
So, a good idea would be to make Heroku Scheduler behave the same. Thus, the aim here is to have Heroku Scheduler run python manage.py some_custom_command at scheduled times.
Here is how you can do it:
In your_app directory, create a folder management and then inside it create another folder commands and finally, inside it, create a file some_custom_command.py
So, just to be clear
your_app/management/commands/some_custom_command.py
Then, inside some_custom_command.py insert:
from django.core.management.base import BaseCommand
from your_app.path_to_myscript_file import myscript
class Command(BaseCommand):
def handle(self, *args, **options):
# Put here some script to get the data from api service and store it into your models.
myscript()
Then go on Heroku > your_app > resources
In add-ons section select Heroku Scheduler, click on it so that its window opens, then click on add job, select the time you want, insert the command python manage.py some_custom_command and save.

How to use django 3.0 ORM in a Jupyter Notebook without triggering the async context check?

Django 3.0 is adding asgi / async support and with it a guard around making synchronous requests in an async context. Concurrently, IPython just added top level async/await support, which seems to be running the whole interpreter session inside of a default event loop.
Unfortunately the combination of these two great addition means that any django ORM operation in a jupyter notebook causes a SynchronousOnlyOperation exception:
SynchronousOnlyOperation: You cannot call this from an async context - use a thread or sync_to_async.
As the exception message says, it's possible to wrap each ORM call in a sync_to_async() like:
images = await sync_to_async(Image.objects.all)()
but it's not very convenient, especially for related fields which would usually be implicitly resolved on attribute lookup.
(I tried %autoawait off magic but it didn't work, from a quick glance at the docs I'm assuming it's because ipykernels always run in an asyncio loop)
So is there a way to either disable the sync in async context check in django or run an ipykernel in a synchronous context?
For context: I wrote a data science package that uses django as a backend server but also exposes a jupyter based interface on top of the ORM that allows you to clean/annotate data, track machine learning experiments and run training jobs all in a jupyter notebook.
It works for me
os.environ["DJANGO_ALLOW_ASYNC_UNSAFE"] = "true"
BTW, I start my notebook using the command
./manage.py shell_plus --notebook
Contrary to other answers I'd suggest just running from the shell as:
env DJANGO_ALLOW_ASYNC_UNSAFE=true ./manage.py shell_plus --notebook
and not modifying any config files or startup scripts.
The advantage of doing it like this is that these checks still seem useful to have enabled almost everywhere else (e.g. when debugging locally via runserver or when running tests). Disabling via files would easily disable these in too many places negating their advantage.
Note that most shells provide easy ways of recalling previously invoked command lines, e.g. in Bash or Zsh Ctrl+R followed by notebook would find the last time you ran something that had "notebook" in. Under the fish shell just type notebook and press the up arrow key to start a reverse search.
For now I plan on just using a forked version of django with a new setting to skip the async_unsafe check. Once the ORM gets async support I'll probably have to rewrite my project to support it and drop the flag.
EDIT: there's now a PR to add an env variable (DJANGO_ALLOW_ASYNC_UNSAFE) to disable the check (https://github.com/django/django/pull/12172)
I added
os.environ["DJANGO_ALLOW_ASYNC_UNSAFE"] = "true" code in the setting.py of your project all the way down to the file and then
command python3 manage.py shell_plus --notebook
if you are using python3 or use python
That's it. worked for me.

Wagtail "schedule_published_pages" Management Command

I was wondering why my scheduled posts were not working automatically in Wagtail, but I see in the documentation that a management command is needed to make this happen. I am unfamiliar with writing custom management commands, and I am wondering how to make the python manage.py publish_scheduled_pages command fire off automatically every hour?
Where would this code go in the document tree? Is there code that I just need to drop in and it runs from there? Or is something required on the server to run these commands on a schedule?
Any help would be appreciated. I couldn't find any existing code anywhere for this functionality in Wagtail and I'm wondering why the button is in the admin to schedule a post, but the functionality is not already built in?
You are probably familiar with management commands since python manage.py runserver and makemigrations and migrate are management commands.
You can see all available commands with python manage.py -h
publish_scheduled_pages should be called periodically. Form the Wagtail docs:
This command publishes, updates or unpublishes pages that have had these actions scheduled by an editor. We recommend running this command once an hour.
Periodically executing a command can be done in various ways. Via crontab is probably the most common. To edit the crontab:
$ crontab -e
Add (for every fist minute of the hour):
0 * * * * python /path/to/your/manage.py publish_scheduled_pages --settings=your.settings

Does django's runserver option provide a hook for running other restart scripts?

I've recently been playing around with django and celery. One annoying thing during development is the fact that I have to restart the celery daemon each time I modify a task. When I'm developing, I usually like to use 'manage.py runserver' which automatically reloads the django framework on modifications to my apps.
Is there a way to add a hook to the reloading process that runserver does so that it automatically restarts the celery daemon I have running?
Alternatively, does celery have a similar monitor-and-reload-on-change mode that I should be using for development?
Django-supervisor works very well for this purpose. You can have it start the Django server, Celery, and anything else you need, and have different configurations for development and production servers. It also knows to reload the celery daemon when your code changes.
https://github.com/rfk/django-supervisor
I believe you can set CELERY_ALWAYS_EAGER to true.
Yes. Django provides auto reload hook, which can be used to restart other scripts.
Here is a simple management command which prints a message on reload
import subprocess
from django.core.management.base import BaseCommand
from django.utils import autoreload
def reload():
print('Code changed. Auto reloading...')
class Command(BaseCommand):
def handle(self, *args, **options):
autoreload.main(reload)
Now you can save to a reload.py and run it with python manage.py reload. A management command to reload celery workers is available here.
Celery didn't have any feature for reload code or for auto restart when the code change, than you have to restart it manually.
There isn't a way for add an hook, and I think not worthwhile of edit the source code of django just for perform a restart.
Personally while I'm developing i prefere to see the output shell of celery that is decorated with color instead of tail the logs, is more readable.
Celery 2.5 has an experimental runtime option --autoreload that could be used for this purpose, too. Here's more detail in the release notes. That being said, I think django-supervisor (via #Lee Semel) looks like the better way of doing things. I thought I would post this alternative here in case other readers do not want to have to configure another app for asynchronous processing.

What is the correct configuration for %autoreload in a Django ipython shell?

Ipython has a plugin called autoreload that will presumably reload all your modules after every command, so you can change the source and not have to quit the shell and reenter all your commands. See http://dsnra.jpl.nasa.gov/software/Python/tips-ipython.html for example.
However, this seems flaky at best when using it with Django, e.g.
python manage.py shell
gives me an IPython shell with Django context, but the autoreloading does NOT seem to work reliably at all.
Here's what I have added to my ipy_user_conf.py file:
def main():
... # rest of the fn here
import ipy_autoreload
ip.magic('%autoreload 2')
The autoreloading works in limited cases, maybe 10-20% of the time.
Has anyone successfully configured this to work with Django?
This answer might also be applicable to your situation. Django keeps its own cache of all models, so if you want to reload everything, you have to clean this cache manually.