Wagtail "schedule_published_pages" Management Command - django

I was wondering why my scheduled posts were not working automatically in Wagtail, but I see in the documentation that a management command is needed to make this happen. I am unfamiliar with writing custom management commands, and I am wondering how to make the python manage.py publish_scheduled_pages command fire off automatically every hour?
Where would this code go in the document tree? Is there code that I just need to drop in and it runs from there? Or is something required on the server to run these commands on a schedule?
Any help would be appreciated. I couldn't find any existing code anywhere for this functionality in Wagtail and I'm wondering why the button is in the admin to schedule a post, but the functionality is not already built in?

You are probably familiar with management commands since python manage.py runserver and makemigrations and migrate are management commands.
You can see all available commands with python manage.py -h
publish_scheduled_pages should be called periodically. Form the Wagtail docs:
This command publishes, updates or unpublishes pages that have had these actions scheduled by an editor. We recommend running this command once an hour.
Periodically executing a command can be done in various ways. Via crontab is probably the most common. To edit the crontab:
$ crontab -e
Add (for every fist minute of the hour):
0 * * * * python /path/to/your/manage.py publish_scheduled_pages --settings=your.settings

Related

Deploying Django to production correct way to do it?

I am developing Django Wagtail application on my local machine connected to a local postgres server.
I have a test server and a production server.
However when I develop locally and try upload it there is always some issue with makemigration and migrate e.g. KeyError etc.
What are the best practices of ensuring I do not get run into these issues? What files do I need to port across?
so ill tell you what i do and what most of the companies that i worked as a django developer did and i can tell you by experience that worked pretty well.
First containerize your application, this will make your life much more easy and you will remove external influence in your code, also will get you an easy way to reproduce your environment.
Your Dockerfile should be from some python image and should do 3 basically things:
Install your requirements dependencies
Run the python manage.py migrate --noinput command
Run a http server such as gunicorn with gunicorn -c /gunicorn.py wsgi:application
You ill do the makemigration in your local machine and make sure that everything is working before commit then to the repo.
In your gunicorn.py you ill put your settings to run the app such as the number of CPU, the binding port, the folder that your app is, something like:
import os
import multiprocessing
# Chdir to specified directory before apps loading.
# https://docs.gunicorn.org/en/stable/settings.html#chdir
chdir = '/app/'
# Bind the application on localhost both on ipv6 and ipv4 interfaces.
# https://docs.gunicorn.org/en/stable/settings.html#bind
bind = '0.0.0.0:8000'
Second containerize your other stuff, for example the postgres database, the redis (for cache), a connection pooler for the database depending on the size of your application.
Its highly recommend that you have a step in the pipeline to do tests, they need to run before everything, maybe just after lint
Ok what now? now you need a way to deploy that stuff, the best for that scenario is: pull your image to github registry, and you can add a tag to that for example:
IMAGE_ID=ghcr.io/${{ github.repository_owner }}/$IMAGE_NAME
# Change all uppercase to lowercase
IMAGE_ID=$(echo $IMAGE_ID | tr '[A-Z]' '[a-z]')
docker tag $IMAGE_NAME $IMAGE_ID:staging
docker push $IMAGE_ID:staging
This can be add in a github action in the build step for example.
After having your new code in a new image inside github you just need to update the current one, this can be done by creaaing a script to do it in the server and running that script from github action, is something like:
docker pull ghcr.io/${{ github.repository_owner }}/$IMAGE_NAME
echo 'Restarting Application...'
docker stop {YOUR_CONTAINER} && docker up -d
sudo systemctl restart nginx
echo 'Cleaning old images'
sudo docker system prune -af
You can see that i create the image with a staging tag, you can create a rule in github actions for example to trigger that action when you create a new release for example, and create another action to be trigger in every new commit and build/deploy for a dev tag.
For the migration problem, the first thing is, when your application go live squash every migration to the first one (you can drop the database and all the migration then create the database and run the makemigration command again to reach this), so you can have a clean migration in the server. Never creates unnecessary relation between the tables, prefer always doing cached properties instead of add new columns, use UUID for unique ids, and try to not do breaking changes in the database, its hard but if you plan the database before is not so difficult to do.
Hit me if you have any questions. A lot of the stuff that i said can be done in a lot of other platforms such as gitlab, travis, circle ci, but i use the github action in the example because i think is more simple to picture.
EDIT:
I forgot to tell you to have a cron in your server doing backups of your databases, the migrate command ill apply the changes only after the verification but if something else break the database this can save your life.

How to have a Django app running on Heroku doing a scheduled job using Heroku Scheduler

I am developing a Django app running on Heroku.
In order to update my database with some data coming from a certain API service, I need to periodically run a certain script (let's say myscript).
How can I use Heroku Scheduler to do it?
As already explained here, quick and simple way to answer this question is asking yourself how would you do to run that script periodically, as if you were the scheduler yourself.
Now, the best way to run a script in your Django app at any moment, it is to create a custom management command and to run it from your command prompt when you need it, like this:
python manage.py some_custom_command
Then, if you were the scheduler, you would run that command from your command prompt at every time written in the schedule.
So, a good idea would be to make Heroku Scheduler behave the same. Thus, the aim here is to have Heroku Scheduler run python manage.py some_custom_command at scheduled times.
Here is how you can do it:
In your_app directory, create a folder management and then inside it create another folder commands and finally, inside it, create a file some_custom_command.py
So, just to be clear
your_app/management/commands/some_custom_command.py
Then, inside some_custom_command.py insert:
from django.core.management.base import BaseCommand
from your_app.path_to_myscript_file import myscript
class Command(BaseCommand):
def handle(self, *args, **options):
# Put here some script to get the data from api service and store it into your models.
myscript()
Then go on Heroku > your_app > resources
In add-ons section select Heroku Scheduler, click on it so that its window opens, then click on add job, select the time you want, insert the command python manage.py some_custom_command and save.

How to use django 3.0 ORM in a Jupyter Notebook without triggering the async context check?

Django 3.0 is adding asgi / async support and with it a guard around making synchronous requests in an async context. Concurrently, IPython just added top level async/await support, which seems to be running the whole interpreter session inside of a default event loop.
Unfortunately the combination of these two great addition means that any django ORM operation in a jupyter notebook causes a SynchronousOnlyOperation exception:
SynchronousOnlyOperation: You cannot call this from an async context - use a thread or sync_to_async.
As the exception message says, it's possible to wrap each ORM call in a sync_to_async() like:
images = await sync_to_async(Image.objects.all)()
but it's not very convenient, especially for related fields which would usually be implicitly resolved on attribute lookup.
(I tried %autoawait off magic but it didn't work, from a quick glance at the docs I'm assuming it's because ipykernels always run in an asyncio loop)
So is there a way to either disable the sync in async context check in django or run an ipykernel in a synchronous context?
For context: I wrote a data science package that uses django as a backend server but also exposes a jupyter based interface on top of the ORM that allows you to clean/annotate data, track machine learning experiments and run training jobs all in a jupyter notebook.
It works for me
os.environ["DJANGO_ALLOW_ASYNC_UNSAFE"] = "true"
BTW, I start my notebook using the command
./manage.py shell_plus --notebook
Contrary to other answers I'd suggest just running from the shell as:
env DJANGO_ALLOW_ASYNC_UNSAFE=true ./manage.py shell_plus --notebook
and not modifying any config files or startup scripts.
The advantage of doing it like this is that these checks still seem useful to have enabled almost everywhere else (e.g. when debugging locally via runserver or when running tests). Disabling via files would easily disable these in too many places negating their advantage.
Note that most shells provide easy ways of recalling previously invoked command lines, e.g. in Bash or Zsh Ctrl+R followed by notebook would find the last time you ran something that had "notebook" in. Under the fish shell just type notebook and press the up arrow key to start a reverse search.
For now I plan on just using a forked version of django with a new setting to skip the async_unsafe check. Once the ORM gets async support I'll probably have to rewrite my project to support it and drop the flag.
EDIT: there's now a PR to add an env variable (DJANGO_ALLOW_ASYNC_UNSAFE) to disable the check (https://github.com/django/django/pull/12172)
I added
os.environ["DJANGO_ALLOW_ASYNC_UNSAFE"] = "true" code in the setting.py of your project all the way down to the file and then
command python3 manage.py shell_plus --notebook
if you are using python3 or use python
That's it. worked for me.

How to hook a debugger to a python code running via django crontab?

I have a Django based web application, some functionality of the application is scheduled to be run as a part of cron jobs using django-crontab. I want to hook a debugger so that I can inspect some odd behaviours of my code. I normally use visual studio code. Is it possible to hook a debugger, since cron jobs basically run independently apart from server?
You can put a breaking point debugger in the code using pdb or ipdb. Like this:
def some_function():
# some code
import pdb;pdb.set_trace() # or use ipdb
# rest of the code
Then in shell, run python manage.py crontab show to show cronjobs with ids, then run python manage.py crontab run <id>. It will hit the debugger, then you will hit the breaking point. Thus you can use debugger here.

Cron Job that access django Models

What i have done?
I have written a small application in django with few models with sqlite3 as backend. Now i want to write a python code that clears the database elements based on certain condition.
Question:
How can i achieve the above requirement?
I think the cleanest way to do this is to write your own django-admin command.
You can then run the command using manage.py:
python manage.py your_command
Having a command that can be run in shell, you can easily put in into your crontab. Optionally django-admin commands can receive command line arguments, if needed.