I am trying to use the combination Django custom logger and Celery task to capture certain application log messages and dump them in DynamoDB asynchronously. I have created a Django Celery task that takes a log message and transfer it to DynamoDB asynchronously. I tried to call this celery task from my custom logger to transfer it to DynamoDB asynchronously.
However, Django custom logger does not allow me to import:
from celery.task import task, Task, PeriodicTask, periodic_task
My server crashes with the below error:
ValueError: Unable to configure handler 'custom_handler': Cannot resolve 'myApp.analytics.tasks.LogHandler': cannot import name cache
I know that Django Logger docs warns against circular imports if the custom logger file
includes settings.py but I have made sure thats not the case. But it is still giving me the same error as that of circular imports.
Am I doing something wrong or is there any other way to achieve asynchronous data transfer to DynamoDB using Django custom logger and DjCelery?
Thanks for any help.
I found the solution.
The problem was "If your settings.py specifies a custom handler class and the file defining that class also imports settings.py a circular import will occur."
To resolve this we need to do the import in the method body instead of the file defining the class.
Here's my custom LogHandler:
import logging
#Do not import settings here, as this would lead to circular import.
#This custom log handler parses the message and inserts the entry to the DynamoDB tables.
class LogHandler(logging.Handler):
def __init__(self):
logging.Handler.__init__(self)
self.report_logger = logging.getLogger('reporting')
self.report_logger.setLevel(logging.INFO)
def emit(self, record):
#Submit the task to "reporting" queue to be picked up and processed by the worker lazily.
#myApp.analytics.tasks imports celery.task
from myApp.analytics import tasks
tasks.push_row_to_dynamodb.apply_async(args=[record])
return
Hope it helps someone.
Related
Normally, we can deploy celery for async. Can celery be used for asynchronous file uploading, which allows client to continue working on the website while big size of file being uploaded? I passed the forms to the task of the celery and I had an error like 'Object of type module is not JSON serializable'. Is there any way for async file uploading?
i'm pretty sure it's not possible, what you need to do is more like open a popup page and do the job inside.
'Object of type module is not JSON serializable'
One of the best practice of Celery is to stock data in database (for a lot of data), and create a celery task with ID's only. The object you pass to celery need to be json formatted.
I have a set of functionalities that are leveraging the the Django management/commands modules to run a bunch of cron jobs that would update the model. However I also need these to execute as all-or-none transactions. Does Django provide a way to define transactions?
If you're trying to wrap a chunk of code in a transaction you can use transaction.atomic as a decorator or context manager, e.g.,
from django.db import transaction
#transaction.atomic
def management_command(args):
# This code executes inside a transaction.
do_stuff()
or
def management_command(args):
# This code executes in autocommit mode (Django's default).
do_stuff()
with transaction.atomic():
# This code executes inside a transaction.
do_more_stuff()
See https://docs.djangoproject.com/en/2.2/topics/db/transactions/#controlling-transactions-explicitly for more details.
I'm using django-celery-beat in a django app (this stores the schedule in the database instead of a local file). I've configured my schedule via celery_beat that Celery is initialized with via app.config_from_object(...)
I recently renamed/removed a few tasks and restarted the app. The new tasks showed up, but the tasks removed from the celery_beat dictionary didn't get removed from the database.
Is this expected workflow -- requiring manual removal of tasks from the database? Is there a workaround to automatically reconcile the schedule at Django startup?
I tried a PeriodicTask.objects.all().delete() in celery/__init__.py
def _clean_schedule():
from django.db import transaction
from django_celery_beat.models import PeriodicTask
from django_celery_beat.models import PeriodicTasks
with transaction.atomic():
PeriodicTask.objects.\
exclude(task__startswith='celery.').\
exclude(name__in=settings.CELERY_CONFIG.celery_beat.keys()).\
delete()
PeriodicTasks.update_changed()
_clean_schedule()
but that is not allowed because Django isn't properly started up yet:
django.core.exceptions.AppRegistryNotReady: Apps aren't loaded yet.
You also can't use Django's AppConfig.ready() because making queries / db connections in ready() is not supported.
Looking at how django-celery-beat actually works to install the schedules, I thought I maybe I could hook into that process.
It doesn't happen when Django starts -- it happens when beat starts. It calls setup_schedule() against the class passed on the beat command line.
Therefore, we can just override the scheduler with
--scheduler=myproject.lib.scheduler:DatabaseSchedulerWithCleanup
to do cleanup:
import logging
from django_celery_beat.models import PeriodicTask
from django_celery_beat.models import PeriodicTasks
from django_celery_beat.schedulers import DatabaseScheduler
from django.db import transaction
class DatabaseSchedulerWithCleanup(DatabaseScheduler):
def setup_schedule(self):
schedule = self.app.conf.beat_schedule
with transaction.atomic():
num, info = PeriodicTask.objects.\
exclude(task__startswith='celery.').\
exclude(name__in=schedule.keys()).\
delete()
logging.info("Removed %d obsolete periodic tasks.", num)
if num > 0:
PeriodicTasks.update_changed()
super(DatabaseSchedulerWithCleanup, self).setup_schedule()
Note, you only want this if you are exclusively managing tasks with beat_schedule. If you add tasks via Django admin or programatically, they will also be deleted.
If I store a row data in the database table(instance), and the table has a field names expire_time. if the time over the expire_time, I want to delete the row data.
So, if I want to do that, I can every time query the table, traverse every row data, if expires, then delete.
But if I don't query I can not realize the requirement.
So, if there is a method to do that?
I use python django, the database is mariadb.
You can write a custom management command to do this for you. Save this in myapp/management/commands/delete_expired.py for example:
from django.core.management.base import BaseCommand
from django.utils import timezone
from myapp.models import MyModel
class Command(BaseCommand):
help = 'Deletes expired rows'
def handle(self, *args, **options):
now = timezone.now()
MyModel.objects.filter(expire_time__lt=now).delete()
Then either call that command from a cron task or a queue. To do it on the command line you can call:
python manage.py delete_expired
I am not sure what you mean by:
I can not realize the requirement.
But I think you might want consider:
custom manage.py command, and cron this command with your venv python source
add django-cron to routinely check for expired data and delete it
try celery as another solution to cron but it could be too complecated for your case
add event to MariaDB and schedule it periodical
The drawback of custom manage.py cmd and event is if you migrate server you should remember to add new cron job/event to clean db periodicaly.
I don't know a database-level approach to do that (maybe you want to add the mariadb tag if you are looking for a database-specific solution).
At the application level, an approach comes to mind. You may use Celery and, whenever you store a row data, schedule a task to delete it. The celery task should check that expire_time is effectively invalid (can that field be modified or updated?).
You can also (in addition or as an alternative) have a Celery beat job that periodically gets the element with smaller expire_time. If it should be removed, removed and call itself again. Otherwise, wait for next beat.
Every now and then, you have the need to rename a model in Django (or, in one recent case I encountered, split one model into two, with new/different names). (Yes, proper planning helps to avoid this situation).
After renaming corresponding tables in the db and fixing affected code, one problem remains: Any permissions granted to Users or Groups to operate on those models still references the old model names. Is there any automated or semi-automated way to fix this, or is it just a matter of manual db surgery? (in development you can drop the auth_permissions table and syncdb to recreate it, but production isn't so simple).
Here's a snippet that fills in missing contenttypes and permissions. I wonder if it could be extended to at least do some of the donkey work for cleaning up auth_permissions.
If you happened to have used a South schema migration to rename the table, the following line in the forward migration would have done this automatically:
db.send_create_signal('appname', ['modelname'])
I got about half-way through a long answer that detailed the plan of attack I would take in this situation, but as I was writing I realized there probably isn't any way around having to do a maintenance downtime in this situation.
You can minimize the downtime by having a prepared loaddata script of course, although care needs to be taken to make sure the auth_perms primary keys are in sync.
Also see short answer: no automated way to do this of which I'm aware.
I recently had this issue and wrote a function to solve it. You'll typically have a discrepancy with both the ContentType and Permission tables if you rename a model/table. Django has built-in helper functions to resolve the issue and you can use them as follow:
from django.contrib.auth.management import create_permissions
from django.contrib.contenttypes.management import update_all_contenttypes
from django.db.models import get_apps
def update_all_content_types_and_permissions():
for app in get_apps():
create_permissions(app, None, 2)
update_all_contenttypes()
I changed verbose names in my application, and in Django 2.2.7 this is the only way I found to fix permissions:
from django.core.management.base import BaseCommand, CommandError
from django.contrib.auth.models import Permission
class Command(BaseCommand):
help = 'Fixes permissions names'
def handle(self, *args, **options):
for p in Permission.objects.filter(content_type__app_label="your_app_label_here"):
p.name = "Can %s %s"%(p.codename.split('_')[0], p.content_type.model_class()._meta.verbose_name)
p.save()