sending periodic emails over django using celery tasks - django

I have a Group model:
class Group(models.Model):
leader = models.ForeignKey(User, on_delete=models.CASCADE)
name = models.CharField(max_length=55)
description = models.TextField()
joined = models.ManyToManyField(User, blank=True)
start_time = models.TimeField(null=True)
end_time = models.TimeField(null=True)
email_list = ArrayField(
models.CharField(max_length=255, blank=True),
blank=True,
default=list,
)
and I want to send an email to all Users who have joined a particular Group 30 minutes before the start_time. For example: if a Group has a start_time of 1:00 PM, I want to send an email to all the joined Users at 12:30 PM, letting them know the group will be meeting soon.
I currently have a bunch of celery tasks that run without error, but they are all called within views by the User (creating, updating, joining, leaving, and deleting groups will trigger a celery task to send an email notification to the User).
The scheduled email I am trying to accomplish here will be a periodic task, I assume, and not in the control of the User. However, it isn't like other periodic tasks I've seen because the time it relies on is based on the start_time of a specific Group.
#Brian in the comments pointed out that it can be a regular celery task that is called by the periodic task every minute. Here's my celery task:
from celery import shared_task
from celery.utils.log import get_task_logger
from django.core.mail import send_mail
from my_chaburah.settings import NOTIFICATION_EMAIL
from django.template.loader import render_to_string
#shared_task(name='start_group_notification_task')
def start_group_notification_task(recipients):
logger.info('sent email to whole group that group is starting')
for recipient in recipients:
send_mail (
'group starting',
'group starting',
NOTIFICATION_EMAIL,
[recipient],
fail_silently=False
)
I'm still not sure exactly how to call this task using a periodic task or how to query my groups and find when groups start_time == now + 30mins. I've read the docs, but I'm new to celery and celery beat and a bit confused by how to move forward.
I'm also not sure where exactly to call the task.
my myapp/celery.py file:
import os
from celery import Celery
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'my_group.settings')
app = Celery('my_group')
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()
#app.task(bind=True, ignore_result=True)
def debug_task(self):
print(f'Request: {self.request!r}')
my group/tasks.py file:
from celery import shared_task
from celery.utils.log import get_task_logger
from django.core.mail import send_mail
from my_chaburah.settings import NOTIFICATION_EMAIL
from django.template.loader import render_to_string
logger = get_task_logger(__name__)
I have a bunch of tasks that I didn't include, but I'm assuming any task regarding my Group model would go here. Still not sure though.
I'd also like to add the ability for the leader of the Group to be able to set the amount time prior to start_time where the email will be sent. For example: 10, mins, 30 mins, 1hr before meeting, but that's more to do with the model.

You can follow the steps here to configure celery for periodic tasks.
In line with that you can do something roughly similar to this:
import datetime
from celery import Celery
from myapp.models import Group
app = Celery()
#app.on_after_configure.connect
def setup_periodic_tasks(sender, **kwargs):
# Setup and call send_reminders() every 60 seconds.
sender.add_periodic_task(60.0, send_reminders, name='check reminders to be sent every minute')
#app.task
def send_reminders():
# Celery task that gets all groups that needs reminders to be sent 30 minutes from now
thirty_minutes_from_now = datetime.datetime.now() + datetime.timedelta(minutes=30)
groups = Group.objects.filter(
start_time__hour=thirty_minutes_from_now.hour,
start_time__minute=thirty_minutes_from_now.minute
).prefetch_related("joined")
for group in groups:
for member in group.joined.all():
send_email_task.delay(member.email)
#app.task
def send_email_task(recipient):
# Celery task to send emails
send_mail(
'group starting',
'group starting',
NOTIFICATION_EMAIL,
[recipient],
fail_silently=False
)
Disclaimer: This is not tested and optimised ;)

Why periodic tasks doesn't work?
There has the bug in Celery <= 5.2.7(stable-version).
It let the periodic tasks doesn't work.
I'm fixed it in this PR, you can edit your Celery source code like this PR, or try the Celery dev-version.
Soluction 1
# your_app/task.py
#app.on_on_after_finalize.connect
def setup_periodic_tasks(sender, **kwargs):
for group in Group.object.all():
notification_time = group.start_time - timedelta(minutes=30)
sender.add_periodic_task(clocked(notification_time),
start_group_notification_task,
kwargs={'recipients':group.recipients}
name='send mail when group start time')
# You need to connect the `Group.post_save` and `Group.post_delete` signal here,
# to setup/revoke your periodic tasks when `Group` changed.
You can use the clocked to be your custom scheduler class, I quoted it from django-celery-beat.
Soluction 2
Maybe you can use the django-celery-beat, and create m2m related to your Group and PeriodicTask.
It looks easier.
Disclaimer: This is not tested and optimised too.

I was able to figure out how to run the task based on start_time but am concerned about issues with runtime.
I added this to my celery.py file:
app.conf.beat_schedule = {
'start_group_notification': {
'task': 'start_group_notification_task',
'schedule': crontab(),
}
}
Which runs the task every minute. The task then checks to see if a Group has a start time within 30 minutes. In group/tasks.py
#shared_task(name='start_group_notification_task')
def start_group_notification_task():
logger.info('sent email to whole group that group is starting')
thirty_minutes_from_now = datetime.datetime.now() + datetime.timedelta(minutes=30)
groups = Group.objects.filter(
start_time__hour=thirty_minutes_from_now.hour,
start_time__minute=thirty_minutes_from_now.minute
).prefetch_related("joined")
for group in groups:
for email in group.email_list:
send_mail (
'group starting in 30 minutes',
group.name,
NOTIFICATION_EMAIL,
[email],
fail_silently=False
)
Now, this works, but I'm concerned about having nested for loops. Is there maybe a better way to do this to have runtime be as little as possible? Or are celery tasks executed fast enough and easy enough that it's not an issue.
As #Brian mentioned this doesn't take into account Users joining within the 30 minute period before a Group starts. The fix to this is for me to see if User joins group within that 30 min period and call a different task to tell them the group is starting soon.
EDIT:
If a User joins a Group within the 30 minute window, I added this variable and conditional:
time_left = int(chaburah.start_time.strftime("%H%M")) - int(current_date.strftime("%H%M"))
if time_left <= 30 and time_left >= 0:
celery_task.delay()
That works if a User joins within the 30, but if the Group has already started I have to implement a new task to let the User know the Group has started.

Related

Celery-beat doesnt work with daily schedule

I trying to run tasks with celery-beat. When I launch it by minute or hour schedule, tasks will start correctly, but if I trying to run daily task, it display in django admin panel, but not run in time.
It must to work in the following way: regular django code starts a 'start_primaries' task in Party class:
def setup_task(self):
schedule, created = IntervalSchedule.objects.get_or_create(every=7, period=IntervalSchedule.DAYS)
self.task = PeriodicTask.objects.create(
name=self.title + ', id ' + str(self.pk),
task='start_primaries',
interval=schedule,
args=json.dumps([self.id]),
start_time=timezone.now()
)
self.save()
Is it possible that there are some settings that limit the duration of the task's life? At the moment I have the following among the Django settings:
CELERY_TASK_TRACK_STARTED = True
CELERY_TASK_TIME_LIMIT = 950400
CELERY_BROKER_URL = 'redis://redis:6379/0'
CELERY_RESULT_BACKEND = 'django-db'
CELERY_BEAT_SCHEDULER = 'django_celery_beat.schedulers:DatabaseScheduler'
CELERY_TASK_RESULT_EXPIRES = None
CELERY_BROKER_TRANSPORT_OPTIONS = {'visibility_timeout': 950400}
Finally,I found the anwser in github.
First, You should set 'last_run_at' to start_time - interval, so that beat will check when it was last "executed", so the next time it executes will be at start_time.
Second, update your start_time setting, and let the scheduler get the updated info.
Good luck.
https://github.com/celery/django-celery-beat/issues/259

Set dynamic scheduling celerybeat

I have send_time field in my Notification model. I want to send notification to all mobile clients at that time.
What i am doing right now is, I have created a task and scheduled it for every minute
tasks.py
#app.task(name='app.tasks.send_notification')
def send_notification():
# here is logic to filter notification that fall inside that 1 minute time span
cron.push_notification()
settings.py
CELERYBEAT_SCHEDULE = {
'send-notification-every-1-minute': {
'task': 'app.tasks.send_notification',
'schedule': crontab(minute="*/1"),
},
}
All things are working as expected.
Question:
is there any way to schedule task as per send_time field, so i don't have to schedule task for every minute.
More specifically i want to create a new instance of task as my Notification model get new entry and schedule it according to send_time field of that record.
Note: i am using new integration of celery with django not django-celery package
To execute a task at specified date and time you can use eta attribute of apply_async while calling task as mentioned in docs
After creation of notification object you can call your task as
# here obj is your notification object, you can send extra information in kwargs
send_notification.apply_async(kwargs={'obj_id':obj.id}, eta=obj.send_time)
Note: send_time should be datetime.
You have to use PeriodicTask and CrontabSchedule to schedule task that can be imported from djcelery.models.
So the code will be like:
from djcelery.models import PeriodicTask, CrontabSchedule
crontab, created = CrontabSchedule.objects.get_or_create(minute='*/1')
periodic_task_obj, created = PeriodicTask.objects.get_or_create(name='send_notification', task='send_notification', crontab=crontab, enabled=True)
Note: you have to write full path to the task like 'app.tasks.send_notification'
You can schedule the notification task in post_save of Notification Model like:
#post_save
def schedule_notification(sender, instance, *args, **kwargs):
"""
instance is notification model object
"""
# create crontab according to your notification object.
# there are more options you can pass like day, week_day etc while creating Crontab object.
crontab, created = CrontabSchedule.objects.get_or_create(minute=instance.send_time.minute, hour=instance.send_time.hour)
periodic_task_obj, created = PeriodicTask.objects.get_or_create(name='send_notification', task='send_notification_{}'.format(instance.pk))
periodic_task_obj.crontab = crontab
periodic_task_obj.enabled = True
# you can also pass kwargs to your task like this
periodic_task_obj.kwargs = json.dumps({"notification_id": instance.pk})
periodic_task_obj.save()

Django - How to run a function EVERYDAY?

I want to run this function everyday midnight to check expiry_date_notification. what can I do? I'm new to django and python.
def check_expiry_date(request):
products = Product.objects.all()
for product in products:
product_id = product.id
expiry_date = product.expiry_date
notification_days = product.notification_days
check_date = int((expiry_date - datetime.datetime.today()).days)
if notification_days <= check_date:
notification = Notification(product_id=product_id)
notification.save()
As others have said, Celery can schedule tasks to execute at a specific time.
from celery.schedules import crontab
from celery.task import periodic_task
#periodic_task(run_every=crontab(hour=7, minute=30, day_of_week="mon"))
def every_monday_morning():
print("This is run every Monday morning at 7:30")
Install via pip install django-celery
You can either write a custom management command and schedule its execution using cron, or you can use celery.
Have a look at:
Celery - Distributed Task Queue

Django celery task keep global state

I am currently developing a Django application based on django-tenants-schema. You don't need to look into the actual code of the module, but the idea is that it has a global setting for the current database connection defining which schema to use for the application tenant, e.g.
tenant = tenants_schema.get_tenant()
And for setting
tenants_schema.set_tenant(xxx)
For some of the tasks I would like them to remember the current global tenant selected during the instantiation, e.g. in theory:
class AbstractTask(Task):
'''
Run this method before returning the task future
'''
def before_submit(self):
self.run_args['tenant'] = tenants_schema.get_tenant()
'''
This method is run before related .run() task method
'''
def before_run(self):
tenants_schema.set_tenant(self.run_args['tenant'])
Is there an elegant way of doing it in celery?
Celery (as of 3.1) has signals you can hook into to do this. You can alter the kwargs that were passed in, and on the other side, undo your alterations before they're given to the actual task:
from celery import shared_task
from celery.signals import before_task_publish, task_prerun, task_postrun
from threading import local
current_tenant = local()
#before_task_publish.connect
def add_tenant_to_task(body=None, **unused):
body['kwargs']['tenant_middleware.tenant'] = getattr(current_tenant, 'id', None)
print 'sending tenant: {t}'.format(t=current_tenant.id)
#task_prerun.connect
def extract_tenant_from_task(kwargs=None, **unused):
tenant_id = kwargs.pop('tenant_middleware.tenant', None)
current_tenant.id = tenant_id
print 'current_tenant.id set to {t}'.format(t=tenant_id)
#task_postrun.connect
def cleanup_tenant(**kwargs):
current_tenant.id = None
print 'cleaned current_tenant.id'
#shared_task
def get_current_tenant():
# Here is where you would do work that relied on current_tenant.id being set.
import time
time.sleep(1)
return current_tenant.id
And if you run the task (not showing logging from the worker):
In [1]: current_tenant.id = 1234; ct = get_current_tenant.delay(); current_tenant.id = 5678; ct.get()
sending tenant: 1234
Out[1]: 1234
In [2]: current_tenant.id
Out[2]: 5678
The signals are not called if no message is sent (when you call the task function directly, without delay() or apply_async()). If you want to filter on the task name, it is available as body['task'] in the before_task_publish signal handler, and the task object itself is available in the task_prerun and task_postrun handlers.
I am a Celery newbie, so I can't really tell if this is the "blessed" way of doing "middleware"-type stuff in Celery, but I think it will work for me.
I'm not sure what you mean here, is before_submit executed before the task is called by a client?
In that case I would rather use a with statement here:
from contextlib import contextmanager
#contextmanager
def set_tenant_db(tenant):
prev_tenant = tenants_schema.get_tenant()
try:
tenants_scheme.set_tenant(tenant)
yield
finally:
tenants_schema.set_tenant(prev_tenant)
#app.task
def tenant_task(tenant=None):
with set_tenant_db(tenant):
do_actions_here()
tenant_task.delay(tenant=tenants_scheme.get_tenant())
You can of course create a base task that does this automatically,
you can apply the context in Task.__call__ for example, but I'm not sure
if that saves you much if you can just use the with statement explicitly.

Django Celerybeat PeriodicTask running far more than expected

I'm struggling with Django, Celery, djcelery & PeriodicTasks.
I've created a task to pull a report for Adsense to generate a live stat report. Here is my task:
import datetime
import httplib2
import logging
from apiclient.discovery import build
from celery.task import PeriodicTask
from django.contrib.auth.models import User
from oauth2client.django_orm import Storage
from .models import Credential, Revenue
logger = logging.getLogger(__name__)
class GetReportTask(PeriodicTask):
run_every = datetime.timedelta(minutes=2)
def run(self, *args, **kwargs):
scraper = Scraper()
scraper.get_report()
class Scraper(object):
TODAY = datetime.date.today()
YESTERDAY = TODAY - datetime.timedelta(days=1)
def get_report(self, start_date=YESTERDAY, end_date=TODAY):
logger.info('Scraping Adsense report from {0} to {1}.'.format(
start_date, end_date))
user = User.objects.get(pk=1)
storage = Storage(Credential, 'id', user, 'credential')
credential = storage.get()
if not credential is None and credential.invalid is False:
http = httplib2.Http()
http = credential.authorize(http)
service = build('adsense', 'v1.2', http=http)
reports = service.reports()
report = reports.generate(
startDate=start_date.strftime('%Y-%m-%d'),
endDate=end_date.strftime('%Y-%m-%d'),
dimension='DATE',
metric='EARNINGS',
)
data = report.execute()
for row in data['rows']:
date = row[0]
revenue = row[1]
try:
record = Revenue.objects.get(date=date)
except Revenue.DoesNotExist:
record = Revenue()
record.date = date
record.revenue = revenue
record.save()
else:
logger.error('Invalid Adsense Credentials')
I'm using Celery & RabbitMQ. Here are my settings:
# Celery/RabbitMQ
BROKER_HOST = "localhost"
BROKER_PORT = 5672
BROKER_USER = "myuser"
BROKER_PASSWORD = "****"
BROKER_VHOST = "myvhost"
CELERYD_CONCURRENCY = 1
CELERYD_NODES = "w1"
CELERY_RESULT_BACKEND = "amqp"
CELERY_TIMEZONE = 'America/Denver'
CELERYBEAT_SCHEDULER = 'djcelery.schedulers.DatabaseScheduler'
import djcelery
djcelery.setup_loader()
On first glance everything seems to work, but after turning on the logger and watching it run I have found that it is running the task at least four times in a row - sometimes more. It also seems to be running every minute instead of every two minutes. I've tried changing the run_every to use a crontab but I get the same results.
I'm starting celerybeat using supervisor. Here is the command I use:
python manage.py celeryd -B -E -c 1
Any ideas as to why its not working as expected?
Oh, and one more thing, after the day changes, it continues to use the date range it first ran with. So as days progress it continues to get stats for the day the task started running - unless I run the task manually at some point then it changes to the date I last ran it manually. Can someone tell me why this happens?
Consider creating a separate queue with one worker process and fixed rate for this type of tasks and just add the tasks in this new queue instead of running them in directly from celerybeat. I hope that could help you to figure out what is wrong with your code, is it problem with celerybeat or your tasks are running longer than expected.
#task(queue='create_report', rate_limit='0.5/m')
def create_report():
scraper = Scraper()
scraper.get_report()
class GetReportTask(PeriodicTask):
run_every = datetime.timedelta(minutes=2)
def run(self, *args, **kwargs):
create_report.delay()
in settings.py
CELERY_ROUTES = {
'myapp.tasks.create_report': {'queue': 'create_report'},
}
start additional celery worker with that would handle tasks in your queue
celery worker -c 1 -Q create_report -n create_report.local
Problem 2. Your YESTERDAY and TODAY variables are set at class level, so within one thread they are set only once.