async messages in admin - django

I created a task in the admin that is costly and should be carried out asynchronously. I would do something like
def costly_task(**kwargs):
def do_task(id):
## do stuff, you know, that is costly
task = ThreadTask.objects.get(pk = id)
task.is_done = True
task.save()
task = ThreadTask()
task.save()
t = threading.Thread(target = do_task, args = [task.id])
t.setDaemon(True)
t.start()
return {"id": task.id}
With a models.py table:
class ThreadTask(models.Model):
task = models.CharField(max_length = 30, blank = True, null = True)
is_done = models.BooleanField(blank = False, default = False)
This works well, but I want to inform the admin of the runnig task, and also when it is finished. There is a very old (django 4.0 incompatible) package called django-async-messages which leverages the normal django-messages package to be used asynchronously. I googled but did not find anything of newer age ... any ideas on how to do this? Can I use djangos async processes to send out two messages:
when task started (that one is easy)
when task finished
I found something similar, but not quite what I am looking for because this would require a lot of implementation effort: Django async send notifications

Related

How to set a timer inside the get request of an APIView?

I am trying to build a timer inside a get method in a DRF View. I have created the timer method inside the GameViewController class and what I am trying to achieve is that a every minute (5 times in a row) a resource object is shown to the user through the get request and a game round object is created. My View works at the moment, however the timer doesn't seem to be doing anything.
I know this isn't exactly how things are done in django but this is how I need to do it for my game API for game logic purposes.
How can I make the timer work? Do I need to use something like request.time or such?
Thanks in advance.
views.py
class GameView(APIView):
def get(self, request, *args, **kwargs):
...
round_number = gametype.rounds
# time = controller.timer()
now = datetime.now()
now_plus_1 = now + timedelta(minutes=1)
while round_number != 0:
while now < now_plus_1:
random_resource = Resource.objects.all().order_by('?').first()
resource_serializer = ResourceSerializer(random_resource)
gameround = Gameround.objects.create(
id=controller.generate_random_id(Gameround),
user_id=current_user_id,
gamesession=gamesession,
created=datetime.now(),
score=current_score
)
gameround_serializer = GameroundSerializer(gameround)
round_number -= 1
return Response({# 'gametype': gametype_serializer.data,
'resource': resource_serializer.data,
'gameround': gameround_serializer.data
})
If you want to jump quickly into this - use huey https://github.com/coleifer/huey
You will need to install Redis as a backend of your queue. It's not complicated.
Huey can run your code by cron, delays or something else complicated:
from huey import RedisHuey, crontab
huey = RedisHuey('my-app', host='redis.myapp.com')
#huey.task()
def add_numbers(a, b):
return a + b
#huey.task(retries=2, retry_delay=60)
def flaky_task(url):
# This task might fail, in which case it will be retried up to 2 times
# with a delay of 60s between retries.
return this_might_fail(url)
#huey.periodic_task(crontab(minute='0', hour='3'))
def nightly_backup():
sync_all_data()
Hyue has the Django extentions https://huey.readthedocs.io/en/latest/contrib.html#django
As for me, this was the fastest way to achieve the same tasks and this has been working in production for ~1 year without my support.

Saving a celery task (for re-running) in database

Our workflow is currently built around an old version of celery, so bear in mind things are already not optimal. We need to run a task and save a record of that task run in the database. If that task fails or hangs (it happens often), we want to re run, exactly as it was run the first time. This shouldn't happen automatically though. It needs to be triggered manually depending on the nature of the failure and the result needs to be logged in the DB to make that decision (via a front end).
How can we save a complete record of a task in the DB so that a subsequent process can grab the record and run a new identical task? The current implementation saves the path of the #task decorated function in the DB as part of a TaskInfo model. When the task needs to be rerun, we have a get_task() method on the TaskInfo model that gets the path from the DB, imports it using getattr, and another rerun() method that runs the task again with *args, **kwargs (also saved in the DB).
Like so (these are methods on the TaskInfo model instance):
def get_task(self):
"""Returns the task's decorated function, which can be delayed."""
module_name, object_name = self.path.rsplit('.', 1)
module = import_module(module_name)
task = getattr(module, object_name)
if inspect.isclass(task):
task = task()
# task = current_app.tasks[self.path]
return task
def rerun(self):
"""Re-run the task, and replace this one.
- A new task is scheduled to run.
- The new task's TaskInfo has the same parent as this TaskInfo.
- This TaskInfo is deleted.
"""
args, kwargs = self.get_arguments()
celery_task = self.get_task()
celery_task.delay(*args, **kwargs)
defaults = {
'path': self.path,
'status': Status.PENDING,
'timestamp': timezone.now(),
'args': args,
'kwargs': kwargs,
'parent': self.parent,
}
TaskInfo.objects.update_or_create(task_id=celery_task.id, defaults=defaults)
self.delete()
There must be a cleaner solution for saving a task in the DB to rerun later, right?
The latest version of Celery (4.4.0) included a param extended_result. You can set it to True, then the table (it is named celery_taskmeta by default) in the Result Backend Database will store the args and kwargs of the task.
Here is a demo:
app = Celery('test_result_backend')
app.conf.update(
broker_url='redis://localhost:6379/10',
result_backend='db+mysql://root:passwd#localhost/celery_toys',
result_extended=True
)
#app.task(bind=True, name='add')
def add(self, x, y):
self.request.task_name = 'add' # For saving the task name.
time.sleep(5)
return x + y
With the task info recorded in MySQL, you are able to re-run your task easily.

Django - How to track if a user is online/offline in realtime?

I'm considering to use django-notifications and Web Sockets to send real-time notifications to iOS/Android and Web apps. So I'll probably use Django Channels.
Can I use Django Channels to track online status of an user real-time? If yes then how I can achieve this without polling constantly the server?
I'm looking for a best practice since I wasn't able to find any proper solution.
UPDATE:
What I have tried so far is the following approach:
Using Django Channels, I implemented a WebSocket consumer that on connect will set the user status to 'online', while when the socket get disconnected the user status will be set to 'offline'.
Originally I wanted to included the 'away' status, but my approach cannot provide that kind of information.
Also, my implementation won't work properly when the user uses the application from multiple device, because a connection can be closed on a device, but still open on another one; the status would be set to 'offline' even if the user has another open connection.
class MyConsumer(AsyncConsumer):
async def websocket_connect(self, event):
# Called when a new websocket connection is established
print("connected", event)
user = self.scope['user']
self.update_user_status(user, 'online')
async def websocket_receive(self, event):
# Called when a message is received from the websocket
# Method NOT used
print("received", event)
async def websocket_disconnect(self, event):
# Called when a websocket is disconnected
print("disconnected", event)
user = self.scope['user']
self.update_user_status(user, 'offline')
#database_sync_to_async
def update_user_status(self, user, status):
"""
Updates the user `status.
`status` can be one of the following status: 'online', 'offline' or 'away'
"""
return UserProfile.objects.filter(pk=user.pk).update(status=status)
NOTE:
My current working solution is using the Django REST Framework with an API endpoint to let client apps send HTTP POST request with current status.
For example, the web app tracks mouse events and constantly POST the online status every X seconds, when there are no more mouse events POST the away status, when the tab/window is about to be closed, the app sends a POST request with status offline.
THIS IS a working solution, depending on the browser I have issues when sending the offline status, but it works.
What I'm looking for is a better solution that doesn't need to constantly polling the server.
Using WebSockets is definitely the better approach.
Instead of having a binary "online"/"offline" status, you could count connections: When a new WebSocket connects, increase the "online" counter by one, when a WebSocket disconnects, decrease it. So that, when it is 0, then the user is offline on all devices.
Something like this
#database_sync_to_async
def update_user_incr(self, user):
UserProfile.objects.filter(pk=user.pk).update(online=F('online') + 1)
#database_sync_to_async
def update_user_decr(self, user):
UserProfile.objects.filter(pk=user.pk).update(online=F('online') - 1)
The best approach is using Websockets.
But I think you should store not just the status, but also a session key or a device identification. If you use just a counter, you are losing valuable information, for example, from what device is the user connected at a specific moment. That is key in some projects. Besides, if something wrong happens (disconnection, server crashes, etc), you are not going to be able to track what counter is related with each device and probably you'll need to reset the counter at the end.
I recommend you to store this information in another related table:
from django.db import models
from django.conf import settings
class ConnectionHistory(models.Model):
ONLINE = 'online'
OFFLINE = 'offline'
STATUS = (
(ONLINE, 'On-line'),
(OFFLINE, 'Off-line'),
)
user = models.ForeignKey(
settings.AUTH_USER_MODEL,
on_delete=models.CASCADE
)
device_id = models.CharField(max_lenght=100)
status = models.CharField(
max_lenght=10, choices=STATUS,
default=ONLINE
)
first_login = models.DatetimeField(auto_now_add=True)
last_echo = models.DatetimeField(auto_now=True)
class Meta:
unique_together = (("user", "device_id"),)
This way you have a record per device to track their status and maybe some other information like ip address, geoposition, etc. Then you can do something like (based on your code):
#database_sync_to_async
def update_user_status(self, user, device_id, status):
return ConnectionHistory.objects.get_or_create(
user=user, device_id=device_id,
).update(status=status)
How to get a device identification
There are plenty of libraries do it like https://www.npmjs.com/package/device-uuid. They simply use a bundle of browser parameters to generate a hash key. It is better than use session id alone, because it changes less frencuently.
Tracking away status
After each action, you can simply update last_echo. This way you can figured out who is connected or away and from what device.
Advantage: In case of crash, restart, etc, the status of the tracking could be re-establish at any time.
My answer is based on the answer of C14L. The idea of counting connections is very clever. I just make some improvement, at least in my case. It's quite messy and complicated, but I think it's necessary
Sometimes, WebSocket connects more than it disconnects, for example, when it has errors. That makes the connection keep increasing. My approach is instead of increasing the connection when WebSocket opens, I increase it before the user accesses the page. When the WebSocket disconnects, I decrease the connection
in views.py
def homePageView(request):
updateOnlineStatusi_goIn(request)
# continue normal code
...
def updateOnlineStatusi_goIn(request):
useri = request.user
if OnlineStatus.objects.filter(user=useri).exists() == False:
dct = {
'online': False,
'connections': 0,
'user': useri
}
onlineStatusi = OnlineStatus.objects.create(**dct)
else:
onlineStatusi = OnlineStatus.objects.get(user=useri)
onlineStatusi.connections += 1
onlineStatusi.online = True
onlineStatusi.save()
dct = {
'action': 'updateOnlineStatus',
'online': onlineStatusi.online,
'userId': useri.id,
}
async_to_sync(get_channel_layer().group_send)(
'commonRoom', {'type': 'sendd', 'dct': dct})
In models.py
class OnlineStatus(models.Model):
online = models.BooleanField(null=True, blank=True)
connections = models.BigIntegerField(null=True, blank=True)
user = models.OneToOneField(User, on_delete=models.CASCADE, null=True, blank=True)
in consummers.py
class Consumer (AsyncWebsocketConsumer):
async def sendd(self, e): await self.send(json.dumps(e["dct"]))
async def connect(self):
await self.accept()
await self.channel_layer.group_add('commonRoom', self.channel_name)
async def disconnect(self, _):
await self.channel_layer.group_discard('commonRoom', self.channel_name)
dct = await updateOnlineStatusi_goOut(self)
await self.channel_layer.group_send(channelRoom, {"type": "sendd", "dct": dct})
#database_sync_to_async
def updateOnlineStatusi_goOut(self):
useri = self.scope["user"]
onlineStatusi = OnlineStatus.objects.get(user=useri)
onlineStatusi.connections -= 1
if onlineStatusi.connections <= 0:
onlineStatusi.connections = 0
onlineStatusi.online = False
else:
onlineStatusi.online = True
onlineStatusi.save()
dct = {
'action': 'updateOnlineStatus',
'online': onlineStatusi.online,
'userId': useri.id,
}
return dct

How do we trigger multiple airflow dags using TriggerDagRunOperator?

I have a scenario wherein a particular dag upon completion needs to trigger multiple dags,have used TriggerDagRunOperator to trigger single dag,is it possible to pass multiple dags to the TriggerDagRunOperator to trigger multiple dags?
And is it possible to trigger only upon successful completion of the current dag.
I have faced the same problem. And there is no solution out of the box, but we can write a custom operator for it.
So here the code of a custom operator, that get python_callable and trigger_dag_id as arguments:
class TriggerMultiDagRunOperator(TriggerDagRunOperator):
#apply_defaults
def __init__(self, op_args=None, op_kwargs=None, *args, **kwargs):
super(TriggerMultiDagRunOperator, self).__init__(*args, **kwargs)
self.op_args = op_args or []
self.op_kwargs = op_kwargs or {}
def execute(self, context):
session = settings.Session()
created = False
for dro in self.python_callable(context, *self.op_args, **self.op_kwargs):
if not dro or not isinstance(dro, DagRunOrder):
break
if dro.run_id is None:
dro.run_id = 'trig__' + datetime.utcnow().isoformat()
dbag = DagBag(settings.DAGS_FOLDER)
trigger_dag = dbag.get_dag(self.trigger_dag_id)
dr = trigger_dag.create_dagrun(
run_id=dro.run_id,
state=State.RUNNING,
conf=dro.payload,
external_trigger=True
)
created = True
self.log.info("Creating DagRun %s", dr)
if created is True:
session.commit()
else:
self.log.info("No DagRun created")
session.close()
trigger_dag_id is dag id what we want running multiple times.
python_callable is a function, it should return a list of DagRunOrder objects, one object for schedule one instance of DAG with dag_id trigger_dag_id.
Code and examples on GitHub: https://github.com/mastak/airflow_multi_dagrun
Little bit more description about this code: https://medium.com/#igorlubimov/dynamic-scheduling-in-airflow-52979b3e6b13
In Airflow 2, you can do a dynamic task mapping. For example:
import uuid
import random
from airflow.decorators import dag, task
from airflow.operators.trigger_dagrun import TriggerDagRunOperator
dag_args = {
"start_date": datetime(2022, 9, 9),
"schedule_interval": None,
"catchup": False,
}
#task
def define_runs():
num_runs = random.randint(3, 5)
runs = [str(uuid.uuid4()) for _ in range(num_runs)]
return runs
#dag(**dag_args)
def dynamic_tasks():
runs = define_runs()
run_dags = TriggerDagRunOperator.partial(
task_id="run_dags",
trigger_dag_id="hello_world",
conf=None,
).expand(
trigger_run_id=runs,
)
run_dags
dag = dynamic_tasks()
Docs here.
You can try looping it! for example:
for i in list:
trigger_dag =TriggerDagRunOperator(task_id='trigger_'+ i,
trigger_dag_id=i,
python_callable=conditionally_trigger_non_indr,
dag=dag)
Set this dependent on the task that is required. I have automated something like this for PythonOperator. You could try if this works for you!
As the API docs state, the method accepts a single dag_id. However, if you want to unconditionally kick off downstream DAGs upon completion, why not just put those tasks in a single DAG and set your dependencies/workflow there? You would then be able to set depends_on_past=True where appropriate.
EDIT: Easy workaround if you absolutely need them in separate DAGs is to create multiple TriggerDagRunOperators and set their dependencies to the same task.

Threads with Django App. Server: Without CRON or Other External Service

I would like to use a thread that starts inside of a Django application.
If we use a standart Python thread it could be stopped by a webserver when the request is finished.
Is there a standard way to do this? Or is there a Django library available that provides this functionality?
I use threads intensively for long processes. A better solution is Celery, of course.
To define thread:
from threading import Thread
class afegeixThread(Thread):
def __init__ (self,usuari, expandir=None, alumnes=None,
impartir=None, matmulla = False):
Thread.__init__(self)
self.expandir = expandir
self.alumnes = alumnes
self.impartir = impartir
self.flagPrimerDiaFet = False
self.usuari = usuari
self.matmulla = matmulla
def run(self):
errors = []
try:
...
self.flagPrimerDiaFet = ...
...
def firstDayDone(self):
return self.flagPrimerDiaFet
Calling thread:
from presencia.afegeixTreuAlumnesLlista import afegeixThread
afegeix=afegeixThread(expandir = expandir, alumnes=alumnes,
impartir=impartir, usuari = user, matmulla = matmulla)
afegeix.start()
#Waiting for first day done before return html:
import time
while afegeix and not afegeix.firstDayDone(): time.sleep( 0.5 )
#return html code
return HttpResponseRedirect('/presencia/passaLlista/%s/'% pk )