How to set a timer inside the get request of an APIView? - django

I am trying to build a timer inside a get method in a DRF View. I have created the timer method inside the GameViewController class and what I am trying to achieve is that a every minute (5 times in a row) a resource object is shown to the user through the get request and a game round object is created. My View works at the moment, however the timer doesn't seem to be doing anything.
I know this isn't exactly how things are done in django but this is how I need to do it for my game API for game logic purposes.
How can I make the timer work? Do I need to use something like request.time or such?
Thanks in advance.
views.py
class GameView(APIView):
def get(self, request, *args, **kwargs):
...
round_number = gametype.rounds
# time = controller.timer()
now = datetime.now()
now_plus_1 = now + timedelta(minutes=1)
while round_number != 0:
while now < now_plus_1:
random_resource = Resource.objects.all().order_by('?').first()
resource_serializer = ResourceSerializer(random_resource)
gameround = Gameround.objects.create(
id=controller.generate_random_id(Gameround),
user_id=current_user_id,
gamesession=gamesession,
created=datetime.now(),
score=current_score
)
gameround_serializer = GameroundSerializer(gameround)
round_number -= 1
return Response({# 'gametype': gametype_serializer.data,
'resource': resource_serializer.data,
'gameround': gameround_serializer.data
})

If you want to jump quickly into this - use huey https://github.com/coleifer/huey
You will need to install Redis as a backend of your queue. It's not complicated.
Huey can run your code by cron, delays or something else complicated:
from huey import RedisHuey, crontab
huey = RedisHuey('my-app', host='redis.myapp.com')
#huey.task()
def add_numbers(a, b):
return a + b
#huey.task(retries=2, retry_delay=60)
def flaky_task(url):
# This task might fail, in which case it will be retried up to 2 times
# with a delay of 60s between retries.
return this_might_fail(url)
#huey.periodic_task(crontab(minute='0', hour='3'))
def nightly_backup():
sync_all_data()
Hyue has the Django extentions https://huey.readthedocs.io/en/latest/contrib.html#django
As for me, this was the fastest way to achieve the same tasks and this has been working in production for ~1 year without my support.

Related

How to use session timeout in django rest view?

I am implementing a view for a game using Django REST's APIView. I am very new to Django and have never done this before so I'm not sure how to implement this.
The main idea is that a game only lasts 5 minutes. I am sending a resource to the user and creating a session object. This view. should be unavailable after 5 minutes. Is there such a thing as a view timeout?
Will the session timeout then work for the post request as well or do I need to implement it there as well?
This is my view:
The out commented code at the end is what I was thinking of doing. Can I even do it in the view directly? How else can I do this and test it?
views.py
class GameView(APIView):
"""
API View that retrieves the game,
retrieves an game round as well as a random resource per round
allows users to post tags that are verified and saved accordingly to either the Tag or Tagging table
"""
def get(self, request, *args, **kwargs):
current_score = 0
if not isinstance(request.user, CustomUser):
current_user_id = 1
else:
current_user_id = request.user.pk
random_resource = Resource.objects.all().order_by('?').first()
resource_serializer = ResourceSerializer(random_resource)
gameround = Gameround.objects.create(user_id=current_user_id,
gamesession=gamesession,
created=datetime.now(),
score=current_score)
gameround_serializer = GameroundSerializer(gameround)
return Response({'resource': resource_serializer.data,
'gameround': gameround_serializer.data,
})
# TODO: handle timeout after 5 min!
# now = timezone.now()
# end_of_game = start_time + timezone.timedelta(minutes=5)
# if :
# return Response({'resource': resource_serializer.data, 'gameround': gameround_serializer.data,})
# else:
# return Response(status=status.HTTP_408_REQUEST_TIMEOUT)
*Testing the out commented code in Postman always leads to a 408_request_timeout.

How to turn a Django Rest Framework API View into an async one?

I am trying to build a REST API that will manage some machine learning classification tasks. I have written an API view, which when hit, will trigger the start of a classification task (such as: training an SVM classifier with the data the user provided previously). However, this is a long running task, so I would ideally not have the user wait once they have made a request to this view. Instead, I would like to start this task in the background and give them a response immediately. They can later view the results of the classification in a separate view (haven't implemented that yet.)
I am using ASGI_APPLICATION = 'mlxplorebackend.asgi.application' in settings.py.
Here's my API view in views.py
import asyncio
from concurrent.futures import ProcessPoolExecutor
from django import setup as SetupDjango
# ... other imports
loop = asyncio.get_event_loop()
def DummyClassification():
result = sum(i * i for i in range(10 ** 7))
print(result)
return result
# ... other API views
class TaskExecuteView(APIView):
"""
Once an API call is made to this view, the classification algorithm will start being processed.
Depends on:
1. Parser for the classification algorithm type and parameters
2. Classification algorithm implementation
"""
def get(self, request, taskId, *args, **kwargs):
try:
task = TaskModel.objects.get(taskId = taskId)
except TaskModel.DoesNotExist:
raise Http404
else:
# this is basically the classification task for now
# need to turn this to an async view
with ProcessPoolExecutor(initializer = SetupDjango) as pool:
loop.run_in_executor(pool, DummyClassification)
return Response({ "message": "The task with id: {} has been started".format(task.taskId) }, status = status.HTTP_200_OK)
The problem I am facing is the following:
When I do not use with ProcessPoolExecutor(initializer = SetupDjango) as pool: i.e. without the initializer, I get django.core.exceptions.AppRegistryNotReady: Apps aren't loaded yet. (full traceback at: https://paste.ubuntu.com/p/ctjmFNYMXW/)
When I do use the initializer, the view no longer remains async, it gets blocked. The response returns after the task is completed, which is about 5 seconds on my machine. I do realize I am not really making use of asyncio.sleep() inside my DummyClassification() function, but I can't figure out the way to do so.
I am guessing this is not the way to do it, therefore any suggestions would be appreciated. I would like to avoid celery if I can, since that seems a tad bit too complicated for me.
Edit:
If I get rid of ProcessPoolExecutor() and simply do loop.run_in_executor(None, DummyClassification), it works as expected, but then only one worker thread is working on the task, which doesn't seem remotely ideal for a classification task.
This was a ride. I at first went through the pain of setting up celery only to find out that the original problem of the classification task using one CPU core remains. Then I switched to django-rq with redis and it is currently working as expected.
from .tasks import Pipeline
class TaskExecuteView(APIView):
"""
Once an API call is made to this view, the classification algorithm will start being processed.
Depends on:
1. Parser for the classification algorithm type
2. Classification algorithm implementation
"""
def get(self, request, taskId, *args, **kwargs):
try:
task = TaskModel.objects.get(taskId = taskId)
except TaskModel.DoesNotExist:
raise Http404
else:
Pipeline.delay(taskId) # this is async now ✔
# mark this as an in-progress task
TaskModel.objects.filter(taskId = taskId).update(inProgress = True)
return Response({ "message": "The task with id: {}, title: {} has been started".format(task.taskId, task.taskTitle) }, status = status.HTTP_200_OK)
tasks.py
from django_rq import job
#job('default', timeout=3600)
def Pipeline(taskId):
# classification task

How do we trigger multiple airflow dags using TriggerDagRunOperator?

I have a scenario wherein a particular dag upon completion needs to trigger multiple dags,have used TriggerDagRunOperator to trigger single dag,is it possible to pass multiple dags to the TriggerDagRunOperator to trigger multiple dags?
And is it possible to trigger only upon successful completion of the current dag.
I have faced the same problem. And there is no solution out of the box, but we can write a custom operator for it.
So here the code of a custom operator, that get python_callable and trigger_dag_id as arguments:
class TriggerMultiDagRunOperator(TriggerDagRunOperator):
#apply_defaults
def __init__(self, op_args=None, op_kwargs=None, *args, **kwargs):
super(TriggerMultiDagRunOperator, self).__init__(*args, **kwargs)
self.op_args = op_args or []
self.op_kwargs = op_kwargs or {}
def execute(self, context):
session = settings.Session()
created = False
for dro in self.python_callable(context, *self.op_args, **self.op_kwargs):
if not dro or not isinstance(dro, DagRunOrder):
break
if dro.run_id is None:
dro.run_id = 'trig__' + datetime.utcnow().isoformat()
dbag = DagBag(settings.DAGS_FOLDER)
trigger_dag = dbag.get_dag(self.trigger_dag_id)
dr = trigger_dag.create_dagrun(
run_id=dro.run_id,
state=State.RUNNING,
conf=dro.payload,
external_trigger=True
)
created = True
self.log.info("Creating DagRun %s", dr)
if created is True:
session.commit()
else:
self.log.info("No DagRun created")
session.close()
trigger_dag_id is dag id what we want running multiple times.
python_callable is a function, it should return a list of DagRunOrder objects, one object for schedule one instance of DAG with dag_id trigger_dag_id.
Code and examples on GitHub: https://github.com/mastak/airflow_multi_dagrun
Little bit more description about this code: https://medium.com/#igorlubimov/dynamic-scheduling-in-airflow-52979b3e6b13
In Airflow 2, you can do a dynamic task mapping. For example:
import uuid
import random
from airflow.decorators import dag, task
from airflow.operators.trigger_dagrun import TriggerDagRunOperator
dag_args = {
"start_date": datetime(2022, 9, 9),
"schedule_interval": None,
"catchup": False,
}
#task
def define_runs():
num_runs = random.randint(3, 5)
runs = [str(uuid.uuid4()) for _ in range(num_runs)]
return runs
#dag(**dag_args)
def dynamic_tasks():
runs = define_runs()
run_dags = TriggerDagRunOperator.partial(
task_id="run_dags",
trigger_dag_id="hello_world",
conf=None,
).expand(
trigger_run_id=runs,
)
run_dags
dag = dynamic_tasks()
Docs here.
You can try looping it! for example:
for i in list:
trigger_dag =TriggerDagRunOperator(task_id='trigger_'+ i,
trigger_dag_id=i,
python_callable=conditionally_trigger_non_indr,
dag=dag)
Set this dependent on the task that is required. I have automated something like this for PythonOperator. You could try if this works for you!
As the API docs state, the method accepts a single dag_id. However, if you want to unconditionally kick off downstream DAGs upon completion, why not just put those tasks in a single DAG and set your dependencies/workflow there? You would then be able to set depends_on_past=True where appropriate.
EDIT: Easy workaround if you absolutely need them in separate DAGs is to create multiple TriggerDagRunOperators and set their dependencies to the same task.

Celery group multiple tasks in one design

I just getting familiar with Celery and have a question. My setup is Django-Redis-Celery
Lets take an example of a task sending email:
TASKS
#task
def send_email(message):
mailserver.sendOneMessage(message)
VIEWS
class newaccount(APIView):
def post(self, request, format=None):
send_email.delay(request.data.email)
This works perfectly, Django sends messages to Redis and those are picked up by Celery then to execute task. But I want to improve the system so that Celery picks up all messages from Redis at certain intervals and executes a single task with multiple messages. This because, connecting to the email server is slow and sending multiple messages as a single request will result in a faster process.
I want something like this to work:
TASKS
#task
def send_emails(messages):
mailserver.sendMultipleMessages(messages)
Thoughts?
Since i am using redis as a cache (django-redis) already i implemented the following workflow:
Step 1. Create a task that adds new emails to cache
#shared_task()
def add_email(user_id):
cache.set("email#{}".format(user_id), None, timeout=None)
Step 2. Create a periodic task that runs every second and looks up for new emails in the cache
class ProcessEmailsTask(PeriodicTask):
run_every = timedelta(seconds=1)
def run(self, **kwargs):
call_email()
def call_email():
item_exists = True
ids = []
while item_exists:
try:
key = next(cache.iter_keys("email#*"))
ids.append(key.split("email#")[1])
cache.delete_pattern(key)
except:
item_exists = False
if len(ids) > 0:
send_emails_to(ids)
Step 3. Run both celery workers and celery beat and profit!

How to have django give a HTTP response before continuing on to complete a task associated to the request?

In my django piston API, I want to yield/return a http response to the the client before calling another function that will take quite some time. How do I make the yield give a HTTP response containing the desired JSON and not a string relating to the creation of a generator object?
My piston handler method looks like so:
def create(self, request):
data = request.data
*other operations......................*
incident.save()
response = rc.CREATED
response.content = {"id":str(incident.id)}
yield response
manage_incident(incident)
Instead of the response I want, like:
{"id":"13"}
The client gets a string like this:
"<generator object create at 0x102c50050>"
EDIT:
I realise that using yield was the wrong way to go about this, in essence what I am trying to achieve is that the client receives a response right away before the server moves onto the time costly function of manage_incident()
This doesn't have anything to do with generators or yielding, but I've used the following code and decorator to have things run in the background while returning the client an HTTP response immediately.
Usage:
#postpone
def long_process():
do things...
def some_view(request):
long_process()
return HttpResponse(...)
And here's the code to make it work:
import atexit
import Queue
import threading
from django.core.mail import mail_admins
def _worker():
while True:
func, args, kwargs = _queue.get()
try:
func(*args, **kwargs)
except:
import traceback
details = traceback.format_exc()
mail_admins('Background process exception', details)
finally:
_queue.task_done() # so we can join at exit
def postpone(func):
def decorator(*args, **kwargs):
_queue.put((func, args, kwargs))
return decorator
_queue = Queue.Queue()
_thread = threading.Thread(target=_worker)
_thread.daemon = True
_thread.start()
def _cleanup():
_queue.join() # so we don't exit too soon
atexit.register(_cleanup)
Perhaps you could do something like this (be careful though):
import threading
def create(self, request):
data = request.data
# do stuff...
t = threading.Thread(target=manage_incident,
args=(incident,))
t.setDaemon(True)
t.start()
return response
Have anyone tried this? Is it safe? My guess is it's not, mostly because of concurrency issues but also due to the fact that if you get a lot of requests, you might also get a lot of processes (since they might be running for a while), but it might be worth a shot.
Otherwise, you could just add the incident that needs to be managed to your database and handle it later via a cron job or something like that.
I don't think Django is built either for concurrency or very time consuming operations.
Edit
Someone have tried it, seems to work.
Edit 2
These kind of things are often better handled by background jobs. The Django Background Tasks library is nice, but there are others of course.
You've turned your view into a generator thinking that Django will pick up on that fact and handle it appropriately. Well, it won't.
def create(self, request):
return HttpResponse(real_create(request))
EDIT:
Since you seem to be having trouble... visualizing it...
def stuff():
print 1
yield 'foo'
print 2
for i in stuff():
print i
output:
1
foo
2