Calling same celery task multiple times in Django - django

I am totally new to celery and trying to call the same celery task multiple times parallel.
Basically, my task creates some data in an external third-party app through API and export created data with a third-party generated id. Now if I run the same tasks multiple times, third-party generated id changes to the latest instance of the task.
Can I know how can I solve this issue? I tried saving the third-party generated id with task id in the results backend, but then how to do I, access this data form results backend? Is there any other way to do it?

Related

Options for running on user demand asynchronous / background tasks in Django?

My Django app generates a complex report that can take upto 5 minutes to create. Therefore it runs once a night using a scheduled management command.
That's been ok, except I now want the user to be able to select the date range for the report, which means the report needs to be created while the user waits.
What are my options for running the tast in the background? So far I've found these:
Celery - might work but is complex
django-background-tasks looks like the right tool for the job but hasn't been updated for years, last supported Django is 2.2
The report/background task could be generated by AWS Lambda, basically in a microservice. Django calls the Microservice which can execute the background task then call the Django app back once finished. This is what I did last time but not sure it would work now as I'd need to send the microservice 10mb of data to process.
Use subprocess.popen which someone here said worked for them but other reports say it doesn't work from Django.
EDIT: Looks like Django 3.1 onwards supports ASync views and may be the simple solution for this.

Running a background task involving a third-party service in Django

The use-case is this: we need to pull in data from a third-party service and update the database with fresh records, every week. The different ways I have been able to explore have been
either creating a custom django-admin command
or running a background task using Celery (and probably ELK for logging)
I just want to know which way is more feasible and simpler? And if there's another way that I can explore. What I want is monitoring the task for the first few runs then just relying on the logs.

Access to Django ORM from remote Celery worker

I have a Django application and a Celery worker - each running on it's own server.
Currently, Django app uses SQLite to store the data.
I'd like to access the database using Django's ORM from the worker.
Unfortunately, it is not completely clear to me; thus I have some questions.
Is it possible without hacks/workarounds? I'd like to have a simple solution (I would not like to implement REST interface to object access). I imagine that achieving this could be done if I started using PostgreSQL instance which is accessible from both servers.
Which project files (there's just Django + tasks.py file) are required on the worker's machine?
Could you provide me with an example or tutorial? I tried looking it up but found just tutorials/answers bound to a problem of local Celery workers.
I have been searching for ways to do this simply but... Your best option is to attached a kind of callback to the task function that will call another function on the django server to carry out the database update

Alternative to django-celery to perform asynchronous task in Django?

In my admin I have a form allowing to upload a file to fill the DB.
Parsing and filling the DB take a long time, so I'd like to do it asynchronously.
As recommended by several SO users I tried to install python-celery, but I can't manage to do it (I'm on Webfaction).
Is there any simple, easy-to-install alternative?
If webfaction supports cron jobs you can create your own pseudo broker. You could save your long running tasks to the db and in a 'tasks' table, this would allow you to return a response to the user instantaneously. Then there could be a cron that goes through very frequently and looks for uncompleteled tasks and processes them.
I believe this is what django mailer does
https://github.com/jtauber/django-mailer/
https://stackoverflow.com/a/1419640/594589
Try Gearman along with it's python client library
It's very easy to setup and run gearman. Try few examples.

Where is the data provided by django-celery urls stored? How long is the data available? And what is the memory consumption?

I am starting a project using django celery and I am making ajax calls to the task urls provided by 'djcelery.urls'.
I would like to know a few things about this data:
Where is that information being stored? Is it called from the djcelery tables in my django projects database or is it kept on the RabbitMQ server? My understanding of the djcelery tables in my database is that they are only for monitoring the usage using the camera.
If it is being stored on the RabbitMQ server, how long will the tasks status report be available? How much memory does this data consume?
Do I need to flush the task status reports periodically to prevent a memory leak? How would this be done? By restarting the rabbitmq server?
Thanks.
The results are stored in the CELERY_RESULT_BACKEND, which is disabled by default.
You can get the result of a task by creating a new celery.result.AsyncResult with the appropriate task_id: How do I get the result of a task if I have the ID that points there?.
Unless you set CELERY_AMQP_TASK_RESULT_EXPIRES, tasks will never expire. You can manually remove the result of a task using AsyncResult.forget().