I have a Django app which performs a rather time consuming statistical model run within my views.py.
As the computation progresses in the view I would like to inform the user perdiodically before the final HttpResponse, for e.g.:
Step 1 completed
Step 2 running...
Is there a way to display a message to the front-end while the view is running?
Long-running tasks should be executed asynchronously. You can use dango-celery for async tasks, then from your view start the task and redirect a user to the page where you can display progress. From your celery task, you can update a progress value as well.
class Job(models.Model):
...
progress = models.PositiveSmallIntegerField(default=0)
...
If you want to display progress value dynamically, then you need an API or at least a view which you will hit by ajax. Something like this:
def progres_view(request, job_id):
value = 0
try:
job = Job.objects.get(pk=job_id)
except Job.DoesNotExist:
job = None
if job is not None:
value = job.progress
response = {"job": job_id, "progress": value}
return HttpResponse(json.dumps(response), content_type='application/json')
Related
In my django project, I have made a view class by using TemplateView class. Again, I am using django channels and have made a consumer class too. Now, I am trying to use celery worker to pull queryset data whenever a user refreshes the page. But the problem is, if user again refreshes the page before the task gets finished, it create another task which causes overload.
Thus I have used revoke to terminate the previous running task. But I see, the revoke permanently revoked the task id. I don't know how to clear this. Because, I want to run the task again whenever user call it.
views.py
class Analytics(LoginRequiredMixin,TemplateView):
template_name = 'app/analytics.html'
login_url = '/user/login/'
def get_context_data(self, **kwargs):
app.control.terminate(task_id=self.request.user.username+'_analytics')
print(app.control.inspect().revoked())
context = super().get_context_data(**kwargs)
context['sub_title'] = 'Analytics'
return context
consumers.py
class AppConsumer(AsyncJsonWebsocketConsumer):
async def connect(self):
await self.accept()
analytics_queryset_for_selected_devices.apply_async(
args=[self.scope['user'].username],
task_id=self.scope['user'].username+'_analytics'
)
Right now I am solving the problem in this following way. In the consumers.py I made a disconnect function which revoke the task when the web socket get closed.
counter = 0
class AppConsumer(AsyncJsonWebsocketConsumer):
async def connect(self):
await self.accept()
analytics_queryset_for_selected_devices.apply_async(args=[self.scope['user'].username],
task_id=self.scope['user'].username+str(counter))
async def disconnect(self, close_code):
global counter
app.control.terminate(task_id=self.scope['user'].username+str(counter), signal='SIGKILL')
counter += 1
await self.close()
counter is used for making new unique task id. But in this method, for every request makes a new task id is added in the revoke list which cause load in memory. To minimize the issue I limited the revoke list size to 20.
from celery.utils.collections import LimitedSet
from celery.worker import state
state.revoked = LimitedSet(maxlen=20, expires=3600)
This is a rookie question on web development. I am trying to find a secure and better way for developers to call and run a function that sends emails out from my django application that they can override and send manually as well as also can be time activated thus sending the email periodically at fixed times of the week.
I found answer suggesting the use of celery on Running a function periodically in Django
But I want to change the periods without redeploying my application. I have done some research on aws tools and I think a combination of AWS gateway and aws lambda and aws cloudwatch to send a url/endpoint (or get request) to my web app to activate the function.
At the moment I have something like below.
views.py
def api_send_email(request):
#insert send email function
print ("sending")
return redirect('/')
urls.py
urlpatterns = [
url(r'^send_email$', api_send_email, name="api_send_email"),
]
So the above can either manually be triggered by going to the url https//xxx/send_email or by is sending a get request to that url periodically from aws. I have thought about doing a post request instead which will make it more secure but I am not sure if the aws tools can do that because it requires the csrf token in my app itself.
Any suggestions on what is the best way to be doing this is welcome.
Thank you
I think you can accomplish this with celery as well. For that, you can add a periodic task. Lets say you have a periodic task which initiates every 5 minutes.
Then you can have your logic in a Model to determine if it should be sent the email at that time. For example:
class YourTaskConfig(models.Model):
SEND_CHOICES = (
('minute': 'minute'),
('hour': 'hour'),
('day': 'day'),
)
send_every = models.CharField(max_length=25, choices=SEND_CHOICES)
interval_amount = models.IntegerField()
last_executed = models.DateTimeField(auto_add_now=True)
is_active = models.BooleanField(default=True)
def should_run(self):
now = timezone.now()
if self.send_every == 'minute':
td = datetime.timedelta(seconds=self.interval_amount*60)
elif self.send_every == 'day':
td = datetime.timedelta(days=self.interval_amount)
... # rest of the logic on time delta
if now - self.last_executed >= td:
self.save() # Updates current execution time
return True
return False
Your Email model can have a FK to this configuration(if you have one):
class Email(models):
config = models.ForeignKey(YourTaskConfig, on_delete=models.DO_NOTHING)
And use it periodic task:
from celery.task.schedules import crontab
from celery.decorators import periodic_task
#periodic_task(run_every=(crontab(minute='*/5')), name="some_task", ignore_result=True) # Runs every 5 minute
def some_task():
for i in YourTaskConfig.objects.filter(is_active=True): # run only active tasks
should_send_email = i.should_run()
if should_send_email:
i.email_set.all() # Here you go, you have your emails which you want send
FYI: Its an untested code, but you can get the general idea behind this solution. Hope it helps!!
I've written a custom action and through model permissions have granted these to 2 users.
But now I only want one of them to run it at any one time. So was thinking whenever they select the actionbox and press the action button, it checks if a request.POST is already being made.
So my question is can I interrogate if there any other HTTP requests made before it takes the user to the intermediary page and display a message? But without having to mine the server logs.
To take a step back, I think what you're really asking is how do you share data across entrypoints to your application. e.g. if you only wanted 1 person to be able to trigger an action on a button at a time.
One strategy for doing this is to take some deployment-wide accessible datastore (like a cache or a message queue that ALL instances of your deployment have access to) and put in a message there that acts like a lock. This would rely on that datastore to support atomic reads and writes. Within Django, something like redis or memcached work well for this purpose (especially if you're using it as your cache backend.)
You might have something that looks like this (Example taken from the celery docs):
from datetime import datetime, timedelta
from django.core.cache import cache
from contextlib import contextmanager
LOCK_EXPIRE = 600 # Let the lock timeout in case your code crashes.
#contextmanager
def memcache_lock(lock_id):
timeout_at = datetime.now() + timedelta(seconds=LOCK_EXPIRE)
# cache.add fails if the key already exists
status = cache.add(lock_id, 'locked', LOCK_EXPIRE)
try:
yield status
finally:
if datetime.now() < timeout_at and status:
# don't release the lock if we exceeded the timeout
# to lessen the chance of releasing an expired lock
# owned by someone else
# also don't release the lock if we didn't acquire it
cache.delete(lock_id)
def my_custom_action(self, *args, **kwargs):
lock_id = "my-custom-action-lock"
with memcache_lock(lock_id) as acquired:
if acquired:
return do_stuff()
else:
do_something_else_if_someone_is_already_doing_stuff()
return
I have a Django app that uses django-wkhtmltopdf to generate PDFs on Heroku. Some of the responses exceed the 30 second timeout. Because this is a proof-of-concept running on the free tier, I'd prefer not to tear apart what I have to move to a worker/ poll process. My current view looks like this:
def dispatch(self, request, *args, **kwargs):
do_custom_stuff()
return super(MyViewClass, self).dispatch(request, *args, **kwargs)
Is there a way I can override the dispatch method of the view class to fake a streaming response like this or with the Empy Chunking approach mentioned here to send an empty response until the PDF is rendered? Sending an empty byte will restart the timeout process giving plenty of time to send the PDF.
I solved a similar problem using Celery, something like this.
def start_long_process_view(request, pk):
task = do_long_processing_stuff.delay()
return HttpResponse(f'{"task":"{task.id}"}')
Then you can have a second view that can check the task state.
from celery.result import AsyncResult
def check_long_process(request, task_id):
result = AsyncResult(task_id)
return HttpResponse(f'{"state":"{result.state}"')
Finally using javascript you can just fetch the status just after the task is being started. Updating every half second will more than enough to give your users a good feedback.
If you think Celery is to much, there are light alternatives that will work just great: https://djangopackages.org/grids/g/workers-queues-tasks/
I have a task that gets called on one view. Basically the task is responsible for fetching some pdf data, and saving it into s3 via django storages.
Here is the view that kicks it off:
#login_required
#minimum_stage(STAGE_SIGN_PAGE)
def page_complete(request):
if not request.GET['documentKey']:
logger.error('Document Key was missing', exc_info=True, extra={
'request': request,
})
user = request.user
speaker = user.get_profile()
speaker.readyForStage(STAGE_SIGN)
speaker.save()
retrieveSpeakerDocument.delay(user.id, documentKey=request.GET['documentKey'], documentType=DOCUMENT_PAGE)
return render_to_response('speaker_registration/redirect.html', {
'url': request.build_absolute_uri(reverse('registration_sign_profile'))
}, context_instance=RequestContext(request))
Here is the task:
#task()
def retrieveSpeakerDocument(userID, documentKey, documentType):
print 'starting task'
try:
user = User.objects.get(pk=userID)
except User.DoesNotExist:
logger.error('Error selecting user while grabbing document', exc_info=True)
return
echosign = EchoSign(user=user)
fileData = echosign.getDocumentWithKey(documentKey)
if not fileData:
logger.error('Error retrieving document', exc_info=True)
else:
speaker = user.get_profile()
print speaker
filename = "%s.%s.%s.pdf" % (user.first_name, user.last_name, documentType)
if documentType == DOCUMENT_PAGE:
afile = speaker.page_file
elif documentType == DOCUMENT_PROFILE:
afile = speaker.profile_file
content = ContentFile(fileData)
afile.save(filename, content)
print "saving user in task"
speaker.save()
In the meantime, my next view hits (actually its an ajax call, but that doesn't matter). Basically its fetching the code for the next embedded document. Once it gets it, it updates the speaker object and saves it:
#login_required
#minimum_stage(STAGE_SIGN)
def get_profile_document(request):
user = request.user
e = EchoSign(request=request, user=user)
e.createProfile()
speaker = user.get_profile()
speaker.profile_js = e.javascript
speaker.profile_echosign_key = e.documentKey
speaker.save()
return HttpResponse(True)
My task works properly, and updates the speaker.page_file property correctly. (I can temporarily see this in the admin, and also watch it occur in the postgres logs.)
However it soon gets stamped over, I BELIEVE by the call in the get_profile_document view after it updates and saves the profile_js property. In fact I know this is where it happens based on the SQL statements. Its there before the profile_js is updated, then its gone.
Now I don't really understand why. The speaker is fetched RIGHT before each update and save, and there's no real caching going on here yet, unless get_profile() does something weird. What is going on and how might I avoid this? (Also, do I need to call save on speaker after running save on the fileField? It seems like there are duplicate calls in the postgres logs because of this.
Update
Pretty sure this is due to Django's default view transaction handling. The view begins a transaction, takes a long time to finish, and then commits, overwriting the object I've already updated in a celery task.
I'm not exactly sure how to solve for it. If I switch the method to manual transactions and then commit right after I fetch the echosign js (takes 5-10 seconds), does it start a new transaction? Didn't seem to work.
Maybe not
I don't have TransactionMiddleware added in. So unless its happening anyway, that's not the problem.
Solved.
So here's the issue.
Django apparently keeps a cache of objects that it doesn't think have changed anywhere. (Correct me if I'm wrong.) Since celery was updating my object in the db outside of django, it had no idea this object had changed and fed me the cached version back when I said user.get_profile().
The solution to force it to grab from the database is simply to regrab it with its own id. Its a bit silly, but it works.
speaker = user.get_profile()
speaker = Speaker.objects.get(pk=speaker.id)
Apparently the django authors don't want to add any kind of refresh() method onto objects, so this is the next best thing.
Using transactions also MIGHT solve my problem, but another day.
Update
After further digging, its because the user model has a _profile_cache property on it, so that it doesn't refetch every time you grab the profile in one request from the same object. Since I was using get_profile() in the echosign function on the same object, it was being cached.