Is there any simplest way to run number of python request asynchronously? - django

There is an API localhost:8000/api/postdatetime/, which is responsible for changing the online status of the driver. Every time a driver hits the API and if the response sent back then the driver status will be online otherwise the driver status will be offline.
Drivers need to give the response within every 10 seconds. If there is no response from the driver then he/she will be marked as offline automatically.
Currently what I am doing is getting all the logged in drivers and hitting the above-mentioned API one driver at a time.
For the simulation purpose, I populated thousands of drivers. To maintain the online-offline of thousands of drivers using my approach will leave almost many drivers offline.
The code for my approach is as described below:
online-offline.py
import requests
import random
import math
from rest_framework.authtoken.models import Token
from ** import get_logged_in_driver_ids
from ** import Driver
def post_date_time():
url = 'localhost:8000/api/postdatetime/'
while True:
# getting all logged in driver ids
logged_in_driver_ids = get_logged_in_driver_ids()
# filtering the driver who are logged in and storing their user id in the driver_query variable
driver_query = Driver.objects.only('id', 'user_id').filter(id__in=logged_in_driver_ids).values('user_id')
# storing the user id of driver in list
driver_user_ids = [driver['user_id'] for driver in driver_query]
# getting number of drivers who are supposed to hit the API
drivers_subset_count = math.ceil(len(driver_user_ids)*(random.randint(75, 85)/100))
# shuffle the driver ids list to randomize the driver id order
random.shuffle(driver_user_ids)
# getting the driver list of length drivers_subset_count
required_drivers = driver_user_ids[:drivers_subset_count]
for driver in required_drivers:
token = Token.objects.only('key', 'user_id').get(user_id=driver)
req_headers = {
'Content-Type': 'application/json',
'Accept': 'application/json',
'Authorization': 'Token' + ' ' + str(token)
}
response = requests.post(url=url, headers=req_headers)
print(response.text)
if __name == '__main__':
post_date_time()
Is there any idea that I can post request the localhost:8000/api/postdatetime/ API asynchronously and can handle about 2-3000 of drivers within 10 seconds?
I am really new at implementing async codes. I have read some articles on the aiohttp library but got confused at the time of implementing it.
This online-offline.py will run for the whole time of the simulation.
Any help will be highly appreciated. Thank you :)

In case of non-local API asyncio can help you. To make requests asynchronously you have to:
use special syntax (async def, await - read here for details)
make requests non-blocking way (so you can await them)
use asyncio.gather() of similar way to make multiple requests parallely"
start event loop
While it is possible to make requests library work with asyncio using threads it's easier and better to use already asyncio-compatible library like aiohttp.
Take a look at code snippets here *: they contain examples of making multiple concurrent requests. Rewrite you code same way, it'll look like:
import asyncio
async def driver_request(driver):
# ...
# use aiohttp to make request with custom headers:
# https://docs.aiohttp.org/en/stable/client_advanced.html#custom-request-headers
async def post_date_time():
# ...
await asyncio.gather(*[
driver_request(driver)
for driver
in required_drivers
])
asyncio.run(post_date_time())
Locally you won't see any effect by default, so to test it locally you have to make localhost send response with delay to emulate real-life network delay.
* In latest Python version event loop can be run with just asyncio.run()

Related

How to stop/cancel a training session(Python)?

I have a function "train_model" which is call via a "train" (Flask) API. Once this API is triggered training of a model is started. On completion it saves a model. But I want to introduce a "cancel" API. Which will stop the training, and should return a valid response for "train" API.
You should probably consider using multiprocessing and run that model in a separate process and respond with a unique request_id, which is stored in the cache/db later whenever you want to cancel, so when model training is complete before exiting the process remove it from cache/db and your API should take request_id and stop that process or if it is not cache/db respond 404 accordingly since this way work is too much of re-inventing wheel you could simply consider using celery
sample untested code with multiprocessing pool
jobs = {}
#app.route("/api/train", methods=['POST'])
def train_it():
with Pool() as process_pool:
job = process_pool.apply_async(train_model_func,args=(your_inputs))
job['12455ABC'] = job
return "OK I got you! here is your request id 12455ABC"

Multiple concurrent requests in Django async views

From version 3.1 Django supports async views. I have a Django app running on uvicorn. I'm trying to write an async view, which can handle multiple requests to itself concurrently, but have no success.
Common examples, I've seen, include making multiple slow I/O operations from inside the view:
async def slow_io(n, result):
await asyncio.sleep(n)
return result
async def my_view(request, *args, **kwargs):
task_1 = asyncio.create_task(slow_io(3, 'Hello'))
task_2 = asyncio.create_task(slow_io(5, 'World'))
result = await task_1 + await task_2
return HttpResponse(result)
This will yield us "HelloWorld" after 5 seconds instead of 8 because requests are run concurrently.
What I want - is to concurrently handle multiple requests TO my_view. E.g. I expect this code to handle 2 simultaneous requests in 5 seconds, but it takes 10.
async def slow_io(n, result):
await asyncio.sleep(n)
return result
async def my_view(request, *args, **kwargs):
result = await slow_io(5, 'result')
return HttpResponse(result)
I run uvicorn with this command:
uvicorn --host 0.0.0.0 --port 8000 main.asgi:application --reload
Django doc says:
The main benefits are the ability to service hundreds of connections without using Python threads.
So it's possible.
What am I missing?
UPD:
It seems, my testing setup was wrong. I was opening multiple tabs in browser and refreshing them all at once. See my answer for details.
Here is a sample project on Django 3.2 with multiple async views and tests. I tested it multiple ways:
Requests from Django's test client are handled simultaneously, as expected.
Requests to different views from single client are handled simultaneously, as expected.
Requests to the same view from different clients are handled simultaneously, as expected.
What doesn't work as expected?
Requests to the same view from single client are handled by one at a time, and I didn't expect that.
There is a warning in Django doc:
You will only get the benefits of a fully-asynchronous request stack if you have no synchronous middleware loaded into your site. If there is a piece of synchronous middleware, then Django must use a thread per request to safely emulate a synchronous environment for it.
Middleware can be built to support both sync and async contexts. Some of Django’s middleware is built like this, but not all. To see what middleware Django has to adapt, you can turn on debug logging for the django.request logger and look for log messages about “Synchronous middleware … adapted”.
So it might be some sync middleware causing the problem even in the bare Django. But it also states that Django should use threads in this case and I didn't use any sync middleware, only standard ones.
My best guess, that it's related to client-server connection and not to sync or async stack. Despite Google says that browsers create new connections for each tab due to security reasons, I think:
Browser might keep the same connection for multiple tabs if the url is the same for economy reasons.
Django creates async tasks or threads per connection and not per request, as it states.
Need to check it out.
Your problem is that you write code like sync version. And you await every function result and only after this, you await next function.
You simple need use asyncio functions like gather to run all tasks asynchronously:
import asyncio
async def slow_io(n, result):
await asyncio.sleep(n)
return result
async def my_view(request, *args, **kwargs):
all_tasks_result = await asyncio.gather(slow_io(3, 'Hello '), slow_io(5, "World"))
result = "".join(all_tasks_result)
return HttpResponse(result)

How to show progress status of requests response download into web page in Django

I build an application in Django, and I have a function to get attendance log from a fingerprint machine, basically like this:
import requests
from xml.etree import ElementTree
def main(request):
Key="xxxxxx"
url="http://192.168.2.188:80"
soapreq="<GetAttLog><ArgComKey xsi:type=\"xsd:integer\">"+Key+"</ArgComKey><Arg><PIN xsi:type=\"xsd:integer\">All</PIN></Arg></GetAttLog>"
http_headers = {
"Accept": "application/soap+xml,multipart/related,text/*",
"Cache-Control": "no-cache",
"Pragma": "no-cache",
"Content-Type": "text/xml; charset=utf-8"
}
response = requests.post(url+"/iWsService",data=soapreq,headers=http_headers)
root = ElementTree.fromstring(response.content)
Now, that process will be repeated for hundred++ fingerprint machines, and I need to display some kind of progress status and also error messages ( if any, like connection cannot be eastablished, etc) on a page, periodically and sequentially, after each events. I mean something like :
....
"machine 1 : download finished."
"downloading data from machine 2 .. Please wait "
"machine 2 : download finished."
...
Thanks.
I'm not really sure what is the main hurdle you're facing. Can you try framing the question in a more precise way?
From what I understand, you want a page that changes dynamically based on something that's happening on the server. I think django may not be the best tool for that, as its basic use is to compute a full view in one go then display it. However, there are a few things that can be done.
The easiest (but not the best in terms of server load) would be to use Ajax requests from the webpage.
Have a "main view" that loads the user-facing page as well as some JS libraries (e.g. jQuery), which will be used to query the server for progress.
Have a "progress view" that displays the current status and that is queried from the main view via AJAX.
This is not the best architecture because you may end up reloading data from the server too often or not often enough. I would suggest to have a look at websockets, which allow you to keep a client-server connection open and use it when needed only, but there is no native support for them in django.

HTTP call after celery task have changed state

I need a scheduler for my next project, and since I'm coding using Django I went for Celery.
What I am looking for is a way for a task to tell Django when it is done, so I can update the database and use SSE to tell the user. All this can be done fairly simple with just putting all the logic into the task. But what do I do when I am planning to have several celery workers?
I found a bunch of info online to cover the single-worker-case, but not many covering the problem if you have more than one worker.
What I thought about was using http callbacks from the workers to the web-server to let it know that the task is done. Looking at celery.task.http looked promising, but didnt do what I needed.
Is the solution to use signals and hook up manual http calls? Or am I on the wrong path? Isn't this a common problem? How can this be solved more elegantly?
So, what are you mean when you tell tell to Django? Is I understand you right, django request which initiliazed a Celery task, is still alive a time when this task is finished? I that case you can check some storage ( database, memcached, etc ). and send your SSE.
Look, there is one way to do that.
1. You django view send task to Celery, after that it goes to infinite loop ( or loop with timeout 60sec?) and waits results in memcached.
Celery gets task executes, and pastes results to memcached.
Django view gets new results, exit the loop and sends your SSE.
Next variant is
Django view sends task to Celery, and returns
Celery execute tasks, after executing it makes simple HTTP requests to your django app.
Django receives a http request from Celery, parse params and send SSE to your user again
Here is some code that seems to do what I want:
In django settings:
CELERY_ANNOTATIONS = {
"*": {
"on_failure": celery_handlers.on_failure,
"on_success": celery_handlers.on_success
}
}
In the celery_handlers.py file included:
def on_failure(self, exc, task_id, *args, **kwargs):
# Use urllib or similar to poke eg; api-int.mysite.com/task_handler/TASK_ID
pass
def on_success(self, retval, task_id, *args, **kwargs):
# Use urllib or similar to poke eg; api-int.mysite.com/task_handler/TASK_ID
pass
And then you can just setup api-int to use something like:
from celery.result import AsyncResult
task_obj = AsyncResult(task_id)
# Logic to handle task_obj.result and related goes here....

App Engine local datastore content does not persist

I'm running some basic test code, with web.py and GAE (Windows 7, Python27). The form enables messages to be posted to the datastore. When I stop the app and run it again, any data posted previously has disappeared. Adding entities manually using the admin (http://localhost:8080/_ah/admin/datastore) has the same problem.
I tried setting the path in the Application Settings using Extra flags:
--datastore_path=D:/path/to/app/
(Wasn't sure about syntax there). It had no effect. I searched my computer for *.datastore, and couldn't find any files, either, which seems suspect, although the data is obviously being stored somewhere for the duration of the app running.
from google.appengine.ext import db
import web
urls = (
'/', 'index',
'/note', 'note',
'/crash', 'crash'
)
render = web.template.render('templates/')
class Note(db.Model):
content = db.StringProperty(multiline=True)
date = db.DateTimeProperty(auto_now_add=True)
class index:
def GET(self):
notes = db.GqlQuery("SELECT * FROM Note ORDER BY date DESC LIMIT 10")
return render.index(notes)
class note:
def POST(self):
i = web.input('content')
note = Note()
note.content = i.content
note.put()
return web.seeother('/')
class crash:
def GET(self):
import logging
logging.error('test')
crash
app = web.application(urls, globals())
def main():
app.cgirun()
if __name__ == '__main__':
main()
UPDATE:
When I run it via command line, I get the following:
WARNING 2012-04-06 19:07:31,266 rdbms_mysqldb.py:74] The rdbms API is not available because the MySQLdb library could not be loaded.
INFO 2012-04-06 19:07:31,778 appengine_rpc.py:160] Server: appengine.google.com
WARNING 2012-04-06 19:07:31,783 datastore_file_stub.py:513] Could not read datastore data from c:\users\amy\appdata\local\temp\dev_appserver.datastore
WARNING 2012-04-06 19:07:31,851 dev_appserver.py:3394] Could not initialize images API; you are likely missing the Python "PIL" module. ImportError: No module named _imaging
INFO 2012-04-06 19:07:32,052 dev_appserver_multiprocess.py:647] Running application dev~palimpsest01 on port 8080: http://localhost:8080
INFO 2012-04-06 19:07:32,052 dev_appserver_multiprocess.py:649] Admin console is available at: http://localhost:8080/_ah/admin
Suggesting that the datastore... didn't install properly?
As of 1.6.4, we stopped saving the datastore after every write. This method did not work when simulating the transactional model found in the High Replication Datastore (you would lose the last couple writes). It is also horribly inefficient. We changed it so the datastore dev stub flushes all writes and saves its state on shut down. It sounds like the dev_appserver is not shutting down correctly. You should see:
Applying all pending transactions and saving the datastore
in the logs when shutting down the server (see source code and source code). If you don't, it means that the dev_appserver is not being shut down cleanly (with a TERM signal or KeyInterrupt).