Best wsgi service for handling webhooks with few resources - flask

Currently working on a Virtual server with 2 CPU's 4GB of ram. I am running a Flask + uwsgi + nginx to host the webserver. I need the server to be capable of accepting about 10 out of 2500-ish the requests a day. The requests that don't pass average about 2ms yet the queue is consistently backed up. The issue I have been encountering lately is both speed and duplication when it does work. As the accepted webhooks are sent to another server and I will get duplicates or completely miss a bunch.
[uwsgi]
module = wsgi
master = true
processes = 4
enable-threads = true
threads = 2
socket = API.sock
chmod-socket = 660
vacuum = true
harakiri = 10
die-on-term = true
This is my current .ini file I have messed around with harakiri and have read countless hours through the uwsgi documentation trying different things it is unbelievably frustrating.
Picture of Systemctl status API
The check for it looks similar to this redacted some info.
#app.route('/api', methods=['POST'])
def handle_callee():
authorization = request.headers.get('authorization')
if authorization == SECRET and check_callee(request.json):
data = request.json
name = data["payload"]["object"]["caller"]["name"]
create_json(name, data)
return 'success', 200
else:
return 'failure', 204
The json is then parsed through a number of functions. This is my first time deploying a wsgi service and I don't know if my configuration is incorrect. I've poured hours of research into trying to fix this. Should I try switching to gunicorn. I have asked this question differently a couple of days ago but to no avail. Trying to put more context in hopes someone could point me in the right direction. I don't even know in the systemctl status whether the | req: 12/31 is how many it's done thus far and what's queued for that PID. Any insight into this situation would make my week. I've been unable to fix this for about 2 weeks of trying different configs increasing working, processes, messing with the harakiri, disabling logging. But none of this has proved to get the requests to process at a speed that I desire.
Thank you to anyone who took the time to read this, I am still learning and have tried to add as much context as possible. If you need more I will gladly respond. I just can't wrap my head around this issue.

You would need to take a systematic approach in figuring out:
How many requests per second can you handle
What are your apps bottlenecks and scaling factors
Cloudbees have written a great article on performance tuning for uwsgi + flask + nginx.
To give an overview of the steps to tune your service here is what it might look like:
First, you need to make sure you have the required tooling, particularly a benchmarking tool like Apache Bench, k6, etc.
Establish a base. This means that you configure your application with the minimum setup to run i.e. single thread and single process, no multi-threading. Run the benchmark and record the results.
Start tweaking the setup. Add threads, processes, etc.
Benchmark after the tweaks.
Repeat steps 2 & 3 until you see the upper limits, and understand the service characteristics - are you CPU/IO bound?
Try changing the hardware/vm, as some offerings come with penalties in performance due to shared CPU with other tenants, bandwidth, etc.
Tip: Try to run the benchmark tool from a different system than the one you are benchmarking, since it also consumes resources and loads the system further.
In your code sample you have two methods create_json(name, data) and check_callee(request.json), do you know their performance?
Note: Can't comment so had to write this as an answer.

Related

How can my Heroku Flask web application support N users concurrently downloading an image file?

I am working on a Flask web application using Heroku. As part of the application, users can request to download an image from the server. That calls a function which has to then retrieve multiple images from my cloud storage (about 500 KB in total), apply some formatting, and return a single image (about 60 KB). It looks something like this:
#app.route('/download_image', methods=['POST'])
def download_image():
# Retrieve about 500 KB of images from cloud storage
base_images = retrieve_base_images(request.form)
# Apply image formatting into a single image
formatted_image = format_images(base_images)
# Return image of about 60 KB for download
formatted_image_file = io.BytesIO()
formatted_image.save(formatted_image_file, format='JPEG')
formatted_image_data = formatted_image_file.getvalue()
return Response(formatted_image_data,
mimetype='image/jpeg',
headers={'Content-Disposition': 'attachment;filename=download.jpg'})
My Procfile is
web: gunicorn my_app:app
How can I design/configure this to support N concurrent users? Let's say, for example, I want to make sure my application can support 100 different users all requesting to download an image at the same time. With several moving parts, I am unsure how to even go about doing this.
Also, if someone requests a download but then loses internet connection before their download is complete, would this cause some sort of lock that could endlessly stall, or would that thread/process automatically timeout after a short period and therefore be handled smoothly?
I currently have 1 dyno (on the Heroku free plan). I am willing to add more dynos if needed.
Run multiple Gunicorn workers:
Gunicorn forks multiple system processes within each dyno to allow a Python app to support multiple concurrent requests without requiring them to be thread-safe. In Gunicorn terminology, these are referred to as worker processes (not to be confused with Heroku worker processes, which run in their own dynos).
…
We recommend setting a configuration variable for this setting. Gunicorn automatically honors the WEB_CONCURRENCY environment variable, if set.
heroku config:set WEB_CONCURRENCY=3
Note that Heroku sets a default WEB_CONCURRENCY for you based on your dyno size. You can probably handle a small number of concurrent requests right now.
However, you're not going to get anywhere close to 100 on a free dyno. This section appears between the previous two in the documentation:
Each forked system process consumes additional memory. This limits how many processes you can run in a single dyno. With a typical Django application memory footprint, you can expect to run 2–4 Gunicorn worker processes on a free, hobby or standard-1x dyno. Your application may allow for a variation of this, depending on your application’s specific memory requirements.
Even if your application is very lightweight you probably won't be able to go above 6 workers on a single small dyno. Adding more dynos and / or increasing the number of dynos you run will be required.
Do you really need to support 100 concurrent requests? If you have four workers going, four users' requests can be served at the same time. If a fifth makes a request, that request just won't get responded to until one of the workers frees up. That's usually reasonable.
If your request takes an unreasonable amount of time to complete you have a few options besides adding more workers:
Can you cache the generated images?
Can you return a response immediately, create the images in a background job, and then notify the user that the images are ready? With some fancy front-end work this can be fairly transparent to the end user.
The right solution will depend on your specific use case. Good luck!

Gunicorn not responding

I'm using Gunicorn to serve a Django application, it was working alright till I changed its timeout from 30s to 900000s, I had to do this because I had a usecase in which a huge file needed to get uploaded and processed (process taking more than 30m in some cases) but after this change Gunicorn goes unresponsive after few hours, I guess the problem is all workers (being 30) will be busy with some requests after this amount of time, the weird thing is it happens even if I don't run that long request at all and it happens with normal exploring in django admin. I wanna know if there's a way to monitor requests on gunicorn and see workers are busy with what requests, I wanna find out the requests that's making them busy. I tried --log-file=- --log-level=debug but it doesn't tell anything about requests, I need more detailed logs.
From the latest Gunicorn docs, on the cmd line/script you can use:
--log-file - ("-" means log to stderr)
--log-level debug
or in the config file you can use:
errorlog = '-'
accesslog = '-'
loglevel = 'debug'
but there is no mention of the parameter format you specified in your question:
--log-file=-
--log-level=debug
The default logging for access is 'None' (ref), so shot in the dark, but this may explain why are not receiving detailed log information. Here is a sample config file from the Gunicorn source, and the latest Gunicorn config docs.
Also, you may want to look into changing your logging configuration in Django.
While I am also looking for a good answer for how to see how many workers are busy, you are solving this problem the wrong way. For a task that takes that long you need a worker, like Celery/RabbitMQ, to do the heavy lifting asynchronously, while your request/response cycle remains fast.
I have a script on my site that can take 5+ minutes to complete, and here's a good pattern I'm using:
When the request first comes in, spawn the task and simply return a HTTP 202 Accepted. Its purpose is "The request has been accepted for processing, but the processing has not been completed."
Have your front-end poll the same endpoint (every 30 seconds should suffice). As long as the task is not complete, return a 202
Eventually when it finishes return a 200, along with any data the front-end might need.
For my site, we want to update data that is more than 10 minutes old. I pass the Accept-Datetime header to indicate how old of data is acceptable. If our local cached copy is older than that, we spawn the task cycle.

uWSGI/nginx/Django poor performances after switching from Apache on Amazon EC2 Micro instances

I have just switched my servers away from Apache/mod_wsgi to go towards a nginx/uwsgi stack. However, I am seeing very bad performances compared to Apache, even though the server load is the same/even less during Christmas. Any ideas as to why, I am very new to uWSGI/Nginx stack? Here is my configuration:
[uwsgi]
chdir = /srv/www/poka/app/poka
module = nginx.wsgi
home = /srv/www/poka/app/env/main
env = DJANGO_SETTINGS_MODULE=settings.prod
//master = true
processes = 10
socket = /srv/www/poka/app/poka/nginx/poka.sock
chmod-socket = 666
vacuum = true
pidfile = /tmp/project-master.pid
harakiri = 60
max-requests = 5000
daemonize = /var/log/uwsgi/poka.log
First, you have to identify where the problem is. Assuming you don't do anything fanzy, like requests with huge payloads, I do a few things:
nginx: Log duration of upstream requests with $upstream_response_time. Compare it to total response time with $request_time. This tells you, where the time is lost, i.e. if nginx has a problem, or the upstream components (uwsgi, django, database, …) If uwsgi is the problem …
uwsgi: enable the stats server, then use uwsgitop to get a quick overview of the stats
If uwsgi is fine, look into what Python/Django is doing …
uwsgi+python: enable pytracebacker-sockets to view what the workers are doing. If you see workers getting stuck, enable (if that is reasonable in your scenario) harakiri-mode, so uwsgi can recycle stuck workers. When using harakiri do not forget to enable the pytracebacker, as that will give you Python stacktraces when a worker is killed.
Django: Enable the debug-toolbar to see where and how much the application is spending its time.
When you've identified the component, you're already much closer to a solution, and can ask much more specific questions.
(If you are doing big requests, then compression settings and max-payload-related settings of uwsgi/nginx may be good candidates to look into. They caused us some headaches.)
Do you really need 10 processes? Why you don't try a minor amount? uWSGI + Nginx can handle lot of concurrent requests just with 2/4 processes, perhaps the bottleneck is there.
you can
monitor cpu/mem for detail comparison
install uwsgitop(via pip install uwsgitop) to monitor your uwsgi process

Task queue in Django

I've only heard about tools like Celery, but I don't know if it fits my needs and is the best solution I can have.
Imagine a game like Travian. We initiate building and we have to wait N seconds until the construction is finished. When and how should we complete the construction?
Solution 1: Check if there are active construction every time the page loads. If queries like that takes some time we can make them asynchronous. If there are some - then complete.
However, in this way we are constantly waiting for the user to reload the page. Sure, we can use cronjob to check for constructions to be completed from time to time, but cronjobs execute once in a minute or less often. Constructions / attacks etc. must be executed as precisely as possible.
The solution above works, but has some cons. What are better and RELIABLE ways to perform actions like those I mentioned.
Moreover, let's assume that resources needs to be regenerated at X per hour speed and we need to regenerate them very precisely and pretty often. How can I achieve this without waiting for the page to be refreshed?
Finally, solution shall work in Webfaction hosting or any other shared hosting. I've heard that Celery doesn't work in Webfaction or am I mistaken?
Yes, celery have periodic tasks with seconds:
http://celery.readthedocs.org/en/latest/userguide/periodic-tasks.html
Also you can run tasks in time with celery's crontab
http://celery.readthedocs.org/en/latest/userguide/periodic-tasks.html#crontab-schedules
Also if you need to check resources count I think it's common part for every request, so your response should looks like
{
"header": {"resources": {"wood":1, "stone":500}}
"data": {.. you real data shoud be here...}
}
You need to add header to response that will contain common information like resources count, unread messages etc and handle it properly on client.
To improve it you can use nginx + ssl + memcache backend.

Django/Postgres performance worsening after repeatedly processing the same query

I am running Django on Apache. I have several client computers which should call urllib2.urlopen() and send over some data which my server will process and immediately send back a reply. However, when I am testing this I found a very tricky issue. I have one client repeatedly send the same data to be processed. The first time, it takes around ~20 seconds, second time, it takes about 40 seconds, third time I get a 504 (gateway timeout) error. If I try to send data some more 504 errors randomly pop up. I am pretty sure this is an issue with Postgres as the function that processes the information makes many database calls, however, I do not know why the performance of Postgres would decline so much. I have tried several database optimization tricks, including this one: (http://stackoverflow.com/questions/1125504/django-persistent-database-connection), to no avail.
Thanks in advance.
Edit: The requests are not coming concurrently. They are coming in back to back and each query involves a lot of SELECTs and JOINs, and there are a few INSERTs and UPDATEs as well. The apache error logs show that it is just a simple timeout, where the function to process the client posted data takes over 90 seconds.
If it's really Postgres, then you should turn on the logging of slow statements in the Postgres configuration to find out which statement exactly is taking so much time.
This can be done by setting the configuration property log_min_duration.
Details are in the manual:
http://www.postgresql.org/docs/current/static/runtime-config-logging.html#GUC-LOG-MIN-DURATION-STATEMENT
You say the function makes "many database calls" so I'd start with a very low number, or even 0 to log the duration of all statements, then you might be able to identify the slow ones.
It could also be a locking issued. Maybe the first call does not end its transaction properly and subsequent calls run into a timeout when waiting for a resource.
You can verify this by checking the system view pg_locks after the first call.
Have you checked the Apache error_logs? Have you set django DEBUG = True or ADMINS = ('email#addr.com',) so you can get a detailed error report about what the actual cause of the issue is? If so, how about pasting some information here.
Why are you certain that it's postgres? Have you done diagnostics to come to that conclusion? If so, please let us know.
Are you running apache with mod_wsgi? How many processes and threads have you allocated to your django application?
Also, 20 seconds to process the first transaction is a huge amount of time. Perhaps you could show us the view code that is causing the time out. We may be able to help there.
I sincerely doubt that it's going to be postgres alone that is causing the issue. It probably has something to do with application code, or server configuration.