Django 1.11 PostgreSQL - "SET TIME ZONE" command on every session - django

We are working out a couple of performance issues on one of our web sites, and we have noticed that the command "SET TIME ZONE 'America/Chicago'" is being executed so often, that in a 24 hour period, just under 1 hour (or around 4% of total DB CPU resources) is spent running that command.
Note that the "USE_TZ" setting is False, so based on my understanding, everything should be stored as UTC in the database, and only converted in the UI to the local time zone when necessary.
Do you have any ideas on how we can remove this strain on the database server?

For postgres Django always sets timezone: either server's local (when USE_TZ = False) or UTC (When USE_TZ = True). That way django supports "live switching" of settings.USE_TZ for postgreSQL DB backend.
How have you actually determined that this is a bottle-neck?
Usually SET TIME ZONE is only called during creation of connection to DB. Maybe you should use persistent connections by using settings.DATABASES[...]['CONN_MAX_AGE'] = GREATER_THAN_ZERO (docs). That way connections will be reused and you'll have less calls to SET TIME ZONE. But if you use that approach you should also take closer look at your PostgreSQL configuration:
max_connections should be greater than 1+maximum concurrency of wsgi server + max number of simultaneous cron jobs that use django (if you have them) + maximum concurrency of celery workers (if you have them) + any other potential sources of connections to postgres
if you are running cron job to call pg_terminate_backend then make sure that CONN_MAX_AGE is greater than "idle timeout"
if you are running postgres on VPS, then in some cases there might be
limits on number of open sockets)
if you are using something like pgbouncer then it may already be reusing connections
if you are killing server that serves your django project with sigkill (kill -9) then it may leave some unclosed connections to DB (but I'm not sure)
I think this may also happen if you use django.utils.timezone.activate. But I'm not sure of it... This may happen if you manually call it in your code or when you are using middleware to do this
Other possible explaining: the way youre are "profiling" your requests actually shows you the time of whole transaction

Related

How can my Heroku Flask web application support N users concurrently downloading an image file?

I am working on a Flask web application using Heroku. As part of the application, users can request to download an image from the server. That calls a function which has to then retrieve multiple images from my cloud storage (about 500 KB in total), apply some formatting, and return a single image (about 60 KB). It looks something like this:
#app.route('/download_image', methods=['POST'])
def download_image():
# Retrieve about 500 KB of images from cloud storage
base_images = retrieve_base_images(request.form)
# Apply image formatting into a single image
formatted_image = format_images(base_images)
# Return image of about 60 KB for download
formatted_image_file = io.BytesIO()
formatted_image.save(formatted_image_file, format='JPEG')
formatted_image_data = formatted_image_file.getvalue()
return Response(formatted_image_data,
mimetype='image/jpeg',
headers={'Content-Disposition': 'attachment;filename=download.jpg'})
My Procfile is
web: gunicorn my_app:app
How can I design/configure this to support N concurrent users? Let's say, for example, I want to make sure my application can support 100 different users all requesting to download an image at the same time. With several moving parts, I am unsure how to even go about doing this.
Also, if someone requests a download but then loses internet connection before their download is complete, would this cause some sort of lock that could endlessly stall, or would that thread/process automatically timeout after a short period and therefore be handled smoothly?
I currently have 1 dyno (on the Heroku free plan). I am willing to add more dynos if needed.
Run multiple Gunicorn workers:
Gunicorn forks multiple system processes within each dyno to allow a Python app to support multiple concurrent requests without requiring them to be thread-safe. In Gunicorn terminology, these are referred to as worker processes (not to be confused with Heroku worker processes, which run in their own dynos).
…
We recommend setting a configuration variable for this setting. Gunicorn automatically honors the WEB_CONCURRENCY environment variable, if set.
heroku config:set WEB_CONCURRENCY=3
Note that Heroku sets a default WEB_CONCURRENCY for you based on your dyno size. You can probably handle a small number of concurrent requests right now.
However, you're not going to get anywhere close to 100 on a free dyno. This section appears between the previous two in the documentation:
Each forked system process consumes additional memory. This limits how many processes you can run in a single dyno. With a typical Django application memory footprint, you can expect to run 2–4 Gunicorn worker processes on a free, hobby or standard-1x dyno. Your application may allow for a variation of this, depending on your application’s specific memory requirements.
Even if your application is very lightweight you probably won't be able to go above 6 workers on a single small dyno. Adding more dynos and / or increasing the number of dynos you run will be required.
Do you really need to support 100 concurrent requests? If you have four workers going, four users' requests can be served at the same time. If a fifth makes a request, that request just won't get responded to until one of the workers frees up. That's usually reasonable.
If your request takes an unreasonable amount of time to complete you have a few options besides adding more workers:
Can you cache the generated images?
Can you return a response immediately, create the images in a background job, and then notify the user that the images are ready? With some fancy front-end work this can be fairly transparent to the end user.
The right solution will depend on your specific use case. Good luck!

How to improve web-service api throughput?

I'm new to creating web service. So I'd like to know what i'm missing out on performance (assuming i'm missing something).
I've build a simple flask app. Nothing fancy, it just reads from the DB and responds with the result.
uWSGI is used for WSGI layer. I've run multiple tests and set process=2 and threads=5 based on performance monitoring.
processes = 2
threads = 5
enable-threads = True
AWS ALB is used for load balancer. uWSGI and Flask app is dockerized and launched in ECS (3 container [1vCPU]).
The for each DB hit, the flask app takes 1 - 1.5 sec to get the data. There is no other lag on the app side. I know it can be optimised. But assuming that the request processing time takes 1 - 1.5 sec, can the throughput be increased?
The throughput I'm seeing is ~60 request per second. I feel it's too low. Is there any way to increase the throughput with the same infra ?
Am i missing something here or is the throughput reasonable given that the DB hit takes 1.5 sec?
Note : It's synchronous.

Celery task fails midway while pulling data from a large database

I'm running a periodic task using celery in a django-rest application that pulls data from a large Postgres database with multiple tables, the task starts well and pulls some data for about 50 mins and then fails with this error
client_idle_timeout
server closed the connection unexpectedly, This probably means the server terminated abnormally before or while processing the request.
What could be the issue causing this and how can I go about to fix it?
It most likely means that your PostgreSQL has limit on how long transaction can take (idle in transaction), or how long session can be (session timeout).
This is probably happening because of a typical, incorrect way of dealing with databases (I've seen this done even by senior developers) - process creates a database session, and then starts doing some business logic that may take long time to finish, while DB data has been either partially updated or inserted. Code written in such way is doomed to fail because of timeouts enforced by PostgreSQL.

Do I need to use celery.result.forget when using a database backend?

I've come across the following warning:
Backends use resources to store and transmit results. To ensure that resources are released, you must eventually call get() or forget() on EVERY AsyncResult instance returned after calling a task.
I am currently using the django-db backend and I am wondering about the consequences of not heeding this warning. What resources will not be "released" if I don't forget an AsyncResult? I'm not worried about cleaning up task results from my database. My primary concern is with the availability of workers being affected.
I've actually never seen that warning. As long as you're running celery beat, you'll be fine. Celery has a default periodic task that it sets up for you scheduled to run at 4:00 AM. That task deletes any expired results in your database if you are using a db-based backend like postgres or mysql.
Celery seems to have a setting for this which is result_expires. The documentation explains it all:
result_expires
Default: Expire after 1 day.
Time (in seconds, or a timedelta object) for when after stored task tombstones will be deleted.
But as #2ps mentioned, celery-beat must be running for database backends which as documented tells that:
A built-in periodic task will delete the results after this time (celery.backend_cleanup), assuming that celery beat is enabled. The task runs daily at 4am.
For other types of backends e.g. AMQP, it seems not necessary as documented:
Note
For the moment this only works with the AMQP, database, cache, Couchbase, and Redis backends.
When using the database backend, celery beat must be running for the results to be expired.

Memory issues on RDS PostgreSQL instance / Rails 4

We are running into a memory issues on our RDS PostgreSQL instance i. e. Memory usage of the postgresql server reaches almost 100% resulting in stalled queries, and subsequent downtime of production app.
The memory usage of the RDS instance doesn't go up gradually, but suddenly within a period of 30min to 2hrs
Most of the time this happens, we see that lot of traffic from bots is going on, though there is no specific pattern in terms of frequency. This could happen after 1 week to 1 month of the previous occurence.
Disconnecting all clients, and then restarting the application also doesn't help, as the memory usage again goes up very rapidly.
Running "Full Vaccum" is the only solution we have found that resolves the issue when it occurs.
What we have tried so far
Periodic vacuuming (not full vacuuming) of some tables that get frequent updates.
Stopped storing Web sessions in DB as they are highly volatile and result in lot of dead tuples.
Both these haven't helped.
We have considered using tools like pgcompact / pg_repack as they don't acquire exclusive lock. However these can't be used with RDS.
We now see a strong possibility that this has to do with memory bloat that can happen on postgresql with prepared statements in rails 4, as discussed in following pages:
Memory leaks on postgresql server after upgrade to Rails 4
https://github.com/rails/rails/issues/14645
As a quick trial, we have now disabled prepared statements in our rails database configuration, and are observing the system. If the issue re-occurs, this hypothesis would be proven wrong.
Setup details:
We run our production environment inside Amazon Elastic Beanstalk, with following configuration:
App servers
OS : 64bit Amazon Linux 2016.03 v2.1.0 running Ruby 2.1 (Puma)
Instance type: r3.xlarge
Root volume size: 100 GiB
Number of app servers : 2
Rails workers running on each server : 4
Max number of threads in each worker : 8
Database pool size : 50 (applicable for each worker)
Database (RDS) Details:
PostgreSQL Version: PostgreSQL 9.3.10
RDS Instance type: db.m4.2xlarge
Rails Version: 4.2.5
Current size on disk: 2.2GB
Number of tables: 94
The environment is monitored with AWS cloudwatch and NewRelic.
Periodic vacuum should help in containing table bloat but not index bloat.
1)Have you tried more aggressive parameters of auto-vacuum ?
2)Tried routine reindexing ? If locking is a concern then consider
DROP INDEX CONCURRENTLY ...
CREATE INDEX CONCURRENTLY ...