Celery eventlet worker threads using too many database connections - django

I have 2 celery workers which pool via eventlet, config is below:
celery multi start w1 w2 -A proj -l info --time-limit=600 -P eventlet -c 1000
When running more than 100 tasks at a time, I get hit by the error:
OperationalError: FATAL: remaining connection slots are reserved for
non-replication superuser connections
I'm running on PostgreSQL with max. connections set at the default of 100.
From what I read online, I thought worker threads in the pools would share the same DB connection. However, mine seem to try and create one connection per thread, which is why the error occurs.
Any ideas?
Thanks!

Django has (or had?) idle DB connection reuse to avoid overhead of creating new connection for each request. Idle reuse is not relevant in this scenario.
Django never had limiting DB connection pool. (please correct if wrong)
Consider overall design:
how many tasks do you need to execute concurrently? (real numbers are often not nice powers of 10)
how many simultaneous connections from this application can your database sustain?
do you need to place artificial bottlenecks (pools) or do you need to increase limits and use available hardware?
Consider using external [Postgresql connection pool] (google terms in square braces) or include one somewhere in your application.

Related

How does database connection pooling work with Celery (& Django) for prefork & gevent connection types?

I have a django server, alongside a celery background worker, both of them interact with a Postgres database.
I have a single celery worker running gevent with 500 concurrency flag. This gives 500 threads under a single worker to run and execute tasks. My question is do all of these threads try to use the same database connection? Or will it try to create 500 connections.
In a prefork pool, does it create a connection per process?
I saw in django documentation (https://docs.djangoproject.com/en/4.1/ref/databases/#connection-management), it says that it allows persistent connections, so connections are reused, but I'm not sure how this translates to celery?

Strategy for Asynchronous database access with Qt5 SQL

I need to create a server in Qt C++ with QTcpServer which can handle so many requests at the same time. nearly more than 1000 connections and all these connection will constantly need to use database which is MariaDB.
Before it can be deployed on main servers, It needs be able to handle 1000 connections with each connection Querying data as fast it can on 4 core 1 Ghz CPU with 2GB RAM Ubuntu virtual machine running on cloud. MySQL database is hosted on some other server which more powerful
So how can I implement this ? after googling around, I've come up with following options
1. Create A new QThread for each SQL Query
2. Use QThreadPool for new SQL Query
For the fist one, it might will create so many Threads and it might slow down system cause of so many context switches.
For second one,after pool becomes full, Other connections have to wait while MariaDB is doing its work. So what is the best strategy ?
Sorry for bad english.
1) Exclude.
2) Exclude.
3) Here first always doing work qt. Yes, connections (tasks for connections) have to wait for available threads, but you easy can add 10000 tasks to qt threadpool. If you want, configure max number of threads in pool, timeouts for tasks and other. Ofcourse your must sync shared data of different threads with semaphore/futex/mutex and/or atomics.
Mysql (maria) it's server, and this server can accept many connections same time. This behaviour equally what you want for your qt application. And mysql it's just backend with data for your application.
So your application it's server. For simple, you must listen socket for new connections and save this clients connections to vector/array and work with each client connection. Always when you need something (get data from mysql backend for client (yeah, with new, separated for each client, onced lazy connection to mysql), read/write data from/to client, close connection, etc.) - you create new task and add this task to threadpool.
This is very simple explanation but hope i'm helped you.
Consider for my.cnf [mysqld] section
thread_handling=pool-of-threads
Good luck.

uWSGI equivalent for Django-Channels

I am aware that Django is request/response cycle and Django Channels is different, my question is not about this.
We know that uWSGI/gunicorn creates worker processes and can be configured execute each request in threads. So that it can serve 10 requests "concurrently" (not in parallel) in a single uWSGI worker process with 10 threads.
Now let's assume that each web client wants to create a websocket using Django Channels, from my limited understanding (with vanilla implementation), that it will process each message in a single thread, which means, to process x amount of connections concurrently, you need x amount of channel worker processes. I know someone will suggest to increase the number of processes, I am not here to debate on this.
My question is simply are there any existing libraries that does similar job with uWSGI/gunicorn that execute consumer functions in threads?
I think you are asking for daphne. It is mentioned in channels document itself.
Daphne provides an option to scale process using a shared FD. Unfortunately, it is not working as expected.
Rightnow, a better alternative is to use uvicorn. You can run multiple workers with
$ uvicorn project.asgi --workers 4
I have been using this in production and it seems good enough.

Django request threads and persistent database connections

I was reading about CONN_MAX_AGE settings and the documentation says:
Since each thread maintains its own connection, your database must support at least as many simultaneous connections as you have worker threads.
So I wonder, On uWSGI, how does a Django process maintain it's own threads, does it spawn new thread for each request and kill it at the end of request?
If yes, how does a ceased thread maintain the connection?
Django is not in control of any threads (well... maybe in development server, but it's pretty simple), but uWSGI is. uWSGI will spawn some threads, depending on it's configuration and in each thread it will run django request handling.
Spawning threads may be dynamic or static, it can be strictly 4 threads or dynamic from 2 to 12 depending on load.
And no, there is no new thread on each request because that will allow someone to kill your server by making many concurrent connections to it because it will spawn so many threads that no server will take it.
Requests are handled one by one on each thread, main uWSGI process will round-robin requests between threads. If there are more requests than threads, some of them will wait until others are finished
In uWSGI there are also workers - independent processes that can spawn own threads so load can be better spreaded.
Also you can have multiple uWSGI servers and tell your HTTP server (apache, proxy) to spread requests between them. That way you can even serve your uWSGI instances on different machines and it will all look like from the outside as one big server.

How does Heroku determines the number of web processes to run per dyno?

I'm using Heroku to host a django application, and I'm using Waitress as my web server.
I run 2 (x2) dynos, And I see in New Relic instance tab that I have 10 instances running.
I was wondering How does Heroku determines the number of web server processes to run on one Dyno when using Waitress?
I know that when using Gunicorn there is a way to set the number of proccess per dyno, but didn't see any way to define it in Waitress.
Thanks!
In Waitress, there is a master process and (by default) 4 worker threads or processes. You can change this if you wish. Here is the docs for these options for waitress-serve
http://waitress.readthedocs.org/en/latest/runner.html#runner
--threads=INT
Number of threads used to process application logic, default is 4.
So if you have 2 dynos, and 5 (4+1) threads on each, then the total would come to 10 instances for this app in the RPM dashboard.
One can add more processes to the dynos as the maximum supported on Heroku 2x dynos is much higher:
2X dynos support no more than 512
https://devcenter.heroku.com/articles/dynos#process-thread-limits
But, you may want to check out some discussion on tuning this vs Gunicorn:
Waitress differs, in that it has an async master process that buffers the
entire client body before passing it onto a sync worker. Thus, the server
is resilient to slow clients, but is also guaranteed to process a maximum
of (default) 4 requests at once. This saves the database from overload, and
makes scaling services much more predictable.
Because waitress has no external dependencies, it also keeps the heroku
slug size smaller.
https://discussion.heroku.com/t/waitress-vs-gunicorn-for-docs/33
So after talking to the New relic support they clarified the issue.
Apparently only processes are counted in the instances tab (threads do not count).
in my Procfile I am also monitoring RabbitMQ workers which add instances to the instance tab, and hence the mismatch.
To quote their answer :
I clarified with our developers how exactly we measure instances for the Instances tab. The Python agent views each monitored process as one instance. Threads do not count as additional instances. I noticed that you're monitoring not only your django/waitress app, but also some background tasks. It looks like the background tasks plus the django processes are adding up to that total of 10 processes being monitored.