Django; ThreadPoolExecutor causes FATAL; sorry, too many clients - django

I'm aware this error has already been posted but the context of my issue is different. I have tried proposed solutions without success.
I'm using django-apscheduler to schedule some management commands, one of these uses ThreadPoolExecutor to speed up an otherwise lengthy process.
Unfortunately, it does not close connections, which eventually leads to:
exception:connection to server at "localhost" (::1), port 5432 failed: FATAL: sorry, too many clients already
I'm using PostgreSQL backend, and when I run the management command directly I can see in pgAdmin that the number of database sessions reduces after execution.
However, when ran by django-apscheduler they accumulate and eventually cause FATAL: sorry, too many clients already.
I've tried calling close_old_connections() from from django.db import close_old_connections but that doesn't seem to do anything.
Can someone point me in the right direction?
This is what my management command looks like:
import concurrent
from django.core.management.base import BaseCommand
from data_model.models import DataModel
class Command(BaseCommand):
help = "Do something interesting"
def handle(self, *args, **kwargs):
with concurrent.futures.ThreadPoolExecutor(10) as executor:
list(executor.map(DataModel.objects.do_something_interesting))
I did also try adding expiration time CONN_MAX_AGE to DB config but that didn't help:
DATABASES = {
"default": {
"ENGINE": "django.db.backends.postgresql_psycopg2",
"NAME": env.str("POSTGRES_DB"),
"USER": env.str("POSTGRES_USER"),
"PASSWORD": env.str("POSTGRES_PASSWORD"),
"HOST": env.str("POSTGRES_HOST"),
"PORT": 5432,
"CONN_MAX_AGE": 60
},
}
Management command execution completed where the arrow is:

Related

Pytest on Flask based API - test by calling the remote API

New to using Pytest on APIs. From my understanding, testing creates another instance of Flask. Additionally, from the tutorials I have seen, they also suggest to create a separate DB table instance to add, fetch and remove data for test purposes. However, I simply plan to use the remote api URL as host to simply make the call.
Now, I set my conftest like this, where the flag --testenv would indicate to make the get/post call on the host listed below:
import pytest
import subprocess
def pytest_addoption(parser):
"""Add option to pass --testenv=api_server to pytest cli command"""
parser.addoption(
"--testenv", action="store", default="exodemo", help="my option: type1 or type2"
)
#pytest.fixture(scope="module")
def testenv(request):
return request.config.getoption("--testenv")
#pytest.fixture(scope="module")
def testurl(testenv):
if testenv == 'api_server':
return 'http://api_url:5000/'
else:
return 'http://locahost:5000'
And my test file is written like this:
import pytest
from app import app
from flask import request
def test_nodes(app):
t_client = app.test_client()
truth = [
{
*body*
}
]
res = t_client.get('/topology/nodes')
print (res)
assert res.status_code == 200
assert truth == json.loads(res.get_data)
I run the code using this:
python3 -m pytest --testenv api_server
The thing I expect is that the test file would simply make a call to the remote api with the creds, fetch the data regardless of how it gets pulled in the remote code, and bring it here for assertion. However, I am getting the 400 BAD REQUEST error, with the error being like this:
assert 400 == 200
E + where 400 = <WrapperTestResponse streamed [400 BAD REQUEST]>.status_code
single_test.py:97: AssertionError
--------------------- Captured stdout call ----------------------
{"timestamp": "2022-07-28 22:11:14,032", "level": "ERROR", "func": "connect_to_mysql_db", "line": 23, "message": "Error connecting to the mysql database (2003, \"Can't connect to MySQL server on 'mysql' ([Errno -3] Temporary failure in name resolution)\")"}
<WrapperTestResponse streamed [400 BAD REQUEST]>
Does this mean that the test file is still trying to lookup the database locally for fetching? I am unable to figure out on which host are they sending the test url as well, so I am kind of stuck here. Looking to get some help around here.
Thanks.

Google Cloud Scheduler: Why cloud function runs successfully but logger still shows error?

I set up a google cloud scheduler job that triggers a cloud function through HTTP. I can be sure that the cloud function is triggered and runs successfully - it has produced the expected outcome.
However, the scheduler job still shows "failed" and the logger is like:
{
"insertId": "8ca551232347v49",
"jsonPayload": {
"jobName": "projects/john/locations/asia-southeast2/jobs/Get_food",
"status": "UNKNOWN",
"url": "https://asia-southeast2-john.cloudfunctions.net/Get_food",
"#type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptFinished",
"targetType": "HTTP"
},
"httpRequest": {},
"resource": {
"type": "cloud_scheduler_job",
"labels": {
"job_id": "Get_food",
"location": "asia-southeast2",
"project_id": "john"
}
},
"timestamp": "2020-10-22T04:08:24.521610728Z",
"severity": "ERROR",
"logName": "projects/john/logs/cloudscheduler.googleapis.com%2Fexecutions",
"receiveTimestamp": "2020-10-22T04:08:24.521610728Z"
}
I have pasted the cloud function code below with edits necessary to remove sensitive information:
import requests
import pymysql
from pymysql.constants import CLIENT
from google.cloud import storage
import os
import time
from DingBot import DING_BOT
from decouple import config
import datetime
BUCKET_NAME = 'john-test-dataset'
FOLDER_IN_BUCKET = 'compressed_data'
LOCAL_PATH = '/tmp/'
TIMEOUT_TIME = 500
def run(request):
"""Responds to any HTTP request.
Args:
request (flask.Request): HTTP request object.
Returns:
The response text or any set of values that can be turned into a
Response object using
`make_response <http://flask.pocoo.org/docs/1.0/api/#flask.Flask.make_response>`.
"""
while True:
# some code that will be break the loop in about 200 seconds
DING_BOT.send_text(msg)
return 'ok'
what I can be sure of is that the line right before the end of the fucntion, DING_BOT.send_text(msg) executed successfully. I have received the text message.
What cloud be wrong here?
It's a common problem because of partial UI of Google Cloud Console. So, I took the hypothesis that you set up your scheduler only with the console.
So, you need to create, or to update it with command line (GCLOUD) or API (but GCLOUD is easier), to add the "attempt-deadline" parameter.
In fact Cloud Scheduler also have a timeout (60s by default)and if the URL don't answer in this timeframe, the call is considered as fail
Increase this param to 250s, and it should be OK.
Note: you can also set retry policies with the CLI, it could be interesting if you need it!

Testing Celery Beat

i work on a celery beat task within a django project which creates Database entries periodically. I know so beacuse when i set the task up like this :
celery.py:
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
from celery.schedules import crontab
app = Celery("clock-backend", broker=os.environ.get("RABBITMQ_URL"))
app.config_from_object("django.conf:settings", namespace="CELERY")
app.conf.beat_schedule = {
'create_reports_monthly': {
'task': 'project_celery.tasks.create_reports_monthly',
'schedule': 10.0,
},
}
app.autodiscover_tasks()
And start my project it really creates an object every 10 seconds.
But what i really want to do is to set it up to run every first day of a month.
To do so i would change "schedule": crontab(0, 0, day_of_month="1").
Here comes my actual problem : How do i test that this really works ?
And by testing i mean actual (unit)tests.
What I've tried is to work with a package called freezegun.
A test with this looks like this :
def test_start_of_month_report_creation(self, user_object, contract_object, report_object):
# set time to the last day of January
with freeze_time("2019-01-31 23:59:59") as frozen_time:
# let one second pass
frozen_time.tick()
# give the celery task some time
time.sleep(20)
# Test Logic to check whether the object was created
# Example: assert MyModel.objects.count() > 0
But this did not work. I suspect that the celery beat does not use the time set via freezgun/python but the real "hardware" clock.
I've also tried setting the Hardwareclock like here but this did not work in my setup.
I'm thankful for any comments, remarks or help on this topic since i'd really like to implement a test for this.
Unit tests cannot test third-party libraries.
You can set the system log, to keep track.
You can check if your task is already on model PeriodicTask. This model defines a single periodic task to be run. It must be associated with a schedule, which defines how often the task should run.

Django Channels Worker not responding to websocket.connect

I'm having a problem with django channels. Daphne accepts WebSocket CONNECT requests properly, but then the workers doesn't respond to the request with the supplied method in consumers.py. The thing is this happens only most of the time. Sometimes it responds with the method in the consumers.py but most of the time the worker doesn't respond at all. I have a duplicate code working fine in vagrant (trusty64) environment, but the code behaves like that in an actual trusty64 machine. It should be noted that the trusty64 machine that hosts the app also has other application running (about 4 apps running at the same time).
I have a routing.py set up like this
from channels import route
from app.consumers import connect_tracking, disconnect_tracking
channel_routing = [
route("websocket.connect", connect_tracking, path=r'^/websocket/tms/tracking/stream/$'),
route("websocket.disconnect", disconnect_tracking, path=r'^/websocket/tms/tracking/stream/$'),
]
with the corresponding consumers.py that looks like this
import json
from channels import Group
from channels.sessions import channel_session
from channels.auth import http_session_user, channel_session_user, channel_session_user_from_http
from django.conf import settings
#channel_session_user_from_http
def connect_tracking(message):
group_name = settings.TRACKING_GROUP_NAME
print "%s is joining %s" % (message.user, group_name)
Group(group_name).add(message.reply_channel)
#channel_session_user
def disconnect_tracking(message):
group_name = settings.TRACKING_GROUP_NAME
print "%s is joining %s" % (message.user, group_name)
Group(group_name).discard(message.reply_channel)
and some channels related lines in settings.py like this
redis_host = os.environ.get('REDIS_HOST', 'localhost')
CHANNEL_LAYERS = {
"default": {
# This example app uses the Redis channel layer implementation asgi_redis
"BACKEND": "asgi_redis.RedisChannelLayer",
"CONFIG": {
"hosts": [(redis_host, 6379)],
},
"ROUTING": "tms_app.routing.channel_routing",
},
}
referencing another question, I've tried running daphne and worker like this
daphne tms_app.asgi:channel_layer --port 9015 --bind 0.0.0.0 -v2
python manage.py runworker -v3
I've captured daphne's and the worker's log, it looks like this
Daphne log :
2016-12-30 17:00:18,870 INFO Starting server at 0.0.0.0:9015, channel layer tms_app.asgi:channel_layer
2016-12-30 17:00:26,788 DEBUG WebSocket open for websocket.send!APpWONQKKDXR
192.168.31.197:48933 - - [30/Dec/2016:17:00:26] "WSCONNECT /websocket/tms/tracking/stream/" - -
2016-12-30 17:00:26,790 DEBUG Upgraded connection http.response!sqlMPEEtolDP to WebSocket websocket.send!APpWONQKKDXR
corresponding worker log :
2016-12-30 17:00:22,265 - INFO - runworker - Running worker against channel layer default (asgi_redis.core.RedisChannelLayer)
2016-12-30 17:00:22,265 - INFO - worker - Listening on channels http.request, websocket.connect, websocket.disconnect, websocket.receive
As you can see when there's a WSCONNECT event, the worker doesn't respond to it.
There's another question that's close to this issue that was solved by downgrading Twisted to 16.2 but it doesn't work for me.
UPDATE January 3, 2017
I cannot replicate the issue on a local vagrant machine despite using the same code and same settings for nginx, supervisor, gunicorn and daphne. I tried changed the channel layers settings so it uses IPC instead of redis and it works. Here's the settings :
CHANNEL_LAYERS = {
"default": {
"BACKEND": "asgi_ipc.IPCChannelLayer",
"ROUTING": "tms_app.routing.channel_routing",
"CONFIG": {
"prefix": "tms",
},
},
}
However this does not solve the current problem as I intend to use Redis channel layers because it's more easier to scale compared to IPC. Does this mean there's something wrong with my redis server?
I think the reason your Connection doesnt complete is because you are not sending the accept message like this:
message.reply_channel.send({'accept': True})
This is what works for my version of Channels, but you should make check the docs for your version to make sure what works for you

Running periodic tasks with django and celery

I'm trying create a simple background periodic task using Django-Celery-RabbitMQ combination. I installed Django 1.3.1, I downloaded and setup djcelery. Here is how my settings.py file looks like:
BROKER_HOST = "127.0.0.1"
BROKER_PORT = 5672
BROKER_VHOST = "/"
BROKER_USER = "guest"
BROKER_PASSWORD = "guest"
....
import djcelery
djcelery.setup_loader()
...
INSTALLED_APPS = (
'djcelery',
)
And I put a 'tasks.py' file in my application folder with the following contents:
from celery.task import PeriodicTask
from celery.registry import tasks
from datetime import timedelta
from datetime import datetime
class MyTask(PeriodicTask):
run_every = timedelta(minutes=1)
def run(self, **kwargs):
self.get_logger().info("Time now: " + datetime.now())
print("Time now: " + datetime.now())
tasks.register(MyTask)
And then I start up my django server (local development instance):
python manage.py runserver
Then I start up the celerybeat process:
python manage.py celerybeat --logfile=<path_to_log_file> -l DEBUG
I can see entries like this in the log:
[2012-04-29 07:50:54,671: DEBUG/MainProcess] tasks.MyTask sent. id->72a5963c-6e15-4fc5-a078-dd26da663323
And I also can see the corresponding entries getting created in database, but I can't find where it is logging the text I specified in the actual run function in MyTask class.
I tried fiddling with the logging settings, tried using the django logger instead of celery logger, but of no use. I'm not even sure, my task is getting executed. If I print any debug information in the task, where does it go?
Also, this is first time I'm working with any type of message queuing system. It looks like the task will get executed as part of the celerybeat process - outside the django web framework. Will I still be able to access all the django models I created.
Thanks,
Venkat.
Celerybeat it stuff, which pushes task when it need, but not executing them. You tasks instances stored in RabbitMq server. You need to execute celeryd daemon for executing your tasks.
python manage.py celeryd --logfile=<path_to_log_file> -l DEBUG
Also if you using RabbitMq, I recommend to you to install special rabbitmq management plugins:
rabbitmq-plugins list
rabbitmq-enable rabbitmq_management
service rabbitmq-server restart
It will be available at http://:55672/ login: guest pass: guest. Here you can check how many tasks in your rabbit instance online.
You should check the RabbitMQ logs, since celery sends the tasks to RabbitMQ and it should execute them. So all the prints of the tasks should be in RabbitMQ logs.