I have a suit of tests for slick in playframework environment, but when they are run, the statements are duplicated in logs, and it seems that they are duplicated as many times as number of tests. So if I have only one test I have one e.g. insert and select statement, but with few I have situation like that:
[info] c.z.h.HikariDataSource - HikariCP pool db is starting.
[info] c.z.h.HikariDataSource - HikariCP pool db is starting.
[info] a.e.s.Slf4jLogger - Slf4jLogger started
[info] c.z.h.HikariDataSource - HikariCP pool db is starting.
[info] c.z.h.HikariDataSource - HikariCP pool db is starting.
[info] a.e.s.Slf4jLogger - Slf4jLogger started
[info] a.e.s.Slf4jLogger - Slf4jLogger started
[debug] s.j.J.statement - Preparing insert statement (returning: ID): insert into `USER` (`EMAIL`,`PASSWORD`,`ACTIVATION_TOKEN`,`ACTIVATED`,`CREATED`) values (?,?,?,?,?)
[debug] s.j.J.statement - Preparing insert statement (returning: ID): insert into `USER` (`EMAIL`,`PASSWORD`,`ACTIVATION_TOKEN`,`ACTIVATED`,`CREATED`) values (?,?,?,?,?)
[debug] s.j.J.statement - Preparing statement: select `ACTIVATION_TOKEN`, `PASSWORD`, `ACTIVATED`, `CREATED`, `EMAIL`, `ID` from `USER` where `EMAIL` = 'test#example.com' limit 1
[debug] s.j.J.statement - Preparing statement: select `ACTIVATION_TOKEN`, `PASSWORD`, `ACTIVATED`, `CREATED`, `EMAIL`, `ID` from `USER` where `EMAIL` = 'test#example.com' limit 1
[info] c.z.h.p.HikariPool - HikariCP pool db is shutting down.
[info] c.z.h.p.HikariPool - HikariCP pool db is shutting down.
[info] c.z.h.HikariDataSource - HikariCP pool db is starting.
[info] c.z.h.HikariDataSource - HikariCP pool db is starting.
[info] c.z.h.p.HikariPool - HikariCP pool db is shutting down.
[info] c.z.h.p.HikariPool - HikariCP pool db is shutting down.
In database state results look correct - there are no redundant statements, only these logs concerns me. "Slf4jLogger started" message is 10 times, so it's not because of parallel execution of tests. Number of duplicates is up to number of tests (currently 4), they are placed sequentially so no other statements from other parallel execution between.
Unit code:
class UserSpec extends PlaySpecification {
val userRepo = Injector.inject[UserRepo]
import scala.concurrent.ExecutionContext.Implicits.global
def fakeApp: FakeApplication = {
FakeApplication(additionalConfiguration =
Map(
"slick.dbs.default.driver" -> "slick.driver.H2Driver$",
"slick.dbs.default.db.driver" -> "org.h2.Driver",
"slick.dbs.default.db.url" -> "jdbc:h2:mem:test;MODE=MySQL;DATABASE_TO_UPPER=FALSE"
))}
"User" should {
"be created as not activated" in new WithApplication(fakeApp) {
val email = "test#example.com"
val action = userRepo.create(email, "Password")
.flatMap(_ => userRepo.findByEmail(email))
val result = Await.result(action, Duration.Inf)
result must not(beNone)
result.map {
case User(id, email2, _, _, activated, _) => {
activated must beFalse
email2 must beEqualTo(email)
}
}
}
"cannot be create if same email" in new WithApplication(fakeApp) {
val email = "test2#example.com"
val action = userRepo.create(email, "Password")
.flatMap(_ => userRepo.create(email, "Password"))
val result = Await.result(action, Duration.Inf)
result must beNone
}
// rest omitted
}
Related
I have a simple Flask app that starts with Gunicorn which has 4 workers.
I want to clear and warmup cache when server restarted. But when I do this inside create_app() method it is executing 4 times.
def create_app(test_config=None):
app = Flask(__name__)
# ... different configuration here
t = threading.Thread(target=reset_cache, args=(app,))
t.start()
return app
[2022-10-28 09:33:33 +0000] [7] [INFO] Booting worker with pid: 7
[2022-10-28 09:33:33 +0000] [8] [INFO] Booting worker with pid: 8
[2022-10-28 09:33:33 +0000] [9] [INFO] Booting worker with pid: 9
[2022-10-28 09:33:33 +0000] [10] [INFO] Booting worker with pid: 10
2022-10-28 09:33:36,908 INFO webapp reset_cache:38 Clearing cache
2022-10-28 09:33:36,908 INFO webapp reset_cache:38 Clearing cache
2022-10-28 09:33:36,908 INFO webapp reset_cache:38 Clearing cache
2022-10-28 09:33:36,909 INFO webapp reset_cache:38 Clearing cache
How to make it only one-time without using any queues, rq-workers or celery?
Signals, mutex, some special check of worker id (but it is always dynamic)?
Tried Haven't found any solution so far.
I used Redis locks for that.
Here is an example using flask-caching, which I had in project, but you can replace set client from whatever place you have redis client:
import time
from webapp.models import cache # cache = flask_caching.Cache()
def reset_cache(app):
with app.app_context():
client = app.extensions["cache"][cache]._write_client # redis client
lock = client.lock("warmup-cache-key")
locked = lock.acquire(blocking=False, blocking_timeout=1)
if locked:
app.logger.info("Clearing cache")
cache.clear()
app.logger.info("Warming up cache")
# function call here with `cache.set(...)`
app.logger.info("Completed warmup cache")
# time.sleep(5) # add some delay if procedure is really fast
lock.release()
It can be easily extended with threads, loops or whatever you need to set value to cache.
My Django project used to work perfectly fine for the last 90 days.
There has been no new code deployment during this time.
Running supervisor -> gunicorn to serve the application and to the front nginx.
Unfortunately it just stopped serving the login page (standard framework login).
I wrote a small view that checks if the DB connection is working and it comes up within seconds.
def updown(request):
from django.shortcuts import HttpResponse
from django.db import connections
from django.db.utils import OperationalError
status = True
# Check database connection
if status is True:
db_conn = connections['default']
try:
c = db_conn.cursor()
except OperationalError:
status = False
error = 'No connection to database'
else:
status = True
if status is True:
message = 'OK'
elif status is False:
message = 'NOK' + ' \n' + error
return HttpResponse(message)
This delivers back an OK.
But the second I am trying to reach /admin or anything else requiring the login, it times out.
wget http://127.0.0.1:8000
--2022-07-20 22:54:58-- http://127.0.0.1:8000/
Connecting to 127.0.0.1:8000... connected.
HTTP request sent, awaiting response... 302 Found
Location: /business/dashboard/ [following]
--2022-07-20 22:54:58-- http://127.0.0.1:8000/business/dashboard/
Connecting to 127.0.0.1:8000... connected.
HTTP request sent, awaiting response... 302 Found
Location: /account/login/?next=/business/dashboard/ [following]
--2022-07-20 22:54:58-- http://127.0.0.1:8000/account/login/? next=/business/dashboard/
Connecting to 127.0.0.1:8000... connected.
HTTP request sent, awaiting response... No data received.
Retrying.
--2022-07-20 22:55:30-- (try: 2) http://127.0.0.1:8000/account/login/?next=/business/dashboard/
Connecting to 127.0.0.1:8000... connected.
HTTP request sent, awaiting response...
Supervisor / Gunicorn Log is not helpful at all:
[2022-07-20 23:06:34 +0200] [980] [INFO] Starting gunicorn 20.1.0
[2022-07-20 23:06:34 +0200] [980] [INFO] Listening at: http://127.0.0.1:8000 (980)
[2022-07-20 23:06:34 +0200] [980] [INFO] Using worker: sync
[2022-07-20 23:06:34 +0200] [986] [INFO] Booting worker with pid: 986
[2022-07-20 23:08:01 +0200] [980] [CRITICAL] WORKER TIMEOUT (pid:986)
[2022-07-20 23:08:02 +0200] [980] [WARNING] Worker with pid 986 was terminated due to signal 9
[2022-07-20 23:08:02 +0200] [1249] [INFO] Booting worker with pid: 1249
[2022-07-20 23:12:26 +0200] [980] [CRITICAL] WORKER TIMEOUT (pid:1249)
[2022-07-20 23:12:27 +0200] [980] [WARNING] Worker with pid 1249 was terminated due to signal 9
[2022-07-20 23:12:27 +0200] [1515] [INFO] Booting worker with pid: 1515
Nginx is just giving:
502 Bad Gateway
I don't see anything in the logs, I don't see any error when running the dev server from Django, also Sentry is not showing anything. Totally lost.
I am running Django 4.0.x and all libraries are updated.
The check up script for the database is only checking the connection. Due to misconfiguration of the database replication, the db was connecting and also reading, but when writing it hang.
The login page tries to write a session to the tables, which failed in this case.
I have a Flask app running on Heroku with uwsgi server in which each user connects to his own database. I have implemented the solution reported here for a very similar situation. In particular, I have implemented the connection registry as follows:
class DBSessionRegistry():
_registry = {}
def get(self, URI, **kwargs):
if URI not in self._registry:
current_app.logger.info(f'INFO - CREATING A NEW CONNECTION')
try:
engine = create_engine(URI,
echo=False,
pool_size=5,
max_overflow=5)
session_factory = sessionmaker(bind=engine)
Session = scoped_session(session_factory)
a_session = Session()
self._registry[URI] = a_session
except ArgumentError:
raise Exception('Error')
current_app.logger.info(f'SESSION ID: {id(self._registry[URI])}')
current_app.logger.info(f'REGISTRY ID: {id(self._registry)}')
current_app.logger.info(f'REGISTRY SIZE: {len(self._registry.keys())}')
current_app.logger.info(f'APP ID: {id(current_app)}')
return self._registry[URI]
In my create_app() I assign a registry to the app:
app.DBregistry = DBSessionRegistry()
and whenever I need to talk to the DB I call:
current_app.DBregistry.get(URI)
where the URI is dependent on the user. This works nicely if I use uwsgi with one single process. With more processes,
[uwsgi]
processes = 4
threads = 1
sometimes it gets stuck on some requests, returning a 503 error code. I have found that the problem appears when the requests are handled by different processes in uwsgi. This is an excerpt of the log, which I commented to illustrate the issue:
# ... EVERYTHING OK UP TO HERE.
# ALL PREVIOUS REQUESTS HANDLED BY PROCESS pid = 12
INFO in utils: SESSION ID: 139860361716304
INFO in utils: REGISTRY ID: 139860484608480
INFO in utils: REGISTRY SIZE: 1
INFO in utils: APP ID: 139860526857584
# NOTE THE pid IN THE NEXT LINE...
[pid: 12|app: 0|req: 1/1] POST /manager/_save_task =>
generated 154 bytes in 3457 msecs (HTTP/1.1 200) 4 headers in 601
bytes (1 switches on core 0)
# PREVIOUS REQUEST WAS MANAGED BY PROCESS pid = 12
# THE NEXT REQUEST IS FROM THE SAME USER AND TO THE SAME URL.
# SO THERE IS NO NEED FOR CREATING A NEW CONNECTION, BUT INSTEAD...
INFO - CREATING A NEW CONNECTION
# TO THIS POINT, I DON'T UNDERSTAND WHY IT CREATED A NEW CONNECTION.
# THE SESSION ID CHANGES, AS IT IS A NEW SESSION
INFO in utils: SESSION ID: 139860363793168 # <<--- CHANGED
INFO in utils: REGISTRY ID: 139860484608480
INFO in utils: REGISTRY SIZE: 1
# THE APP AND THE REGISTRY ARE UNIQUE
INFO in utils: APP ID: 139860526857584
# uwsgi GIVES UP...
*** HARAKIRI ON WORKER 4 (pid: 11, try: 1) ***
# THE FAILED REQUEST WAS MANAGED BY PROCESS pid = 11
# I ASSUME THIS IS WHY IT CREATED A NEW CONNECTION
HARAKIRI: -- syscall> 7 0x7fff4290c6d8 0x1 0xffffffff 0x4000 0x0 0x0
0x7fff4290c6b8 0x7f33d6e3cbc4
HARAKIRI: -- wchan> poll_schedule_timeout
HARAKIRI !!! worker 4 status !!!
HARAKIRI [core 0] - POST /manager/_save_task since 1587660997
HARAKIRI !!! end of worker 4 status !!!
heroku[router]: at=error code=H13 desc="Connection closed without
response" method=POST path="/manager/_save_task"
DAMN ! worker 4 (pid: 11) died, killed by signal 9 :( trying respawn ...
Respawned uWSGI worker 4 (new pid: 14)
# FROM HERE ON, NOTHINGS WORKS ANYMORE
This behavior is consistent over several attempts: when the pid changes, the request fails. Even with a pool_size = 1 in the create_engine function the issue persists. No issue instead is uwsgi is used with one process.
I am pretty sure it is my fault, there is something I don't know or I don't understand about how uwsgi and/or sqlalchemy work. Could you please help me?
Thanks
What is hapeening is that you are trying to share memory between processes.
There are some exaplanations in these posts.
(is it possible to share memory between uwsgi processes running flask app?).
(https://stackoverflow.com/a/45383617/11542053)
You can use an extra layer to store your sessions outsite of the app.
For that, you can use uWsgi's SharedArea(https://uwsgi-docs.readthedocs.io/en/latest/SharedArea.html) which is very low level or you can user other approaches like uWsgi's caching(https://uwsgi-docs.readthedocs.io/en/latest/Caching.html)
hope it helps.
I am facing worker timeout issue. when this happens api gets timeout issue.
Please look into log and settings.
Here is log:
[2016-11-26 09:45:02 +0000] [19064] [INFO] Autorestarting worker after current request.
[2016-11-26 09:45:02 +0000] [19064] [INFO] Worker exiting (pid: 19064)
[2016-11-26 09:46:02 +0000] [19008] [CRITICAL] WORKER TIMEOUT (pid:19064)
[2016-11-28 04:12:06 +0000] [19008] [INFO] Handling signal: winch
gunicorn config:
workers = threads = numCPUs() * 2 + 1
backlog = 2048
max_requests = 1200
timeout = 60
preload_app = True
worker_class = "gevent"
debug = True
daemon = False
pidfile = "/tmp/gunicorn.pid"
logfile = "/tmp/gunicorn.log"
loglevel = 'info'
accesslog = '/tmp/gunicorn-access.log'
I'm using Akka 2.2 contrib's project ClusterSingletonManager to guarantee there is always one and just one specific type of actor (master) in a cluster. However, I've observed an odd behaviour (which, incidentally, may be expected, but can't understand why). Whenever a master drops out of the cluster and joins in later, the following sequence of actions occur:
[INFO] [04/30/2013 17:47:35.805] [ClusterSystem-akka.actor.default-dispatcher-9] [akka://ClusterSystem/system/cluster/core/daemon] Cluster Node [akka.tcp://ClusterSystem#127.0.0.1:2551] - Welcome from [akka.tcp://ClusterSystem#127.0.0.1:2552]
[INFO] [04/30/2013 17:47:48.703] [ClusterSystem-akka.actor.default-dispatcher-8] [akka://ClusterSystem/user/singleton] Member removed [akka.tcp://ClusterSystem#127.0.0.1:52435]
[INFO] [04/30/2013 17:47:48.712] [ClusterSystem-akka.actor.default-dispatcher-2] [akka://ClusterSystem/user/singleton] ClusterSingletonManager state change [Start -> BecomingLeader]
[INFO] [04/30/2013 17:47:49.752] [ClusterSystem-akka.actor.default-dispatcher-9] [akka://ClusterSystem/user/singleton] Retry [1], sending HandOverToMe to [None]
[INFO] [04/30/2013 17:47:50.850] [ClusterSystem-akka.actor.default-dispatcher-21] [akka://ClusterSystem/user/singleton] Retry [2], sending HandOverToMe to [None]
[INFO] [04/30/2013 17:47:51.951] [ClusterSystem-akka.actor.default-dispatcher-20] [akka://ClusterSystem/user/singleton] Retry [3], sending HandOverToMe to [None]
[INFO] [04/30/2013 17:47:53.049] [ClusterSystem-akka.actor.default-dispatcher-3]
...
[INFO] [04/30/2013 17:48:10.650] [ClusterSystem-akka.actor.default-dispatcher-21] [akka://ClusterSystem/user/singleton] Retry [20], sending HandOverToMe to [None]
[INFO] [04/30/2013 17:48:11.751] [ClusterSystem-akka.actor.default-dispatcher-4] [akka://ClusterSystem/user/singleton] Timeout in BecomingLeader. Previous leader unknown, removed and no TakeOver request.
[INFO] [04/30/2013 17:48:11.752] [ClusterSystem-akka.actor.default-dispatcher-4] [akka://ClusterSystem/user/singleton] Singleton manager [akka.tcp://ClusterSystem#127.0.0.1:2551] starting singleton actor
[INFO] [04/30/2013 17:48:11.754] [ClusterSystem-akka.actor.default-dispatcher-4] [akka://ClusterSystem/user/singleton] ClusterSingletonManager state change [BecomingLeader -> Leader]
Why is it attempting to send an HandOverToMe to [None]? It takes about 20 seconds (20 retries) until it becomes the new leader, though in this particular situation the previous one was well known...
I'm not sure if this will answer your question, but in looking at the source code for ClusterSingletonManager, you can see the chain of events that leads to this scenario. This class uses the Finite State Machine logic in Akka, and the behavior you are seeing is kicked off due to a state transition from Start -> BecomingLeader. First, look at the Start state:
when(Start) {
case Event(StartLeaderChangedBuffer, _) ⇒
leaderChangedBuffer = context.actorOf(Props[LeaderChangedBuffer].withDispatcher(context.props.dispatcher))
getNextLeaderChanged()
stay
case Event(InitialLeaderState(leaderOption, memberCount), _) ⇒
leaderChangedReceived = true
if (leaderOption == selfAddressOption && memberCount == 1)
// alone, leader immediately
gotoLeader(None)
else if (leaderOption == selfAddressOption)
goto(BecomingLeader) using BecomingLeaderData(None)
else
goto(NonLeader) using NonLeaderData(leaderOption)
}
The part to look at here is:
else if (leaderOption == selfAddressOption)
goto(BecomingLeader) using BecomingLeaderData(None)
To me, it looks like this piece is saying "If I'm the leader, change start to Become Leader with None as the previousLeader option"
Then, if you look at the BecomingLeader state:
when(BecomingLeader) {
...
case Event(HandOverRetry(count), BecomingLeaderData(previousLeaderOption)) ⇒
if (count <= maxHandOverRetries) {
logInfo("Retry [{}], sending HandOverToMe to [{}]", count, previousLeaderOption)
previousLeaderOption foreach { peer(_) ! HandOverToMe }
setTimer(HandOverRetryTimer, HandOverRetry(count + 1), retryInterval, repeat = false)
} else if (previousLeaderOption forall removed.contains) {
// can't send HandOverToMe, previousLeader unknown for new node (or restart)
// previous leader might be down or removed, so no TakeOverFromMe message is received
logInfo("Timeout in BecomingLeader. Previous leader unknown, removed and no TakeOver request.")
gotoLeader(None)
} else
throw new ClusterSingletonManagerIsStuck(
s"Becoming singleton leader was stuck because previous leader [${previousLeaderOption}] is unresponsive")
}
This is the block that keeps repeating that message you are seeing in the log. It basically looks like it's attempting to get a previous leader to hand over responsibility to w/o knowing who the previous leader was because in the state transition, it passed in None as the previous leader. The million dollar question is "If it doesn't know who the previous leader is, why keep attempting handoffs that will never succeed?".