libpqxx: How to reconnect to a Postgresql database after connection process has died - c++

I instantiate a PostgreSQL connection through libpqxx. I query the database and get correct response. After that I tried the following error case: after instance of pqxx::connection has been created, I pause my program, manually kill the Postgre's connection process from a Linux's command shell and resume the program. It continues until it tries to create new transaction pqxx::work where it throws pqxx::broken_connection. I handle this exception and try to reconnect with a call to pqxx::connection::activate() but another pqxx::broken_connection gets thrown. How to reconnect to DB without instantiate another pqxx::connection?
P.S. reactivation is not inhibited. I use the standard connection type -
namespace pqxx
{
typedef basic_connection<connect_direct> connection;
}

Ok, nobody has answered. I noticed that after I manually kill the process behind the connection after several successive calls to pqxx::connection::activate, it gets reconnected, so that's my workaround.
class dbconnection : public pqxx::connection
{
public:
dbconnection(std::string options) : pqxx::connection(options) { };
void reconnect()
{
static int times = 0;
try
{
times++;
if(!this->is_open())
{
this->activate();
}
times = 0;
}
catch(const pqxx::broken_connection & e)
{
if(times > 10)
{
times = 0;
return;
}
this->reconnect();
}
};
};
I call dbconnection::reconnect each time after I catch a pqxx::broken_connection. Let me know do you have better solution?

Related

gRPC: What are the best practices for long-running streaming?

We've implemented a Java gRPC service that runs in the cloud, with an unidirectional (client to server) streaming RPC which looks like:
rpc PushUpdates(stream Update) returns (Ack);
A C++ client (a mobile device) calls this rpc as soon as it boots up, to continuously send an update every 30 or so seconds, perpetually as long as the device is up and running.
ChannelArguments chan_args;
// this will be secure channel eventually
auto channel_p = CreateCustomChannel(remote_addr, InsecureChannelCredentials(), chan_args);
auto stub_p = DialTcc::NewStub(channel_p);
// ...
Ack ack;
auto strm_ctxt_p = make_unique<ClientContext>();
auto strm_p = stub_p->PushUpdates(strm_ctxt_p.get(), &ack);
// ...
While(true) {
// wait until we are ready to send a new update
Update updt;
// populate updt;
if(!strm_p->Write(updt)) {
// stream is not kosher, create a new one and restart
break;
}
}
Now different kinds of network interruptions happen while this is happening:
the gRPC service running in the cloud may go down (for maintenance) or may simply become unreachable.
the device's own ip address keeps changing as it is a mobile device.
We've seen that on such events, neither the channel, nor the Write() API is able to detect network disconnection reliably. At times the client keep calling Write() (which doesn't return false) but the server doesn't receive any data (wireshark doesn't show any activity at the outgoing port of the client device).
What are the best practices to recover in such cases, so that the server starts receiving the updates within X seconds from the time when such an event occurs? It is understandable that there would loss of X seconds worth data whenever such an event happens, but we want to recover reliably within X seconds.
gRPC version: 1.30.2, Client: C++-14/Linux, Sever: Java/Linux
Here's how we've hacked this. I want to check if this can be made any better or anyone from gRPC can guide me about a better solution.
The protobuf for our service looks like this. It has an RPC for pinging the service, which is used frequently to test connectivity.
// Message used in IsAlive RPC
message Empty {}
// Acknowledgement sent by the service for updates received
message UpdateAck {}
// Messages streamed to the service by the client
message Update {
...
...
}
service GrpcService {
// for checking if we're able to connect
rpc Ping(Empty) returns (Empty);
// streaming RPC for pushing updates by client
rpc PushUpdate(stream Update) returns (UpdateAck);
}
Here is how the c++ client looks, which does the following:
Connect():
Create the stub for calling the RPCs, if the stub is nullptr.
Call Ping() in regular intervals until it is successful.
On success call PushUpdate(...) RPC to create a new stream.
On failure reset the stream to nullptr.
Stream(): Do the following a while(true) loop:
Get the update to be pushed.
Call Write(...) on the stream with the update to be pushed.
If Write(...) fails for any reason break and the control goes back to Connect().
Once in every 30 minutes (or some regular interval), reset everything (stub, channel, stream) to nullptr to start afresh. This is required because at times Write(...) does not fail even if there is no connection between the client and the service. Write(...) calls are successful but the outgoing port on the client does not show any activity on wireshark!
Here is the code:
constexpr GRPC_TIMEOUT_S = 10;
constexpr RESTART_INTERVAL_M = 15;
constexpr GRPC_KEEPALIVE_TIME_MS = 10000;
string root_ca, tls_key, tls_cert; // for SSL
string remote_addr = "https://remote.com:5445";
...
...
void ResetStreaming() {
if (stub_p) {
if (strm_p) { // graceful restart/stop, this pair of API are called together, in this order
if (!strm_p->WritesDone()) {
// Log a message
}
strm_p->Finish(); // Log if return value of this is NOT grpc::OK
}
strm_p = nullptr;
strm_ctxt_p = nullptr;
stub_p = nullptr;
channel_p = nullptr;
}
}
void CreateStub() {
if (!stub_p) {
ChannelArguments chan_args;
chan_args.SetInt(GRPC_ARG_KEEPALIVE_TIME_MS, GRPC_KEEPALIVE_TIME_MS);
channel_p = CreateCustomChannel(
remote_addr,
SslCredentials(SslCredentialsOptions{root_ca, tls_key, tls_cert}),
chan_args);
stub_p = GrpcService::NewStub(m_channel_p);
}
}
void Stream() {
const auto restart_time = steady_clock::now() + minutes(RESTART_INTERVAL_M);
while (!stop) {
// restart every RESTART_INTERVAL_M (15m) even if ALL IS WELL!!
if (steady_clock::now() > restart_time) {
break;
}
Update updt = GetUpdate(); // get the update to be sent
if (!stop) {
if (channel_p->GetState(true) == GRPC_CHANNEL_SHUTDOWN ||
!strm_p->Write(updt)) {
// could not write!!
return; // we will Connect() again
}
}
}
// stopped due to stop = true or interval to create new stream has expired
ResetStreaming(); // channel, stub, stream are recreated once in every 15m
}
bool PingRemote() {
ClientContext ctxt;
ctxt.set_deadline(system_clock::now() + seconds(GRPC_TIMEOUT_S));
Empty req, resp;
CreateStub();
if (stub_p->Ping(&ctxt, req, &resp).ok()) {
static UpdateAck ack;
strm_ctxt_p = make_unique<ClientContext>(); // need new context
strm_p = stub_p->PushUpdate(strm_ctxt_p.get(), &ack);
return true;
}
if (strm_p) {
strm_p = nullptr;
strm_ctxt_p = nullptr;
}
return false;
}
void Connect() {
while (!stop) {
if (PingRemote() || stop) {
break;
}
sleep_for(seconds(5)); // wait before retrying
}
}
// set to true from another thread when we want to stop
atomic<bool> stop = false;
void StreamUntilStopped() {
if (stop) {
return;
}
strm_thread_p = make_unique<thread>([&] {
while (!stop) {
Connect();
Stream();
}
});
}
// called by the thread that sets stop = true
void Finish() {
strm_thread_p->join();
}
With this we are seeing that the streaming recovers within 15 minutes (or RESTART_INTERVAL_M) whenever there is a disruption for any reason. This code runs in a fast path, so I am curious to know if this can be made any better.

How to prevent name collisions when creating QSqlDatabase connection in multiple threads

I have multi-threaded QTcpServer and for each database request, it creates new Thread to keep server Responsive. So in each thread I have to creating new QSqlDatabase connection. But I keep getting name collisions between connections.
here is my sample code to recreate issue.:-
#include <QSqlDatabase>
class DBTask
{
public:
DBTask(ClientSocket *socket,ConnectionWorker *connectionWorker);
~DBTask();
static void initStatic();
private:
static QThreadPool *pool; // all addConnection() call be be called in QtConcurrent::run with this pool
static QString host, user, type, password, name;
static quint64 dbConnectionNumber;
QSqlDatabase db;
ClientSocket *socket;
ConnectionWorker *connectionWorker;
bool addDatabase() ;
};
quint64 DBTask::dbConnectionNumber=0;
DBTask::DBTask(ClientSocket *socket, ConnectionWorker *connectionWorker):
socket(socket),
connectionWorker(connectionWorker)
{
dbConnectionNumber++;
}
bool DBTask::addDatabase() {
QSqlDatabase db = QSqlDatabase::addDatabase(type,QString::number(dbConnectionNumber));
db.setHostName(host);
db.setDatabaseName(name);
db.setUserName(user);
db.setPassword(password);
if(!db.open()){
qWarning() << "Error while opening database for socket " << socket << '\n' << db.lastError();
return false;
}
else {
return true;
}
}
this works fine when I manually check my application with GUI with human speed But when I run a c++ test code which simulates thousands of requests like this:-
void connectionTest(){
QThreadPool pool;
pool.setMaxThreadCount(10);
for(int i=0;i<10;i++){
QtConcurrent::run(&pool,[this](){
for(int i=0;i<1000;i++){
login(i%2); // login function sends request to QTcpServer
}
});
}
}
I get multiple errors like this:-
QSqlDatabasePrivate::removeDatabase: connection '10' is still in use, all queries will cease to work.
QSqlDatabasePrivate::addDatabase: duplicate connection name '10', old connection removed.
QSqlDatabasePrivate::removeDatabase: connection '10' is still in use, all queries will cease to work.
QSqlDatabasePrivate::addDatabase: duplicate connection name '10', old connection removed.
and Server crashes with segfault
Even if you make the counter atomic, a thread can still get interrupted in the DBTask::addDatabase method (before creating the connection), another one can increment the counter and then they both continue and create 2 connections with the same id. You need to make both operations (increment the counter and the connection creation) in one transaction: inside the DBTask::addDatabase, by making use of a mutex lock.
After adding QMutex to addDatabase, it works:-
bool DBTask::addDatabase() {
mutex.lock();
dbConnectionNumber++;
db = QSqlDatabase::addDatabase(type,QString::number(dbConnectionNumber));
mutex.unlock();
...
}

How check fast, if database reachable? (Qt, QML, C++)- Linux

I use qt with qml and c++. On my application i use a database.
It all works, if the database is reachable.
My problem is, that i would like to check, if database is reachable (like ping).
I tried
db.setDatabaseName(dsn);
if(db.isValid())
{
if(db.open())
{
//std::cout <<"Offene Datenbank";
connected=true;
}
else
{
connected=false;
}
}
else
{
connected=false;
}
and give the connected value as result. But that takes very long (maybe 30 seconds), if there is no connection. How i can check fast, if i have a database connection?
Is there maybe a way to break the command .open after 5 seconds not connected?
I think one easy solution is to just check the ping of the database sever. You can use platform specific ways for pinging.
This would work on Linux :
int exitCode = QProcess::execute("ping", QStringList() << "-c 2" << serverIp);
if (exitCode==0)
{
// is reachable
} else
{
// is not reachable
}
I have studied this question a bit. Here is what I found out.
The problem is in default db connection timeout - it is too long. Each db allows you to change it to an acceptable value, using their own API. In Qt there is one common db interface - QSqlDatabase. And it does not have such method. You can set connection settings by calling it's QSqlDatabase::setConnectOptions method, but it accepts only predefined list of options (which you can read in Qt's help).
For PostgreSQL there is an option connect_timeout, so you can write:
db.setConnectOptions("connect_timeout=5"); // Set to 5 seconds
For other databases there is no such parameter. Connection options of each db are parsed in it's 'driver' class, which derives QSqlDriver and is stored in a 'driver' library.
So, what you can do:
You can rewrite database's driver in order it to accept timeout option.
You can write separate code for each db, using it's native API.
UPDATE
Turns out, that ODBC has SQL_ATTR_CONNECTION_TIMEOUT option.
UPDATE 2
qsql_odbc.cpp:713
} else if (opt.toUpper() == QLatin1String("SQL_ATTR_CONNECTION_TIMEOUT")) {
v = val.toUInt();
r = SQLSetConnectAttr(hDbc, SQL_ATTR_CONNECTION_TIMEOUT, (SQLPOINTER) v, 0);
https://msdn.microsoft.com/en-us/library/ms713605(v=vs.85).aspx
SQL_ATTR_CONNECTION_TIMEOUT (ODBC 3.0)
An SQLUINTEGER value
corresponding to the number of seconds to wait for any request on the
connection to complete before returning to the application. The driver
should return SQLSTATE HYT00 (Timeout expired) anytime that it is
possible to time out in a situation not associated with query
execution or login.
If ValuePtr is equal to 0 (the default), there is no timeout.
Should work fine...
I suggest having some separate thread/class where you check connection and emit signal after some timeout if nothing happens (with a check - knowConnection - if we found out already if its connected).
This code is not tested and written from scratch on top of my head.. may contain some errors.
/// db connection validator in separate thread
void validator::doValidate() {
this->knowConnection = false;
db.setDatabaseName(dsn);
if(db.isValid())
{
QTimer::singleShot(1000, [this]() {
if (!this->knowConnection) {
emit connected(false);dm->connected=false;
}
});
if(db.open())
{
//std::cout <<"Offene Datenbank";
this->knowConnection = true;
dm->connected=true;
emit connected(true);
}
else
{
dm->connected=false;
this->knowConnection = true;
emit connected(false);
}
}
else
{
dm->connected=false;
this->knowConnection = true;
emit connected(false);
}
}
/// db manager in different thread
void dm::someDbFunction() {
if (connected) {
/// db logic
}
}
/// in gui or whatever
MainWindow::MainWindow() : whatever, val(new validator(..), .. {
connect(val, SIGNAL(connected(bool)), this, SLOT(statusSlot(bool));
....
}
void MainWindow::statusSlot(bool connected) {
ui->statusBar->setText((connected?"Connected":"Disconnected"));
}

How to trace resource deadlocks?

I've wrote a timer using std::thread - here is how it looks like:
TestbedTimer::TestbedTimer(char type, void* contextObject) :
Timer(type, contextObject) {
this->active = false;
}
TestbedTimer::~TestbedTimer(){
if (this->active) {
this->active = false;
if(this->timer->joinable()){
try {
this->timer->join();
} catch (const std::system_error& e) {
std::cout << "Caught system_error with code " << e.code() <<
" meaning " << e.what() << '\n';
}
}
if(timer != nullptr) {
delete timer;
}
}
}
void TestbedTimer::run(unsigned long timeoutInMicroSeconds){
this->active = true;
timer = new std::thread(&TestbedTimer::sleep, this, timeoutInMicroSeconds);
}
void TestbedTimer::sleep(unsigned long timeoutInMicroSeconds){
unsigned long interval = 500000;
if(timeoutInMicroSeconds < interval){
interval = timeoutInMicroSeconds;
}
while((timeoutInMicroSeconds > 0) && (active == true)){
if (active) {
timeoutInMicroSeconds -= interval;
/// set the sleep time
std::chrono::microseconds duration(interval);
/// set thread to sleep
std::this_thread::sleep_for(duration);
}
}
if (active) {
this->notifyAllListeners();
}
}
void TestbedTimer::interrupt(){
this->active = false;
}
I'm not really happy with that kind of implementation since I let the timer sleep for a short interval and check if the active flag has changed (but I don't know a better solution since you can't interrupt a sleep_for call). However, my program core dumps with the following message:
thread is joinable
Caught system_error with code generic:35 meaning Resource deadlock avoided
thread has rejoined main scope
terminate called without an active exception
Aborted (core dumped)
I've looked up this error and as seems that I have a thread which waits for another thread (the reason for the resource deadlock). However, I want to find out where exactly this happens. I'm using a C library (which uses pthreads) in my C++ code which provides among other features an option to run as a daemon and I'm afraid that this interfers with my std::thread code. What's the best way to debug this?
I've tried to use helgrind, but this hasn't helped very much (it doesn't find any error).
TIA
** EDIT: The code above is actually not exemplary code, but I code I've written for a routing daemon. The routing algorithm is a reactive meaning it starts a route discovery only if it has no routes to a desired destination and does not try to build up a routing table for every host in its network. Every time a route discovery is triggered a timer is started. If the timer expires the daemon is notified and the packet is dropped. Basically, it looks like that:
void Client::startNewRouteDiscovery(Packet* packet) {
AddressPtr destination = packet->getDestination();
...
startRouteDiscoveryTimer(packet);
...
}
void Client::startRouteDiscoveryTimer(const Packet* packet) {
RouteDiscoveryInfo* discoveryInfo = new RouteDiscoveryInfo(packet);
/// create a new timer of a certain type
Timer* timer = getNewTimer(TimerType::ROUTE_DISCOVERY_TIMER, discoveryInfo);
/// pass that class as callback object which is notified if the timer expires (class implements a interface for that)
timer->addTimeoutListener(this);
/// start the timer
timer->run(routeDiscoveryTimeoutInMilliSeconds * 1000);
AddressPtr destination = packet->getDestination();
runningRouteDiscoveries[destination] = timer;
}
If the timer has expired the following method is called.
void Client::timerHasExpired(Timer* responsibleTimer) {
char timerType = responsibleTimer->getType();
switch (timerType) {
...
case TimerType::ROUTE_DISCOVERY_TIMER:
handleExpiredRouteDiscoveryTimer(responsibleTimer);
return;
....
default:
// if this happens its a bug in our code
logError("Could not identify expired timer");
delete responsibleTimer;
}
}
I hope that helps to get a better understanding of what I'm doing. However, I did not to intend to bloat the question with that additional code.

What is a good way to handle multithreading with Poco SocketReactor?

So I'm starting to do some research on alternatives for implementing a high volume client/server system, and I'm currently looking at Poco's Reactor framework since I'm using Poco for so much of my application frameworks now.
The incoming packet sizes are going to be pretty small, so I think it will work fine from the perspective of reading the data from the clients. But the operations that will be performed based on the client input will be relatively expensive and may need to be offloaded to another process or even another server. And the responses sent back to the client will sometimes be fairly large. So obviously I can't block the reactor thread while that is taking place.
So I'm thinking if I just read the data in the reactor event handler and then pass it to another thread(pool) that processes the data, it would work out better.
What I'm not too sure about is the process for sending the responses back to the client when the operations are complete.
I can't find too much information about the best ways to use the framework. But I've done some testing and it looks like the reactor will fire the WritableNotification event repeatedly while the socket is writable. So would the optimal process be to queue up the data that needs to be sent in the object that receives the WritableNotification events and send small chunks each time the event is received?
Update: So when I started testing this I was horrified to discover that server CPU usage went up to 100% on the CPU the server app was running on with a single connection. But after some digging I found what I was doing wrong. I discovered that I don't need to register for WritableNotification events when the service handler is created, I only need to register when I have data to send. Then once all of the data is sent, I should unregister the event handler. This way the reactor doesn't have to keep calling the event handlers over and over when there is nothing to send. Now my CPU usage stays close to 0 even with 100 connections. Whew!
i have wrote a class ServerConnector that copied from SocketConnector, but do not call the connect for socket, because the socket was connected already, if a reactor was started with a ServiceHandler for notifications in the run() function of TcpServerConnection, the class TcpServer would start a new thread. so, i got multithread of reactor-partten, but i do not konw it's best way or not.
class ServerConnector
template <class ServiceHandler>
class ServerConnector
{
public:
explicit ServerConnector(StreamSocket& ss):
_pReactor(0),
_socket(ss)
/// Creates a ServerConnector, using the given Socket.
{
}
ServerConnector(StreamSocket& ss, SocketReactor& reactor):
_pReactor(0),
_socket(ss)
/// Creates an acceptor, using the given ServerSocket.
/// The ServerConnector registers itself with the given SocketReactor.
{
registerConnector(reactor);
onConnect();
}
virtual ~ServerConnector()
/// Destroys the ServerConnector.
{
unregisterConnector();
}
//
// this part is same with SocketConnector
//
private:
ServerConnector();
ServerConnector(const ServerConnector&);
ServerConnector& operator = (const ServerConnector&);
StreamSocket& _socket;
SocketReactor* _pReactor;
};
the Echo-Service is a common ServiceHander
class EchoServiceHandler
{
public:
EchoServiceHandler(StreamSocket& socket, SocketReactor& reactor):
_socket(socket),
_reactor(reactor)
{
_reactor.addEventHandler(_socket, Observer<EchoServiceHandler, ReadableNotification>(*this, &EchoServiceHandler::onReadable));
_reactor.addEventHandler(_socket, Observer<EchoServiceHandler, ErrorNotification>(*this, &EchoServiceHandler::onError));
}
~EchoServiceHandler()
{
_reactor.removeEventHandler(_socket, Observer<EchoServiceHandler, ErrorNotification>(*this, &EchoServiceHandler::onError));
_reactor.removeEventHandler(_socket, Observer<EchoServiceHandler, ReadableNotification>(*this, &EchoServiceHandler::onReadable));
}
void onReadable(ReadableNotification* pNf)
{
pNf->release();
char buffer[4096];
try {
int n = _socket.receiveBytes(buffer, sizeof(buffer));
if (n > 0)
{
_socket.sendBytes(buffer, n);
} else
onError();
} catch( ... ) {
onError();
}
}
void onError(ErrorNotification* pNf)
{
pNf->release();
onError();
}
void onError()
{
_socket.shutdown();
_socket.close();
_reactor.stop();
delete this;
}
private:
StreamSocket _socket;
SocketReactor& _reactor;
};
The EchoReactorConnection works with class TcpServer to run reactor as a thread
class EchoReactorConnection: public TCPServerConnection
{
public:
EchoReactorConnection(const StreamSocket& s): TCPServerConnection(s)
{
}
void run()
{
StreamSocket& ss = socket();
SocketReactor reactor;
ServerConnector<EchoServiceHandler> sc(ss, reactor);
reactor.run();
std::cout << "exit EchoReactorConnection thread" << std::endl;
}
};
cppunit test case is same with TCPServerTest::testMultiConnections, but using EchoReactorConnection for multi-thread.
void TCPServerTest::testMultithreadReactor()
{
ServerSocket svs(0);
TCPServerParams* pParams = new TCPServerParams;
pParams->setMaxThreads(4);
pParams->setMaxQueued(4);
pParams->setThreadIdleTime(100);
TCPServer srv(new TCPServerConnectionFactoryImpl<EchoReactorConnection>(), svs, pParams);
srv.start();
assert (srv.currentConnections() == 0);
assert (srv.currentThreads() == 0);
assert (srv.queuedConnections() == 0);
assert (srv.totalConnections() == 0);
//
// same with TCPServerTest::testMultiConnections()
//
// ....
///
}