I have in working thread, which runs forever, connection to postgresql database ( in c++ using libpqxx)
#include <pqxx/connection>
// later in code on starting thread only once executed
connection* conn= conn = new connection(createConnectionString(this->database, this->port, this->username, this->password));
How to check if connection is still active couple hours later, for example if I in meanwhile restrt postgre server ( worker thread still running and is not restarted) I should when I try to execute new query check if it is still valid and reconnect if it not.
How to know if it is still alive ?
How to check if connection is still active couple hours later
Don't.
Just use it, as if it were alive. If an exception is thrown because there's something wrong, catch the exception and retry the transaction that failed from be beginning.
Attempts to "test" connections or "validate" them are doomed. There is an inherent race condition where the connection could go away between validation and actually being used. So you have to handle exceptions correctly anyway - at which point there's no point doing that connection validation in the first place.
Queries can fail and transactions can be aborted for many reasons. So your app must always execute transactions in a retry loop that detects possibly transient failure conditions. This is just one possibility - you could also have a query cancelled by the admin, a transaction aborted by the deadlock detector, a transaction cancelled by a serialization failure, etc.
To avoid unwanted low level TCP timeouts you can set a TCP keepalive on the connection, server-side or client-side.
If you really insist on doing this, knowing that it's wrong, just issue an empty query, i.e. "".
Related
I am currently working on a server application in C++. My main inspirations are these examples:
Windows SDK IOCP Excample
The I/O Completion Port IPv4/IPv6 Server Program Example
My app is strongly similar to these (socketobj, packageobj, ...).
In general, my app is running without issues. The only things which still causes me troubles are half open connections.
My strategy for this is: I check every connected client in a time period and count an "idle counter" up. If one completion occurs, I reset this timer. If the Idle counter goes too high, I set a boolean to prevent other threads from posting operations, and then call closesocket().
My assumption was that now the socket is closed, the pending operations will complete (maybe not instantly but after a time). This is also the behavior the MSDN documentation is describing (hints, second paragraph). I need this because only after all operations are completed can I free the resources.
Long story short: this is not the case for me. I did some tests with my testclient app and some cout and breakpoint debugging, and discovered that pending operations for closed sockets are not completing (even after waiting 10 min). I also already tried with a shutdown() call before the closesocket(), and both returned no error.
What am I doing wrong? Does this happen to anyone else? Is the MSDN documentation wrong? What are the alternatives?
I am currently thinking of the "linger" functionality, or to cancel every operation explicitly with the CancelIoEx() function
Edit: (thank you for your responses)
Yesterday evening I added a chained list for every sockedobj to hold the per io obj of the pending operations. With this I tried the CancelIOEx() function. The function returned 0 and GetLastError() returned ERROR_NOT_FOUND for most of the operations.
Is it then safe to just free the per Io Obj in this case?
I also discovered, that this is happening more often, when I run my server app and the client app on the same machine. It happens from time to time, that the server is then not able to complete write operations. I thought that this is happening because the client side receive buffer gets to full. (The client side does not stop to receive data!).
Code snipped follows as soon as possible.
The 'linger' setting can used to reset the connection, but that way you will (a) lose data and (b) deliver a reset to the peer, which may terrify it.
If you're thinking of a positive linger timeout, it doesn't really help.
Shutdown for read should terminate read operations, but shutdown for write only gets queued after pending writes so it doesn't help at all.
If pending writes are the problem, and not completing, they will have to be cancelled.
I'm using CRecordSet class to execute select query. I want to handle situation when I loose connection to database. I simulate this by turning off database. In most cases I receive "Connection failure" in catch which is correct. However sometimes, I get "Query timeout expired - State:S1T00,Native:0" and this is the only exception. Any idea why ? How can I detect when connection is lost if I get "Query timeout"? I use MS SQL Server 2014 and MFC. I will be grateful for all help.
Probably it will depend on how much time you wait to make the test.
Try set a known timeout with:
CDatabase::SetQueryTimeout()
...and test the conection before and after to see if the exceptions are consistent when the timeout expires and when it doesn´t.
I'm creating a few simple helper classes and methods for working with libpq, and am wondering if I receive an error from the database - (e.g. SQL error), how should I handle it?
At the moment, each method returns a bool depending on whether the operation was a success, and so is up to the user to check before continuing with new operations.
However, after reading the libpq docs, if an error occurs the best I can come up with is that I should log the error message / status and otherwise ignore. For example, if the application is in the middle of a transaction, then I believe it can still continue (Postgresql won't cancel the transaction as far as I know).
Is there something I can do with PostgreSQL / libpq to make the consequences of such errors safe regarding the database server, or is ignorance the better policy?
You should examine the SQLSTATE in the error and make handling decisions based on that and that alone. Never try to make decisions in code based on the error message text.
An application should simply retry transactions for certain kinds of errors:
Serialization failures
Deadlock detection transaction aborts
For connection errors, you should reconnect then re-try the transaction.
Of course you want to set a limit on the number of retries, so you don't loop forever if the issue doesn't clear up.
Other kinds of errors aren't going to be resolved by trying again, so the app should report an error to the client. Syntax error? Unique violation? Check constraint violation? Running the statement again won't help.
There is a list of error codes in the documentation but the docs don't explain much about each error, but the preamble is quite informative.
On a side note: One trap to avoid falling into is "testing" connections with a trivial query before using them, and assuming that means the real query can't fail. That's a race condition. Don't bother testing connections; simply run the real query and handle any error.
The details of what exactly to do depend on the error and on the application. If there was a single always-right answer, libpq would already do it for you.
My suggestions:
Always keep a record of the transaction until you've got a confirmed commit from the DB, in case you have to re-run. Don't just fire-and-forget SQL statements.
Retry the transaction without a disconnect and reconnect for SQLSTATEs 40001 (serialization_failure) and 40P01 (deadlock_detected), as these are transient conditions generally resolved by re-trying. You should log them, as they're opportunities to improve how the app interacts with the DB and if they happen a lot they're a performance problem.
Disconnect, reconnect, and retry the transaction at least once for error class 08 (connection exceptions).
Handle 53300 (too_many_connections) and 53400 (connection limit exceeded) with specific and informative errors to the user. Same with the other 53 class entries.
Handle class 57's entries with specific and informative errors to the user. Do not retry if you get a query_cancelled (57014), it'll make sysadmins very angry.
Handle 25006 (read_only_sql_transaction) by reporting a different error, telling the user you tried to write to a read-only database or using a read-only transaction.
Report a different error for 23505 (UNIQUE violation), indicating that there's a conflict in a unique constraint or primary key constraint. There's no point retrying.
Error class 01 should never produce an exception.
Treat other cases as errors and report them to the caller, with details from the problem - most importantly SQLSTATE. Log all the details if you return a simplified error.
Hope that's useful.
For THIS reason, I want to try something new - close the socket using some system call.
The situation in two words - can't set query timeout of the mysql library (the C API, refer to the link for more info), so I want to try closing the socket to see how the library will react. Probably this is not a good idea, but still wanna try it.
Here's what I've done - there's another started thread - a timer. So, after a specific timeout (let's say 10 second), if there's no response, I want to close the socket. The MYSQL struct has member net, that is also a struct, and holds the fd. But when I try to do this:
shutdown( m_pOwner->m_ptrDBConnection->m_mysql.net.fd, SHUT_RDWR );
close( m_pOwner->m_ptrDBConnection->m_mysql.net.fd );
nothing happens. The returned values from shutdown and close are 0, but the socket is still opened (because after 60sec waiting, there's a returned result from the DB, that means that the mysql client is still waiting for response from the DB.
Any ideas?
Thanks
EDIT - Yes, there's a running transaction, while I'm trying to close the socket. But this is the actual problem - I cannot terminate the query, nor to close the connection, nothing, and I don't wanna wait the whole timeout, which is 20min and 30 sec, or something like this. That's why I'm looking for a brute-force.. :/
Just a shot in the dark, but make sure you cancel/terminate any running transactions. I'm not familiar with the MySQL C API, but I would imagine there is a way to check if there are any active connections/queries. You may not be able to close the socket simply because there are still things running, and they need to be brought to some "resolved" state, be that either committed or rolled back. I would begin there and see what happens. You really don't want to shutdown the socket "brute force" style if you have anything pending anyway because your data would not be in a reliable "state" afterwards - you would not know what transactions succeeded and which ones did not, although I would imagine that MySQL would rollback any pending transactions if the connection failed abruptly.
EDIT:
From what I have found via Googling "MySQL stopping runaway query", the consensus seems to be to ask MySQL to terminate the thread of the runaway/long-running query using
KILL thread-id
I would imagine that the thread ID is available to you in the MySQL data structure that contains the socket. You may want to try this, although IIRC to do so requires super user priviledges.
EDIT #2:
Apparently MySQL provides a fail-safe mechanism that will restart a closed connection, so forcefully shutting down the socket will not actually terminate the query. Once you close it, MySQL will open another and attempt to complete the query. Turning this off will allow you to close the socket and cause the query to terminate.
The comments below show how the answer was found, and the thought process involved therein.
It looks like you are running into an issue with the TCP wait timer, meaning it will close eventually. [Long story short] it is sort of unavoidable. There was another discussion on this.
close vs shutdown socket?
As far as I know, If shutdown() and close() both return 0 there's no doubt you had successfully closed a socket. The fact is that you could have closed the wrong fd. Or the server could not react properly to a correct shutdown (if so, this could be considered a bug of the server: no reason to still wait for data incoming). I'd keep looking for a supported way to do this.
I am using mysql++ in order to connect to a MySQL database to perform a bunch of data queries. Due to the fact that the tables I am reading from are constantly being written to, and that I need a consistent view of the data, I lock the tables first. However, MySQL has no concept of 'NOWAIT' in its lock query, thus if the tables are locked by something else that keeps them locked for a long time, my application sits there waiting. What I want it to do is to be able to return and say something like 'Lock could no be obtained' and try again in a few seconds. My general attempt at this timeout is below.
If I run this after locking the table on the database, I get the message that the timeout is hit, but I don't know how to then get the mysql_query line to terminate. I'd appreciate any help/ideas!
volatile sig_atomic_t success = 1;
void catch_alarm(int sig) {
cout << "Timeout reached" << endl;
success = 0;
signal(sig,catch_alarm);
}
// connect to db etc.
// *SNIP
signal (SIGALRM, catch_alarm);
alarm(2);
mysql_query(p_connection,"LOCK TABLES XYZ as write");
You can implement a "cancel-like" behavior this way:
You execute the query on a separate thread, that keeps running whether or not the timeout occurs. The timeout occurs on the main thread, and sets a variable to "1" marking that it occurred. Then you do whatever you want to do on your main thread.
The query thread, once the query completes, checks if the timeout has occurred. If it hasn't, it does the rest of the work it needs to do. If it HAS, it just unlocks the tables it just locked.
I know it sounds a bit wasteful, but the lock-unlock period should be basically instantaneous, and you get as close to the result you want as possible.
You could execute the blocking query in a different thread and never being bothered with the timeout. When some data arrives you notify the thread that needs to know about the status of the transaction.
If I was writing from scratch I would do that, but this is a server application that we are just doing an upgrade to rather than a large rework.
instead of trying to fake transactions with table locks, why not switch to innodb tables where you get actual transactions? just make sure to set the default transaction isolation level to REPEATABLE READ.
As I said, it is not so easy to 'switch' or re-architect when this is a live, in production system. I'm slightly frustrated that MySQL provides no methods to check for locks or choose not to hang waiting on a lock.
I don't know if this is a good idea in terms of resource usage and "best practices" and "cleanliness" and all the rest... but you have now repeatedly described the handcuffs that bind you in terms of re-architecting a "clean" system... so here goes.....
Could you open a new, separate connection just for sending the LOCK statement? Then close that connection when you catch the timeout alarm? By closing/destroying the connection that was dedicated to the LOCK statement, would not that essentially "cancel" the LOCK statment? I am not certain if such events would occur as I have described/guessed, but maybe it is something to test out.
My experience described so far indicates to me that closing a connection in which a query is running causes a seg fault. Therefore dispatching that query into a different connection wouldn't really help, as that would also seg fault.