How to detect Oracle broken/stalled connection? - c++

In our server/client-setup we're experiencing some weird behaviour. The client is a C/C++-application which uses OCI to connect to an Oracle server (using the OTL library).
Every now and then the DB server dies in a way (yes this is the core issue, but from application-side we're unable to solve it but have to deal with it anyway), that the machine does not respond anymore to new requests/connections but the existing ones, like the Oracle-connections, do not drop or time out. Queries sent to the DB just never return successfully anymore.
What possibilities (if any) are provided by Oracle to detect these stalled connections from the client-application side and recover in a more or less safe way?

This is a bug in Oracle ( or call it a feature ) till 11.1.0.6 and they said the patch on Oracle 11g release 1 ( patch 11.1.0.7 ) which has the fix. Need to see that.
If it happens you will have to cancel ( kill ) the thread performing this action.
Not good approach though

In all my DB schema i have a table with one constant record. Just poll such table periodically by simple SQL request. All other methods unreliable.

There's a set_timeout API in OTL that might be useful for this.
Edit: Actually, ignore that. set_timeout doesn't work with OCI. Have a look at the set_timeout description from here where it describes a technique that can be used with OCI

Sounds like you need to fire off a query to the database (eg SELECT * FROM dual;), then if the database hasn't responded within a specified amount of time, assume the server has died and react accordingly. I'm afraid I don't know C/C++, but can you use multi-threading to fire off the statement then wait for the response, without hanging the application?

This works - I have done exactly what you are looking for.
Have a parent process (A) create a child process (B). The child process (B) connects to the database,
performs a query (something like "select 1 from a_table" - you will get better performance if you avoid using "dual" for this and create your own table). If (B) is successful then it writes out that it was successful and exits. (A) is waiting for a specified amount of time. I used 15 seconds. If (A) detects that (B) is still running - then it can assume that the database is hung - it Kills (B) and takes necessary actions (Like calling me on the phone with a SMS).
If you configure SQL*NET to use a timeout you will probably notice that large queries will fail because of it. The OCI set_timeout configuration will also cause this.

There is a manual way to avoid this. You can open a firewall and do something like ping database after every specified duration of time. In this way the database connection will not get lost.
idea
If (current_time - lastPingTime > configuredPingTime)
{
//Dummy query
select 1 from dual;
}

Related

How to cancel long-running QSqlQuery?

How to cancel long running QSqlQuery?
Database is returning 3M+ rows and it's shown in QTableView control. I'd like to be able to force stop both long operations:
when database is running a long operation
if database is fast, but there is a huge number of rows to be returned and processing/copying/showing those takes a lot of time
2nd bullet, can be solved by not using QSqlQueryModel. In this case, parsing query results manually can be done in stages and this will be implemented, but i'd also like to know if the process of moving data DB->QTableView can be interrupted and cancelled.
I've tried following without success:
QSqlQuery::finish()
QFuture::cancel()
QSqlDatabase.close() -- this one crashes application
If full context is needed, it's here. Method in question is
on_button_stopQuery_released
Aborting queries during execution (not fetching, which is what QSqlQuery::finish does) is hit-and-miss in all databases. Qt itself doesn't support this; workarounds will be backend-specific.
For example, with PostgreSQL you can do the following:
In your original connection, retrieve the connection ID (SELECT pg_backend_pid();) and save it
When you want to abort your query, open a second connection and kill the query by issuing SELECT pg_cancel_backend(saved_id);
SQLite has sqlite3_interrupt(sqlite3*). This interrupts queries and does not close the connection.
MySQL is similar to PostgreSQL:
First retrieve the connection ID (SELECT CONNECTION_ID();)
Then kill it through another connection (KILL [CONNECTION|QUERY] $connection_id).
As you can see, even the capabilities provided are backend-specific. Postgres can only abort connections, while SQLite can only abort queries. The easiest way to implement this would thus be to discard the connection if the query was aborted and the connection is still valid. Then you can have a simple two-API interface for a cancellation management (pseudo code, i.e. Python):
class IConnectionCancellation:
def register(connection):
# save/retrieve connection ID
def cancel():
# open second connection, send backend-specific query
For large result sets, consider using canFetchMore and fetchMore in your model. That way you don't have to process the entire result set before showing some results to the user; might feel smoother to use. Doesn't help with inherent query execution latency due to e.g. order by or grouping clauses, of course.

Whether performance will impact when database procedure is called from application many times?

In my project, we are calling the oracle procedure from our c++ application with
the help of Pro *C/C++ library provided by the oracle.
We have one big procedure, and my idea is to split the procedure in to two for modularity. But their advice is to call the procedure for one time, and perform all jobs at a shot.
The reason which i got from them is it will cause performance impacts, since application program is interacting with database for multiple times.
I agree, the above scenario will happen when application connects the database, calls the procedure and finally disconnects the database for each procedure call. But, what we really do is we create a pool of connections at startup, and reuse that pre-connected database connections for interacting with the database.
Information about my application:
It is multi-threaded application, which handles about 1000 request per second with thread pool size as 20. Currently for each request we communicate with database for 4 times.
EDIT:
"the switch between PLSQL and SQL is much faster than the other way around".
Q1. How this is is related to my actual question? My question is about spiliting a procedure in to two equal parts. Say suppose i have 4 queries executed in procedure, i just split it in to two as procedure a and procedure b, and each procedure will have two queries.
"pro*c calls which call PLSQL is an performance hit".
Q2. Do you mean an communication between application(pro *C/C++) and database(oracle) here? If so, is communication was a big performance hit?
In the ask tom link you have attached, "But don't be afraid at all to invoke SQL from PLSQL - that is what PLSQL does best"
Q4. Wheather context switch will happen while we call SQL from PLSQL? Because, as per above statement, it seems to be no performanace impact.
Your advice is correct, it would be better to perform all database tasks at once. There are 2 major performance impacts in your scenario
The context switching of pro*c between the SQL engine and the PL/SQL engine to run your threads multiple times. Usually the biggest issue in many PL/SQL calls from a client application.
The network stack overhead (TNS) in the communications between your pro*c app and the database engine - particularly if your app is on a different physical host.
Having said that, you are creating a connection pool at the application end the TNS listener should also have a pool of bequeathed server shadow processes waiting for each network connection (this is setup at the listener.ora).
The OCI login/logoff when the shadow process is already waiting for connect is very quick and not a huge factor in latency - I don't worry about this unless a new shadow process on the server has to start up - then it can be a very expensive call. As you are using connection pooling on the client side, this is usually not an issue but just something to consider because of the threading in your calls. Once you exhaust the pool of server shadow processes, you will notice a huge degradation if the TNS listener has to start up more server shadow process.
Edit in answer to the new questions:
It is very related. As pointed out previously, you should minimise the amount of plsql and sql calls within your C++ app. Each PLSQL call within your C++ app call invokes the SQL engine which then invokes the PLSQL engine for the procedure call. So if you split your procedure into 2 - you are doubling the SQL to PLSQL context switches which is the more expensive switch as outlined by the Tom Kyte article and my own personal experience.
Is answered in 1. But as I previously said the communications overhead is second unless your hosts are on different physical networks and the types of data you are transferring. For instance large C++ object parameters and large Oracle result sets with many calls will obviously affect communications latency with round trips. Remember that with more PLSQL calls you are also adding more SQLNET traffic for the setup for each connection and result set.
there is no 3. question
PLSQL to SQL within the PLSQL engine is negligible so don't get hung up on it. Put all your SQL calls within 1 PLSQL call for maximum performance throughput. Don't split the calls just to be more eloquent at the expensive of performance.

sqlite db remains locked/unaccessible

I have a problem with an sqlite3 db which remains locked/unaccessible after a certain access.
Behaviour occurs so far on Ubuntu 10.4 and on custom (OpenEmbedded) Linux.
The sqlite version is 3.7.7.1). Db is a local file.
One C++-applications accesses the db periodically (5s). Each time several insert statements are done wrapped in a deferred transaction. This happens in one thread only. The connection to the db is held over the whole lifetime of the application. The statements used are also persistent and reused via sqlite3_reset. sqlite_threadsafe is set to 1 (serialized), journaling is set to WAL.
Then I open in parellel the sqlite db with the sqlite command line tool. I enter BEGIN IMMEDIATE;, wait >5s, and commit with END;.
after this the db access of the application fails: the BEGIN TRANSACTION returns return code 1 ("SQL error or missing database"). If I execute an ROLLBACK TRANSACTION right before the begin, just to be sure there is not already an active transaction, it fails with return code 5 ("The database file is locked").
Has anyone an idea how to approach this problem or has an idea what may cause it?
EDIT: There is a workaround: If the described error occures, I close and reopen the db connection. This fixes the problem, but I'm currently at a loss at to why this is so.
Sqlite is a server less database. As far as I know it does not support concurrent access from multiple source by design. You are trying to access the same backing file from both your application and the command tool - so you attempt to perform concurrent access. This is why it is failing.
SQLite connections should only be used from a single thread, as among other things they contain mutexes that are used to ensure correct concurrent access. (Be aware that SQLite also only ever supports a single updating thread at once anyway, and with no concurrent reads at the time; that's a limitation of being a server-less DB.)
Luckily, SQLite connections are relatively cheap when they're not doing anything and the cost of things like cached prepared statements is actually fairly small; open up as many as you need.
[EDIT]:
Moreover, this would explain closing and reopening the connection works: it builds the connection in the new thread (and finalizes all the locks etc. in the old one).

How do I detect an aborted connection in Django?

I have a Django view that does some pretty heavy processing and takes around 20-30 seconds to return a result.
Sometimes the user will end up closing the browser window (terminating the connection) before the request completes -- in that case, I'd like to be able to detect this and stop working. The work I do is read-only on the database so there isn't any issue with transactions.
In PHP the connection_aborted function does exactly this. Is this functionality available in Django?
Here's example code I'd like to write:
def myview(request):
while not connection_aborted():
# do another bit of work...
if work_complete:
return HttpResponse('results go here')
Thanks.
I don't think Django provides it because it basically can't. More than Django itself, this depends on the way Django interfaces with your web server. All this depends on your software stack (which you have not specified). I don't think it's even part of the FastCGI and WSGI protocols!
Edit: I'm also pretty sure that Django does not start sending any data to the client until your view finishes execution, so it can't possibly know if the connection is dead. The underlying socket won't trigger an error unless the server tries to send some data back to the user.
That connection_aborted method in PHP doesn't do what you think it does. It will tell you if the client disconnected but only if the buffer has been flushed, i.e. some sort of response is sent from the server back to the client. The PHP versions wouldn't even work as you've written if above. You'd have to add a call to something like flush within your loop to have the server attempt to send data.
HTTP is a stateless protocol. It's designed to not have either the client or the server dependent on each other. As a result the state of either is only known when there is a connection is created, and that only occurs when there's some data to send one way or another.
Your best bet is to do as #MattH suggested and do this through a bit of AJAX, and if you'd like you can integrate something like Node.js to make client "check-ins" during processing. How to set that up properly is beyond my area of expertise, though.
So you have an AJAX view that runs a query that takes 20-30 seconds to process requested in the background of a rendered page and you're concerned about wasted resources for when someone cancels the page load.
I see that you've got options in three broad categories:
Live with it. Improve the situation by caching the results in case the user comes back.
Make it faster. Throw more space at a time/space trade-off. Maintain intermediate tables. Precalculate the entire thing, etc.
Do something clever with the browser fast-polling a "is it ready yet?" query and the server cancelling the query if it doesn't receive a nag within interval * 2 or similar. If you're really clever, you could return progress / ETA to the nags. However, this might not have particularly useful behaviour when the system is under load or your site is being accessed over limited bandwidth.
I don't think you should go for option 3 because it's increasing complexity and resource usage for not much gain.

Avoiding sqlite3 database locked

I have a multithreaded application that uses sqlite (3.7.3)
I'm hitting the database locked error that seems to be quite prevalent.
I'm wondering how to avoid it in my case.
Let me describe what I'm building. Sorry, no code it's too large and complex.
I have around 8 threads that simultaneously access the database. Any one of those threads can either read or write at the same time.
Each row in a table in the database has a file path that points to a resource + other attributes related to that resource.
3 fields of note are readers, status and del.
Readers is incremented each time a thread reads from the resource, but only if status > 0 and del = 0.
So I have some SQL that does
UPDATE resource set readers=readers+1 where id=? AND del=0 AND status>0
After that, I check the number of rows updated. It should only be 1.
After that I try to read the row back with a select. I do that even if it failed
to update because I need to know the reason that it failed.
I tried wrapping both the update and the select in in a transaction but that didn't help.
I've checked that I'm calling finalize on my statements too.
Now, I thought that sqlite serializes by default. I've tried a couple of open modes but I still get the same error.
And before you ask, no I'm not intending to go to mysql. I absolutely need zero config.
Can someone provide some pointers on how to avoid this type of problem? should I move the readers lock out of the DB? If I do that what mechanism should I replace it with? I'm using Linux under C++ and with the boost library available.
EDIT:
Interestingly, adding COMMIT after my updated call improved things dramatically.
When you open the db, you should configure the 'busy timeout'
int sqlite3_busy_timeout(sqlite3*, int ms);
http://www.sqlite.org/c3ref/busy_timeout.html
First question: are you trying to use one connection with all eight threads? If so, make sure each thread has their own connection. I don't know of any database that likes that.
Also check out the FAQ: http://www.sqlite.org/faq.html
Apparently SQLite has to be compiled with a SQLITE_THREADSAFE preprocessor option set to 1. They do have a method to determine if that's your problem.
Another issue is that writes can only happen from one process safely.