How to cancel long-running QSqlQuery?

How to cancel long-running QSqlQuery? - c++

How to cancel long running QSqlQuery?
Database is returning 3M+ rows and it's shown in QTableView control. I'd like to be able to force stop both long operations:
when database is running a long operation
if database is fast, but there is a huge number of rows to be returned and processing/copying/showing those takes a lot of time
2nd bullet, can be solved by not using QSqlQueryModel. In this case, parsing query results manually can be done in stages and this will be implemented, but i'd also like to know if the process of moving data DB->QTableView can be interrupted and cancelled.
I've tried following without success:
QSqlQuery::finish()
QFuture::cancel()
QSqlDatabase.close() -- this one crashes application
If full context is needed, it's here. Method in question is
on_button_stopQuery_released

Aborting queries during execution (not fetching, which is what QSqlQuery::finish does) is hit-and-miss in all databases. Qt itself doesn't support this; workarounds will be backend-specific.
For example, with PostgreSQL you can do the following:
In your original connection, retrieve the connection ID (SELECT pg_backend_pid();) and save it
When you want to abort your query, open a second connection and kill the query by issuing SELECT pg_cancel_backend(saved_id);
SQLite has sqlite3_interrupt(sqlite3*). This interrupts queries and does not close the connection.
MySQL is similar to PostgreSQL:
First retrieve the connection ID (SELECT CONNECTION_ID();)
Then kill it through another connection (KILL [CONNECTION|QUERY] $connection_id).
As you can see, even the capabilities provided are backend-specific. Postgres can only abort connections, while SQLite can only abort queries. The easiest way to implement this would thus be to discard the connection if the query was aborted and the connection is still valid. Then you can have a simple two-API interface for a cancellation management (pseudo code, i.e. Python):
class IConnectionCancellation:
def register(connection):
# save/retrieve connection ID
def cancel():
# open second connection, send backend-specific query
For large result sets, consider using canFetchMore and fetchMore in your model. That way you don't have to process the entire result set before showing some results to the user; might feel smoother to use. Doesn't help with inherent query execution latency due to e.g. order by or grouping clauses, of course.

Related

Django: for loop through parallel process and store values and return after it finishes

I have a for loop in django. It will loop through a list and get the corresponding data from database and then do some calculation based on the database value and then append it another list
def getArrayList(request):
list_loop = [...set of values to loop through]
store_array = [...store values here from for loop]
for a in list_loop:
val_db = SomeModel.objects.filter(somefield=a).first()
result = perform calculation on val_db
store_array.append(result)
The list if 10,000 entries. If the user want this request he is ready to wait and will be informed that it will take time
I have tried joblib with backed=threading its not saving much time than normal loop
But when i try with backend=multiprocessing. it says "Apps aren't loaded yet"
I read multiprocessing is not possible in module based files.
So i am looking at celery now. I am not sure how can this be done in celery.
Can any one guide how can we faster the for loop calculation using mutliprocessing techniques available.

You're very likely looking for the wrong solution. But then again - this is pseudo code so we can't be sure.
In either case, your pseudo code is a self-fulfilling prophecy, since you run queries in a for loop. That means network latency, result set fetching, tying up database resources etc etc. This is never a good pattern, at best it's a last resort.
The simple solution is to get all values in one query:
list_values = [ ... ]
results = []
db_values = SomeModel.objects.filter(field__in=list_values)
for value in db_values:
results.append(calc(value))
If for some reason you need to loop, then to do this in celery, you would mark the function as a task (plenty of examples to find). It won't speed up anything. But you won't speed up anything - it will we be run in the background and so you render a "please wait" message and somehow you need to notify the user again that the job is done.
I'm saying somehow, because there isn't a really good integration package that I'm aware of that ties in all the components. There's django-notifications-hq, but if this is your only background task, it's a lot of extra baggage just for that - so you may want to change the notification part to "we will send you an email when the job is done", cause that's easy to achieve inside your function.
And thirdly, if this is simply creating a report, that doesn't need things like automatic retries on failure, then you can simply opt to use Django Channels and a browser-native websocket to start and report on the job (which also allows you to send email).

You could try concurrent.futures.ProcessPoolExecutor, which is a high level api for processing cpu bound tasks
def perform_calculation(item):
pass
# specify number of workers(default: number of processors on your machine)
with concurrent.futures.ProcessPoolExecutor(max_workers=6) as executor:
res = executor.map(perform_calculation, tasks)
EDIT
In case of IO bound operation, you could make use of ThreadPoolExecutor to open a few connections in parallel, you can wrap the pool in a contextmanager which handles the cleanup work for you(close idle connections). Here is one example but handles the connection closing manually.

Database access with threading

I'm developing a program (using C++ running on a Linux machine) that uses SQLite as a back-end.
It has 2 threads which carry out the following tasks:
Thread 1
Waits for a piece of data to arrive (in this case, via a radio module)
Immediately inserts it into the database
Returns to waiting for new data
It is important this thread is "listening" for as much of the time as possible and isn't blocked waiting to insert into the database
Thread 2
Every 2 minutes, runs a SELECT on the database to find un-processed data
Processes the data
UPDATEs the rows fetched with a flag to show they have been processed
The key thing is to make sure that Thread 1 can always INSERT into the database, even if this means that Thread 2 is unable to SELECT or UPDATE (as this can just take place at a future point, the timing isn't critical).
I was hoping to find a way to prioritise INSERTs somehow using SQLite, but have failed to find a way so far. Another thought was for Thread 1 to push it's the data into a basic queue (held in memory) and then bulk INSERT it every so often (as this wouldn't be blocking the receiving of data and could do a simple check to see if the database was locked, if so, wait a few milliseconds and try again).
However, what is the "proper" way to do this with SQLite and C++ threads?

SQlite database can be opened with or without multi-threading support. Both threads should open the database separately.
If you want to do the hard way, you can use a priority queue and process the queries.

sqlite db remains locked/unaccessible

I have a problem with an sqlite3 db which remains locked/unaccessible after a certain access.
Behaviour occurs so far on Ubuntu 10.4 and on custom (OpenEmbedded) Linux.
The sqlite version is 3.7.7.1). Db is a local file.
One C++-applications accesses the db periodically (5s). Each time several insert statements are done wrapped in a deferred transaction. This happens in one thread only. The connection to the db is held over the whole lifetime of the application. The statements used are also persistent and reused via sqlite3_reset. sqlite_threadsafe is set to 1 (serialized), journaling is set to WAL.
Then I open in parellel the sqlite db with the sqlite command line tool. I enter BEGIN IMMEDIATE;, wait >5s, and commit with END;.
after this the db access of the application fails: the BEGIN TRANSACTION returns return code 1 ("SQL error or missing database"). If I execute an ROLLBACK TRANSACTION right before the begin, just to be sure there is not already an active transaction, it fails with return code 5 ("The database file is locked").
Has anyone an idea how to approach this problem or has an idea what may cause it?
EDIT: There is a workaround: If the described error occures, I close and reopen the db connection. This fixes the problem, but I'm currently at a loss at to why this is so.

Sqlite is a server less database. As far as I know it does not support concurrent access from multiple source by design. You are trying to access the same backing file from both your application and the command tool - so you attempt to perform concurrent access. This is why it is failing.

SQLite connections should only be used from a single thread, as among other things they contain mutexes that are used to ensure correct concurrent access. (Be aware that SQLite also only ever supports a single updating thread at once anyway, and with no concurrent reads at the time; that's a limitation of being a server-less DB.)
Luckily, SQLite connections are relatively cheap when they're not doing anything and the cost of things like cached prepared statements is actually fairly small; open up as many as you need.
[EDIT]:
Moreover, this would explain closing and reopening the connection works: it builds the connection in the new thread (and finalizes all the locks etc. in the old one).

How to detect Oracle broken/stalled connection?

In our server/client-setup we're experiencing some weird behaviour. The client is a C/C++-application which uses OCI to connect to an Oracle server (using the OTL library).
Every now and then the DB server dies in a way (yes this is the core issue, but from application-side we're unable to solve it but have to deal with it anyway), that the machine does not respond anymore to new requests/connections but the existing ones, like the Oracle-connections, do not drop or time out. Queries sent to the DB just never return successfully anymore.
What possibilities (if any) are provided by Oracle to detect these stalled connections from the client-application side and recover in a more or less safe way?

This is a bug in Oracle ( or call it a feature ) till 11.1.0.6 and they said the patch on Oracle 11g release 1 ( patch 11.1.0.7 ) which has the fix. Need to see that.
If it happens you will have to cancel ( kill ) the thread performing this action.
Not good approach though

In all my DB schema i have a table with one constant record. Just poll such table periodically by simple SQL request. All other methods unreliable.

There's a set_timeout API in OTL that might be useful for this.
Edit: Actually, ignore that. set_timeout doesn't work with OCI. Have a look at the set_timeout description from here where it describes a technique that can be used with OCI

Sounds like you need to fire off a query to the database (eg SELECT * FROM dual;), then if the database hasn't responded within a specified amount of time, assume the server has died and react accordingly. I'm afraid I don't know C/C++, but can you use multi-threading to fire off the statement then wait for the response, without hanging the application?

This works - I have done exactly what you are looking for.
Have a parent process (A) create a child process (B). The child process (B) connects to the database,
performs a query (something like "select 1 from a_table" - you will get better performance if you avoid using "dual" for this and create your own table). If (B) is successful then it writes out that it was successful and exits. (A) is waiting for a specified amount of time. I used 15 seconds. If (A) detects that (B) is still running - then it can assume that the database is hung - it Kills (B) and takes necessary actions (Like calling me on the phone with a SMS).
If you configure SQL*NET to use a timeout you will probably notice that large queries will fail because of it. The OCI set_timeout configuration will also cause this.

There is a manual way to avoid this. You can open a firewall and do something like ping database after every specified duration of time. In this way the database connection will not get lost.
idea
If (current_time - lastPingTime > configuredPingTime)
{
//Dummy query
select 1 from dual;
}

How to timeout a mysql++ query in c++

I am using mysql++ in order to connect to a MySQL database to perform a bunch of data queries. Due to the fact that the tables I am reading from are constantly being written to, and that I need a consistent view of the data, I lock the tables first. However, MySQL has no concept of 'NOWAIT' in its lock query, thus if the tables are locked by something else that keeps them locked for a long time, my application sits there waiting. What I want it to do is to be able to return and say something like 'Lock could no be obtained' and try again in a few seconds. My general attempt at this timeout is below.
If I run this after locking the table on the database, I get the message that the timeout is hit, but I don't know how to then get the mysql_query line to terminate. I'd appreciate any help/ideas!
volatile sig_atomic_t success = 1;
void catch_alarm(int sig) {
cout << "Timeout reached" << endl;
success = 0;
signal(sig,catch_alarm);
}
// connect to db etc.
// *SNIP
signal (SIGALRM, catch_alarm);
alarm(2);
mysql_query(p_connection,"LOCK TABLES XYZ as write");

You can implement a "cancel-like" behavior this way:
You execute the query on a separate thread, that keeps running whether or not the timeout occurs. The timeout occurs on the main thread, and sets a variable to "1" marking that it occurred. Then you do whatever you want to do on your main thread.
The query thread, once the query completes, checks if the timeout has occurred. If it hasn't, it does the rest of the work it needs to do. If it HAS, it just unlocks the tables it just locked.
I know it sounds a bit wasteful, but the lock-unlock period should be basically instantaneous, and you get as close to the result you want as possible.

You could execute the blocking query in a different thread and never being bothered with the timeout. When some data arrives you notify the thread that needs to know about the status of the transaction.

If I was writing from scratch I would do that, but this is a server application that we are just doing an upgrade to rather than a large rework.

instead of trying to fake transactions with table locks, why not switch to innodb tables where you get actual transactions? just make sure to set the default transaction isolation level to REPEATABLE READ.

As I said, it is not so easy to 'switch' or re-architect when this is a live, in production system. I'm slightly frustrated that MySQL provides no methods to check for locks or choose not to hang waiting on a lock.

I don't know if this is a good idea in terms of resource usage and "best practices" and "cleanliness" and all the rest... but you have now repeatedly described the handcuffs that bind you in terms of re-architecting a "clean" system... so here goes.....
Could you open a new, separate connection just for sending the LOCK statement? Then close that connection when you catch the timeout alarm? By closing/destroying the connection that was dedicated to the LOCK statement, would not that essentially "cancel" the LOCK statment? I am not certain if such events would occur as I have described/guessed, but maybe it is something to test out.

My experience described so far indicates to me that closing a connection in which a query is running causes a seg fault. Therefore dispatching that query into a different connection wouldn't really help, as that would also seg fault.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to cancel long-running QSqlQuery? - c++

Related

Django: for loop through parallel process and store values and return after it finishes

Database access with threading

sqlite db remains locked/unaccessible

How to detect Oracle broken/stalled connection?

How to timeout a mysql++ query in c++

Categories

Resources