Accessing SQLite database from multiple processes and SQLITE_BUSY - c++

I have multiple processes (c++, Windows 8) that use the same SQLite database. I configured connections with SQLITE_CONFIG_SERIALIZED and PRAGMA busy_timeout = 60000;. Used journaling mode - DELETE.
Test scenario:
process #1 opens connection, makes reads/writes, sleeps for 5 sec
process #2 opens connection, makes reads/writes
After that process #1 failed to write to the database - it receives SQLITE_BUSY immediate after call to SQLite API (sqlite3_step, sqlite3_finalize). Process #2 still uses the connection without any problems.
I do not have any not closed transactions, I do not have any long operations on the database. What else can lead to this?
I use the same SQLite connection from the multiple threads inside the process. SQLite docs says that this is OK with config option SQLITE_CONFIG_SERIALIZED. Any exception for this rule?

SQLite acquires lock on database/table in sqlite3_prepare and releases it in sqlite3_finalize. Type of lock depends on your SQL expression.
If you created an STMT - you need to execute it and finalize as soon as possible. Otherwise you block different connections.
My applications created list of prepared STMT and kept it until the end. Generally this is misuse of SQLite.
Links:
SQLite locking

Related

What is the difference between SQLITE_THREADSAFE = 1 vs = 2 and why don't they allow sharing a same connection in multiple threads?

From SQLite compile time options:
... SQLITE_THREADSAFE=1 sets the default threading mode to Serialized. SQLITE_THREADSAFE=2 sets the default threading mode to Multi-threaded ...
It further states:
Multi-thread. In this mode, SQLite can be safely used by multiple
threads provided that no single database connection is used
simultaneously in two or more threads.
Serialized. In serialized mode, SQLite can be safely used by multiple
threads with no restriction.
It's not clear what is the use of "Multi-thread" (=2), if "Serialized" (=1) is capable of doing it without restrictions. The literal meanings of these 2 quoted terms are also not very clear.
Is the single DB connection in multiple threads not allowed for =2 option or =1 as well? Is that an undefined behaviour if used?
The reason for the second question is that, I have a requirement where several DB files are opened at the same time. They are being read in worker thread and written in a single DB thread. If I create 2 connections for each DB file, then soon the file descriptor limit can get exhausted for an OS.
Though we haven't faced any major problem, recently we came across a situation where the SQLite was accessed simultaneously from both the worker and DB threads. A long delay of 20 sec blocked the worker thread. This issue reproduces consistently.
This lead me to believe that, threading could be an issue. In my setup, the default =1 (Serialized) option is set at the compile time.
Clarifications:
Environment: Using Qt/C++. For threading we use QThreads. IMO, this may not impact this behaviour
Threading: There is a main thread, "database" thread and 4 worker threads. Every user sits on a particular worker thread for its socket connection. However their DBs are always on the common "database" thread
DB connections: There can be hundreds of different DBs opened at a time depending on number of users connected to server. Since every OS has a limit of how many files can be opened at a time, I use 1 connection per DB file.
Connection sharing: Every user's DB connection is shared between its worker thread for reading (SELECT) and the common DB thread for writing (INSERT/DELETE/UPDATE). I assumed that for =1, the connection can probably be shared.
Suspicion: There is 1 table which has 10k+ rows and it also contains huge data in its columns. Total DB size goes upto 300-400 MBs mainly due to this. When a SELECT is invoked on this particular row based on its "id" field (30 character string). The first time, it takes upto 20 sec. The next time, it's few milliseconds
Don't remove the C++ tag.
Well I'm no SQLite expert, but the following sounds quite clear to me:
Multi-thread. In this mode, SQLite can be safely used by multiple threads provided that no single database connection is used simultaneously in two or more threads.
Serialized. In serialized mode, SQLite can be safely used by multiple threads with no restriction.
From my understanding this means:
If you use SQLITE_THREADSAFE=2 (multithreaded) you have to make sure that each thread uses its own database connection. Sharing a single database connecgtion amongst multiple threads isn't safe.
If you use SQLITE_THREADSAFE=1 (serialized) you can even safely reuse a single databse connection amongst multiple threads.

Can We connect multiple application with the same sqllite database [duplicate]

I have a SQLite database that is used by two processes. I am wondering, with the most recent version of SQLite, while one process (connection) starts a transaction to write to the database will the other process be able to read from the database simultaneously?
I collected information from various sources, mostly from sqlite.org, and put them together:
First, by default, multiple processes can have the same SQLite database open at the same time, and several read accesses can be satisfied in parallel.
In case of writing, a single write to the database locks the database for a short time, nothing, even reading, can access the database file at all.
Beginning with version 3.7.0, a new “Write Ahead Logging” (WAL) option is available, in which reading and writing can proceed concurrently.
By default, WAL is not enabled. To turn WAL on, refer to the SQLite documentation.
SQLite3 explicitly allows multiple connections:
(5) Can multiple applications or multiple instances of the same
application access a single database file at the same time?
Multiple processes can have the same database open at the same time.
Multiple processes can be doing a SELECT at the same time. But only
one process can be making changes to the database at any moment in
time, however.
For sharing connections, use SQLite3 shared cache:
Starting with version 3.3.0, SQLite includes a special "shared-cache"
mode (disabled by default)
In version 3.5.0, shared-cache mode was modified so that the same
cache can be shared across an entire process rather than just within a
single thread.
5.0 Enabling Shared-Cache Mode
Shared-cache mode is enabled on a per-process basis. Using the C
interface, the following API can be used to globally enable or disable
shared-cache mode:
int sqlite3_enable_shared_cache(int);
Each call sqlite3_enable_shared_cache() effects subsequent database
connections created using sqlite3_open(), sqlite3_open16(), or
sqlite3_open_v2(). Database connections that already exist are
unaffected. Each call to sqlite3_enable_shared_cache() overrides all
previous calls within the same process.
I had a similar code architecture as you. I used a single SQLite database which process A read from, while process B wrote to it concurrently based on events. (In python 3.10.2 using the most up to date sqlite3 version). Process B was continually updating the database, while process A was reading from it to check data. My issue was that it was working in debug mode, but not in "release" mode.
In order to solve my particular problem I used Write Ahead Logging, which is referenced in previous answers. After creating my database in Process B (write mode) I added the line:
cur.execute('PRAGMA journal_mode=wal') where cur is the cursor object created from establishing connection.
This set the journal to wal mode which allows for concurrent access for multiple reads (but only one write). In Process A, where I was reading the data, before connecting to the same database I included:
time.sleep(0.5)
Setting a sleep timer before a connection was made to the same database fixed my issue with it not working in "release" mode.
In my case: I did not have to manually set any checkpoints, locks, or transactions. Your use case might be different than mine however, so research is most likely required. Nevertheless, I hope this post helps and saves everyone some time!

Database access with threading

I'm developing a program (using C++ running on a Linux machine) that uses SQLite as a back-end.
It has 2 threads which carry out the following tasks:
Thread 1
Waits for a piece of data to arrive (in this case, via a radio module)
Immediately inserts it into the database
Returns to waiting for new data
It is important this thread is "listening" for as much of the time as possible and isn't blocked waiting to insert into the database
Thread 2
Every 2 minutes, runs a SELECT on the database to find un-processed data
Processes the data
UPDATEs the rows fetched with a flag to show they have been processed
The key thing is to make sure that Thread 1 can always INSERT into the database, even if this means that Thread 2 is unable to SELECT or UPDATE (as this can just take place at a future point, the timing isn't critical).
I was hoping to find a way to prioritise INSERTs somehow using SQLite, but have failed to find a way so far. Another thought was for Thread 1 to push it's the data into a basic queue (held in memory) and then bulk INSERT it every so often (as this wouldn't be blocking the receiving of data and could do a simple check to see if the database was locked, if so, wait a few milliseconds and try again).
However, what is the "proper" way to do this with SQLite and C++ threads?
SQlite database can be opened with or without multi-threading support. Both threads should open the database separately.
If you want to do the hard way, you can use a priority queue and process the queries.

sqlite3 - keep open handler VS, open database when need

How much overhead of keeping sqlite3 database opened VS. open database only when need.
The application is high load.
1) But it's hard to write version that will use one handler per thread, but I can write something like driver that will keep ie. 3-5 handlers opened and ready for reading and 1 for writing. Drive them for threads by request, keep mutexes etc. ( not easy solution to implement )
VS.
2) open sqlite database only when I need it by some thread and give sqlite to do all job, but here is additional overhead to open database each time. (easy to implement)
UPDATE:
3) there are other option: I can keep one handler opened per database and use simple mutex to lock access to the database. The disadvantages of this is that I loose concurrency reads. So, only one thread will be able to read or write, while by option 3 there is concurrency free reading (more then 1 reader can read at the time)
You should keep it open.
Open and close file is more expensive then keep one file handler opened.
You can simulate the cost by running 1000 same queries in loop, 1st when open and close are inside the loop and then when you move them out.
Usually a multi-threaded application should use connection pool. The size of the pool should be calculated.
EDIT: synchronizing writes to DB can be done by TRANSACTION. in sqlite you use BEGIN TRANSACTION and END TRANSACTION sqls (or just BEGIN & END). BEGIN can be use as mutex lock in a loop, END can be use as unlock. it can protect you from altering the DB from other process.
EDIT2: more solution is connection per thread.
EDIT3: You can also implement or use a message queue to write to the DB.
EDIT4:
I think separating read & write is not so good idea, because write should be in higher priority than read. the problem is that in sqlite you can't lock a single table, you lock the entire DB.
When I used sqlite I used a wrapper class with a single handle to the DB, all the read and write from/to the DB by high level functions, I had a write queue, and also kept track for each table if it had unwritten change pending, so for every read function I could test if I have the updated data or should wait.

sqlite db remains locked/unaccessible

I have a problem with an sqlite3 db which remains locked/unaccessible after a certain access.
Behaviour occurs so far on Ubuntu 10.4 and on custom (OpenEmbedded) Linux.
The sqlite version is 3.7.7.1). Db is a local file.
One C++-applications accesses the db periodically (5s). Each time several insert statements are done wrapped in a deferred transaction. This happens in one thread only. The connection to the db is held over the whole lifetime of the application. The statements used are also persistent and reused via sqlite3_reset. sqlite_threadsafe is set to 1 (serialized), journaling is set to WAL.
Then I open in parellel the sqlite db with the sqlite command line tool. I enter BEGIN IMMEDIATE;, wait >5s, and commit with END;.
after this the db access of the application fails: the BEGIN TRANSACTION returns return code 1 ("SQL error or missing database"). If I execute an ROLLBACK TRANSACTION right before the begin, just to be sure there is not already an active transaction, it fails with return code 5 ("The database file is locked").
Has anyone an idea how to approach this problem or has an idea what may cause it?
EDIT: There is a workaround: If the described error occures, I close and reopen the db connection. This fixes the problem, but I'm currently at a loss at to why this is so.
Sqlite is a server less database. As far as I know it does not support concurrent access from multiple source by design. You are trying to access the same backing file from both your application and the command tool - so you attempt to perform concurrent access. This is why it is failing.
SQLite connections should only be used from a single thread, as among other things they contain mutexes that are used to ensure correct concurrent access. (Be aware that SQLite also only ever supports a single updating thread at once anyway, and with no concurrent reads at the time; that's a limitation of being a server-less DB.)
Luckily, SQLite connections are relatively cheap when they're not doing anything and the cost of things like cached prepared statements is actually fairly small; open up as many as you need.
[EDIT]:
Moreover, this would explain closing and reopening the connection works: it builds the connection in the new thread (and finalizes all the locks etc. in the old one).