I have a problem with an sqlite3 db which remains locked/unaccessible after a certain access.
Behaviour occurs so far on Ubuntu 10.4 and on custom (OpenEmbedded) Linux.
The sqlite version is 3.7.7.1). Db is a local file.
One C++-applications accesses the db periodically (5s). Each time several insert statements are done wrapped in a deferred transaction. This happens in one thread only. The connection to the db is held over the whole lifetime of the application. The statements used are also persistent and reused via sqlite3_reset. sqlite_threadsafe is set to 1 (serialized), journaling is set to WAL.
Then I open in parellel the sqlite db with the sqlite command line tool. I enter BEGIN IMMEDIATE;, wait >5s, and commit with END;.
after this the db access of the application fails: the BEGIN TRANSACTION returns return code 1 ("SQL error or missing database"). If I execute an ROLLBACK TRANSACTION right before the begin, just to be sure there is not already an active transaction, it fails with return code 5 ("The database file is locked").
Has anyone an idea how to approach this problem or has an idea what may cause it?
EDIT: There is a workaround: If the described error occures, I close and reopen the db connection. This fixes the problem, but I'm currently at a loss at to why this is so.
Sqlite is a server less database. As far as I know it does not support concurrent access from multiple source by design. You are trying to access the same backing file from both your application and the command tool - so you attempt to perform concurrent access. This is why it is failing.
SQLite connections should only be used from a single thread, as among other things they contain mutexes that are used to ensure correct concurrent access. (Be aware that SQLite also only ever supports a single updating thread at once anyway, and with no concurrent reads at the time; that's a limitation of being a server-less DB.)
Luckily, SQLite connections are relatively cheap when they're not doing anything and the cost of things like cached prepared statements is actually fairly small; open up as many as you need.
[EDIT]:
Moreover, this would explain closing and reopening the connection works: it builds the connection in the new thread (and finalizes all the locks etc. in the old one).
Related
I have a SQLite database that is used by two processes. I am wondering, with the most recent version of SQLite, while one process (connection) starts a transaction to write to the database will the other process be able to read from the database simultaneously?
I collected information from various sources, mostly from sqlite.org, and put them together:
First, by default, multiple processes can have the same SQLite database open at the same time, and several read accesses can be satisfied in parallel.
In case of writing, a single write to the database locks the database for a short time, nothing, even reading, can access the database file at all.
Beginning with version 3.7.0, a new “Write Ahead Logging” (WAL) option is available, in which reading and writing can proceed concurrently.
By default, WAL is not enabled. To turn WAL on, refer to the SQLite documentation.
SQLite3 explicitly allows multiple connections:
(5) Can multiple applications or multiple instances of the same
application access a single database file at the same time?
Multiple processes can have the same database open at the same time.
Multiple processes can be doing a SELECT at the same time. But only
one process can be making changes to the database at any moment in
time, however.
For sharing connections, use SQLite3 shared cache:
Starting with version 3.3.0, SQLite includes a special "shared-cache"
mode (disabled by default)
In version 3.5.0, shared-cache mode was modified so that the same
cache can be shared across an entire process rather than just within a
single thread.
5.0 Enabling Shared-Cache Mode
Shared-cache mode is enabled on a per-process basis. Using the C
interface, the following API can be used to globally enable or disable
shared-cache mode:
int sqlite3_enable_shared_cache(int);
Each call sqlite3_enable_shared_cache() effects subsequent database
connections created using sqlite3_open(), sqlite3_open16(), or
sqlite3_open_v2(). Database connections that already exist are
unaffected. Each call to sqlite3_enable_shared_cache() overrides all
previous calls within the same process.
I had a similar code architecture as you. I used a single SQLite database which process A read from, while process B wrote to it concurrently based on events. (In python 3.10.2 using the most up to date sqlite3 version). Process B was continually updating the database, while process A was reading from it to check data. My issue was that it was working in debug mode, but not in "release" mode.
In order to solve my particular problem I used Write Ahead Logging, which is referenced in previous answers. After creating my database in Process B (write mode) I added the line:
cur.execute('PRAGMA journal_mode=wal') where cur is the cursor object created from establishing connection.
This set the journal to wal mode which allows for concurrent access for multiple reads (but only one write). In Process A, where I was reading the data, before connecting to the same database I included:
time.sleep(0.5)
Setting a sleep timer before a connection was made to the same database fixed my issue with it not working in "release" mode.
In my case: I did not have to manually set any checkpoints, locks, or transactions. Your use case might be different than mine however, so research is most likely required. Nevertheless, I hope this post helps and saves everyone some time!
I'm using Berkeley DB with a probably relatively large database file (2.1 GiB, using btree format in case it matters). During application shutdown, DbEnv::lsn_reset is called in order to "flush" everything before exiting the application. For the large database, this routine takes a very long time for me -- 10 minutes or so at least, during which heavy disk access happens.
Is this normal or the result of using Berkeley DB in some wrong way? Is there anything that can be done to make things process faster? In particular, which parameters could be tweaked to improve performance here?
DbEnv::lsn_reset() is probably not what you want. That function rewrites every single page in the database, so that you can close the databases out and open them in a different environment. It's going to write out at least 2.1 GiB, and pretty slowly.
If you're just shutting the application down to be started back up sometime later, you may simply just want to do a DbEnv::txn_checkpoint() to flush the database log and insert a checkpoint record. Though, this isn't required either. As long as you have the logs committed to stable storage, you can simply exit your application.
http://docs.oracle.com/cd/E17276_01/html/api_reference/CXX/txncheckpoint.html
I'm opening the sqlite database file with sqlite3_open and inserting data with a sqlite3_exec.
The file is a global log file and many users are writting to it.
Now I wonder, what happens if two different users with two different program instances try to insert data at the same time... Is the opening failing for the second user? Or the inserting?
What will happen in this case?
Is there a way to handle the problem, if this scenario is not working? Without a server side database?
In most cases yes. It uses file locking, but it is broken on some systems, see http://www.sqlite.org/faq.html#q5
In short, the lock is created when you start a transaction, and released immediately after. While locked, other instances can neither read nor write to the db (in "big" db, they can still read). However, you can connect sqlite in exclusive mode.
When you want to write to db, which is locked by another process, the execution halts for a specified timeout, by default 5 seconds. If lock is released, it proceeds with writing, if not it raises error.
I have a multithreaded application that uses sqlite (3.7.3)
I'm hitting the database locked error that seems to be quite prevalent.
I'm wondering how to avoid it in my case.
Let me describe what I'm building. Sorry, no code it's too large and complex.
I have around 8 threads that simultaneously access the database. Any one of those threads can either read or write at the same time.
Each row in a table in the database has a file path that points to a resource + other attributes related to that resource.
3 fields of note are readers, status and del.
Readers is incremented each time a thread reads from the resource, but only if status > 0 and del = 0.
So I have some SQL that does
UPDATE resource set readers=readers+1 where id=? AND del=0 AND status>0
After that, I check the number of rows updated. It should only be 1.
After that I try to read the row back with a select. I do that even if it failed
to update because I need to know the reason that it failed.
I tried wrapping both the update and the select in in a transaction but that didn't help.
I've checked that I'm calling finalize on my statements too.
Now, I thought that sqlite serializes by default. I've tried a couple of open modes but I still get the same error.
And before you ask, no I'm not intending to go to mysql. I absolutely need zero config.
Can someone provide some pointers on how to avoid this type of problem? should I move the readers lock out of the DB? If I do that what mechanism should I replace it with? I'm using Linux under C++ and with the boost library available.
EDIT:
Interestingly, adding COMMIT after my updated call improved things dramatically.
When you open the db, you should configure the 'busy timeout'
int sqlite3_busy_timeout(sqlite3*, int ms);
http://www.sqlite.org/c3ref/busy_timeout.html
First question: are you trying to use one connection with all eight threads? If so, make sure each thread has their own connection. I don't know of any database that likes that.
Also check out the FAQ: http://www.sqlite.org/faq.html
Apparently SQLite has to be compiled with a SQLITE_THREADSAFE preprocessor option set to 1. They do have a method to determine if that's your problem.
Another issue is that writes can only happen from one process safely.
we are planning to use SQLite in our project , and bit confused with the concurrency model of it (because i read many different views on this in the community). so i am putting down my questions hoping to clear off my apprehension.
we are planning to prepare all my statements during the application startup with multiple connection for reads and one connection for write.so basically with this we are create connection and prepare the statement in one thread and use another thread for binding and executing the statement.
i am using C APIs on windows and Linux.
Creating connection on one thread and using it in another . does this pose any problem.
should i use "Shared cache Mode".
i am thinking of using one lock to synchronize between reads and writes and there would not be any sync between Reads. should i sync between reads as well?
does concurrent multiple read work on same connection
does concurrent multiple read work on different connection
EDIT : one more question , how to validate the connection i,e we are opening the connection at the application startup ,the same connection will be used till the application exits, so in this process, how do we validate the connection before using it
Thanks in Advance
Regards
DEE
1) I do not think SQLite uses any thread specific data, therefore creating a connection on one thread and using on another should be fine (They say its so for version 3.5 onwards)
2) I don't think it will have any significant performance benefit to use shared cache mode, experiment and see, it takes only a single statement to enable it per thread
3) You need to use a Single-Writer-Multiple-Reader kind of lock, using simple locks will serialize all reads and writes and nullify any performance benefits of using multiple threads.
4 & 5) Any read operation should work concurrently without any problem.
SQL Lite faq covers threading in detail. Specific FAQ on threads As of 3.3.1 it is safe to do what you say, under certain conditions (see FAQ).