sqlite3 - keep open handler VS, open database when need

sqlite3 - keep open handler VS, open database when need - c++

How much overhead of keeping sqlite3 database opened VS. open database only when need.
The application is high load.
1) But it's hard to write version that will use one handler per thread, but I can write something like driver that will keep ie. 3-5 handlers opened and ready for reading and 1 for writing. Drive them for threads by request, keep mutexes etc. ( not easy solution to implement )
VS.
2) open sqlite database only when I need it by some thread and give sqlite to do all job, but here is additional overhead to open database each time. (easy to implement)
UPDATE:
3) there are other option: I can keep one handler opened per database and use simple mutex to lock access to the database. The disadvantages of this is that I loose concurrency reads. So, only one thread will be able to read or write, while by option 3 there is concurrency free reading (more then 1 reader can read at the time)

You should keep it open.
Open and close file is more expensive then keep one file handler opened.
You can simulate the cost by running 1000 same queries in loop, 1st when open and close are inside the loop and then when you move them out.
Usually a multi-threaded application should use connection pool. The size of the pool should be calculated.
EDIT: synchronizing writes to DB can be done by TRANSACTION. in sqlite you use BEGIN TRANSACTION and END TRANSACTION sqls (or just BEGIN & END). BEGIN can be use as mutex lock in a loop, END can be use as unlock. it can protect you from altering the DB from other process.
EDIT2: more solution is connection per thread.
EDIT3: You can also implement or use a message queue to write to the DB.
EDIT4:
I think separating read & write is not so good idea, because write should be in higher priority than read. the problem is that in sqlite you can't lock a single table, you lock the entire DB.
When I used sqlite I used a wrapper class with a single handle to the DB, all the read and write from/to the DB by high level functions, I had a write queue, and also kept track for each table if it had unwritten change pending, so for every read function I could test if I have the updated data or should wait.

Related

Database access with threading

I'm developing a program (using C++ running on a Linux machine) that uses SQLite as a back-end.
It has 2 threads which carry out the following tasks:
Thread 1
Waits for a piece of data to arrive (in this case, via a radio module)
Immediately inserts it into the database
Returns to waiting for new data
It is important this thread is "listening" for as much of the time as possible and isn't blocked waiting to insert into the database
Thread 2
Every 2 minutes, runs a SELECT on the database to find un-processed data
Processes the data
UPDATEs the rows fetched with a flag to show they have been processed
The key thing is to make sure that Thread 1 can always INSERT into the database, even if this means that Thread 2 is unable to SELECT or UPDATE (as this can just take place at a future point, the timing isn't critical).
I was hoping to find a way to prioritise INSERTs somehow using SQLite, but have failed to find a way so far. Another thought was for Thread 1 to push it's the data into a basic queue (held in memory) and then bulk INSERT it every so often (as this wouldn't be blocking the receiving of data and could do a simple check to see if the database was locked, if so, wait a few milliseconds and try again).
However, what is the "proper" way to do this with SQLite and C++ threads?

SQlite database can be opened with or without multi-threading support. Both threads should open the database separately.
If you want to do the hard way, you can use a priority queue and process the queries.

sqlite in c++ - parallel inserts from different applications, what will happen?

I'm opening the sqlite database file with sqlite3_open and inserting data with a sqlite3_exec.
The file is a global log file and many users are writting to it.
Now I wonder, what happens if two different users with two different program instances try to insert data at the same time... Is the opening failing for the second user? Or the inserting?
What will happen in this case?
Is there a way to handle the problem, if this scenario is not working? Without a server side database?

In most cases yes. It uses file locking, but it is broken on some systems, see http://www.sqlite.org/faq.html#q5

In short, the lock is created when you start a transaction, and released immediately after. While locked, other instances can neither read nor write to the db (in "big" db, they can still read). However, you can connect sqlite in exclusive mode.
When you want to write to db, which is locked by another process, the execution halts for a specified timeout, by default 5 seconds. If lock is released, it proceeds with writing, if not it raises error.

Strategies for concurrent read/writing and reading in SQLite

I have an SQLite database that I am keeping open and writing to in process A. I would like to be able to use it from process B on a read-only basis.
According to the document,
if the database is UNLOCKED the database may not be read (or written) - unsuitable
if the database is SHARED then two processes can read it but the first can't write - unsuitable
if a process wants to write it needs an EXCLUSIVE lock which means no other processes can write - unsuitable
The process A will be making lots of little writes so I don't think making a copy on each transaction commit will be efficient.
The only way I can see it is for the reader to wait until the database enters UNLOCKED state, get a SHARED lock for the duration of the read and then release it. Meanwhile process A will want to write and will be blocked until the lock becomes available - if it ever does (what if process B crashes?). This means that process A and process B will be in contention for locks - B wants SHARED and A wants EXCLUSIVE and this will slow things down or even lead to concurrency problems.
Is there any way to achieve my aim of concurrent writing and reading?

Use WAL mode. It supports concurrent readers and one writer.

As for Android you can use WAL mode. It is (badly) supported from API 11. Better support starts with API 16. Use this code to switch your database connection to WAL mode:
int flags = SQLiteDatabase.CREATE_IF_NECESSARY;
if(walModeEnabled) {
if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.JELLY_BEAN) {
flags = flags | SQLiteDatabase.ENABLE_WRITE_AHEAD_LOGGING;
}
}
SQLiteDatabase db = SQLiteDatabase.openDatabase(databasePath.getPath(), null, flags);
// backward compatibility hack to support WAL on pre-jelly-bean devices
if(walModeEnabled) {
if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.HONEYCOMB &&
Build.VERSION.SDK_INT < Build.VERSION_CODES.JELLY_BEAN) {
db.enableWriteAheadLogging();
} else {
Log.w(TAG, "WAL is not supported on API levels below 11.");
}
}
For SQLiteOpenHelper and deeper explanation how WAL mode works under the hood please refer to my article:
https://www.skoumal.net/en/parallel-read-and-write-in-sqlite/

Simple answer for your question is - "IMPOSSIBLE"
Even if you getting success to do so - Means you are getting wrong results.
I think you should know the basics of Database -
Why database is better then file handling and other data storage methods.
Simple answer -
You can't perform W-W, W-R, R-W operations simultaneously.
( W - write, R- read )
However you can execute infinite R-R operations at a same time.
Just think about the online Banking system or Railway reservation system.
In which there is a special feature of database is used which is Transaction.
It follows ACID.
which is Atomicity, Consistency, Isolation, Durability.
Atomicity - Either complete or not at all.
Consistency - After each transaction system will go from one consistent state to another consistent state.
Isolation - Every transaction will executed in isolation of each other.
( Means if write query come first it will executed first ) There is just no way to give both write and read operation at the same time. Even there is a difference of nano second System will detect it. However if you got success to do so . db simply reject it or execute the operation which has higher priority.
Durability - System must durable in time.
--Maybe it is very broad than a simple database but it may be help you to understand.--
2.
An SQLite database file is organized as pages. The size of each
page is a power of 2 between 512 and SQLITE_MAX_PAGE_SIZE. The default
value for SQLITE_MAX_PAGE_SIZE is 32768.
The SQLITE_MAX_PAGE_COUNT parameter, which is normally set to
1073741823, is the maximum number of pages allowed in a single database
file. An attempt to insert new data that would cause the database file
to grow larger than this will return SQLITE_FULL.
So we have 32768 * 1073741823, which is 35,184,372,056,064 (35 trillion bytes)!
You can modify SQLITE_MAX_PAGE_COUNT or SQLITE_MAX_PAGE_SIZE
in the source, but this of course will require a custom build of SQLite
for your application. As far as I'm aware, there's no way to set a
limit programmatically other than at compile time (but I'd be happy to
be proven wrong).

sqlite db remains locked/unaccessible

I have a problem with an sqlite3 db which remains locked/unaccessible after a certain access.
Behaviour occurs so far on Ubuntu 10.4 and on custom (OpenEmbedded) Linux.
The sqlite version is 3.7.7.1). Db is a local file.
One C++-applications accesses the db periodically (5s). Each time several insert statements are done wrapped in a deferred transaction. This happens in one thread only. The connection to the db is held over the whole lifetime of the application. The statements used are also persistent and reused via sqlite3_reset. sqlite_threadsafe is set to 1 (serialized), journaling is set to WAL.
Then I open in parellel the sqlite db with the sqlite command line tool. I enter BEGIN IMMEDIATE;, wait >5s, and commit with END;.
after this the db access of the application fails: the BEGIN TRANSACTION returns return code 1 ("SQL error or missing database"). If I execute an ROLLBACK TRANSACTION right before the begin, just to be sure there is not already an active transaction, it fails with return code 5 ("The database file is locked").
Has anyone an idea how to approach this problem or has an idea what may cause it?
EDIT: There is a workaround: If the described error occures, I close and reopen the db connection. This fixes the problem, but I'm currently at a loss at to why this is so.

Sqlite is a server less database. As far as I know it does not support concurrent access from multiple source by design. You are trying to access the same backing file from both your application and the command tool - so you attempt to perform concurrent access. This is why it is failing.

SQLite connections should only be used from a single thread, as among other things they contain mutexes that are used to ensure correct concurrent access. (Be aware that SQLite also only ever supports a single updating thread at once anyway, and with no concurrent reads at the time; that's a limitation of being a server-less DB.)
Luckily, SQLite connections are relatively cheap when they're not doing anything and the cost of things like cached prepared statements is actually fairly small; open up as many as you need.
[EDIT]:
Moreover, this would explain closing and reopening the connection works: it builds the connection in the new thread (and finalizes all the locks etc. in the old one).

Writing concurrently to a file

I have this tool in which a single log-like file is written to by several processes.
What I want to achieve is to have the file truncated when it is first opened, and then have all writes done at the end by the several processes that have it open.
All writes are systematically flushed and mutex-protected so that I don't get jumbled output.
First, a process creates the file, then starts a sequence of other processes, one at a time, that then open the file and write to it (the master sometimes chimes in with additional content; the slave process may or may not be open and writing something).
I'd like, as much as possible, not to use more IPC that what already exists (all I'm doing now is writing to a popen-created pipe). I have no access to external libraries other that the CRT and Win32 API, and I would like not to start writing serialization code.
Here is some code that shows where I've gone:
// open the file. Truncate it if we're the 'master', append to it if we're a 'slave'
std::ofstream blah(filename, ios::out | (isClient ? ios:app : 0));
// do stuff...
// write stuff
myMutex.acquire();
blah << "stuff to write" << std::flush;
myMutex.release();
Well, this does not work: although the output of the slave process is ordered as expected, what the master writes is either bunched together or at the wrong place, when it exists at all.
I have two questions: is the flag combination given to the ofstream's constructor the right one ? Am I going the right way anyway ?

If you'll be writing a lot of data to the log from multiple threads, you'll need to rethink the design, since all threads will block on trying to acquire the mutex, and in general you don't want your threads blocked from doing work so they can log. In that case, you'd want to write your worker thread to log entries to queue (which just requires moving stuff around in memory), and have a dedicated thread to pull entries off the queue and write them to the output. That way your worker threads are blocked for as short a time as possible.
You can do even better than this by using async I/O, but that gets a bit more tricky.

As suggested by reinier, the problem was not in the way I use the files but in the way the programs behave.
The fstreams do just fine.
What I missed out is the synchronization between the master and the slave (the former was assuming a particular operation was synchronous where it was not).
edit: Oh well, there still was a problem with the open flags. The process that opened the file with ios::out did not move the file pointer as needed (erasing text other processes were writing), and using seekp() completely screwed the output when writing to cout as another part of the code uses cerr.
My final solution is to keep the mutex and the flush, and, for the master process, open the file in ios::out mode (to create or truncate the file), close it and reopen it using ios::app.

I made a 'lil log system that has it's own process and will handle the writing process, the idea is quite simeple. The proccesses that uses the logs just send them to a pending queue which the log process will try to write to a file. It's like batch procesing in any realtime rendering app. This way you'll grt rid of too much open/close file operations. If I can I'll add the sample code.

How do you create that mutex?
For this to work this needs to be a named mutex so that both processes actually lock on the same thing.
You can check that your mutex is actually working correctly with a small piece of code that lock it in one process and another process which tries to acquire it.

I suggest blocking such that the text is completely written to the file before releasing the mutex. I've had instances where the text from one task is interrupted by text from a higher priority thread; doesn't look very pretty.
Also, put the format into Comma Separated format, or some format that can be easily loaded into a spreadsheet. Include thread ID and timestamp. The interlacing of the text lines shows how the threads are interacting. The ID parameter allows you to sort by thread. Timestamps can be used to show sequential access as well as duration. Writing in a spreadsheet friendly format will allow you to analyze the log file with an external tool without writing any conversion utilities. This has helped me greatly.

One option is to use ACE::logging. It has an efficient implementation of concurrent logging.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js