I have used a version of double checked locking in my CF app (before I knew what double checked locking was).
Essentially, I check for the existance of an object. If it is not present, I lock (usually using a named lock) and before I try and create the object I check for existance again. I thought this was a neat way to stop multiple objects being created and stop excessive locking in the system.
This seems to work, in that there is not excessive locking and object duplicates don't get created. However, I have recently learned that Double Checked Locking dosn't work in Java, what I don't know is if this holds true in CF, seeing as CF threads and locks are not quite the same as native Java threads and locks.
To add on to what Ben Doom said about Java, this is fairly standard practice in ColdFusion, specifically with an application initialization routine where you set up your application variables.
Without having at least one lock, you are letting the initial hits to your web application all initialize the application variables at the same time. This assumes that your application is busy enough to warrant this. The danger is only there if your application is busy at the time your application is first starting up.
The first lock makes sure only one request at a time initializes your variables.
The second lock, embedded within the first, will check to make sure a variable defined at the end of your initialization code exists, such as application.started. If it exists, the user is kicked out.
The double-locking pattern has saved my skin on busy sites, however, with VERY busy sites, the queue of requests for the application's initial hit to complete can climb too high, too quickly, and cause the server to crash. The idea is, the requests are waiting for the first hit, which is slow, then the second one breaks into the first cflock, and is quickly rejected. With hundreds or thousands of requests in the queue, growing every millisecond, they are all funneling down to the first cflock block. The solution is to set a very low timeout on the first cflock and not throw (or catch and duck) the lock timeout error.
As a final note, this behavior that I described has been deprecated with ColdFusion 7's onApplicationStart() method of your Application.cfc. If you are using onApplicationStart(), then you shouldn't be locking at all for your application init routine. Application.cfc is well locked already.
To conclude, yes, double-checked locking works in ColdFusion. It's helpful in a few certain circumstances, but do it right. I don't know the schematics of why it works as opposed to Java's threading model, chances are it's manually checking some sort of lookup table in the background of your ColdFusion server.
Java is threadsafe, so it isn't so much that your locks won't work as that they aren't necessary. Basically, in CF 6+, locks are needed for preventing race conditions or creating/althering objects that exist outside Java's control (files, for example).
To open a whole other can of worms...
Why don't you use a Dependency Injection library, such as ColdSpring, to keep track of your objects and prevent circular dependencies.
Related
The application which can be run by different users (Admin, Non-Admin) on Windows at once. It has a critical section which cannot be executed at the same time by different (or the same) user.
To prevent that I'm using Global Mutex from WinApi by calling:
CreateMutex(NULL, true, "Global\\MyMutex"); and then after job is done Mutex is released. "Global" prefix is added to make Mutex visible between all Windows sessions. (It's optional but I want to present that this Mutex is not Local one).
And now the problem:
Assume there is an attacker which want to prevent critical section to be executed by anyone, so she create a program which create a mutex with name "Global\\MyMutex" and never release it... In that way she was able to perform DoS attack on my application for other users - as they are not able to reach critical section.
The question is - How can I prevent such attack scenario with Global Mutex?
First, doing it with a global mutex requires admin. Second, you can do it with a plain named mutex, in which yes, a malicious app can prevent your app from running. But a malicious app running in your system may do far more problematic things than just blocking your application.
Solutions:
Create a named mutex based on some weird unique string (say a CLSID) so a malicious app won't detect it easily.
Do not lock per-application. You may issue a warning on a mutex found locked for a long time, but still let the user proceed. It depends on what your application is designed to do. In my opinion this is the better approach.
Do not bother. If one malicious-app creator does bother in attacking your application, you would have been long successful anyway.
You can:
Not use a global mutex.
That's basically it.
But you say, "How can I stop my application doing something twice at the same time?"
Well, what would happen if the attacker did that thing over and over? It would screw up your application because nobody else would be able to do it at the same time, right? Mutex or no mutex.
So you need to figure out how to make it so it can be done at the same time.
Your other option is to declare that you don't really care about security. Sometimes that's an option, like if you trust everyone who logs into the computer. It doesn't look good if someone does decide to attack your application.
I use sqlite3 in multiple threads application (it is compiled with SQLITE_THREADSAFE=2). In watch window I see that sqlite->busyTimeout == 600000, i. e. that it is supposed to have 10 minutes timeout. However, sqlite3_step returns SQLITE_BUSY obvoiusly faster than after 10 minutes (it returns instantly actually, like if I never called sqlite3_busy_timeout).
What is the reason that sqlite3 ignores timeout and return error instantly?
One possibility: SQLite ignores the timeout when it detects a deadlock.
The scenario is as follows. Transaction A starts as a reader, and later attempts to perform a write. Transaction B is a writer (either started that way, or started as a reader and promoted to a writer first). B holds a RESERVED lock, waiting for readers to clear so it can start writing. A holds a SHARED lock (it's a reader) and tries to acquire RESERVED lock (so it can start writing). For description of various lock types, see http://sqlite.org/lockingv3.html
The only way to make progress in this situation is for one of the transactions to roll back. No amount of waiting will help, so when SQLite detects this situation, it doesn't honor the busy timeout.
There are two ways to avoid the possibility of a deadlock:
Switch to WAL mode - it allows one writer to co-exist with multiple readers.
Use BEGIN IMMEDIATE to start a transaction that may eventually need to write - this way, it starts as a writer right away. This of course reduces the potential concurrency in the system, as the price of avoiding deadlocks.
I made a lot of tests and share them here for other people who uses SQLite in multithreaded environment. SQLite threading support is not well documented, there are not any good tutorial that describes all threading issues in one place. I made a test program that creates 100 threads and sends random queries (INSERT, SELECT, UPDATE, DELETE) concurrently to single database. My answer is based on this program observation.
The only really thread safe journal mode is WAL. It allows multiple connections to do anything they need for the same database within one proces in the same manner as single threaded application does. Any other modes are not thread safe independently from timeouts, busy handlers and SQLITE_THREADSAFE preprocessor definition. They generate SQLITE_BUSY periodically, and it looks like complex programming task to expect such error always and handle it always. If you need thread safe SQLite that never returns SQLITE_BUSY like signle thread does, you have to set WAL journal mode.
Additionally, you have to set SQLITE_THREADSAFE=2 or SQLITE_THREADSAFE=1 preprocessor definition.
When done, you have to choose from 2 options:
You can call sqlite3_busy_timeout. It is enough, you are not required to call sqlite3_busy_handler, even from documentation it is not obvious. It gives you "default", "built-in" timeout functionality.
You can call sqlite3_busy_handler and implement timeout yourself. I don't see why, but may be under some nonstandard OS it is required. When you call sqlite3_busy_handler, it resets timeout to 0 (i. e. disabled). For desktop Linux & Windows you don't need it unless you like to write more complex code.
Usually developing applications I am used to print to console in order to get useful debugging/tracing information. The application I am working now since it is multi-threaded sometimes I see my printf overlapping each other.
I tried to synchronize the screen using a mutex but I end up in slowing and blocking the app. How to solve this issue?
I am aware of MT logging libraries but in using them, since I log too much, I slow ( a bit ) my app.
I was thinking to the following idea..instead of logging within my applications why not log outside it? I would like to send logging information via socket to a second application process that actually print out on the screen.
Are you aware of any library already doing this?
I use Linux/gcc.
thanks
afg
You have 3 options. In increasing order of complexity:
Just use a simple mutex within each thread. The mutex is shared by all threads.
Send all the output to a single thread that does nothing but the logging.
Send all the output to a separate logging application.
Under most circumstances, I would go with #2. #1 is fine as a starting point, but in all but the most trivial applications you can run in to problems serializing the application. #2 is still very simple, and simple is a good thing, but it is also quite scalable. You still end up doing the processing in the main application, but for the vast majority of applications you gain nothing by spinning this off to it's own, dedicated application.
Number 3 is what you're going to do in preformance-critical server type applications, but the minimal performance gain you get with this approach is 1: very difficult to achieve, 2: very easy to screw up, and 3: not the only or even most compelling reason people generally take this approach. Rather, people typically take this approach when they need the logging service to be seperated from the applications using it.
Which OS are you using?
Not sure about specific library's, but one of the classical approaches to this sort of problem is to use a logging queue, which is worked by a writer thread, who's job is purely to write the log file.
You need to be aware, either with a threaded approach, or a multi-process approach that the write queue may back up, meaning it needs to be managed, either by discarding entries or by slowing down your application (which is obviously easier if it's the threaded approach).
It's also common to have some way of categorising your logging output, so that you can have one section of your code logging at a high level, whilst another section of your code logs at a much lower level. This makes it much easier to manage the amount of output that's being written to files and offers you the option of releasing the code with the logging in it, but turned off so that it can be used for fault diagnosis when installed.
As I know critical section has less weight.
Critical section
Using critical section
If you use gcc, you could use atomic accesses. Link.
Frankly, a Mutex is the only way you really want to do that, so it's always going to be slow in your case because you're using so many print statements.... so to solve your question then, don't use so many print_f statements; that's your problem to begin with.
Okay, is your solution using a mutex to print? Perhaps you should have a mutex to a message queue which another thread is processing to print; that has a potential hang up, but I think will be faster. So, use an active logging thread that spins waiting for incoming messages to print. The networking solution could work too, but that requires more work; try this first.
What you can do is to have one queue per thread, and have the logging thread routinely go through each of these and post the message somewhere.
This is fairly easy to set up and the amount of contention can be very low (just a pointer swap or two, which can be done w/o locking anything).
I'm using SQLite3 in a Windows application. I have the source code (so-called SQLite amalgamation).
Sometimes I have to execute heavy queries. That is, I call sqlite3_step on a prepared statement, and it takes a lot of time to complete (due to the heavy I/O load).
I wonder if there's a possibility to abort such a call. I would also be glad if there was an ability to do some background processing in the middle of the call within the same thread (since most of the time is spent in waiting for the I/O to complete).
I thought about modifying the SQLite code myself. In the simplest scenario I could check some condition (like an abort event handle for instance) before every invocation of either ReadFile/WriteFile, and return an error code appropriately. And in order to allow the background processing the file should be opened in the overlapped mode (this enables asynchronous ReadFile/WriteFile).
Is there a chance that interruption of WriteFile may in some circumstances leave the database in the inconsistent state, even with the journal enabled? I guess not, since the whole idea of the journal file is to be prepared for any error of any kind. But I'd like to hear more opinions about this.
Also, did someone tried something similar?
Thanks in advance.
EDIT:
Thanks to ereOn. I wasn't aware of the existence of sqlite3_interrupt. This probably answers my question.
Now, for all of you who wonders how (and why) one expects to do some background processing during the I/O within the same thread.
Unfortunately not many people are familiar with so-called "Overlapped I/O".
http://en.wikipedia.org/wiki/Overlapped_I/O
Using it one issues an I/O operation asynchronously, and the calling thread is not blocked. Then one receives the I/O completion status using one of the completion mechanisms: waitable event, new routine queued into the APC, or the completion port.
Using this technique one doesn't have to create extra threads. Actually the only real legitimation for creating threads is when your bottleneck is the computation time (i.e. CPU load), and the machine has several CPUs (or cores).
And creating a thread just to let it be blocked by the OS most of the time - this doesn't make sense. This leads to the unjustified waste of the OS resources, complicates the program (need for synchronization and etc.).
Unfortunately not all the libraries/APIs allow asynchronous mode of operation, thus making creating extra threads the necessarily evil.
EDIT2:
I've already found the solution, thansk to ereOn.
For all those who nevertheless insist that it's not worth doing things "in background" while "waiting" for the I/O to complete using overlapped I/O. I disagree, and I think there's no point to argue about this. At least this is not related to the subject.
I'm a Windows programmer (as you may noticed), and I have a very extensive experience in all kinds of multitasking. Plus I'm also a driver writer, so that I also know how things work "behind the scenes".
I know that it's a "common practice" to create several threads to do several things "in parallel". But this doesn't mean that this is a good practice. Please allow me not to follow the "common practice".
I don't understand why you want the interruption to come from the same thread and I even don't understand how that would be possible: if the current thread is blocked, waiting for some IO, you can't execute any other code. (Yeah, that's what "blocked" means)
Perhaps if you give us more hints about why you want this, we might help further.
Usually, I use sqlite3_interrupt() to cancel calls. But this, obviously, involves that the call is made from another thread.
By default, SQLite is threadsafe. It sounds to me like the easiest thing to do would be to start the Sqlite command on a background thread, and let SQLite to the necessary locking to have that work.
From your perspective then, the sqlite call looks like an asynchronous bit of I/O, and you can continue normal processing on this thread, such as e.g. using a loop including interruptible sleep and a bit of occasional background processing (e.g. to update a liveness indicator). When the SQLite statement completes, the background thread should set a state variable to indicate this, wake the main thread (if necessary), and terminate.
I'm developing an application with SQLite as the database, and am having a little trouble understanding how to go about using it in multiple threads (none of the other Stack Overflow questions really helped me, unfortunately).
My use case: The database has one table, let's call it "A", which has different groups of rows (based on one of their columns). I have the "main thread" of the application which reads the contents from table A. In addition, I decide, once in a while, to update a certain group of rows. To do this, I want to spawn a new thread, delete all the rows of the group, and re-insert them (that's the only way to do it in the context of my app). This might happen to different groups at the same time, so I might have 2+ threads trying to update the database.
I'm using different transactions from each thread, I.E. at the start of every thread's update cycle, I have a begin. In fact, what each thread actually does is call "BEGIN", delete from the database all the rows it needs to "update", and inserts them again with the new values (this is the way it must be done in the context of my application).
Now, I'm trying to understand how I go about implementing this. I've tried reading around (other answers on Stack Overflow, the SQLite site) but I haven't found all the answers. Here are some things I'm wondering about:
Do I need to call "open" and create a new sqlite structure from each thread?
Do I need to add any special code for all of this, or is it enough to spawn different threads, update the rows, and that's fine (since I'm using different transactions)?
I saw something talking about the different lock types there are, and the fact that I might receive "SQLite busy" from calling certain APIs, but honestly I didn't see any reference that completely explained when I need to take all this into account. Do I need to?
If anyone can answer the questions/point me in the direction of a good resource, I'd be very grateful.
UPDATE 1: From all that I've read so far, it seems like you can't have two threads who are going to write to a database file anyway.
See: http://www.sqlite.org/lockingv3.html. In section 3.0: A RESERVED lock means that the process is planning on writing to the database file at some point in the future but that it is currently just reading from the file. Only a single RESERVED lock may be active at one time, though multiple SHARED locks can coexist with a single RESERVED lock.
Does this mean that I may as well only spawn off a single thread to update a group of rows each time? I.e., have some kind of poller thread which decides that I need to update some of the rows, and then creates a new thread to do it, but never more than one at a time? Since it looks like any other thread I create will just get SQLITE_BUSY until the first thread finishes, anyway.
Have I understood things correctly?
BTW, thanks for the answers so far, they've helped a lot.
Some steps when starting out with SQLlite for multithreaded use:
Make sure sqlite is compiled with the multi threaded flag.
You must call open on your sqlite file to create a connection on each thread, don't share connections between threads.
SQLite has a very conservative threading model, when you do a write operation, which includes opening transactions that are about to do an INSERT/UPDATE/DELETE, other threads will be blocked until this operation completes.
If you don't use a transaction, then transactions are implicit, so if you start a INSERT/DELETE/UPDATE, sqlite will try to acquire an exclusive lock, and complete the operation before releasing it.
If you do a BEGIN EXCLUSIVE statement, it will acquire an exclusive lock before doing operations in that transaction. A COMMIT or ROLLBACK will release the lock.
Your sqlite3_step, sqlite3_prepare and some other calls may return SQLITE_BUSY or SQLITE_LOCKED. SQLITE_BUSY usually means that sqlite needs to acquire the lock. The biggest difference between the two return values:
SQLITE_LOCKED: if you get this from a sqlite3_step statement, you MUST call sqlite3_reset on the statement handle. You should only get this on the first call to sqlite3_step, so once reset is called you can actually "retry" your sqlite3_step call. On other operations, it's the same as SQLITE_BUSY
SQLITE_BUSY : There is no need to call sqlite3_reset, just retry your operation after waiting a bit for the lock to be released.
Check out this link. The easiest way is to do the locking yourself, and to avoid sharing the connection between threads. Another good resource can be found here, and it concludes with:
Make sure you're compiling SQLite with -DTHREADSAFE=1.
Make sure that each thread opens the database file and keeps its own sqlite structure.
Make sure you handle the likely possibility that one or more threads collide when they access the db file at the same time: handle SQLITE_BUSY appropriately.
Make sure you enclose within transactions the commands that modify the database file, like INSERT, UPDATE, DELETE, and others.
I realize this is an old thread and the responses are good but I've been looking into this recently and came across an interesting analysis of some different implementations. Mainly it goes over the strengths and weaknesses of connection sharing, message passing, thread-local connections and connection pooling. Take a look at it here: http://dev.yorhel.nl/doc/sqlaccess
Modern versions of SQLite has thread safety enabled by default. SQLITE_THREADSAFE compilation flag controls whether or not code is included in SQLite to enable it to operate safely in a multithreaded environment. Default value is SQLITE_THREADSAFE=1. It means Serialized mode. In this mode:
In this mode (which is the default when SQLite is compiled with SQLITE_THREADSAFE=1) the SQLite library will itself serialize access to database connections and prepared statements so that the application is free to use the same database connection or the same prepared statement in different threads at the same time.
Use sqlite3_threadsafe() function to check Sqlite library SQLITE_THREADSAFE compilation flag.
Default library thread safety behavior can be changed via sqlite3_config(). Use SQLITE_OPEN_NOMUTEX and SQLITE_OPEN_FULLMUTEX flags at sqlite3_open_v2() to adjust the threading mode of individual database connections.
Check this code from the SQLite wiki.
I have done something similar with C and I uploaded the code here.
I hope it's useful.
Summary
Transactions in SQLite are SERIALIZABLE.
Changes made in one database connection are invisible to all other database connections prior to commit.
A query sees all changes that are completed on the same database connection prior to the start of the query, regardless of whether or not those changes have been committed.
If changes occur on the same database connection after a query starts running but before the query completes, then it is undefined whether or not the query will see those changes.
If changes occur on the same database connection after a query starts running but before the query completes, then the query might return a changed row more than once, or it might return a row that was previously deleted.
For the purposes of the previous four items, two database connections that use the same shared cache and which enable PRAGMA read_uncommitted are considered to be the same database connection, not separate database connections.
In addition to the above information on multi-threaded access, it might be worth taking a look at this page on isolation, as many things have changed since this original question and the introduction of the write-ahead log (WAL).
It seems a hybrid approach of having several connections open to the database provides adequate concurrency guarantees, trading off the expense of opening a new connection with the benefit of allowing multi-threaded write transactions.
If you use connection pooling, like in Java EE, web application, set the connection pool max. size to 1. Access will be serialized.