I have a requirement to synchronize concurrent access to a shared resource modified by different processes which run on different hosts. I am thinking to synchronize this by creating a lock table in a sql database which is accessible from a service that can access the database. All the process will first request for lock from the service and only the one getting the lock will go forward and change the shared resource. Processes will then release the lock after their computation. The lock table will hold information like host, pid, lock creation time of the process currently holding the lock so as to clear the lock if the current process holding the lock has died unexpectedly and some other process has requested for the lock.
I am not inclined for a zookeeper based solution as the traffic in my case is minimal(2-5 process may run in a single day and so the probability of concurrent access is already minimal) and so I am not thinking to maintain a separate service for lock but extend one of the existing service itself by adding an additional table in its database.
I wanted suggestions on this approach or if there is some other simpler solution for this problem.
Related
in our application, if user logged in as admin he can do any operation. supposed one admin modifying a route,if second admin at the same time checked the same route and creating an airwaybill for the same route.This will be a problem. I could not find how my application is handling these concurrent requests.
(we are simply using jdbc transactions)
I am getting different answers from my team 1. web/application server handles these transactions and it will handle concurrent requests without any issues.
2. locks will be happened on rows on database and there wont be any problem for concurrent requests.
bottomline : concurrent requests should be handled in code? or we shall do any setting in web/application for concurrent requests while deploying? or by default database will handle concurrent requests by row locking mechanism?
if anyone knows where to find the solution , please let me know.
As far as I'm aware most database engines use some kind of locking during queries, but it will differ depending on the engine. I know that InnoDB enforces transaction atomicity (see this stack exchange thread) so anything that is wrappend in a transaction wont be interfered with mid execution. However there is no guaruntee as to which request will reach the datbase first.
As for the webServer/appServer, assuming your using a threaded web server: apache tomcat, jetty etc then each request is handled as by a seperate thread so I would assume is no inherent thread saftey. In the majority of cases the database will handle your concurrency without complaining, however i would recommend including a some kind of serialisation on the application end in case you decide to change DB implementation somewhere down the road, you will also have more control over how requests are handled.
In short ..... do both.
As far as I know, most databases has some kind of locking during the transactions and queries, but you should check the database references to ensure the type of locking method it uses. As for your problem with your web server, I know that tomcat handles requests concurrently and offers some kind of thread safety for it's own resources, but it offers no thread safety for your application.Thus, you should do it on your own. For the problem you mentioned above, I think when you are accessing your route, you should query it against the database whether it exists or not. Also when your other admin is modifying the route, you can use some sort of lock on the block you are doing that so that when the other admin at the same time wants to access the route that is being modified, he waits for the transaction to be completed. if you are using java for the server side, I recommend to see java synchronization methods or if another language, check locking and thread safety methods for that language.
I am developing a multi-threaded application and using Cassandra for the back-end.
Earlier, I created a separate session for each child thread and closed the session before killing the thread after its execution. But then I thought it might be an expensive job so I now designed it like, I have a single session opened at the start of the server and any number of clients can use that session for querying purposes.
Question: I just want to know if this is correct, or is there a better way to do this? I know connection pooling is an option but, is that really needed in this scenario?
It's certainly thread safe in the Java driver, so I assume the C++ driver is the same.
You are encouraged to only create one session and have all your threads use it so that the driver can efficiently maintain a connection pool to the cluster and process commands from your client threads asynchronously.
If you create multiple sessions on one client machine or keep opening and closing sessions, you would be forcing the driver to keep making and dropping connections to the cluster, which is wasteful of resources.
Quoting this Datastax blog post about 4 simple rules when using the DataStax drivers for Cassandra:
Use one Cluster instance per (physical) cluster (per application
lifetime)
Use at most one Session per keyspace, or use a single
Session and explicitely specify the keyspace in your queries
If you execute a statement more than once, consider using a PreparedStatement
You can reduce the number of network roundtrips and also have atomic operations by using Batches
The C/C++ driver is definitely thread safe at the session and future levels.
The CassSession object is used for query execution. Internally, a session object also manages a pool of client connections to Cassandra and uses a load balancing policy to distribute requests across those connections. An application should create a single session object per keyspace as a session object is designed to be created once, reused, and shared by multiple threads within the application.
They actually have a section called Thread Safety:
A CassSession is designed to be used concurrently from multiple threads. CassFuture is also thread safe. Other than these exclusions, in general, functions that might modify an object’s state are NOT thread safe. Objects that are immutable (marked ‘const’) can be read safely by multiple threads.
They also have a note about freeing objects. That is not thread safe. So you have to make sure all your threads are done before you free objects:
NOTE: The object/resource free-ing functions (e.g. cass_cluster_free, cass_session_free, … cass_*_free) cannot be called concurrently on the same instance of an object.
Source:
http://datastax.github.io/cpp-driver/topics/
There are many examples on the net about creating a simple thread pool such as Sample1 and Sample2
What I wanted to implement though is to have a separate thread pool for different tasks. For example, the app may have a pool of threads for processing incoming tcp connections (let's call this the network pool), while another pool for talking to a database (database pool).
These incoming tcp requests might want information from the database. In this case it will need to ask the those threads from the database pool to perform query, and return the result asynchronously.
Is there a recommended way to do so using boost::asio? Would it be having one instance of io_service for each pool? And how should those threads communicate with each other (using boost)?
I understand to explain all these, the code won't be that short and trivial, but if possible some sort of pseudo code would be much appreciated.
Thanks!
The communication between thread / thread pools should be through thread safe queues.
In your example, you should have a networking thread pool for handling network connections, a process pool for executing the network requests, and a database connection / thread pool (one pool per database; one thread per database connection, but possibly you could have multiple connections to the same database).
You would also need a thread safe queues, one for the network pool, one for the process pool and one for each of the database pools.
Say you have a network request that needs to get information from the database. You would receive the request while executing on a network thread, and append the handler for the request onto the process queue.
The process handler (in a process thread) would see that the request needs something from the database, and so it would append a database request as well as a callback handler onto the appropriate database queue.
The appropriate database thread would pick up the request from the database queue, execute the query, get the results back, and add the results to the callback handler. The callback handler object with the database results would then be pushed onto the process queue.
The callback handler (in a process thread) would then continue executing the request, and possibly package a response message, which is then pushed onto the network queue.
The network handler (in a network thread) would then pick up the response messsage and deliver it (encoding as necessary).
An example of a thread safe queue can be found here.
Albeit a little complicated, you can see an implementation of an application server that can handle what you're talking about here, although it may be overkill for what you're trying to do. The source code is fairly well documented so you should be able to follow it and see what it's doing.
My example uses boost for asio (see the TCP Connection implementation within that same system), but it does not use boost io_service for handlers.
I have an API which opens an access database for read and write. The API opens the connection when it's constructed and closes the connection when it's destructed. When the db is opened an .ldb file is created and when it closes it's removed (or disappears).
There are multiple applications using the API to read and write to the access db. I want to know:
Is ldb file used to track multiple connections
Does calling an db.close() closes all connections or just one instance.
Will there be any sync issues with the above approach.
db.Close() closes one connecton. The .ldb is automatically removed when all connections are closed.
Keep in mind that while Jet databases (i.e. Access) do support mutiple simultaneous users, they're not extremely well-suited for a very large concurrent user base; for one thing, they are easily corrupted when there are network issues. I'm actually dealing with that right now. If it comes to that, you will want to use a database server.
That said, I've used Jet databases in that way many times.
Not sure what you mean when you say "sync issues".
Yes, it's required to open database in shared mode by multiple users. Seems it stands for "Lock Database". See more info in MSDN: Introduction to .ldb files in Access 2000.
Close() closes only one connection, others are unaffected.
Yes, it's possible if you try to write records that another user has locked. However data will remain consistent, you will just receive error about write conflict.
Actually MS Access is not best solution for multi-connection usage scenario.
You may take a look at SQL Server Compact which is light version of MS SQL Server. It runs in-process, supports multiple connections and multithreading, most of robust T-SQL features (excluding stored procs) etc.
As an additional note to otherwise good answers, I would strongly recommend keeping a connection to a dummy table open for the lifetime of the client application.
Closing connections too often and allowing the lock file to be created/deleted every time is a huge performance bottleneck and, in some cases of rapid access to the database, can actually cause queries and inserts to fail.
You can read a bit more in this answer I gave a while ago.
When it comes to performance and reliability, you can get quite a lot out of Access databases providing that you keep some things in mind:
Keep a connection open to a dummy table for the duration of the life of the client (or at least use some timeout that would close the connection after like 20 seconds of inactivity if you don't want to keep it open all the time).
Engineer your clients apps to properly close all connections (including the dummy one when i'ts time to do it), whatever happens (eg crash, user shutdown, etc).
Leaving locks in place is not good, as it could mean that the client has left the database in an unknown state, and could increase the likelihood of corruption if other clients keep leaving stale locks.
Compact and repair the database regularly. Make it a nightly task.
This will ensure that the database is optimised, and that any stale data is removed and open locks properly closed.
Good, stable network connectivity is paramount to data integrity for a file-based database: avoid WiFi like the plague.
Have a way to kick out all clients from the database server itself.
For instance, have a table with for instance a MaintenanceLock field that clients poll regularly. If the field is set, the client should disconnect, after giving an opportunity for the user to save his work.
Similarly, when a client app starts, check this field in the database to allow or disallow the client to connect to it.
Now, you can quick out clients at any time without having to go to each user and ask them to close the app. It's also very useful to ensure that no client left open at night are still connected to the database when you run Compact & Repair maintenance on it.
Google's Chubby distributed lock manager has a feature called "sequencers" that I would like to emulate using ZooKeeper. Is there a known good way to do so?
A sequencer works as follows:
Client acquires a lock on a resource
Client requests a sequencer for it's lock, which is a string with some metadata
Client makes a call to a service and passes the sequencer as a parameter
The service uses the sequencer to verify that the client still holds the lock before processing the request
The goal is to prevent a situation where a client dies after making a call to a remote service which must be protected by a lock.
The main paper on Chubby is available at http://research.google.com/archive/chubby.html. Sequencers are discussed in section 2.4.
Thanks!
The zookeeper lock recipes all involve the locking process create a sequential ephemeral znode. The name of the sequential ephemeral znode will be unique, and the znode will cease to exist if the lockers session expires due to the locker not sending a valid heartbeat within the timeout.
So the locking process just needs to pass the name of the sequential ephemeral znode it created while locking to the remote service, and the remote service can check the existence of the znode before processing.
You can be even have the remote service add a watch to the znode, and be notified when the znode is removed.