How to implement Chubby-style lock sequencers with ZooKeeper? - concurrency

Google's Chubby distributed lock manager has a feature called "sequencers" that I would like to emulate using ZooKeeper. Is there a known good way to do so?
A sequencer works as follows:
Client acquires a lock on a resource
Client requests a sequencer for it's lock, which is a string with some metadata
Client makes a call to a service and passes the sequencer as a parameter
The service uses the sequencer to verify that the client still holds the lock before processing the request
The goal is to prevent a situation where a client dies after making a call to a remote service which must be protected by a lock.
The main paper on Chubby is available at http://research.google.com/archive/chubby.html. Sequencers are discussed in section 2.4.
Thanks!

The zookeeper lock recipes all involve the locking process create a sequential ephemeral znode. The name of the sequential ephemeral znode will be unique, and the znode will cease to exist if the lockers session expires due to the locker not sending a valid heartbeat within the timeout.
So the locking process just needs to pass the name of the sequential ephemeral znode it created while locking to the remote service, and the remote service can check the existence of the znode before processing.
You can be even have the remote service add a watch to the znode, and be notified when the znode is removed.

Related

Lock service using SQL DataBase

I have a requirement to synchronize concurrent access to a shared resource modified by different processes which run on different hosts. I am thinking to synchronize this by creating a lock table in a sql database which is accessible from a service that can access the database. All the process will first request for lock from the service and only the one getting the lock will go forward and change the shared resource. Processes will then release the lock after their computation. The lock table will hold information like host, pid, lock creation time of the process currently holding the lock so as to clear the lock if the current process holding the lock has died unexpectedly and some other process has requested for the lock.
I am not inclined for a zookeeper based solution as the traffic in my case is minimal(2-5 process may run in a single day and so the probability of concurrent access is already minimal) and so I am not thinking to maintain a separate service for lock but extend one of the existing service itself by adding an additional table in its database.
I wanted suggestions on this approach or if there is some other simpler solution for this problem.

C++ gRPC thread number configuration

Does gRPC server/client have any concept of thread pool for connections? That it is possible to reuse threads, pre-allocate threads, queue request on thread limit reached, etc.
If no, how does it work, is it just allocating/destroying a thread when it need, without any limitation and/or reuse? If yes, is it possible to configure it?
It depends whether you are using sync or async API.
For a sync client, your RPC calls blocks the calling thread, so it is not really relevant. For a sync server, there is an internal threadpool handling all the incoming requests, you can use a grpc::ResourceQuota on a ServerBuilder to limit the max number of threads used by the threadpool.
For async client and server, gRPC uses CompletionQueue as a way for users to define their own threading model. A common way of building clients and servers is to use a user-provided threadpool to run CompletionQueue::Next in each thread. Then once it gets some tag from the Next call, you can cast it to a user-defined type and run some methods to proceed the state. In this case, the user has full control of threads being used.
Note gRPC does create some internal threads, but they should not be used for the majority of the rpc work.

Is a cassandra session thread safe? (using cpp driver)

I am developing a multi-threaded application and using Cassandra for the back-end.
Earlier, I created a separate session for each child thread and closed the session before killing the thread after its execution. But then I thought it might be an expensive job so I now designed it like, I have a single session opened at the start of the server and any number of clients can use that session for querying purposes.
Question: I just want to know if this is correct, or is there a better way to do this? I know connection pooling is an option but, is that really needed in this scenario?
It's certainly thread safe in the Java driver, so I assume the C++ driver is the same.
You are encouraged to only create one session and have all your threads use it so that the driver can efficiently maintain a connection pool to the cluster and process commands from your client threads asynchronously.
If you create multiple sessions on one client machine or keep opening and closing sessions, you would be forcing the driver to keep making and dropping connections to the cluster, which is wasteful of resources.
Quoting this Datastax blog post about 4 simple rules when using the DataStax drivers for Cassandra:
Use one Cluster instance per (physical) cluster (per application
lifetime)
Use at most one Session per keyspace, or use a single
Session and explicitely specify the keyspace in your queries
If you execute a statement more than once, consider using a PreparedStatement
You can reduce the number of network roundtrips and also have atomic operations by using Batches
The C/C++ driver is definitely thread safe at the session and future levels.
The CassSession object is used for query execution. Internally, a session object also manages a pool of client connections to Cassandra and uses a load balancing policy to distribute requests across those connections. An application should create a single session object per keyspace as a session object is designed to be created once, reused, and shared by multiple threads within the application.
They actually have a section called Thread Safety:
A CassSession is designed to be used concurrently from multiple threads. CassFuture is also thread safe. Other than these exclusions, in general, functions that might modify an object’s state are NOT thread safe. Objects that are immutable (marked ‘const’) can be read safely by multiple threads.
They also have a note about freeing objects. That is not thread safe. So you have to make sure all your threads are done before you free objects:
NOTE: The object/resource free-ing functions (e.g. cass_cluster_free, cass_session_free, … cass_*_free) cannot be called concurrently on the same instance of an object.
Source:
http://datastax.github.io/cpp-driver/topics/

boost asio: different thread pool for different tasks

There are many examples on the net about creating a simple thread pool such as Sample1 and Sample2
What I wanted to implement though is to have a separate thread pool for different tasks. For example, the app may have a pool of threads for processing incoming tcp connections (let's call this the network pool), while another pool for talking to a database (database pool).
These incoming tcp requests might want information from the database. In this case it will need to ask the those threads from the database pool to perform query, and return the result asynchronously.
Is there a recommended way to do so using boost::asio? Would it be having one instance of io_service for each pool? And how should those threads communicate with each other (using boost)?
I understand to explain all these, the code won't be that short and trivial, but if possible some sort of pseudo code would be much appreciated.
Thanks!
The communication between thread / thread pools should be through thread safe queues.
In your example, you should have a networking thread pool for handling network connections, a process pool for executing the network requests, and a database connection / thread pool (one pool per database; one thread per database connection, but possibly you could have multiple connections to the same database).
You would also need a thread safe queues, one for the network pool, one for the process pool and one for each of the database pools.
Say you have a network request that needs to get information from the database. You would receive the request while executing on a network thread, and append the handler for the request onto the process queue.
The process handler (in a process thread) would see that the request needs something from the database, and so it would append a database request as well as a callback handler onto the appropriate database queue.
The appropriate database thread would pick up the request from the database queue, execute the query, get the results back, and add the results to the callback handler. The callback handler object with the database results would then be pushed onto the process queue.
The callback handler (in a process thread) would then continue executing the request, and possibly package a response message, which is then pushed onto the network queue.
The network handler (in a network thread) would then pick up the response messsage and deliver it (encoding as necessary).
An example of a thread safe queue can be found here.
Albeit a little complicated, you can see an implementation of an application server that can handle what you're talking about here, although it may be overkill for what you're trying to do. The source code is fairly well documented so you should be able to follow it and see what it's doing.
My example uses boost for asio (see the TCP Connection implementation within that same system), but it does not use boost io_service for handlers.

How to design a client server architect

I like to know the server (TCP based) architecture to support large scale of clients(at least10K) to implement Fix server. My points are
How we design it.
How to listen on the open port? Use select or poll or any other function.
How to process the response of the client? On large scale we cannot create the one thread for each client.
Should the processing of response is in the different executable and share the request and response to the server executable through IPC.
There is much more on it. I would appreciate if anyone explains it or provide any link.
Thanks
An excellent resource for information on this topic is The C10K problem. Although the dimensions there seem a little old, the techniques are still applicable today.
The architecture depends on what you want to do with the clients incoming data. My guess is that for every incoming message you would perform some computations and probably also return a response.
In that case I would create 1 main listener thread that receives all the incoming messages (Actually, if your hardware has more than 1 physical network device, I would use a listener thread per device and make sure each one is listening to a specific device).
Get the number of CPUs that you have on your machine and create worker threads for each CPU and bind them each thread to one cpu (Maybe number of working thread should be num_of_cpu-1, to leave an availalbe cpu for the listener and dispatcher).
Each thread has a queue and semaphore, the main listener thread just push the incoming data into those queues. There are many way to perform load balancing (Will talk about it later).
Each working thread just works on the requests given to it, and put the response on another queue that is read by the dispatcher.
The dispatcher - there are 2 options here, use a thread for dispatcher (or thread per network device as for listeners), or have the dispatcher actually be the same thread as the listener.
There is some advantage to put them both on the same thread, since it makes it easier to detect lost socket connection and use the same fds for both reading and writing without thread synchronization. However, it could be that using 2 different threads would give better performance, it need to be tested.
Note about load balancing:
This is a topic of its own.
The simplest thing is to use 1 queue for all working threads, but the problem is that they have to lock in order to pop items and the locking can damage performance. (But you get the most balanced load).
Another quite simple approach would be to have a private queue for every worker and perform round-robin when inserting. After every X cycles check the size of all the queues. If some queues are much larger than others then leave them out for the next X cycles and then recheck them again. This is not the best approach, but a simple one to implement and gives some load balancing while no locking is needed.
By the way - There is a way to implement queue between 2 threads without blocking - but this is also another topic.
I hope it helps,
Guy
If the client and server are on a secure network then the security aspect is to be minimal - to the extent that the transfers are encrypted. If the clients and the server are not on a secure network - you first want the server and client to authenticate each other and then initiate encrypted data transfer. For data transfer, server-side authentication should suffice. At the end of this authentication use the session key to generate encrypted data stream (symmetric). consider using TFTP it is simple to implement and scales reasonably well.