Concurrent priority queue in redis? - concurrency

I would like to implement a concurrent priority queue in Redis, with multiple processes on different machines adding items (with scores) and multiple other processes popping these items, lowest score first.
A simple queue can be implemented with LPUSH and RPOP.
Using a ZSET, I can add the items using ZADD and pop them with ZRANGE and ZREM, as long as there is only one reader.
For multiple readers I think I need something like ZPOP which combines ZRANGE and ZREM in a single atomic operation. Otherwise two readers may get the same item from ZRANGE before either can ZREM it. Retrying if ZREM returns 0 would work but is not desirable.
Is there some way I can do this using the current Redis commands? Is there any reason this hasn't been added to Redis already? It seems like it would be a pretty simple command to implement.

You can guarantee atomicity if you use a Lua script that does the ZRANGE & ZREM or with a MULTI/EXEC block. This will prevent multiple workers from interfering with each other.
I assume that ZPOP wasn't put in in the first place because it isn't a common use case and, when needed, it can be easily scripted.

you can use redis command: watch
WATCH zset
element = ZRANGE zset 0 0
MULTI
ZREM zset element
EXEC
if exec fails (return a null reply), just repeat those commands.

From Redis 5.0.0 you can use ZPOPMIN and ZPOPMAX (and their blocking counterpart BZPOPMIN and BZPOPMAX).

Related

Notifying a task from multiple other tasks without extra work

My application is futures-based with async/await, and has the following structure within one of its components:
a "manager", which is responsible for starting/stopping/restarting "workers", based both on external input and on the current state of "workers";
a dynamic set of "workers", which perform some continuous work, but may fail or be stopped externally.
A worker is just a spawned task which does some I/O work. Internally it is a loop which is intended to be infinite, but it may exit early due to errors or other reasons, and in this case the worker must be restarted from scratch by the manager.
The manager is implemented as a loop which awaits on several channels, including one returned by async_std::stream::interval, which essentially makes the manager into a poller - and indeed, I need this because I do need to poll some Mutex-protected external state. Based on this state, the manager, among everything else, creates or destroys its workers.
Additionally, the manager stores a set of async_std::task::JoinHandles representing live workers, and it uses these handles to check whether any workers has exited, restarting them if so. (BTW, I do this currently using select(handle, future::ready()), which is totally suboptimal because it relies on the select implementation detail, specifically that it polls the left future first. I couldn't find a better way of doing it; something like race() would make more sense, but race() consumes both futures, which won't work for me because I don't want to lose the JoinHandle if it is not ready. This is a matter for another question, though.)
You can see that in this design workers can only be restarted when the next poll "tick" in the manager occurs. However, I don't want to use a too small interval for polling, because in most cases polling just wastes CPU cycles. Large intervals, however, can delay restarting a failed/canceled worker by too much, leading to undesired latencies. Therefore, I though I'd set up another channel of ()s back from each worker to the manager, which I'd add to the main manager loop, so when a worker stops due to an error or otherwise, it will first send a message to its channel, resulting in the manager being woken up earlier than the next poll in order to restart the worker right away.
Unfortunately, with any kinds of channels this might result in more polls than needed, in case two or more workers stop at approximately the same time (which due to the nature of my application, is somewhat likely to happen). In such case it would make sense to only run the manager loop once, handling all of the stopped workers, but with channels it will necessarily result in the number of polls equal to the number of stopped workers, even if additional polls don't do anything.
Therefore, my question is: how do I notify the manager from its workers that they are finished, without resulting in extra polls in the manager? I've tried the following things:
As explained above, regular unbounded channels just won't work.
I thought that maybe bounded channels could work - if I used a channel with capacity 0, and there was a way to try and send a message into it but just drop the message if the channel is full (like the offer() method on Java's BlockingQueue), this seemingly would solve the problem. Unfortunately, the channels API, while providing such a method (try_send() seems to be like it), also has this property of having capacity larger than or equal to the number of senders, which means it can't really be used for such notifications.
Some kind of atomic or a mutex-protected boolean flag also look as if it could work, but there is no atomic or mutex API which would provide a future to wait on, and would also require polling.
Restructure the manager implementation to include JoinHandles into the main select somehow. It might do the trick, but it would result in large refactoring which I'm unwilling to make at this point. If there is a way to do what I want without this refactoring, I'd like to use that first.
I guess some kind of combination of atomics and channels might work, something like setting an atomic flag and sending a message, and then skipping any extra notifications in the manager based on the flag (which is flipped back to off after processing one notification), but this also seems like a complex approach, and I wonder if anything simpler is possible.
I recommend using the FuturesUnordered type from the futures crate. This collection allows you to push many futures of the same type into a collection and wait for any one of them to complete at once.
It implements Stream, so if you import StreamExt, you can use unordered.next() to obtain a future that completes once any future in the collection completes.
If you also need to wait for a timeout or mutex etc., you can use select to create a future that completes once either the timeout or one of the join handles completes. The future returned by next() implements Unpin, so it is usable with select without problems.

CQRS, multiple write nodes for a single aggregate entry, while maintaining concurrency

Let's say I have a command to edit a single entry of an article, called ArticleEditCommand.
User 1 issues an ArticleEditCommand based on V1 of the article.
User 2 issues an ArticleEditCommand based on V1 of the same
article.
If I can ensure that my nodes process the older ArticleEditCommand commands first, I can be sure that the command from User 2 will fail because User 1's command will have changed the version of the article to V2.
However, if I have two nodes process ArticleEditCommand messages concurrently, even though the commands will be taken of the queue in the correct order, I cannot guarantee that the nodes will actually process the first command before the second command, due to a spike in CPU or something similar. I could use a sql transaction to update an article where version = expectedVersion and make note of the number of records changed, but my rules are more complex, and can't live solely in SQL. I would like my entire logic of the command processing guaranteed to be concurrent between ArticleEditCommand messages that alter that same article.
I don't want to lock the queue while I process the command, because the point of having multiple command handlers is to handle commands concurrently for scalability. With that said, I don't mind these commands being processed consecutively, but only for a single instance/id of an article. I don't expect a high volume of ArticleEditCommand messages to be sent for a single article.
With the said, here is the question.
Is there a way to handle commands consecutively across multiple nodes for a single unique object (database record), but handle all other commands (distinct database records) concurrently?
Or, is this a problem I created myself because of a lack of understanding of CQRS and concurrency?
Is this a problem that message brokers typically have solved? Such as Windows Service Bus, MSMQ/NServiceBus, etc?
EDIT: I think I know how to handle this now. When User 2 issues the ArticleEditCommand, an exception should be throw to the user letting them know that there is a current pending operation on that article that must be completed before then can queue the ArticleEditCommand. That way, there is never two ArticleEditCommand messages in the queue that effect the same article.
First let me say, if you don't expect a high volume of ArticleEditCommand messages being sent, this sounds like premature optimization.
In other solutions, this problem is usually not solved by message brokers, but by optimistic locking enforced by the persistence implementation. I don't understand why a simple version field for optimistic locking that can be trivially handled by SQL contradicts complicated business logic/updates, maybe you could elaborate more?
It's actually quite simple and I did that. Basically, it looks like this ( pseudocode)
//message handler
ModelTools.TryUpdateEntity(
()=>{
var entity= _repo.Get(myId);
entity.Do(whateverCommand);
_repo.Save(entity);
}
10); //retry 10 times until giving up
//repository
long? _version;
public MyObject Get(Guid id)
{
//query data and version
_version=data.version;
return data.ToMyObject();
}
public void Save(MyObject data)
{
//update row in db where version=_version.Value
if (rowsUpdated==0)
{
//things have changed since we've retrieved the object
throw new NewerVersionExistsException();
}
}
ModelTools.TryUpdateEntity and NewerVersionExistsException are part of my CavemanTools generic purpose library (available on Nuget).
The idea is to try doing things normally, then if the object version (rowversion/timestamp in sql) has changed we'll retry the whole operation again after waiting a couple of miliseconds. And that's exactly what the TryUpdateEntity() method does. And you can tweak how much to wait between tries or how many times it should retry the operation.
If you need to notify the user, then forget about retrying, just catch the exception directly and then tell the user to refresh or something.
Partition based solution
Achieve node stickiness by routing the incoming command based on the object's ID (eg. articleId modulo your-number-of-nodes) to make sure the commands of User1 and User2 ends up on the same node, then process the commands consecutively. You can choose to process all commands one by one or if you want to parallelize the execution, partition the commands on something like ID, odd/even, by country or similar.
Grid based solution
Use an in-memory grid (eg. Hazelcast or Coherence) and use a distributed Executor Service (http://docs.hazelcast.org/docs/2.0/manual/html/ch09.html#DistributedExecution) or similar to coordinate the command processing across the cluster.
Regardless - before adding this kind of complexity, you should of course ask yourself if it's really a problem if User2's command would be accepted and User1 got a concurrency error back. As long as User1's changes are not lost and can be re-applied after a refresh of the article it might be perfectly fine.

two processes may change the same Redis resource, using Watch. Should I be worried for livelock?

Processes A and B both operate on a Redis resource R.
These processes may be executed in parallel, and I need both processes to be certain of the value of R at the moment they change it.
I'm therefore using Redis transactions with the WATCH command. From the docs: "we are asking Redis to perform the transaction only if no other client modified any of the WATCHed keys. Otherwise the transaction is not entered at all."
To retry in case of failure, the suggested way is looping the Watch/Multi-exec loop until it succeeds. However, I'm worried that both A and B might starting looping indefinitely (i.e.: livelock).
It this something to be worried about? Better yet, what to do about it? Would setting a random timeout on the retry solve the issue?
No need to worry because only A or B will succeed with their EXEC and change R (Redis is [mostly] single threaded). The one that fails will need to retry the transaction with the new R value.

Redis INCR concurrency

I am using Redis' INCR to generate an ID for objects. And then use ZADD to add the object using the ID as key.
Do I need to worry about if there are multiple connections executing this same block of code? Say after id:12 if two connections connect at the same time and both add object using id:13, then one of them would be lost.
Since redis is single threaded, this can never happen - only one client can make a change to the database at a time.
As Jonatan Hedborg stated, Redis is single threaded so you never need to worry about two clients doing something at the same time. If, on the other hand, your worry is that you want to run the INCR and ZADD commands sequentially, and want to make sure no other commands are run in between them, you can use transactions, and be guaranteed your commands are run as a single unit with nothing in between.

How to implement a master machine controlling several slave machines via Linux C++

could anyone give some advice for how to implement a master machine controlling some slave machines via C++?
I am trying to implement a simple program that can distribute tasks from master to slaves. It is easy to implement one master + one slave machine. However, when there are more than one slave machine, I don't know how to design.
If the solution can be used for both Linux and Windows, it would be much better.
You use should a framework rather than make your own. What you need to search for is Cluster Computing. one that might work easily is Boost.MPI
With n-machines, you need to keep track of which ones are free, and if there are none, load across your slaves (i.e. how many tasks have been queued up at each) and then queue on the lowest loaded machine (or whichever your algorithm deems best), say better hardware means that some slaves perform better than others etc. I'd start with a simple distribution algorithm, and then tweak once it's working...
More interesting problems will arise in exceptional circumstances (i.e. slaves dying, and various such issues.)
I would use an existing messaging bus to make your life easier (rather than re-inventing), the real intelligence is in the distribution algorithm and management of failed nodes.
We need to know more, but basically you just need to make sure the slaves don't block each other. Details of doing that in C++ will get involved, but the first thing to do is ask yourself what the algorithm is. The simplest case is going to be if you don't care about waiting for the slaves, in which case you have
while still tasks to do
launch a task on a slave
If you have to have just one job running on a slave then you'll need something like an array of flags, one per slave
slaves : array 0 to (number of slaves - 1)
initialize slaves to all FALSE
while not done
find the first FALSE slave -- it's not in use
set that slave to TRUE
launch a job on that slave
check for slaves that are done
set that slave to FALSE
Now, if you have multiple threads, you can make that into two threads
while not done
find the first FALSE slave -- it's not in use
set that slave to TRUE
launch a job on that slave
while not done
check for slaves that are done
set that slave to FALSE