How to add an event to grpc completion queue in C++? - c++

I am trying to implement an async GRPC server, where whenever the client makes a call it gets an indefinite stream of messages. I read through the official documentation. It doesn't cover the scenario where I want to keep the stream open for each RPC. This article - https://www.gresearch.co.uk/article/lessons-learnt-from-writing-asynchronous-streaming-grpc-services-in-c/
addresses the issue of keeping the stream open by basically putting the callback handler object in the completion queue again.
The article suggests:
if (HasData())
{
responder_.Write(reply_, this);
}
else
{
grpc::Alarm alarm;
alarm.Set(completion_queue_, gpr_now(gpr_clock_type::GPR_CLOCK_REALTIME), this);
}
I tried using the Alarm object approach as suggested in this article but for some reason on the next Next function call on completion queue I get ok argument as false - GPR_ASSERT(cq_->Next(&tag, &ok));. As a result, I have to close the server and am unable to wait on a stream till further data is available.
I am able to receive data fine till the else case is not hit.
Could someone please help me identify what I might be doing wrong? I am unable to find much C++ resources on GRPC. Thanks!

When Alarm goes out of scope, it will generate a Cancel() causing you to get !ok in Next().
if you want to use this, you would need to put the Alarm into your class scope and trigger it:
std::unique_ptr<Alarm> alarm_;
alarm_.reset(new Alarm);
alarm_->Set(cq_, grpc_timeout_seconds_to_deadline(10), this);
from the doc on Alarm::Set:
Trigger an alarm instance on completion queue cq at the specified
time.
Once the alarm expires (at deadline) or it's cancelled (see Cancel),
an event with tag tag will be added to cq. If the alarm expired, the
event's success bit will be true, false otherwise (ie, upon
cancellation).

Related

How to implement long running gRPC async streaming data updates in C++ server

I'm creating an async gRPC server in C++. One of the methods streams data from the server to clients - it's used to send data updates to clients. The frequency of the data updates isn't predictable. They could be nearly continuous or as infrequent as once per hour. The model used in the gRPC example with the "CallData" class and the CREATE/PROCESS/FINISH states doesn't seem like it would work very well for that. I've seen an example that shows how to create a 'polling' loop that sleeps for some time and then wakes up to check for new data, but that doesn't seem very efficient.
Is there another way to do this? If I use the "CallData" method can it block in the 'PROCESS' state until there's data (which probably wouldn't be my first choice)? Or better, can I structure my code so I can notify a gRPC handler when data is available?
Any ideas or examples would be appreciated.
In a server-side streaming example, you probably need more states, because you need to track whether there is currently a write already in progress. I would add two states, one called WRITE_PENDING that is used when a write is in progress, and another called WRITABLE that is used when a new message can be sent immediately. When a new message is produced, if you are in state WRITABLE, you can send immediately and go into state WRITE_PENDING, but if you are in state WRITE_PENDING, then the newly produced message needs to go into a queue to be sent after the current write finishes. When a write finishes, if the queue is non-empty, you can grab the next message from the queue and immediately start a write for it; otherwise, you can just go into state WRITABLE and wait for another message to be produced.
There should be no need to block here, and you probably don't want to do that anyway, because it would tie up a thread that should otherwise be polling the completion queue. If all of your threads wind up blocked that way, you will be blind to new events (such as new calls coming in).
An alternative here would be to use the C++ sync API, which is much easier to use. In that case, you can simply write straight-line blocking code. But the cost is that it creates one thread on the server for each in-progress call, so it may not be feasible, depending on the amount of traffic you're handling.
I hope this information is helpful!

Should a call to WSAResetEvent after WSAEnumNetworkEvents cause event to never be set again?

We have a thread which is reading off of a socket. We ran into an issue on a network with a little more latency that we are used to, where our read loop would seemingly stop getting notified of read events on the socket. Original code (some error checking removed):
HANDLE hEventSocket = WSACreateEvent();
WSAEventSelect(pIOParams->sock, hEventSocket, FD_READ | FD_CLOSE);
std::array<HANDLE, 2> ahEvents;
// This is an event handle that can be called from another thread to
// get this read thread to exit
ahEvents[0] = pIOParams->hEventStop;
ahEvents[1] = hEventSocket;
while(pIOParams->bIsReading)
{
// wait for stop or I/O events
DWORD dwTimeout = 30000; // in ms
dwWaitResult = WSAWaitForMultipleEvents(ahEvents.size(), ahEvents.data(), FALSE, dwTimeout, FALSE);
if(dwWaitResult == WSA_WAIT_TIMEOUT)
{
CLogger::LogPrintf(LogLevel::LOG_DEBUG, "CSessionClient", "WSAWaitForMultipleEvents time out");
continue;
}
if(dwWaitResult == WAIT_OBJECT_0) // check to see if we were signaled to stop from another thread
{
break;
}
if(dwWaitResult == WAIT_OBJECT_0 +1)
{
// determine which I/O operation triggered event
if (WSAEnumNetworkEvents(pIOParams->sock, hEventSocket, &NetworkEvents) != 0)
{
int err = WSAGetLastError();
CLogger::LogPrintf(LogLevel::LOG_WARN, "CSessionClient", "WSAEnumNetworkEvents failed (%d)", err);
break;
}
// HERE IS THE LINE WE REMOVED THAT SEEMED TO FIX THE PROBLEM
WSAResetEvent(hEventSocket);
// Handle events on socket
if (NetworkEvents.lNetworkEvents & FD_READ)
{
// Do stuff to read from socket
}
if (NetworkEvents.lNetworkEvents & FD_CLOSE)
{
// Handle that the socket was closed
break;
}
}
}
Here is the issue: With WSAResetEvent(hEventSocket); in the code, sometimes the program works and reads all of the data from the server, but sometimes, it seems to get stuck in a loop receiving WSA_WAIT_TIMEOUT, even though the server appears to have data queued up for it.
While the program is looping receiving WSA_WAIT_TIMEOUT, Process Hacker shows the socket connected in a normal state.
Now we know that WSAEnumNetworkEvents will reset hEventSocket, but it doesn't seem like the additional call to WSAResetEvent should hurt. It also doesn't make sense that it permanently messes up the signaling. I would expect that perhaps we wouldn't get notified of the last chunk of data to be read, as data could have been read in between the call to WSAEnumNetworkEvents and WSAResetEvent, but I would assume that once additional data came in on the socket, the hEventSocket would get raised.
The stranger part of this is that we have been running this code for years, and we're only now seeing this issue.
Any ideas why this would cause an issue?
Calling WSAResetEvent() manually introduces a race condition that can put your socket into a bad state.
After WSAEnumNetworkEvents() is called, when new data arrives afterwards, or there is unread data left over from an earlier read, then the event is signaled, but ONLY if the socket is in the proper state to signal that event.
If the event does get signaled before you call WSAResetEvent(), you lose that signal.
Per the WSAEventSelect() documentation:
Having successfully recorded the occurrence of the network event (by setting the corresponding bit in the internal network event record) and signaled the associated event object, no further actions are taken for that network event until the application makes the function call that implicitly reenables the setting of that network event and signaling of the associated event object.
FD_READ
The recv, recvfrom, WSARecv, WSARecvEx, or WSARecvFrom function.
...
Any call to the reenabling routine, even one that fails, results in reenabling of recording and signaling for the relevant network event and event object.
...
For FD_READ, FD_OOB, and FD_ACCEPT network events, network event recording and event object signaling are level-triggered. This means that if the reenabling routine is called and the relevant network condition is still valid after the call, the network event is recorded and the associated event object is set. This allows an application to be event-driven and not be concerned with the amount of data that arrives at any one time. 
What that means is that if you manually reset the event after calling WSAEnumNetworkEvents(), the event will NOT be signaled again until AFTER you perform a read on the socket (which re-enables the signing of the event for read operations) AND new data arrives afterwards, or you didn't read all of the data that was available.
By resetting the event manually, you lose the signal that allows WSAWaitForMultipleEvents() to tell you to call
WSAEnumNetworkEvents() so it can then tell you to read from the socket. Without that read, the event will never be signaled again when data is waiting to be read. The only other condition you registered that can signal the event is a socket closure.
Since WSAEnumNetworkEvents() already resets the event for you, DON'T reset the event manually!
You already pass the event handle to WSAEnumNetworkEvents, which resets the handle in an atomic manner. That is the handle is only reset if the pending event data is copied.
With a direct call to WSAResetEvent it would be possible for data notification to be lost (that is you call WSAEnumNetworkEvents to get the current status and reset the event after which more data arrives causing the event to be set but before you call WSAResetEvent, you then call WSAResetEvent before the next loop iteration and unless more data comes in you won't get told about the data that already came in).
Far better to just let WSAEnumNetworkEvents deal with the event state.

Spring Integration Multiple consumers not processing concurrently

I am using Spring Integration with ActiveMQ. I defined a DefaultMessageListenerContainer with maxConcurrentConsumers = 5. It is referenced in a . After an int-xml:validating-filter and an int-xml:unmarshalling-transformer, I defined a queue channel actionInstructionTransformed. And I have got a poller for this queue channel. When I start my application, in the ActiveMQ console, I can see that a connection is created and inside five sessions.
Now, I have got a #MessageEndpoint with a method annotated
#ServiceActivator(inputChannel = "actionInstructionTransformed", poller = #Poller(value = "customPoller")).
I have got a log statement at the method entrance. Processing of each message is long (several minutes). In my logs, I can see that thread-1 starts the processing and then I can only see thread-1 outputs. Only when thread-1 has finished processing 1 message, I can see thread-2 starts processing the next message, etc. I do NOT have any synchronized block inside my class annotated #MessageEndpoint. I have not managed to get thread-1, thread-2, etc process messages concurrently.
Has anybody experienced something similar?
Look, you say:
After an int-xml:validating-filter and an int-xml:unmarshalling-transformer, I defined a queue channel actionInstructionTransformed.
Now let's go to the QueueChannel and PollingConsumer definitions!
On the other hand, a channel adapter connected to a channel that implements the org.springframework.messaging.PollableChannel interface (e.g. a QueueChannel) will produce an instance of PollingConsumer.
And pay attention that #Poller (PollerMetadata) has taskExecutor option.
By default the TaskScedhuler ask QueueChannel for data periodically according to the trigger configuration. If that is PeriodicTrigger with default options like fixedRate = false, the next poll really happens after the previous one. That's why you see only one Thread.
So, try to configure taskExecutor and your messages from that queue will go in parallel.
The concurrency on the DefaultMessageListenerContainer does not have effect. Because in the end you place all those messages to the QueueChannel. And here a new Threading model starts to work based on the #Poller configuration.

How to prevent other workers from accessing a message which is being currently processed?

I am working on a project that will require multiple workers to access the same queue to get information about a file which they will manipulate. Files are ranging from size, from mere megabytes to hundreds of gigabytes. For this reason, a visibility timeout doesn't seem to make sense because I cannot be certain how long it will take. I have though of a couple of ways but if there is a better way, please let me know.
The message is deleted from the original queue and put into a
‘waiting’ queue. When the program finished processing the file, it
deletes it, otherwise the message is deleted from the queue and put
back into the original queue.
The message id is checked with a database. If the message id is
found, it is ignored. Otherwise the program starts processing the
message and inserts the message id into the database.
Thanks in advance!
Use the default-provided SQS timeout but take advantage of ChangeMessageVisibility.
You can specify the timeout in several ways:
When the queue is created (default timeout)
When the message is retrieved
By having the worker call back to SQS and extend the timeout
If you are worried that you do not know the appropriate processing time, use a default value that is good for most situations, but don't make it so big that things become unnecessarily delayed.
Then, modify your workers to make a ChangeMessageVisiblity call to SQS periodically to extend the timeout. If a worker dies, the message stops being extended and it will reappear on the queue to be processed by another worker.
See: MessageVisibility documentation

How to design a state machine in face of non-blocking I/O?

I'm using Qt framework which has by default non-blocking I/O to develop an application navigating through several web pages (online stores) and carrying out different actions on these pages. I'm "mapping" specific web page to a state machine which I use to navigate through this page.
This state machine has these transitions;
Connect, LogIn, Query, LogOut, Disconnect
and these states;
Start, Connecting, Connected, LoggingIn, LoggedIn, Querying, QueryDone, LoggingOut, LoggedOut, Disconnecting, Disconnected
Transitions from *ing to *ed states (Connecting->Connected), are due to LoadFinished asynchronous network events received from network object when currently requested url is loaded. Transitions from *ed to *ing states (Connected->LoggingIn) are due to events send by me.
I want to be able to send several events (commands) to this machine (like Connect, LogIn, Query("productA"), Query("productB"), LogOut, LogIn, Query("productC"), LogOut, Disconnect) at once and have it process them. I don't want to block waiting for the machine to finish processing all events I sent to it. The problem is they have to be interleaved with the above mentioned network events informing machine about the url being downloaded. Without interleaving machine can't advance its state (and process my events) because advancing from *ing to *ed occurs only after receiving network type of event.
How can I achieve my design goal?
EDIT
The state machine I'm using has its own event loop and events are not queued in it so could be missed by machine if they come when the machine is busy.
Network I/O events are not posted directly to neither the state machine nor the event queue I'm using. They are posted to my code (handler) and I have to handle them. I can forward them as I wish but please have in mind remark no. 1.
Take a look at my answer to this question where I described my current design in details. The question is if and how can I improve this design by making it
More robust
Simpler
Sounds like you want the state machine to have an event queue. Queue up the events, start processing the first one, and when that completes pull the next event off the queue and start on that. So instead of the state machine being driven by the client code directly, it's driven by the queue.
This means that any logic which involves using the result of one transition in the next one has to be in the machine. For example, if the "login complete" page tells you where to go next. If that's not possible, then the event could perhaps include a callback which the machine can call, to return whatever it needs to know.
Asking this question I already had a working design which I didn't want to write about not to skew answers in any direction :) I'm going to describe in this pseudo answer what the design I have is.
In addition to the state machine I have a queue of events. Instead of posting events directly to the machine I'm placing them in the queue. There is however problem with network events which are asynchronous and come in any moment. If the queue is not empty and a network event comes I can't place it in the queue because the machine will be stuck waiting for it before processing events already in the queue. And the machine will wait forever because this network event is waiting behind all events placed in the queue earlier.
To overcome this problem I have two types of messages; normal and priority ones. Normal ones are those send by me and priority ones are all network ones. When I get network event I don't place it in the queue but instead I send it directly to the machine. This way it can finish its current task and progress to the next state before pulling the next event from the queue of events.
It works designed this way only because there is exactly 1:1 interleave of my events and network events. Because of this when the machine is waiting for a network event it's not busy doing anything (so it's ready to accept it and does not miss it) and vice versa - when the machine waits for my task it's only waiting for my task and not another network one.
I asked this question in hope for some more simple design than what I have now.
Strictly speaking, you can't. Because you only have state "Connecting", you don't know whether you need top login afterwards. You'd have to introduce a state "ConnectingWithIntentToLogin" to represent the result of a "Connect, then Login" event from the Start state.
Naturally there will be a lot of overlap between the "Connecting" and the "ConnectingWithIntentToLogin" states. This is most easily achieved by a state machine architecture that supports state hierarchies.
--- edit ---
Reading your later reactions, it's now clear what your actual problem is.
You do need extra state, obviously, whether that's ingrained in the FSM or outside it in a separate queue. Let's follow the model you prefer, with extra events in a queue. The rick here is that you're wondering how to "interleave" those queued events vis-a-vis the realtime events. You don't - events from the queue are actively extracted when entering specific states. In your case, those would be the "*ed" states like "Connected". Only when the queue is empty would you stay in the "Connected" state.
If you don't want to block, that means you don't care about the network replies. If on the other hand the replies interest you, you have to block waiting for them. Trying to design your FSM otherwise will quickly lead to your automaton's size reaching infinity.
How about moving the state machine to a different thread, i. e. QThread. I would implent a input queue in the state machine so I could send queries non blocking and a output queue to read the results of the queries. You could even call back a slotted function in your main thread via connect(...) if a result of a query arrives, Qt is thread safe in this regard.
This way your state machine could block as long as it needs without blocking your main program.
Sounds like you just want to do a list of blocking I/O in the background.
So have a thread execute:
while( !commands.empty() )
{
command = command.pop_back();
switch( command )
{
Connect:
DoBlockingConnect();
break;
...
}
}
NotifySenderDone();