Concurrent send/receive go channel

Concurrent send/receive go channel - concurrency

I have a go channel called queue with let's say 100 as the buffer size. Many go routines can send data to this channel, and another go routine is sitting there to receive data from this channel. This is a long lasting process, meaning the channel is acting like a pipeline absorbing data from many ends and sinking data to one end. I do something like this in the receiving go routine:
for {
for data := range queue {
sink(data)
}
}
Now my question is: what if some new data were sent to the channel buffer before the range loop is finished. Will the new data be available for the next range loop? Or they will be missed if concurrency is not taken into consideration in this case?

You only need one for loop. From the spec on range expressions:
For channels, the iteration values produced are the successive values sent on the channel until the channel is closed. If the channel is nil, the range expression blocks forever.
In this case, the range loop is not acting like a regular range over, for example, a slice. Instead, when an item can be read from the channel, the loop body processes it. Therefore, your nested loops should be replaced with the following:
for data := range queue {
sink(data)
}

As #Tim said, you only need a single for, since range will emit values from the channel, until it is closed.
Overall, the pattern you describe is called fan-in. A example for a basic producer/consumer setup can be found here: http://play.golang.org/p/AhQ012Qpwj. The range loop runs in the consumer:
// consumer acts as fan in, signals when it is done.
func consumer(out chan string, done chan bool) {
for value := range out {
fmt.Println(value)
}
done <- true
}

Related

Explicit throughput limiting on part of an akka stream

I have a flow in our system which reads some elements from SQS (using alpakka) and does some preporcessing (~ 10 stages, normally < 1 minute in total). Then, the prepared element is sent to the main processing (single stage, taking a few minutes). The whole thing runs on AWS/K8S and we’d like to scale out when the SQS queue grows above a certain threshold. The issue is, the SQS queue takes a long time to blow up, since there are a lot of elements “idling” in-process, having done their preprocessing but waiting for the main thing.
We can’t externalize the preprocessing stuff to a separate queue since their outcome can’t survive a de/serialization roundtrip. Also, this service and the “main” processor are deeply coupled (this service runs as main’s sidecar) and can’t be scaled independently.
The preprocessing stages are technically .mapAsyncUnordered, but the whole thing is already very slim (stream stages and SQS batches/buffers).
We tried lowering the interstage buffer (akka.stream.materializer.max-input-buffer-size), but that only gives some indirect benefit, no direct control (and is too internal to be mucking with, for my taste anyway).
I tried implementing a “gate” wrapper which would limit the amount of elements allowed inside some arbitrary Flow, looking something like:
class LimitingGate[T, U](originalFlow: Flow[T, U], maxInFlight: Int) {
private def in: InputGate[T] = ???
private def out: OutputGate[U] = ???
def gatedFlow: Flow[T, U, NotUsed] = Flow[T].via(in).via(originalFlow).via(out)
}
And using callbacks between the in/out gates for throttling.
The implementation partially works (stream termination is giving me a hard time), but it feels like the wrong way to go about achieving the actual goal.
Any ideas / comments / enlightening questions are appreciated
Thanks!

Try something along these lines (I'm only compiling it in my head):
def inflightLimit[A, B, M](n: Int, source: Source[T, M])(businessFlow: Flow[T, B, _])(implicit materializer: Materializer): Source[B, M] = {
require(n > 0) // alternatively, could just result in a Source.empty...
val actorSource = Source.actorRef[Unit](
completionMatcher = PartialFunction.empty,
failureMatcher = PartialFunction.empty,
bufferSize = 2 * n,
overflowStrategy = OverflowStrategy.dropHead // shouldn't matter, but if the buffer fills, the effective limit will be reduced
)
val (flowControl, unitSource) = actorSource.preMaterialize()
source.statefulMapConcat { () =>
var firstElem: Boolean = true
{ a =>
if (firstElem) {
(0 until n).foreach(_ => flowControl.tell(())) // prime the pump on stream materialization
firstElem = false
}
List(a)
}}
.zip(unitSource)
.map(_._1)
.via(businessFlow)
.wireTap { _ => flowControl.tell(()) } // wireTap is Akka Streams 2.6, but can be easily replaced by a map stage which sends () to flowControl and passes through the input
}
Basically:
actorSource will emit a Unit ((), i.e. meaningless) element for every () it receives
statefulMapConcat will cause n messages to be sent to the actorSource only when the stream first starts (thus allowing n elements from the source through)
zip will pass on a pair of the input from source and a () only when actorSource and source both have an element available
for every element which exits businessFlow, a message will be sent to the actorSource, which will allow another element from the source through
Some things to note:
this will not in any way limit buffering within source
businessFlow cannot drop elements: after n elements are dropped the stream will no longer process elements but won't fail; if dropping elements is required, you may be able to inline businessFlow and have the stages which drop elements send a message to flowControl when they drop an element; there are other things to address this which you can do as well

c++ streaming udp data into a queue?

I am streaming data as a string over UDP, into a Socket class inside Unreal engine. This is threaded, and runs in the background.
My read function is:
float translate;
void FdataThread::ReceiveUDP()
{
uint32 Size;
TArray<uint8> ReceivedData;
if (ReceiverSocket->HasPendingData(Size))
{
int32 Read = 0;
ReceivedData.SetNumUninitialized(FMath::Min(Size, 65507u));
ReceiverSocket->RecvFrom(ReceivedData.GetData(), ReceivedData.Num(), Read, *targetAddr);
}
FString str = FString(bytesRead, UTF8_TO_TCHAR((const UTF8CHAR *)ReceivedData));
translate = FCString::Atof(*str);
}
I then call the translate variable from another class, on a Tick, or timer.
My test case sends an incrementing number from another application.
If I print this number from inside the above Read function, it looks as expected, counting up incrementally.
When i print it from the other thread, it is missing some of the numbers.
I believe this is because I call it on the Tick, so it misses out some data due to processing time.
My question is:
Is there a way to queue the incoming data, so that when i pull the value, it is the next incremental value and not the current one? What is the best way to go about this?
Thank you, please let me know if I have not been clear.

Is this the complete code? ReceivedData isn't used after it's filled with data from the socket. Instead, an (in this code) undefined variable 'buffer' is being used.
Also, it seems that the while loop could run multiple times, overwriting old data in the ReceivedData buffer. Add some debugging messages to see whether RecvFrom actually reads all bytes from the socket. I believe it reads only one 'packet'.
Finally, especially when you're using UDP sockets over the network, note that the UDP protocol isn't guaranteed to actually deliver its packets. However, I doubt this is causing your problems if you're using it on a single computer or a local network.

Your read loop doesn't make sense. You are reading and throwing away all datagrams but the last in any given sequence that happen to be in the socket receive buffer at the same time. The translate call should be inside the loop, and the loop should be while(true), or while (running), or similar.

How to limit an Akka Stream to execute and send down one message only once per second?

I have an Akka Stream and I want the stream to send messages down stream approximately every second.
I tried two ways to solve this problem, the first way was to make the producer at the start of the stream only send messages once every second when a Continue messages comes into this actor.
// When receive a Continue message in a ActorPublisher
// do work then...
if (totalDemand > 0) {
import scala.concurrent.duration._
context.system.scheduler.scheduleOnce(1 second, self, Continue)
}
This works for a short while then a flood of Continue messages appear in the ActorPublisher actor, I assume (guess but not sure) from downstream via back-pressure requesting messages as the downstream can consume fast but the upstream is not producing at a fast rate. So this method failed.
The other way I tried was via backpressure control, I used a MaxInFlightRequestStrategy on the ActorSubscriber at the end of the stream to limit the number of messages to 1 per second. This works but messages coming in come in at approximately three or so at a time, not just one at a time. It seems the backpressure control doesn't immediately change the rate of messages coming in OR messages were already queued in the stream and waiting to be processed.
So the problem is, how can I have an Akka Stream which can process one message only per second?
I discovered that MaxInFlightRequestStrategy is a valid way to do it but I should set the batch size to 1, its batch size is default 5, which was causing the problem I found. Also its an over-complicated way to solve the problem now that I am looking at the submitted answer here.

You can either put your elements through the throttling flow, which will back pressure a fast source, or you can use combination of tick and zip.
The first solution would be like this:
val veryFastSource =
Source.fromIterator(() => Iterator.continually(Random.nextLong() % 10000))
val throttlingFlow = Flow[Long].throttle(
// how many elements do you allow
elements = 1,
// in what unit of time
per = 1.second,
maximumBurst = 0,
// you can also set this to Enforcing, but then your
// stream will collapse if exceeding the number of elements / s
mode = ThrottleMode.Shaping
)
veryFastSource.via(throttlingFlow).runWith(Sink.foreach(println))
The second solution would be like this:
val veryFastSource =
Source.fromIterator(() => Iterator.continually(Random.nextLong() % 10000))
val tickingSource = Source.tick(1.second, 1.second, 0)
veryFastSource.zip(tickingSource).map(_._1).runWith(Sink.foreach(println))

avoiding collisions when collapsing infinity lock-free buffer to circular-buffer

I'm solving two feeds arbitrate problem of FAST protocol.
Please don't worry if you not familar with it, my question is pretty general actually. But i'm adding problem description for those who interested (you can skip it).
Data in all UDP Feeds are disseminated in two identical feeds (A and B) on two different multicast IPs. It is strongly recommended that client receive and process both feeds because of possible UDP packet loss. Processing two identical feeds allows one to statistically decrease the probability of packet loss.
It is not specified in what particular feed (A or B) the message appears for the first time. To arbitrate these feeds one should use the message sequence number found in Preamble or in tag 34-MsgSeqNum. Utilization of the Preamble allows one to determine message sequence number without decoding of FAST message.
Processing messages from feeds A and B should be performed using the following algorithm:
Listen feeds A and B
Process messages according to their sequence numbers.
Ignore a message if one with the same sequence number was already processed before.
If the gap in sequence number appears, this indicates packet loss in both feeds (A and B). Client should initiate one of the Recovery process. But first of all client should wait a reasonable time, perhaps the lost packet will come a bit later due to packet reordering. UDP protocol can’t guarantee the delivery of packets in a sequence.
// tcp recover algorithm further
I wrote such very simple class. It preallocates all required classes and then first thread that receive particular seqNum can process it. Another thread will drop it later:
class MsgQueue
{
public:
MsgQueue();
~MsgQueue(void);
bool Lock(uint32_t msgSeqNum);
Msg& Get(uint32_t msgSeqNum);
void Commit(uint32_t msgSeqNum);
private:
void Process();
static const int QUEUE_LENGTH = 1000000;
// 0 - available for use; 1 - processing; 2 - ready
std::atomic<uint16_t> status[QUEUE_LENGTH];
Msg updates[QUEUE_LENGTH];
};
Implementation:
MsgQueue::MsgQueue()
{
memset(status, 0, sizeof(status));
}
MsgQueue::~MsgQueue(void)
{
}
// For the same msgSeqNum should return true to only one thread
bool MsgQueue::Lock(uint32_t msgSeqNum)
{
uint16_t expected = 0;
return status[msgSeqNum].compare_exchange_strong(expected, 1);
}
void MsgQueue::Commit(uint32_t msgSeqNum)
{
status[msgSeqNum] = 2;
Process();
}
// this method probably should be combined with "Lock" but please ignore! :)
Msg& MsgQueue::Get(uint32_t msgSeqNum)
{
return updates[msgSeqNum];
}
void MsgQueue::Process()
{
// ready packets must be processed,
}
Usage:
if (!msgQueue.Lock(seq)) {
return;
}
Msg msg = msgQueue.Get(seq);
msg.Ticker = "HP"
msg.Bid = 100;
msg.Offer = 101;
msgQueue.Commit(seq);
This works fine if we assume that QUEUE_LENGTH is infinity. Because in this case one msgSeqNum = one updates array item.
But I have to make buffer circular because it is not possible to store entire history (many millions of packets) and there are no reason to do so. Actually I need to buffer enough packets to reconstruct the session, and once session is reconstructed i can drop them.
But having circular buffer significantly complicates algorithm. For example assume that we have circular buffer of length 1000. And at the same time we try to process seqNum = 10 000 and seqNum = 11 000 (this is VERY unlikely but still possible). Both these packets will map to the array updates at index 0 and so collision occur. In such case buffer should 'drop' old packets and process new packets.
It's trivial to implement what I want using locks but writing lock-free code on circular-buffer that used from different threads is really complicated. So I welcome any suggestions and advice how to do that. Thanks!

I don't believe you can use a ring buffer. A hashed index can be used in the status[] array. Ie, hash = seq % 1000. The issue is that the sequence number is dictated by the network and you have no control over it's ordering. You wish to lock based on this sequence number. Your array doesn't need to be infinite, just the range of the sequence number; but that is probably larger than practical.
I am not sure what is happening when the sequence number is locked. Does this mean another thread is processing it? If so, you must maintain a sub-list for hash collisions to resolve the particular sequence number.
You may also consider an array size as a power of 2. For example, 1024 will allow hash = seq & 1023; which should be quite efficient.

Reading packets off a fragmented byte stream with ranges?

I believe I sort of know ranges, but I have no real idea for when and where to use them, or how. I fail to "get" ranges. Consider this example:
Let's assume we have a network handler, that we have no control over, that calls our callback function in some thread whenever there's some new data for us:
void receivedData(ubyte[] data)
This stream of data contains packets with the layout
{
ushort size;
ubyte[size] body;
}
However, the network handler doesn't know about this, so a call to dataReceived() may contain one or two partial packets, one or several complete packets, or a combination. For simplicity, let's assume there can be no corrupt packets and that we'll receive a data.length == 0 when the connection is closed.
What we'd like now is some beautiful D code that turns all this chaos into appropriate calls to
void receivedPacket(ubyte[] body)
I can surely think of "brute-force" ways of accomplishing this. But here's perhaps where my confusion comes in: Can ranges play a role in this? Can we wrap up receivedData() into a nice range-thingy? How? Or is this just not the kind of problems where you'd use ranges? Why not?
(If it would make more sense for using ranges, feel free to redefine the example.)

what I'd do is
ubyte[1024] buffer=void;//temp buffer set the size as needed...
ushort filledPart;//length of the part of the buffer containing partial packet
union{ushort nextPacketLength=0;ubyte[2] packetLengtharr;}//length of the next packet
void receivedData(ubyte[] data){
if(!data.length)return;
if(nextPacketLength){
dataPart = min(nextPacketLength-filledPart.length,data.length);
buffer[filledPart..nextPacketLength] = data[0..dataPart];
filledPart += dataPart;
if(filledPart == nextPacketLength){
receivedPacket(buffer[0..nextPacketLength]);//the call
filledPart=nextPacketLength=0;//reset state
receivedData(datadataPart..$]);//recurse
}
} else{
packetLengtharr[]=data[0..2];//read length of next packet
if(nextPacketLength<data.length){//full paket in data -> avoid unnecessary copies
receivedPacket(data[2..2+nextPacketLength]);
receivedData(data[2+nextPacketLength..$]);//recurse
}else
receivedData(data[2..$]);//recurse to use the copy code above
}
}
it's recursive with 3 possible paths:
data is empty -> no action
nextPacketLength is set to a value != 0 -> copy as much data into the buffer as possible and if the packet is complete call the callback and reset filledPart and nextPacketLength and recurse with rest of data
nextPacketLength ==0 read the packet length (here with a union) if full packet available call callback then recurse with rest of data
there's only 1 bug still in there when data only hold the first byte of length but I'll let you figure that out (and I'm too lazy to do it now)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Concurrent send/receive go channel - concurrency

Related

Explicit throughput limiting on part of an akka stream

c++ streaming udp data into a queue?

How to limit an Akka Stream to execute and send down one message only once per second?

avoiding collisions when collapsing infinity lock-free buffer to circular-buffer

Reading packets off a fragmented byte stream with ranges?

Categories

Resources