Inconsistent BufferWithTime Behavior - unit-testing

I have a unit test that tests BufferWithTime. I seem to be getting inconsistent results when values are emitted at the point the buffering will emit a new value.
var scheduler = new TestScheduler();
var source = scheduler.CreateColdObservable(
new Recorded<Notification<int>>(50, new Notification<int>.OnNext(1)),
new Recorded<Notification<int>>(100, new Notification<int>.OnNext(2)),
new Recorded<Notification<int>>(150, new Notification<int>.OnNext(3)),
new Recorded<Notification<int>>(200, new Notification<int>.OnNext(4)),
new Recorded<Notification<int>>(250, new Notification<int>.OnNext(5)),
new Recorded<Notification<int>>(300, new Notification<int>.OnNext(6)),
new Recorded<Notification<int>>(350, new Notification<int>.OnNext(7)),
new Recorded<Notification<int>>(400, new Notification<int>.OnNext(8)),
new Recorded<Notification<int>>(450, new Notification<int>.OnNext(9)),
new Recorded<Notification<int>>(450, new Notification<int>.OnCompleted()));
var results = scheduler.Run(() => source
.BufferWithTime(TimeSpan.FromTicks(150), scheduler));
The results I get back from this are essentially:
results[0] = [1,2]
results[1] = [3,4,5,6]
results[2] = [7,8,9]
My question is, why is there only two items in the first buffer and 4 in the second? I would expect that a source that emits at the same time as buffering is supposed to happen, they either always go in the buffer or are always queued for the next buffer. Have I just stumbled upon a bug?

Based on responses on the MSDN forums this isn't a bug. You can read their answers here.
Basically, when something is scheduled to execute at exactly the same time as something else, it's the order of scheduling that takes precedence i.e. they are queued. When looking at the ordering of the scheduling with the above example you can see why I'm getting the behaviour that I'm getting.
BufferWithTime schedules a window to
open at 0 and close at 150.
The cold Source is then subscribed
to which schedules all other
notifications. At this point, the value to be
emitted at 150 is then queued behind
the closing of the window.
At time 150 the window closes first
(emitting the first buffer of two
values). The next window is opened
and is scheduled to close at 300.
The value that is scheduled to be
emitted at 150 is added to the
second buffer.
At time 300, the value 6 was
scheduled to be emitted first (as it
was scheduled when the source was
subscribed to) so it is added to the
second buffer. BufferWithTime then closes the window (emits the buffer) and opens a new one scheduled to close at 450.
They cycle will then continue consistently.

Related

How to use Airflow to process batch new data?

we want to use Airflow to process batch new data, first, our dag run a command to check our CRM system if there are new data every 15 minutes and then porcess the new data to two other systems, so it's like:
task1 (check if there are new data) > task 2 (send new data to system1) > task 3 (send new data to system2)
The problem is
the numbers of new data are dynamic, we don't know how many data we
might get.
how to porcess the new data one by one?
I am not sure what is the problem you face. Please be more specific.
The best bet is to create a custom operator(if there is no default one).
Task1(Extract new Data write to a location[Export as ndjson or other formats])>
Task2(Checks if there are any data(if the location is dynamic pass it through xcom))>
Task3(same as task 2(location may be passed as xcom))
Each run triggered every 15 min should fetch new data and push

QueueTrigger : Mesage with encoding error doesnt push message to poison queue

I have webjob queue trigger which is responding to queue message and it works fine. However sometimes we push messages manually in queue and if there is manual mistake which causes DecoderFallBackException. But the strange behavior is that looks like webjob keeps trying unlimited times and our AI logs are creating a mess. I tried restarting webjob to see if it clears any internal cache but doesn’t help.
only thing which helps is deleting queue
Ideally any exception beyond deque count should move message to poison queue.
I've tried to reproduce your issue on my side, but it works well. First I create a backend demo to insert invalid byte message in queue that could cause DecoderFallBackException
Encoding ae = Encoding.GetEncoding(
"us-ascii",
new EncoderExceptionFallback(),
new DecoderExceptionFallback());
string inputString = "XYZ";
byte[] encodedBytes = new byte[ae.GetByteCount(inputString)];
ae.GetBytes(inputString, 0, inputString.Length,
encodedBytes, 0);
//make the byte invalid
encodedBytes[0] = 0xFF;
encodedBytes[2] = 0xFF;
CloudQueueMessage message = new CloudQueueMessage(encodedBytes);
queue.AddMessage(message);
Web Job code:
public static void ProcessQueueMessage([QueueTrigger("queue")] string message, TextWriter log)
{
log.WriteLine(message);
}
After 5 times the exception occurs, the message is moved to 'queue-poison'. This is the expected behavior. Check here for details:
maxDequeueCount 5 The number of times to try processing a message before moving it to the poison queue.
You may check if you accidently set "maxDequeueCount" to bigger value. If not, please provide your webjob code and how you find DecoderFallBackException for us to investigate.

Updating data base record only after waiting for 20 sec so the maximum record can be received

Updating data base record only after waiting for 20 sec so the maximum record can be received.
When a student record is added from a tool(A tool which adds new student details to the data base) will send an Event ie RecordChangedEvent and this tool can add many records at same time.
Note :when RecordChangedEvent is received i have to make a call to changed.list to get the newly added student record
Let say first time I added 200 records from tools and tools send 200 RecordChangedEvent but here I don't want to receive 200 events rather I will delay by 20 sec and ignore which comes during this delay.
After coming back from delay I must call on changed.list which will add all the newly added records which came when it was in 20 sec delay and add them all in a single go.
My problem is that I am getting 200 RecordChangedEvent and it is delayed by 200*20 sec (which is bad). I want to ignore all the events when it is on delay and when it come back after delay just get the update list from changed.list
Below is my approach (inefficient)
RecordChangedEvent(void)
{
static bool lock = false;
bool updateNewRecord = false;
// delaying for 20 sec when first event received so that max record can be received
if(!lock)
{
lock = true;
std::this_thread::sleep_for(std::chrono::seconds(20));
updateNewRecord = true;
}
if(updateNewRecord)
{
// adding after 20 sec delay changed.list will have all the updated received records within 20 seconds
AddedRecord(changed.list);
lock = false;
}
}
You need multiple threads calling RecordChangedEvent. If you only use one thread, that thread will block for 20 seconds before returning, getting the next event, then calling RecordChangedEvent again (where it will wait for 20 seconds again). So you need at least one more thread to handle events while the first thread is sleeping.
Once you have multiple threads, you can still run into problems. Your lock variable is not thread safe. There is the remote possibility that two threads can enter the if with the lock at the same time. Also, there is no guarantee that a change to lock in one thread will be immediately visible in another. You should use a standard lock object instead.
Another source of problems is that your use of changed.list here is unsynchronized, since you're reading from it (either to copy it to pass to AddRecord or within AddRecord if you pass it in by reference) while elsewhere in your program another thread can be trying to add a new item to the list.

How to limit an Akka Stream to execute and send down one message only once per second?

I have an Akka Stream and I want the stream to send messages down stream approximately every second.
I tried two ways to solve this problem, the first way was to make the producer at the start of the stream only send messages once every second when a Continue messages comes into this actor.
// When receive a Continue message in a ActorPublisher
// do work then...
if (totalDemand > 0) {
import scala.concurrent.duration._
context.system.scheduler.scheduleOnce(1 second, self, Continue)
}
This works for a short while then a flood of Continue messages appear in the ActorPublisher actor, I assume (guess but not sure) from downstream via back-pressure requesting messages as the downstream can consume fast but the upstream is not producing at a fast rate. So this method failed.
The other way I tried was via backpressure control, I used a MaxInFlightRequestStrategy on the ActorSubscriber at the end of the stream to limit the number of messages to 1 per second. This works but messages coming in come in at approximately three or so at a time, not just one at a time. It seems the backpressure control doesn't immediately change the rate of messages coming in OR messages were already queued in the stream and waiting to be processed.
So the problem is, how can I have an Akka Stream which can process one message only per second?
I discovered that MaxInFlightRequestStrategy is a valid way to do it but I should set the batch size to 1, its batch size is default 5, which was causing the problem I found. Also its an over-complicated way to solve the problem now that I am looking at the submitted answer here.
You can either put your elements through the throttling flow, which will back pressure a fast source, or you can use combination of tick and zip.
The first solution would be like this:
val veryFastSource =
Source.fromIterator(() => Iterator.continually(Random.nextLong() % 10000))
val throttlingFlow = Flow[Long].throttle(
// how many elements do you allow
elements = 1,
// in what unit of time
per = 1.second,
maximumBurst = 0,
// you can also set this to Enforcing, but then your
// stream will collapse if exceeding the number of elements / s
mode = ThrottleMode.Shaping
)
veryFastSource.via(throttlingFlow).runWith(Sink.foreach(println))
The second solution would be like this:
val veryFastSource =
Source.fromIterator(() => Iterator.continually(Random.nextLong() % 10000))
val tickingSource = Source.tick(1.second, 1.second, 0)
veryFastSource.zip(tickingSource).map(_._1).runWith(Sink.foreach(println))

Stopping or cancelling queued keyboard commands in a program

I have a program written in python 2.7 which takes photos of a sample from 3 different cameras when the result value is typed into the program.
The USB controller bandwidth can't handle all cameras firing at the same time, so I have to call each one individually. This causes a delay between pressing the value and the preview of the pictures showing up.
During this delay, the program is still able to accept keyboard commands which are then addressed once the photos have been taken. This is causing issues, as sometimes, values are inputted twice, which means that the value is then applied to the next one after it has taken the photos for the first sample.
What I'm after is a way to disregard any queued keyboard commands whilst the program is working on the current command:
def selChange(self):
#Disable the textbox
self.valInput.configure(state='disabled')
#Gather pictures from cameras and store them in 2D list with sample result (This takes a second or two to complete)
self.gatherPictures()
if not int(self.SampleList.size()) == 0:
#clear texbox
self.valInput.delete(0,END)
#Create previews from 2D list
self.img1 = ImageTk.PhotoImage(self.dataList[int(self.SampleList.curselection()[0])][2].resize((250,250),Image.ANTIALIAS))
self.pic1.configure(image = self.img1)
self.img2 = ImageTk.PhotoImage(self.dataList[int(self.SampleList.curselection()[0])][3].resize((250,250),Image.ANTIALIAS))
self.pic2.configure(image = self.img2)
self.img3 = ImageTk.PhotoImage(self.dataList[int(self.SampleList.curselection()[0])][4].resize((250,250),Image.ANTIALIAS))
self.pic3.configure(image = self.img3)
self.img4 = ImageTk.PhotoImage(Image.open("Data/" + str(self.dataList[int(self.SampleList.curselection()[0])][1]) + ".jpg").resize((250,250),Image.ANTIALIAS))
self.pic4.configure(image = self.img4)
#Unlock textbox ready for next sample
self.valInput.configure(state='normal')
I was hoping that disabling the textbox and re-enabling it afterwards would work, but it doesn't. I wanted to use buttons, but they have insisted that it be typed to increase speed