Using Siddhi CEP 3.1.2 as standalone library, I am using try figure out how to correctly trigger the callback handler under a number of conditions.
Two events joined by logical AND
a AND b
I find with the above that if i provide both events, then i can trigger the callback handler, however I've also seen that if i subsequently provide either a or b then the handler is also triggered. I did not expect the latter to happen as I assumed there would be no match due to the previous execution of the handler... Is there a way to clear the streams following a successful match?
Two events joined by logical AND including a within.
a AND b within 5 sec
I've also found in the second case, that the "within" option is seemingly ignored. The callback is triggered regardless of the time gap between the events a and b.
Have I misunderstood the Siddhi documentation? I'd appreciate any guidance on these. Thanks
These are two bugs based on our testing and fixed with the PR #436.
The following test cases are added to ensure the correct behavior:
Test A and B
Test every (A and B)
Test A -> B and C within 1 sec
You will get these fixes from the next release onwards or else you can build Siddhi from source and test them right now. Here is a blog post on how to build and use Siddhi from source: Siddhi 4.0.0 Early Access.
Related
Been recently developing a Dataflow consumer which read from a PubSub subscription and outputs to Parquet files the combination of all those objects grouped within the same window.
While I was doing testing of this without a huge load everything seemed to work fine.
However, after performing some heavy testing I can see that from 1.000.000 events sent to that PubSub queue, only 1000 make it to Parquet!
According to multiple wall times across different stages, the one which parses the events prior applying the window seems to last 58 minutes. The last stage which writes to Parquet files lasts 1h and 32 minutes.
I will show now the most relevant parts of the code within, hope you can shed some light if its due to the logic that comes before the Window object definition or if it's the Window object iself.
pipeline
.apply("Reading PubSub Events",
PubsubIO.readMessagesWithAttributes()
.fromSubscription(options.getSubscription()))
.apply("Map to AvroSchemaRecord (GenericRecord)",
ParDo.of(new PubsubMessageToGenericRecord()))
.setCoder(AvroCoder.of(AVRO_SCHEMA))
.apply("15m window",
Window.<GenericRecord>into(FixedWindows.of(Duration.standardMinutes(15)))
.triggering(AfterProcessingTime
.pastFirstElementInPane()
.plusDelayOf(Duration.standardSeconds(1)))
.withAllowedLateness(Duration.ZERO)
.accumulatingFiredPanes()
)
Also note that I'm running Beam 2.9.0.
Could the logic inside the second stage be too heavy so that messages arrive too late and get discarded in the Window? The logic basically consists reading the payload, parsing into a POJO (reading inner Map attributes, filtering and such)
However, if I sent a million events to PubSub, all those million events make it till the Parquet write to file stage, but then those Parquet files don't contain all those events, just partially. Does that make sense?
I would need the trigger to consume all those events independently of the delay.
Citing from an answer on the Apache Beam mailing list:
This is an unfortunate usability problem with triggers where you can accidentally close the window and drop all data. I think instead, you probably want this trigger:
Repeatedly.forever(
AfterProcessingTime
.pastFirstElementInPane()
.plusDelayOf(Duration.standardSeconds(1)))
The way I recommend to express this trigger is:
AfterWatermark.pastEndOfWindow().withEarlyFirings(
AfterProcessingTime
.pastFirstElementInPane()
.plusDelayOf(Duration.standardSeconds(1)))
In the second case it is impossible to accidentally "close" the window and drop all data.
I'am using Stream Processor for receiveing events and I need to know, that there is any way, how to check that some event arrived within specified time in window? Let's say we want to check, that event arrived every 5 minutes. If it's not, we need to publish alert. Have Siddhi 4.0 any function for this purpose? My idea was counting same events in time window and then equal this count, but don't know, if it's the best way how to deal this problem.
You can do this using logical patterns.1
I have created a framework in which I have used Set Browser Implicit Wait 30
I have 50 suite that contains total of 700 test cases. A few of the test cases (200 TC's) has steps to find if Element present and element not present. My Objective is that I do not want to wait until 30 seconds to check if Element Present or Element not Present. I tried using Wait Until Element Is Visible ${locator} timeout=10, expecting to wait only 10 seconds for the Element , but it wait for 30 seconds.
Question : Can somebody help with the right approach to deal with such scenarios in my framework? If I agree to wait until 30 seconds, the time taken to complete such test case will be more. I am trying to save 20*200 secs currently Please advise
The simplest solution is to change the implicit wait right before checking that an element does not exist, and then changing it back afterwards. You can do this with the keyword set selenium implicit wait.
For example, your keyword might look something like this:
*** Keywords ***
verify element is not on page
[Arguments] ${locator}
${old_wait}= Set selenium implicit wait 10
run keyword and continue on failure
... page should not contain element ${locator}
set selenium implicit wait ${old_wait}
You can simply add timeout="${Time}" next to the keyword you want to execute (Exp., Wait Until Page Contains Element ${locator} timeout=50)
The problem you're running into deals with issue of "Implicit wait vs Explicit Wait". Searching the internet will provide you with a lot of good explanations on why mixing is not recommended, but I think Jim Evans (Creator of IE Webdriver) explained it nicely in this stackoverflow answer.
Improving the performance of your test run is typically done by utilizing one or both of these:
Shorten the duration of each individual test
Run test in parallel.
Shortening the duration of a test typically means being in complete control of the application under test resulting in the script knowing when the application has successfully loaded the moment it happens. This means having a a low or none Implicit wait and working exclusively with Fluent waits (waiting for a condition to occur). This will result in your tests running at the speed your application allows.
This may mean investing time understanding the application you test on a technical level. By using a custom locator you can still use all the regular SeleniumLibrary keywords and have a centralized waiting function.
Running tests in parallel starts with having tests that run standalone and have no dependencies on other tests. In Robot Framework this means having Test Suite Files that can run independently of each other. Most of us use Pabot to run our suites in parallel and merge the log file afterwards.
Running several browser application tests in parallel means running more than 1 browser at the same time. If you test in Chrome, this can be done on a single host - though it's not always recommended. When you run IE then you require multiple boxes/sessions. Then you start to require a Selenium Grid type solution to distribute the execution load across multiple machines.
I use the EventHub support of the Azure WebJobs Sdk to process Events. Because of the throughput I decided to go for batch processing of those Events, e.g. my method looks like this:
public static void HandleBatchRaw([EventHubTrigger("std")] EventData[] events) {...}
Now one of those events within a batch might cause an Exception - what's the right way to handle that? When I leave the Exception uncaught the processing stops and the remainder of the Events in the EventData[] parameter get lost.
Options:
Catch the Exception manually, forward the Event to some place
else and continue
Let the SDK do the magic, e.g. it should just
'ACK' the Events processed until then (I probably would have to do that), mark this event as 'Poisoned', exit the method and continue on the next call of the function.
Move to Single Event Handling - but for performance
goals I don't feel that's right
I missed the point and should think of another strategy
How should I approach this?
There are only four choices in any messaging solution:
1 Stop
2 Drop
3 Retry
4 Dead letter
You have to do that. I don't believe that SDK will retry anything. Recall there is no ACK for Event Hubs read, you just read.
How are you checkpointing?
Your best bet is probably your option #1. WebJobs EventHub binding doesn't give you many options here. Feel free to file an issue at https://github.com/Azure/azure-webjobs-sdk/issues to request better error handling support here.
If you want to see exactly what it's doing under the hood, here's the spot in the WebJobs SDK EventHub binding that receives events via EventProcessorHost:
https://github.com/Azure/azure-webjobs-sdk/blob/dev/src/Microsoft.Azure.WebJobs.ServiceBus/EventHubs/EventHubListener.cs#L86
Hard to write a good title for this question. I am developing a performance test in Gatling for a SOAP Webservice. I'm not very experienced with Gatling so I'm learning things as I go, but this conundrum has me entirely stumped.
One of the scenarios I am implementing a test for is an order-process consisting of several unique consecutive calls to the webservice, one of which is a polling call that returns the current status of the ordering process. Simplified, this call gets a SOAP Response with a status that can be of three types:
PROCESSING - Signifying the order is still processing.
ORDER_OK - Order completed without errors.
EVERYTHING_ELSE - A group of varying error-statuses and other results.
What I want to do, is have Gatling continuously poll the webservice until the processing-status changes - and then check that the status says it completed successfully. Polling continuously is easily implemented, but performing the check after it completes is turning out to be a far greater challenge than it has any business being.
So far, this is what I've done to solve the polling:
exec { session => session.set("status", "PROCESSING") }
.asLongAs(session => session("status").as[String].equals("PROCESSING")) {
exec(http("Poll order")
.post("/MyWebService")
.body(ELFileBody("bodies/ws/pollOrder.xml"))
.check(
status.is(200),
regex("soapFault").notExists,
regex("pollResponse").exists,
xpath("//*[local-name(.)='result']").exists.saveAs("status")
)
).exitHereIfFailed.pause(5 seconds)
}
This snip appears to be performing the polling correctly, it continues to poll until the orderStatus changes from processing to something else. I need to check the status to see if it changed to the response I am interested in however, because I don't know what it is, and only one of the many results it can be should cause the scenario to continue for that user.
A potential fix would be to add more checks in that call that go something like this:
.check(regex("EVERYTHING_ELSE_XYZ")).notExists
The service can return a LOT of different "not a happy day" messages however and I'm only really interested in the two other ones, so it would be preferable for me to be able to do a check only for the two valid happy-day responses. Checking if one exact thing exists seems far more sensible than checking that dozens of things don't.
What I thought I would be able to do was performing a check on the status variable in the users session when the step exits the asLongAs-loop, and continue/exit the scenario for that user. As it's a session-variable I could probably do this in the next step of the total scenario and break the run for that user there, but that would also mean the error is reported in the wrong place, and the next calls fault-% would be polluted by errors from the previous call.
Using pseudocode, being able to do something like this immediately after it exits the asLongAs loop would have been perfect:
if (session("status").as[String].equals("ORDER_OK")) ? continueTheScenario : failTheScenario
but I've not been able to do anything similar to that inside a gatling-chain. It's almost starting to appear impossible to do something like that, but can anyone see a solution to this that I'm not seeing?
Instead of "exists", use "in" to check that the result is one of the 2 valid values.