Using Siddhi patterns for events that haven't happened

Using Siddhi patterns for events that haven't happened - wso2

In the CEP engine can I look for a patterns for events that haven't occurred.
Editing the fraud pattern detection query: Can I fire the event if two purchases of the same card are made within one day and if the first purchase is less than $10 and the second one isn't greater than $10,000.
from every (a1 = purchase[price > 10] ) NOT -> a2 = purchase [price >10000 and 1.cardNo==a2.cardNo] within 1 day
insert into potentialFraud a1.cardNo as cardNo, a2.price as price, a2.place as place;
Fire if event1 hasn't been followed by event2 within the last hour rather than fire if event1 has been followed by event2 within the last hour?

Non occurrences are not supported as of CEP 3.1.0 (but it will be available in the next version, 4.0.0).
But your use case can be implemented in an alternative way. Since you want to find the occurrence of at least 1 event > 10 and no events > 10000 (per card no.) in the last hour, you can do something like follows:
add a filter that filters events with price > 10
send them to a time window (of 1 hour)
in the time window, use functions to calculate max() value (with a group by) and emit the max value with the output events
In a filter, check for max < 10000
This will look for one or more events with price > 10, but less than 10000 in the last hour.
You'll find the below documentation useful for implementing this:
https://docs.wso2.com/display/CEP310/Windows

Related

CloudWatch log filter count metric values are < 1

I followed the instructions here:
https://docs.amazonaws.cn/en_us/AmazonCloudWatch/latest/logs/CountOccurrencesExample.html
and created a log filter metric to count occurrences of a particular logged term
But when I graph the metric I get:
I don't see how a value of < 1 is possible for a count metric.
It seems like it is calculating something else, perhaps the ratio of hits for the log filter query vs total number of log entries. But that's a meaningless stat because these are application logs so it's not even the ratio of hits vs no of requests.
The shape of the graph looks right, but the units don't make sense.
How do I get a meaningful count from a log filter metric?

After thinking about this further I realised what maybe should have been obvious already... I was graphing the average rate of a count.
This can very easily be < 1
One option would be to instead graph the sum (per time bucket) of the count, so that is an easy way to get "occurrences per minute" or per second or whatever.
I realised eventually what I really wanted was the percentage of a specific subset of logs lines (potential matches) where the log term matched.
I achieved this by creating another metric which counted both the matching and non-matching instances of this log term, for a specific log path (e.g. requests to a particular endpoint, or calls to a specific function)
Then I can then hide both those metric lines from the graph and instead add a 'math expression' like m1 / m2 * 100 and show that to graph the percentage of requests which feature the log term of interest.

Compare batches of average values with each other in WSO2 Stream Processor

I've written some code in Siddhi that logs/prints the average of a batch of the last 100 events. So the average for event 0-100, 101-200, etc. I now want to compare these averages with each other to find some kind of trend. In first place I just want to see if there is some simple downward of upward trend for a certain amount of averages. For example I want to compare all average values with all upcoming 1-10 average values.
I've looked into Siddhi documentation but I did not find the answer that I wanted. I tried some solutions with partitioning, but this did not work. The below code is what I have right now.
define stream HBStream(ID int, DateTime String, Result double);
#info(name = 'Average100Query')
from HBStream#window.lengthBatch(100)
select ID, DateTime, Result, avg(Result)
insert into OutputStream;

Siddhi sequences can be used to match the averages and to identify a trend, https://siddhi.io/en/v5.1/docs/query-guide/#sequence
from every e1=HBStream, e2=HBStream[e2.avgResult > e1.avgResult], e3=HBStream[e3.avgResult > e2.avgResult]
select e1.ID, e3.avgResult - e1.avgResult as tempDiff
insert into TempDiffStream;
Please note you have to use partition to decide this patter per ID is you need averages to be calculated per Sensor. In your app, also use group by if you need average per sensor
#info(name = 'Average100Query')
from HBStream#window.lengthBatch(100)
select ID, DateTime, Result, avg(Result) as avgResult
group by ID
insert into OutputStream;

AWS IoT Analytics Delta Window

I am having real problems getting the AWS IoT Analytics Delta Window (docs) to work.
I am trying to set it up so that every day a query is run to get the last 1 hour of data only. According to the docs the schedule feature can be used to run the query using a cron expression (in my case every hour) and the delta window should restrict my query to only include records that are in the specified time window (in my case the last hour).
The SQL query I am running is simply SELECT * FROM dev_iot_analytics_datastore and if I don't include any delta window I get the records as expected. Unfortunately when I include a delta expression I get nothing (ever). I left the data accumulating for about 10 days now so there are a couple of million records in the database. Given that I was unsure what the optimal format would be I have included the following temporal fields in the entries:
datetime : 2019-05-15T01:29:26.509
(A string formatted using ISO Local Date Time)
timestamp_sec : 1557883766
(A unix epoch expressed in seconds)
timestamp_milli : 1557883766509
(A unix epoch expressed in milliseconds)
There is also a value automatically added by AWS called __dt which is a uses the same format as my datetime except it seems to be accurate to within 1 day. i.e. All values entered within a given day have the same value (e.g. 2019-05-15 00:00:00.00)
I have tried a range of expressions (including the suggested AWS expression) from both standard SQL and Presto as I'm not sure which one is being used for this query. I know they use a subset of Presto for the analytics so it makes sense that they would use it for the delta but the docs simply say '... any valid SQL expression'.
Expressions I have tried so far with no luck:
from_unixtime(timestamp_sec)
from_unixtime(timestamp_milli)
cast(from_unixtime(unixtime_sec) as date)
cast(from_unixtime(unixtime_milli) as date)
date_format(from_unixtime(timestamp_sec), '%Y-%m-%dT%h:%i:%s')
date_format(from_unixtime(timestamp_milli), '%Y-%m-%dT%h:%i:%s')
from_iso8601_timestamp(datetime)

What are the offset and time expression parameters that you are using?
Since delta windows are effectively filters inserted into your SQL, you can troubleshoot them by manually inserting the filter expression into your data set's query.
Namely, applying a delta window filter with -3 minute (negative) offset and 'from_unixtime(my_timestamp)' time expression to a 'SELECT my_field FROM my_datastore' query translates to an equivalent query:
SELECT my_field FROM
(SELECT * FROM "my_datastore" WHERE
(__dt between date_trunc('day', iota_latest_succeeded_schedule_time() - interval '1' day)
and date_trunc('day', iota_current_schedule_time() + interval '1' day)) AND
iota_latest_succeeded_schedule_time() - interval '3' minute < from_unixtime(my_timestamp) AND
from_unixtime(my_timestamp) <= iota_current_schedule_time() - interval '3' minute)
Try using a similar query (with no delta time filter) with correct values for offset and time expression and see what you get, The (_dt between ...) is just an optimization for limiting the scanned partitions. You can remove it for the purposes of troubleshooting.

Please try the following:
Set query to SELECT * FROM dev_iot_analytics_datastore
Data selection filter:
Data selection window: Delta time
Offset: -1 Hours
Timestamp expression: from_unixtime(timestamp_sec)
Wait for dataset content to run for a bit, say 15 minutes or more.
Check contents

After several weeks of testing and trying all the suggestions in this post along with many more it appears that the extremely technical answer was to 'switch off and back on'. I deleted the whole analytics stack and rebuild everything with different names and it now seems to now be working!
Its important that even though I have flagged this as the correct answer due to the actual resolution. Both the answers provided by #Populus and #Roger are correct had my deployment being functioning as expected.

I found by chance that changing SELECT * FROM datastore to SELECT id1, id2, ... FROM datastore solved the problem.

Checking the time in ORACLE APEX 5.1

I'm am new to apex and I'm working on a food ordering application where customers are permitted to change their order details only up to 15 minutes after the order has been placed. How can I implement that ?

Create a validation on date item. Calculate difference between SYSDATE (i.e. "now") and order date. Subtracting two DATE datatype values results in number of days, so multiply it by 24 (to get hours) and by 60 (to get minutes). If that result is more than 15, raise an error.

To provide an alternative to Littlefoot's answer, timestamp arithmetic returns interval literals, if you use SYSTIMESTAMP instead your query could be:
systimestamp - order_date < interval '15' minute
or, even using SYSDATE something like:
order_date > sysdate - interval '15' minute
One note, the 15 minutes seems somewhat arbitrary (a magic number) it relies on the order not starting to be processed within that time limit. It feels more natural to say something like "you can change your order until the kitchen has started cooking it". There's no need for any magic numbers then and considerably less wastage (either of the customers time always waiting 15 minutes or of the kitchen's resources cooking something they may then have to discard).

CEP query based on date / time of the day

In WSO2 CEP, I made an execution plan that includes the following query:
(it will be fired if the temperature exeeds 20 degrees 3 times in a row within 10 seconds)
from MQTTstream[meta_temperature > 20]#window.time(10 sec)
select count(meta_temperature) as meta_temperature
having meta_temperature > 3
insert into out_temperatureAlarm
How can I achieve that the query is only applied if it is a special time of the day, e.g. 08:00 until 10:00 o'clock?
Is there something that I could put into the query like:
having meta_temperature > 3 and HOUR_OF_THE_DAY BETWEEN 8 and 10

You can use a cron window #window.cron instead of using a time window #window.time. You can specify Cron expression string for desired time periods in Siddhi [1]. Please refer quartz scheduler documentation to get more information on cron expression strings [2].
[1] https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-croncron
[2] http://www.quartz-scheduler.org/documentation/quartz-1.x/tutorials/crontrigger

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Using Siddhi patterns for events that haven't happened - wso2

Related

CloudWatch log filter count metric values are < 1

Compare batches of average values with each other in WSO2 Stream Processor

AWS IoT Analytics Delta Window

Checking the time in ORACLE APEX 5.1

CEP query based on date / time of the day

Categories

Resources