Support for creating dynamic query/rule in WSO2 CEP

Support for creating dynamic query/rule in WSO2 CEP - wso2

How to write query dynamically in WSO2-CEP. As in PSQL user has the option to write query dynamically, is there any alternative in CEP tool to achieve it?
Use case: 1. Suppose, if in the stream both function and data are available then there must be some option or by making some adapter user can make rules dynamically.
for ex.
steam : 1. function1:sum, function2: avg, function3: count, value1: 1, value2: 2, value3: 3
2. function1:sum, function2: min, function3: max, value1: 1, value2: 2, value3: 3
Rule should be created dynamically as following:
select sum(value1) as value1,avg(value2) as value2, count(value3) as value 3 from ....
select sum(value1) as value1,min(value2) as value2, max(value3) as value 3 from ....
Thanks
Gagan

In CEP we use ExecutionPlans to deploy queries+configurations. So if you want to change a query you need to edit the execution plan. In a manual use case, ExecutionPlan can be changed from management console. When you change an ExecutionPlan changes will come into effect immediately. However you will loose current states such as windows.
Do you want this rules to be changed depending on the incoming events? If so you can add another layer of siddhi query which will analyze incoming events and send them to desired query.
Tishan

Related

Algolia Grouping (attributeForDistinct)

I'm grouping data by the attributeForDistinct option from Algolia dashboard. Is there any way to use this option from react-instantSearch-dom? There is a distinct option in the dashboard that can be changed from code true/false. I want to use the attributeForDistinct option from code to group data dynamically.enter image description here

You would turn on distinct in your index configuration, or you can configure it client side by adding it to a Configure widget.
Here's a simple example:
https://www.algolia.com/doc/api-reference/widgets/configure/react/#examples
Set it to an integer value between 0 and 4 to control the number of records with the same value for attributeForDistinct returned in the result set.

Kibana: can I store "Time" as a variable and run a consecutive search?

I want to automate a few search in one, here are the steps:
Search in Kibana for this ID:"b2c729b5-6440-4829-8562-abd81991e2a0" which will return me a bunch of logs. Of these logs I need to take the first and the last timestamp:
I now would like to store these two data FROM: September 3rd 2019, 21:28:22.155, TO: September 3rd 2019, 21:28:23.524 in 2 variables
Run a second search in Kibana for the word "fail" in between these two variable of time
How to automate the whole process without need of copy/paste and running a second query?
EDIT:
SHORT STORY LONG: I work in a company that produce a software for autonomous vehicles.
SCENARIO: A booking is rejected and we need to understand why.
WHERE IS THE PROBLE: I need to monitor just a few seconds of logs on 3 different machines. Each log is completely separated, there is no relation between the logs so I cannot write a query in discover, I need to run 3 separated queries.
EXAMPLE:
A booking was rejected, so I open Chrome and I search on "elk-prod.myhost.com" for the BookingID:"b2c729b5-6440-4829-8562-abd81991e2a0" and I have a dozen of logs returned during a range of 2 seconds (FROM: September 3rd 2019, 21:28:22.155, TO: September 3rd 2019, 21:28:23.524).
Now I need to know what was happening on the car so I open a new Chrome tab and I search on "elk-prod.myhost.com" for the CarID: "Tesla-45-OU" on the time range FROM: September 3rd 2019, 21:28:22.155, TO: September 3rd 2019, 21:28:23.524
Now I need to know why the server which calculate the matching rejected the booking so I open a new Chrome tab and I search for the word CalculationMatrix always on the time range FROM: September 3rd 2019, 21:28:22.155, TO: September 3rd 2019, 21:28:23.524
CONCLUSION: I want to stop to keep opening Chrome tabs by hand and automate the whole thing. I have no idea around what time the book was made so I first need to search for the BookingID "b2c729b5-6440-4829-8562-abd81991e2a0", then store the timestamp of first and last log and run a second and third query based on those timestamps.
There is no relation between the 3 logs I search so there is no way to filter from the Discover, I need to automate 3 different query.

Here is how I would do it. First of all, from what I understand, you have three different indexes:
one for "bookings"
one for "cars"
one for "matchings"
First, in Discover, I would create three Saved Searches, one per index pattern. Then in Visualize, I would create a Vertical bar chart on the bookings saved search (Bucket X-Axis by date_histogram on the timestamp field, leave the rest as is). You'll get a nice histogram of all your booking events bucketed by time.
Finally, I would create a dashboard and add the vertical bar chart + those three saved searches inside it.
When done, the way I would search according to the process you've described above is as follows:
Search for the booking ID b2c729b5-6440-4829-8562-abd81991e2a0 in the top filter bar. In the bar chart histogram (bookings), you will see all documents related to the selected booking. On that chart, you can select the exact period from when the very first booking document happened to the very last. This will adapt the main time picker at the top and the start/end time will be "remembered" by Kibana
Remove the booking ID from the top filter (since we now know the time range and Kibana stores it). Search for Tesla-45-OU in the top filter bar. The bar histogram + the booking saved search + the matchings saved search will be empty, but you'll have data inside the second list, the one for cars. Find whatever you need to find in there and go to the next step.
Remove the car ID from the top filter and search for ComputationMatrix. Now the third saved search is going to show you whatever documents you need to see within that time range.
I'm lacking realistic data to try this out, but I definitely think this is possible as I've laid out above, probably with some adaptations.

Kibana does work like this (any order is ok):
Select time filter: https://www.elastic.co/guide/en/kibana/current/set-time-filter.html
Add additional criteria for search like for example field s is b2c729b5-6440-4829-8562-abd81991e2a0.
Add aditional criteria for search like for example field x is Fail.
Additionaly you can view surrounding documents https://www.elastic.co/guide/en/kibana/current/document-context.html#document-context
This is how Kibana works.
You can prepare some filters beforehands, save them and then use them if you want to automate the process of discovering somehow.
You can do that in Discover tab in Kibana using New/Save/Open options.
Edit:
I do not think you can achieve what you need in Kibana. As I mentioned earlier one option is to change the data that is comming to Elasticsearch so you can search for it via discover in Kibana. Another option could be builiding for example Java application, that is using Elasticsearch - then you can write algorithm that returns the data that you want. But i think it's a big overhead and I recommend checking the data first.
Edit: To clarify - you can create external Java let's say SpringBoot application that uses Elasticsearch - all the data that you need is inside it.
But in this option you will not use Kibana at all.
You can export the result to csv or what you want in the code.
SpringBoot application can ask ElasticSearch for whatever it needs, then it would be easy to store these time variables inside of Java code.
EDIT: After OP edited question to change it dramatically:
#FrancescoMantovani Well the edited version is very different from where you first posted here How to automate the whole process without need of copy/paste and running a second query? and search for word fail in a single shot. In accepted answer you are still using a three filters one at a time so it is not one search, but three.
What's more if you would use one index, and send data from multiple hosts via filebeat you don't even to have to create this dashboard to do that. Then you can you can select the exact period from when the very first document happened to the very last regarding filter and then remove it and add another filter that you need - it's simple as that. Before you were writing about one query,
How to automate the whole process without need of copy/paste and
running a second query?
not three. And you don't need to open new tab in Chrome each time you want to change filter just organize the data by for example using filebeat as mentioned before.
There is no relation between the 3 logs
From what you wrote the realation exist and it is time.
If the data is in for example three diferent indicies (cause documents don't have much similiar data) you can do it like that:
You change them easily in dicover see:
You can go to discover select index 1 search, select time range that you need, when you change index the time range is still the one you selected, you only need to change filter - you will get what you need.

PDI - Update field value in Logging tables

I'm trying create a transformation that can change field value in DB (postgreSQL what i use).
Case :
In postgre db I have table called Monitoring and it has several field like id, date, starttime, endtime, duration, transformation name, status, desc. All those value I get from Transformation Logging.
So, when I run the transformation it will insert into Monitoring table and set value for field status with Running. And when it done it will update the status into Finish. What I'm trying is to define value in table field by myself not take it from Transformation Logging so I can customize the value like I want to.
Goal is Update transformation status value from 'running' to 'finish/error/abort etc' in my db using pentaho and display that status in web app
I have thinking to used Modified Java Script step to do it but if there any other way maybe? A better one. (Just need opinion about this)

Apart from my remark, did you try the Value Mapper?

modified javascript is not a good idea to use. Ideally, it shouldn't be used due to the performance issue. You can use "add constant" step or "User defined Java Class" for an alternative.

You cannot change the values of the built-in Logging tables, for the simple reason that they are reserved for PDI usage. This causes a known issue in case of hard error: for example the status is not set to finish when the data base server crashes, or when a NullException is not catch by the DPI code.
You have some work around.
The simplest, the one used in the ETL-Pilot is to test (Status=Finish OR LogDate< 15 minutes ago) is the web app.
You can update the table when the transformation is not running. For example, put an hourly (or less) crontab that changes to Finish the status of any transformation whose LogDate is older than 15 mn. This crontab may be a simple SQL or included in a transformation that also check the tables size and/or send an email in case of potential error.
You can copy the table (if it is a non locking operation in your DB system), modify the Status column and use this table for your web app.

loading alternate tables in alternate workflow executions in Informatica

I want to load Target1 in first run of workflow and load Target2 in second run of same workflow and Target1 in third run of same workflow and so on...Please let me know how can I achieve this.

Create a Workflow with a persistent variable (eg. $$runCnt) with default value as 1. Use AssignmentTask to flip the variable value IIF($$runCnt=1, 2, 1). Link the AssignmentTask with two sessions, eg. s_Target1 and s_Target2. Use following conditions on the links:
AssignmentTask to s_Target1 link condition: $$runCnt=1
AssignmentTask to s_Target2 link condition: $$runCnt=2
The two Sessions should reuse same mapping, just override 'Target Table Name' property to use the appropriate table in each one of them.

trigger Informatica workflow based on the status column in oracle table

I want to implement the below scenario without using pl/sql procedure or trigger
I have a table called emp_details with coulmns (empno,ename,salary,emp_status,flag,date1) .
If someone updates the columns emp_status='abc' and flag='y', Informatica WF 1 would be in continuous running status and checking emp_status value "ABC"
If it found record / records then query all the records and it will invoke WF 2.
WF 1 will pass value ename,salary,Date1 to WF 2 (Wf2 will populate will insert the records into the table emp_details2).
How can I do this using the informatica approach instead of plsql or trigger?

If you want to achieve this in real time, write the output of WF1 to a message queue and in the second workflow WF2 subscribe to the message queue produced from WF1.
If you have batch process in place. Produce a output file from WF1 and use this output file in WF2. You can easily setup this dependency using job schedulers.

I don't understand why do you need two workflows in the first place. Why not accomplish emp_details2 table updates with the very same workflow that is looking for differences.
Anyway, this can be done using indicator file:
WF1 running continously should create a file if any changes have been found.
WF2 should be running continously with EventWait set to wait for the indicator file specified above. Once found it should use the Assignment Task to rename/delete the file and fetch the desired data from source and populate the emp_details2 table.
If you need it this way, you can pass the data through the indicator file

You can do this in a single workflow, Create a dummy session which which check for the flag in table after this divide the flow into two based on the below link conditions,
Flow one: Link condition, Session.Status=SUCCEEDED and SOURCE_SUCCESS_ROWS(count)>=1 then run your actual session which will load the data
Flow two: Link Condition, Session.Status=SUCCEEDED and SOURCE_SUCCESS_ROWS=0, connect this to control task and mark the workflow as complete.
Make sure you schedule the workflow at Informatica level to run continousuly.
Cheers

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js