I'm trying to write an Execution Plan in wso2's CEP. But I'm getting an error in the select clause if I use the aggregated functions. In my case it is "sum". Please read below for further details:
#Export('stream.sla.consolidated.breach:1.0.0')
define stream ConsolidatedBreachSLA (breach_date string, breach_count_per_day int,request_id int);
#Export('stream.sla.breach.details:1.0.0')
define stream BreachSLA (request_id int, breach_date string, breach_flag int);
from BreachSLA#window.length(50)
select breach_date as breach_date,sum(breach_flag) as breach_count_per_day,request_id
group by breach_date
having breach_count_per_day > 2
insert into ConsolidatedBreachSLA;
CEP Error
But as soon as I remove the "sum" from select clause everything seems to get validated correctly.
from BreachSLA#window.length(50)
select breach_date as breach_date,breach_flag as breach_count_per_day,request_id
group by breach_date
having sum(breach_count_per_day) > 2
insert into ConsolidatedBreachSLA;
The intent is to get the sum of records in the select clause such that the value of the sum could be exported to a publisher.
The sum function returns a long value (for int or long inputs) whereas your attribute 'breach_flag' is defined as an int. It seems the stream is already defined with an int attribute for this, hence this conflict occurs and you'll need to modify the defined stream's (ConsolidatedBreachSLA and any subsequent streams) attribute type (i.e. type of breach_count_per_day) to 'long' to get the sum.
Rajeev was right about the conflict.
I was trying different things with this problem and I tried the "convert" function and it helped.
So I changed my Siddhi query to
from BreachSLA#window.length(50)
select breach_date,convert(sum(breach_flag),'int') as breach_count_per_day
group by breach_date
having breach_count_per_day > 2
insert into BreachCountDay;
Related
I have an Athena table that has a column in it that I would like to query. The type of the column is double, but it contains data of mixed types. The data is either:
A double (0-1 inclusive)
An array with 0 or 1 elements (again, a double 0-1 inclusive).
I have no idea how the column go into this state. I'm just trying to fix it.
If I do a naive query:
SELECT col FROM tbl.db;
I get the error: "HIVE_BAD_DATA: Error parsing field value '[]' for field 0: org.openx.data.jsonserde.json.JSONArray cannot be cast to java.lang.Double"
Some things that I've tried, but don't work:
Use try_cast
The docs on try_cast make it sound like the perfect solution; reality is not so kind.
When I tried to run
SELECT COALESCE(
try_cast(col AS double),
try_cast(col AS array<double>)) FROM tbl.db;
I get the error: "SYNTAX_ERROR: line 3:5: Cannot cast double to array(double)". Indeed, when I try more simple examples, I continue to get an error: both
SELECT try_cast(3.4 AS array<double>);
SELECT try_cast(ARRAY [3.4] AS double);
trigger errors. It appears that, although the docs claim that a cast error would cause the function to return null, perhaps that only works when casting between primitive data types.
Cast to JSON
While casting both doubles and arrays to JSON works fine as in these examples:
SELECT try_cast(3.4 AS JSON);
SELECT try_cast(ARRAY [3.4] AS JSON);
when I perform the cast on the actual column like so:
SELECT try_cast(col AS JSON) FROM tbl.db;
I get the error: "HIVE_BAD_DATA: Error parsing field value '["0.01"]' for field 0: org.openx.data.jsonserde.json.JSONArray cannot be cast to java.lang.Double"
I'd really like to be able to query this data. Alternatively, if it's possible to migrate it into a state where it's all one type, that would be an acceptable solution as well.
I have a custom function created in the Power Query Editor that accepts two parameters. One parameter should be dynamic based upon the value in a column. When tested with static parameters, the function works correctly as far as using a static parameters goes. However, if the table column is supplied (to provide the dynamic value needed), it produces a cyclical error. The function is invoked using "Invoke Custom Function". The first parameter selects a field, while the second parameter is set statically.
The Invoke Custom Function that produces a correct result:
= Table.AddColumn(#"Rounded Up", "qryProjectedDate", each qryProjectedDate(5, -1))
The Invoke Custom Function that produces an error:
= Table.AddColumn(#"Rounded Up", "qryProjectedDate", each qryProjectedDate([RemainingBudgetDays_RndUp], -1))
The qryProjectedDate:
(TotalLoops as number, Loop as number) =>
let
CurrentLoop = Loop + 1,
output =
if Calendar[IsWorkDay]{CurrentLoop} = 1
then
(if CurrentLoop = TotalLoops - 1
then Calendar[Date]{CurrentLoop}
else #qryProjectedDate(TotalLoops, CurrentLoop))
else #qryProjectedDate(TotalLoops + 1, CurrentLoop)
in output as date
This function produces a stack overflow error and a cyclical error messages.
Further testing of just inputting the column value as a parameter and returning that same value produces the value that I would expect. Therefore, I believe my issue is in how I have written the power query M function.
if function(5,-1) works then using function([column1],-1) will also work if column1 contains a numeric five
Are you sure the column is named RemainingBudgetDays_RndUp ? Spelling and capitalization of the column name matter. Try temporarily renaming that column and look at the output to see what powerquery thinks that column is named
Once that is set, is the column formatted as numeric or alpha? If not numeric, change the column type or use Number.From() in your function
Try calling your function using a table with a single row, and the contents of the RemainingBudgetDays_RndUp column equal to a numeric five. Does it work? Then the data needs cleaning on some of the rows
Are there any nulls, alpha characters, out of bound, or other data in the RemainingBudgetDays_RndUp column that will break your function? To guard against that, try adding a try and otherwise null before and after strategic points in the function
I have an AWS Step function, and I need to insert items into DynamoDB. I'm passing the following input to the Step Function execution:
{
"uuid": "dd10a857-3711-451e-91ee-d0b3ab621b2e",
"item_id": "0D98C2F77",
"item_count": 3,
"order_id": "IO-98255AX"
}
I have a DynamoDB PutItem Step, set up like so:
Since item_count is a numeric value, I specified "N.$": "$.item_count" - I specified N at the beginning because that maps to the number type in DynamoDB. Since all of the other fields are strings, I started their keys with S.
I then tried to test the PutItem step with the above payload, and I got the following error:
{
"error": "States.Runtime",
"cause": "An error occurred while executing the state 'DynamoDB PutItem' (entered at the event id #2). The Parameters '{\"TableName\":\"test_item_table\",\"Item\":{\"uuid\":{\"S\":\"dd10a857-3711-451e-91ee-d0b3ab621b2e\"},\"item_id\":{\"S\":\"0D98C2F77\"},\"item_count\":{\"N\":3},\"order_id\":{\"S\":\"IO-98255AX\"}}}' could not be used to start the Task: [The value for the field 'N' must be a STRING]"
}
I looked up the The value for the field 'N' must be a STRING error, and I found two relevant results:
A post on AWS where the OP decided to just change the format of the data that gets passed to the Dynamo step
A post on Github, where the OP was using CDK - and he ends up using a numberFromString() function that's available in CDK
In my case, I have an integer value, and I'd prefer to pass in into Dynamo as an integer - but based on the first link, it seems that Step Functions can only pass string values to DynamoDB. This means that my only option is to convert the integer value to a string, but I'm not sure how to do this. I know that Step Functions have intrinsic functions, but I don't think that this is applicable to JSON paths.
What's the best way to handle storing this numeric data to DynamoDB?
TL;DR "item_count": {"N.$": "States.JsonToString($.item_count)"}
it seems that Step Functions can only pass string values to DynamoDB
Yes, although technically it's a constraint of the DynamoDB API. DynamoDB accepts numbers as strings to maximalize compatability, but the underlying data type remains numeric.
This means that my only option is to convert the integer value to a string, but I'm not sure how to do this.
The JsonToString intrinsic function can stringify a number value from the State Machine execution input.
I have CrateDb version 3.2.7 running under Windows Server 2012. I create a table like this:
create table test3 (firstcolumn bigint primary key, secondcolumn int, thirdcolumn timestamp, fourthcolumn double, fifthcolumn double, sixtcolumn smallint, seventhcolumn double, heightcolumn int, ninthcolumn smallint, tenthcolumn smallint) clustered into 12 shards with(number_of_replicas = 0, refresh_interval =0);
So I'm expecting the firstcolumn to be the first, and so on. But after the creation, when I do a SELECT * FROM test3, I get the following result:
It seems that the first column returned is the "fifth" Looks like columns are returned in alphabetical order.
Does it means that CrateDB created the columns in that order? Does it keeps the order somewhere? If columns are in alphabetical order, does that mean that if I want to COPY data from another dbms to CrateDB, then I have to export data based on alphabetical order?
For insert not necessarily, only if they are omitted do they have to be in an alphabetical order see here. Order doesn't seem to be "kept" anywhere per se.
COPY FROM is a different kind of import tactic and not quite what the good old INSERT would do. I would suggest writing a command line app to import data into cratedb. COPY FROM doesn't do any type checking, nor does it cast types and will always import the data as it was in the source file (see here). From your other question I see you may have gps related data (?) you will need to manually map them to a GEO_POINT type just as 1 example.
Crate offers good performance (whatever that means to you or me) with bulk endpoint
I'm converting data from one database to another with a slightly different structure.
In my flow at some point I need to read data from the first database filtering on the id coming from previous steps.
This is the image of my flow
The last step is where I need to filter data. The query is:
SELECT e.*,UNIX_TIMESTAMP(v.dataInserimento)*1000 as timestamp
FROM verbale_evento ve JOIN evento e ON ve.eventi_id=e.id
WHERE ve.Verbale_id=? AND e.titolo='Note verbale'
Unfortunately ve.Verbale_id is a column of the first table (first step). How can I define to filter by that field?
Right now I've an error:
2017/12/22 15:01:00 - Error setting value #2 [Boolean] on prepared statement
2017/12/22 15:01:00 - Parameter index out of range (2 > number of parameters, which is 1).
I need to do this query at the end of the entire transformation.
You can pass previous rows of data as parameters.
However, the number of parameter placeholders in the Table input query must match the number of fields of the incoming data stream. Also, order matters.
Try trimming the data stream to only the field you want to pass using a select values step and then choose that step in the “get data from” box near the bottom of the table input. Also, check the “execute for each input row”.