AWS CloudWatch Logs Insights - amazon-web-services

I am working with data that comes in similar format to this in insights. Sometimes the value of J may be missing and I want to set the value as the value of B if this is the case. Is there any way to do conditional logic similar to this on data in CloudWatch Insights? I have explored ispresent() but cannot figure out how to do the conditional logic.
Example:
B
J
1
3
2
4
2
for the last I would like to set the data equal in J to 2 when I run the query.

You may be able to use coalesce(J, B) which won't set the J to 2 but can be assigned to a new field (e.g. fields coalesce(J, B) as newB that can be used for display or additional logic. coalesce takes 2+ arguments and returns the first value that isn't blank.
https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_QuerySyntax.html (search for coalesce)

Related

R Studio: Mutating a column variable based on two selection conditions

The dataframe above represents a repeated-measures design, each participant took part in both task A and B. The condition determines which order the tasks occurred - if in condition 1, then Task A came first followed by task B, and vice-versa for condition 2.
I would like to mutate a new column in my dataframe called 'First Task'. This column must represent the scores from the task that always occurred first. For example, participant 1001 was in condition 1, so their score from task A should go into this first task column. For participant 1002, in condition 2, their score from task B should go into the first task column, and so on.
After scouring possible threads (which have always solved every need I have!) I considered using the mutate function, combined with cases_when (group == 1), and thereafter I am not sure how to properly pipe something along the lines of select score from TASK A. Alternatively, I considered how I may go about using if or ifelse, probably the likely piece of code to execute something like this?
It would be an elegant piece of coding like this I am after, as opposed to re-creating a new dataframe . I would greatly appreciate any thoughts or ideas on this. Let me know if this is clear (note I have simplified this image as a example to make the question clearer).
Many Thanks community

DynamoDB - TransactWrite, how to do If Else condition

An example, a table with a column counter with only 1 row. I want to update this row's counter value based on its current value.
Say if counter is >= 10, set it to 0, otherwise counter++. How to achieve this if else clause in TransactWrite?
I can't have two actions in one transaction because Documentation states that it does not allow more than 1 action on the same item.
And of course, the reason I use TransactWrite is because there will be multiple lambda doing this task in parallel.
You cannot do it in one request, transactional or otherwise.
You can get the item, decide what to do, and then update the item in a second request accordingly. You’ll want to keep a version number or timestamp attribute to make sure the item hasn’t changed between the read and write, and use a condition expression to fail if it has.
That’s a common idiom:
https://dynobase.dev/dynamodb-locking/
You can't do it the way you like, because if/else is not supported in transactions. There is however a simpler solution.
Just Update the item and increment the counter. Whenever you read the value, you take the counter value modulo 10, and you get the desired behavior, e.g. 123 % 10 = 3 or 10 % 10 = 0.

Math Expression on AWS Cloudwatch metrics is not giving expected output

I have created two metrics (m1 and m2) on my logs which will give me sum of some filter pattern, I wanted to add math expression in metric to sum these two metrics so I have added SUM([m1,m2]) but it is not giving me actual sum, Please refer below snapshot.
I tried to add expressions as m1+m2 but still no luck. One thing I tried, m1 + 2 is giving me exact sum as 5. Not sure if anything is missing here.
Update (2019-07-18):
Adding stacked snapshot,
The SUM() functions sums up values per datapoint. On your last datapoints you have the value 2 for Completed and no value for Failed, so the sum is 2 + 0 = 2. Number widget on the other hand displays the last value returned which for Failed count is 3, but that 3 didn't happen at the last observed time period, it happened before.
You can do few thing here:
Update the metric filter on the logs to emit the value 0 as default if no Failed events are encountered.
Add a new expression to your graph, FILL(m1, 0), with ID e3 for example, which will give you a continuous line with zeros when there are no failures and the number of failures otherwise. Then you can update your SUM expression to be SUM([m2, e3]).
You can do this on both or your metrics, so you don't have gaps in any of them. This will make the graphing and alarming more consistent and intuitive.

How to conditionally execute a SET operation in DynamoDB

I have an aggregations table in DynamoDb with the following columns: id, sum, count, max, min, and hash. I will ALWAYS want to update sum and count but will want to update min and max only when I have values greater than/lesser than the values already in the database. Also, I only want this operation to succeed when the stored hash is different from what I am sending, to prevent reprocessing the same data.
I currently have these:
UpdateExpression: ADD sum :sum ADD count :count SET hash :hash
UpdateCondition: attribute_not_exists(hash) OR hash <> :hash
The thing is that I need something like this for min and max:
SET min :min IF :min < min and something alike for max. Of course, this doesn't currently work. I could not find a suitable update function that would perform this comparision in DynamoDb. What is the proper way to achieve this.
PS.: I already was suggested doing multiple requests to dynamodb and place the max/min as UpdateConditions, but I want to avoid these multiple requests approach for data consistency reasons.
PS2.: Another way to express what I want in a JavaScript-sh way would be something like SET :min < min ? :min : min
I got to a solution to this problem by realizing that what I wanted was just not possible. There must be just one condition to the entire update and since there is no such thing as SET min = minimum(:min, min) I had to accept my fate and make more than one UpdateItem request to DynamoDB.
The nice thing is that the order of execution of these updates doesn't matter. The hard thing here is to make sure that each update is executed exactly once. Because we are firing a lot of requests (and having peaks eventually) there is a real chance of some failing updates due to ProvisionedThroughputExceededException or maybe just some rate limiting from AWS.
So here is my final solution;
Lambda function receives payload with hundreds of data points.
Lambda function aggregates this data points in memory and produces an intermediary aggregation object of the form {id, sum, count, min, max}.
Lambda function generates 3 update objects per aggregation object, of the forms (these updates are referring to the same record):
{UpdateExpression: 'ADD #SUM :sum, #COUNT :count'}
{ConditionExpression: '#MAX < :max OR attribute_not_exists(#MAX)', UpdateExpression: 'SET #MAX = :max'}
{ConditionExpression: '#MIN > :min OR attribute_not_exists(#MIN)', UpdateExpression: 'SET #MIN = :min'}
Because we need to be 100% sure that these updates will always be processed with success, then the lambda function sends them to a FIFO SQS queue (as 3 separate messages). I am not using a FIFO queue here because I want the order to be preserved but because I want the guarantee of exactly once delivery.
A consumer keeps pooling the queue and whenever there are messages it just shoots them to DynamoDB as the parameter of .updateItem.
At the end of this process, I was able to do real-time aggregations for thousands of records :)
PS.: Got rid of the hash column
It is not possible to do this in a single update since UpdateExpression doesn't support functions like max() and min(). The documentation for supported operations and functions can be found here
The best way to achieve the same effect is to add a field called latest or something similar which stores the latest value. You will need to change your update expression to be something like the following.
UpdateExpression: SET hash = :hash, latest = :latest, sum = sum + :latest, count = count + :num
Where :hash is of course your update hash to guard against replays, :latest is the latest value, and :num is 1 or whatever your increment is.
Then you can use DynamoDB Streams with a Lambda that looks at each update and checks if latest is less than min or greater than max. If not, ignore the update, otherwise perform a second update to set min or max to the latest value accordingly.
The main drawback to this approach is that there will be a small window where latest might be outside of the range of min or max however, this can be normalized easily in your application code when you read the records.
You should also consider the additional cost that will result from the DynamoDB Stream and Lambda invocations
I had a similar situation where I needed to atomically update a min value, and ended up doing this:
Let each item have an attribute of type Set (NS) keeping the candidate values for the minvalue, and when you want to set a new value that might be the new min, just add it to the set. Then at read time, find the lowest number in the set on the client side.
This is atomic and requires no condition expression, but has the downside that the set grows over time, so I added a clean up request to run as needed, for example when the set has more than N values, or simply on every get. The clean up might need to use a condition expression to be concurrent safe though, depending on if you also remove values through other use cases.
This does not solve all scenarios, but worked for me. In my case the value was a timestamp of an event in the future, and I wanted to store when the next event occurs. I could then easily also clean up by removing all values in the past.
Summary:
Set new potentially minimum value: ADD #values :value.
Read minimum value: GetItem followed by finding the lowest value in values client-side. This could if needed be combined with a clean up that finds all obsolete values, then calls UpdateItem DELETE #values [x, y, z...]

Creating an ID based on factor and filling down with Stata

Consider the fictional data to illustrate my problem, which contains in reality thousands of rows.
Figure 1
Each individual is characterized by values attached to A,B,C,D,E. In figure1, I show 3 individuals for which some characteristics are missing. Do you have any idea how can I get the following completed table (figure 2)?
Figure 2
With the ID in figure 1 I could have used the carryforward command to filling in the values. But since each individual has a different number of rows I don't know how to create the ID.
Edit: All individual share the characteristic "A".
Edit: the existing order of observations is informative.
To detect the change of id, the idea is to compare if the precedent value of char is >= in each rows.
This works only if your data are ordered, but it seems mandatory in your data.
gen id= 1 if (char[_n-1] >= char[_n]) | _n ==1
replace id = sum(id) if id==1
replace id = id[_n-1] if missing(id)
fillin id char
drop _fillin
If an individual as only the characteristics A and C and another individual as only the characteristics D and E, this won't work, but it seems impossible to detect with your data.