AWS X-Ray trace segments are missing or not connected

AWS X-Ray trace segments are missing or not connected - amazon-web-services

I have code that creates a segment when a queue is being read. In the first function (within the same lambda) I have this:
import * as AWSXRay from 'aws-xray-sdk'; // (using TypeScrpt)
AWSXRay.enableManualMode();
var segment1 = new AWSXRay.Segment("A");
In the second function (within the same lambda), called from the first, I have something like this:
var segment2 = new AWSXRay.Segment("B", segment1.trace_id, segment1.id);
Instead of seeing
*->A->B
On the AWS graph (on the website), I see:
*->A
*->B
...where they are not even associated, even though they have the same tracing ID, and the parent IDs are properly set. I seem to be missing something but not sure what...?
I even tried to pull X-Amzn-Trace-Id from the API request to use that as the root tracking ID for everything but that didn't work either.
This is the JSON for the first segment (A):
{
"Duration": 0.808,
"Id": "1-5d781a08-d41b49e35c3c0f38cdbd4912",
"Segments": [
{
"Document": {
"id": "74c99567f73185ce",
"name": "router",
"start_time": 1568152071.979,
"end_time": 1568152072.787,
"parent_id": "ef34fc0bcf23bbbe",
"aws": {
"xray": {
"sdk": "X-Ray for Node.js",
"sdk_version": "2.3.6",
"package": "aws-xray-sdk"
}
},
"service": {
"version": "unknown",
"runtime": "node",
"runtime_version": "v10.16.3",
"name": "unknown"
},
"trace_id": "1-5d781a08-d41b49e35c3c0f38cdbd4912"
},
"Id": "74c99567f73185ce"
}
]
}
This is the JSON for the second segment (B):
{
"Duration": 0.801,
"Id": "1-5d781a08-d9626abbab1cfbbfe4ff0dff",
"Segments": [
{
"Document": {
"id": "e2b4faaa6538bbb2",
"name": "handleCreateLoad",
"start_time": 1568152071.98,
"end_time": 1568152072.781,
"parent_id": "74c99567f73185ce",
"aws": {
"xray": {
"sdk": "X-Ray for Node.js",
"sdk_version": "2.3.6",
"package": "aws-xray-sdk"
}
},
"service": {
"version": "unknown",
"runtime": "node",
"runtime_version": "v10.16.3",
"name": "unknown"
},
"trace_id": "1-5d781a08-d9626abbab1cfbbfe4ff0dff",
"subsegments": [
{
"id": "08ccf2f374364066",
"name": "...-CreateLoad",
"start_time": 1568152071.981,
"end_time": 1568152072.781
}
]
},
"Id": "e2b4faaa6538bbb2"
}
]
}
It's quite clear the the parent ID for 'B' (74c99567f73185ce) points to "A"'s ID, but the graph does not connect them.
Also, I think _x_amzn_trace_id should be set when the lambda executes, but it is not. That may be root of my issues.

Turns out process.env._x_amzn_trace_id, required by the AWS XRay SDK, does NOT exist until the handler is called. It may help others to know what I went through:
At first I tried to get the trace details for the current lambda on start up (before the handler is called) to connect my new segments, but it didn't work. I have many handlers in the same project, so getting the lambda segment on startup is what I was hoping to do.
I then proceeded to create a main lambda segment (thinking I had to create the first segment myself) but all it did was create an orphaned segment. To make matters worse, each segment creates a new trace ID if one is not provided, and since I could not get the trace ID from the global start-up scope, nothing was connecting. The proper trace ID is important to pass along from start to finish for each request to make sure the calls down-stream are tracked properly.
Dumping of the environment variables before the handler is called and after clearly showed the trace ID is not provided until just before the handler gets called. It's sad that most of the online examples don't even bother to warn about this. I then moved the called to AWSXRay.getSegment() at the start of the lambda handler, then passed the details onto the child segments.
DO NOT set context.callbackWaitsForEmptyEventLoop = false while also calling the callback(error, response) callback passed to the lambda handler. Doing so will terminate the lambda without waiting for segment update events to flush to the daemon, resulting in orphaned segments. :(
Note: This documentation is lacking: https://docs.aws.amazon.com/xray-sdk-for-nodejs/latest/reference/
It states "You can retrieve the current segment or subsegment at any time" when in fact there are some times when you cannot. It's too bad there are no proper examples using actual working NodeJS Lambda code, instead of isolated lines of code thrown everywhere.

Related

Write a scalable fail check for parallel flow using States.ArrayContains

Trying to update logic from checking fails within parallel flows in a AWS Step Function. The parallel flows have following logic to handle error:
"Subflow failed": {
"Type": "Pass",
"End": true,
"Result": true,
"ResultPath": "$.fail"
}
Once the parallel flows have finished the step function proceeds to check if any of the flows failed using the following logic:
"Any parallel flow fail?": {
"Type": "Choice",
"Choices": [
{
"Or": [
{
"And": [
{
"Variable": "$[0].fail",
"IsPresent": true
},
{
"Variable": "$[0].fail",
"BooleanEquals": true
}
]
},
This approach works but leads to an issue when developing new sub flows without adding a new block to check the fails in the array for new flow. I would like to find a more scalable method for checking the array if it contains the fail variable.
I have tried to use the States.ArrayContains but the usage and syntax is unfamiliar to me so I wasn't able to get valid syntax when trying to plug in into the check. Couldn't find any concrete examples of using the method online so any help is much appreciated.

AWS Step function parameters are not moving to the next step

I am running a step function with many different step yet I am still stuck on the 2nd step.
The first step is a Java Lambda that gets all the input parameters and does what it needs to do.
The lambda returns null as it doesn't need to return anything.
The next step is a call for API gateway which needs to use one of the parameters in the URL.
However, I see that neither the URL has the needed parameter nor do I actually get the parameters into the step. ("input": null under TaskStateEntered)
The API gateway step looks as follows: (I also tried "Payload.$": "$" instead of the "Input.$": "$")
"API Gateway start": {
"Type": "Task",
"Resource": "arn:aws:states:::apigateway:invoke",
"Parameters": {
"Input.$": "$",
"ApiEndpoint": "aaaaaa.execute-api.aa-aaaa-1.amazonaws.com",
"Method": "GET",
"Headers": {
"Header1": [
"HeaderValue1"
]
},
"Stage": "start",
"Path": "/aaa/aaaa/aaaaa/aaaa/$.scenario",
"QueryParameters": {
"QueryParameter1": [
"QueryParameterValue1"
]
},
"AuthType": "IAM_ROLE"
},
"Next": "aaaaaa"
},
But when my step function gets to this stage it fails and I see the following in the logs:
{
"name": "API Gateway start",
"input": null,
"inputDetails": {
"truncated": false
}
}
And eventually:
{
"error": "States.Runtime",
"cause": "An error occurred while executing the state 'API Gateway start' (entered at the event id #9). Unable to apply Path transformation to null or empty input."
}
What am I missing here? Note that part of the path is a value that I enter at the step function execution. ("Path": "/aaa/aaaa/aaaaa/aaaa/$.scenario")
EDIT:
As requested by #lynkfox, I am adding the lambda definition that comes before the API gateway step:
And to answer the question, yes its standard and I see no input.
"Run tasks": {
"Type": "Task",
"Resource": "arn:aws:states:::lambda:invoke",
"OutputPath": "$.Payload",
"Parameters": {
"Payload.$": "$",
"FunctionName": "arn:aws:lambda:aaaaaa-1:12345678910:function:aaaaaaa-aaa:$LATEST"
},
"Retry": [
{
"ErrorEquals": [
"Lambda.ServiceException",
"Lambda.AWSLambdaException",
"Lambda.SdkClientException"
],
"IntervalSeconds": 2,
"MaxAttempts": 6,
"BackoffRate": 2
}
],
"Next": "API Gateway start"
},

So yes, as I commented, I believe the problem is the OutputPath of your lambda task definition. What this is saying is Take whatever comes out of this lambda (which is nothing!) and cut off everything other than the key Payload.
Well you are returning nothing, so this causes nothing to be sent to the next task.
I am assuming your incoming vent already has a key in the Json that is named Payload, so what you want to do is remove the OutputPath from your lambda. It doesn't need to return anything so it doesn't need an Output or Result path.
Next, on your API task, assuming again that your initializing event has a key of Payload, you would have "InputPath": "$.Payload" - if you have your headers or parameters in the initializing json Event then, you can reference those keys in the Parameters section of the definition.
Every AWS Service begins with an Event and ends with an Event. Each Event is a JSON object. (Which I'm sure you know). With State Machines, this continues - the State Machine/Step Function is just the controller for passing Events from one Task to the next.
So any given task can have an InputPath, OutputPath, or Result Path - These three definition parameters can decide what values go into the Task and what are sent onto the Next Task. State machines are, by definition, for maintaining State between Tasks, and these help control that 'State' (and there is pretty much only one 'state' at any given time, the event heading to the next Task(s)
The ResultPath is where, in that overall Event, the task puts the data. If you put ResultPath: "$.MyResult" by itself it appends this key to the incoming event
If you add OutputPath, it ONLY passes that key from the output event of the Task onto the next step in the Step Functions.
These three give you a lot of control.
Want to Take an Event into a Lambda and respond with something completely different - you don't need the incoming data - you combine OutputPath and ResultPath with the same value (and your Lambda needs to respond with a Json Object) then you can replace the event wholesale.
If you have ResultPath of some value and OutputPath: "$." you create a new json object with a single Key that contains the result of your task (the key being the definition set in ResultPath
InputPath allows you to set what goes into the Task. I am not 100% certain but I'm pretty sure it does not remove anything from the next Task in the chain.
More information can be found here but it can get pretty confusing.
My quick guide:
ResultPath by itself if you want to append the data to the event
ResultPath + OutputPath of the same value if you want to cut off the Input and only have the output of the task continue (and it returns a JSON style object)

AWS Step Functions - How to ignore "ResultSelector" when error is catched?

I have a AWS Step Functions state machine that as first state starts a Lambda function. This function do something then returns a JSON like { temp_a: "temporary a" }.
This output should be sent to the second state of this state machine but, I don't want to send temp_a as key, rather I'd like to rename it a, so the result of the first state should be { a: "temporary a" }.
This is trivial and can be done using ResultSelector. For this, the Step Functions will look like this:
{
"StartAt": "State1",
"States": {
"State1": {
"Next": "State2",
"Resource": "arn:aws:lambda:eu-west-1:XXX:function:sfexample-LambdaFunction",
"ResultSelector": {
"a.$": "$.temp_a"
},
"Type": "Task"
},
"State2": {
"End": true,
"Type": "Pass"
}
}
}
and Lambda will be something as easiest as possible since it contains just a single instruction return { temp_a: "temporary a" };.
Once the state machine has been started, everything works like a charm since the temp_a is successfully renamed into a (thanks to the ResultSelector) and then it is sent to the State2. Great!
Occasionally, that Lambda can throw a CustomError exception that I would catch in the state machine. When the error is caught the flow have to be diverted into the state CustomErrorState.
To make things possible I've added a Catch statement into the State1, and added another state called CustomErrorState of type Fail.
{
"StartAt": "State1",
"States": {
"CustomErrorState": {
"Cause": "Error happens",
"Type": "Fail"
},
"State1": {
"Catch": [
{
"ErrorEquals": [
"CustomError"
],
"Next": "CustomErrorState"
}
],
"Next": "State2",
"Resource": "arn:aws:lambda:eu-west-1:XXX:function:sfexample-LambdaFunction",
"ResultSelector": {
"a.$": "$.temp_a"
},
"Type": "Task"
},
"State2": {
"End": true,
"Type": "Pass"
}
}
}
This seems reasonable but when Lambda throw the CustomError I get a runtime error because the State1 cannot perform what I've specified in the ResultSelector property.
What's the meaning? If an error is caught how could I handle the result? It is possible to ignore ResultSelector's instructions when I catch en error in a state?
Further detail:
Here you can grab all necessary file to test it into your account.

I am an engineer from Step Functions and would like to update this post to let you know that the issue has been fixed.
ResultSelector field will not be applied to caught error output, and the previous behaviour was a bug.
Thanks for bringing this up.

The issue has been fixed. See post below.
I have posted the same question on AWS discussion forum, signalling that the behaviour looks like a bug. They confirm the problem, in fact:
The "ResultSelector" field should only be applied to the successful
result of a Task, Map or Parallel State. In the case of a caught
error, "ResultSelector" should NOT be applied.

Fortunately today I encountered the same problem. I believe what you have to do is handle the exception manually in your code there is no way to ignore ResultSelector. You have to return same dict/json from your code if it fails then only the SFN will be executed completely. Here is what I tried:
Intentionally I am raising an Exception to check the behaviour of State Machine.
Below is the State Machine's execution when I ran that code:

Amazon.StopIntent is behaving weirdly in Alexa Skill

My STOPINTENT is behaving weirdly, it is giving back null in the JSON output, I don't know what I'm doing wrong. None of my previous skills ran into this issue.
'AMAZON.StopIntent': function () {
this.response.speak('Goodbye!');
this.emit(':responseReady');
I am still getting There was a problem with the requested skill's response
JSON Input
"request": {
"type": "IntentRequest",
"requestId": "amzn1.echo-api.request.8e9eadd5-7018-40b0-a749-ba84ee2d44f7",
"timestamp": "2018-01-09T01:36:44Z",
"locale": "en-US",
"intent": {
"name": "AMAZON.StopIntent",
"confirmationStatus": "NONE"
}
}

The issue was with some unhandled states, for AMAZON.HELP and AMAZON.STOP Intent.
I got them to work by adding those HELP, STOP, CANCEL in all my handlers with different states.
Whenever using different state handlers, ensure that all handlers contain their separate AMAZON.HELP and AMAZON.STOP Intent to make them work properly.

Utterances to test lambda function not working (but lambda function itself executes)

I have a lambda function that executes successfully with an intent called GetEvent that returns a specific string. I've created one utterance for this intent for testing purposes (one that is simple and doesn't require any of the optional slots for invoking the skill), but when using the service simulator to test the lambda function with this utterance for GetEvent I'm met with a lambda response that says "The response is invalid". Here is what the interaction model looks like:
#Intent Schema
{
"intents": [
{
"intent": "GetVessel",
"slots": [
{
"name": "boat",
"type": "LIST_OF_VESSELS"
},
{
"name": "location",
"type": "LIST_OF_LOCATIONS"
},
{
"name": "date",
"type": "AMAZON.DATE"
},
{
"name": "event",
"type": "LIST_OF_EVENTS"
}
]
},
{
"intent": "GetLocation",
"slots": [
{
"name": "event",
"type": "LIST_OF_EVENTS"
},
{
"name": "date",
"type": "AMAZON.DATE"
},
{
"name": "boat",
"type": "LIST_OF_VESSELS"
},
{
"name": "location",
"type": "LIST_OF_LOCATIONS"
}
]
},
{
"intent": "GetEvent",
"slots": [
{
"name": "event",
"type": "LIST_OF_EVENTS"
},
{
"name": "location",
"type": "LIST_OF_LOCATIONS"
}
]
}
]
}
With the appropriate custom skill type syntax and,
#First test Utterances
GetVessel what are the properties of {boat}
GetLocation where did {event} occur
GetEvent get me my query
When giving Alexa the utterance get me my query the lambda response should output the string as it did in the execution. I'm not sure why this isn't the case; this is my first project with the Alexa Skills Kit, so I am pretty new. Is there something I'm not understanding with how the lambda function, the intent schema and the utterances are all pieced together?
UPDATE: Thanks to some help from AWSSupport, I've narrowed the issue down to the area in the json request where new session is flagged as true. For the utterance to work this must be set to false (this works when inputting the json request manually, and this is also the case during the lambda execution). Why is this the case? Does Alexa really care about whether or not it is a new session during invocation? I've cross-posted this to the Amazon Developer Forums as well a couple of days ago, but have yet to get a response from someone.

This may or may not have changed -- the last time I used the service simulator (about two weeks ago at the time of writing) it had a pretty severe bug which would lead to requests being mapped to your first / wrong intent, regardless of actual simulated speech input.
So even if you typed in something random like wafaaefgae it simply tries to map that to the first intent you have defined, providing no slots to said intent which may lead to unexpected results.
Your issue could very well be related to this, triggering the same unexpected / buggy behavior because you aren't using any slots in your sample utterance
Before spending more time debugging this, I'd recommend trying the Intent using an actual echo or alternatively https://echosim.io/ -- interaction via actual speech works as expected, unlike the 'simulator'

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js