Starting a StepFunction and exiting doesn't trigger execution - amazon-web-services

I have Lambda function tranportKickoff which receives an input and then sends/proxies that input forward into a Step Function. The code below does run and I am getting no errors but at the same time the step function is NOT executing.
Also critical to the design, I do not want the transportKickoff function to wait around for the step function to complete as it can be quite long running. I was, however, expecting that any errors in the calling of the Step Function would be reported back synchronously. Maybe this thought is at fault and I'm somehow missing out on an error that is thrown somewhere. If that's the case, however, I'd like to find a way which is able to achieve the goal of having the kickoff lambda function exit as soon as the Step Function has started execution.
note: I can execute the step function independently and I know that it works correctly
const stepFn = new StepFunctions({ apiVersion: "2016-11-23" });
const stage = process.env.AWS_STAGE;
const name = `transport-steps ${message.command} for "${stage}" environment at ${Date.now()}`;
const params: StepFunctions.StartExecutionInput = {
stateMachineArn: `arn:aws:states:us-east-1:999999999:stateMachine:transportion-${stage}-steps`,
input: JSON.stringify(message),
name
};
const request = stepFn.startExecution(params);
request.send();
console.info(
`startExecution request for step function was sent, context sent was:\n`,
JSON.stringify(params, null, 2)
);
callback(null, {
statusCode: 200
});
I have also checked from the console that I have what I believe to be the right permissions to start the execution of a step function:
I've now added more permissions (see below) but still experiencing the same problem:
'states:ListStateMachines'
'states:CreateActivity'
'states:StartExecution'
'states:ListExecutions'
'states:DescribeExecution'
'states:DescribeStateMachineForExecution'
'states:GetExecutionHistory'

Ok I have figured this one out myself, hopefully this answer will be helpful for others:
First of all, the send() method is not a synchronous call but it does not return a promise either. Instead you must setup listeners on the Request object before sending so that you can appropriate respond to success/failure states.
I've done this with the following code:
const stepFn = new StepFunctions({ apiVersion: "2016-11-23" });
const stage = process.env.AWS_STAGE;
const name = `${message.command}-${message.upc}-${message.accountName}-${stage}-${Date.now()}`;
const params: StepFunctions.StartExecutionInput = {
stateMachineArn: `arn:aws:states:us-east-1:837955377040:stateMachine:transportation-${stage}-steps`,
input: JSON.stringify(message),
name
};
const request = stepFn.startExecution(params);
// listen for success
request.on("extractData", req => {
console.info(
`startExecution request for step function was sent and validated, context sent was:\n`,
JSON.stringify(params, null, 2)
);
callback(null, {
statusCode: 200
});
});
// listen for error
request.on("error", (err, response) => {
console.warn(
`There was an error -- ${err.message} [${err.code}, ${
err.statusCode
}] -- that blocked the kickoff of the ${message.command} ITMS command for ${
message.upc
} UPC, ${message.accountName} account.`
);
callback(err.statusCode, {
message: err.message,
errors: [err]
});
});
// send request
request.send();
Now please bear in mind there is a "success" event but I used "extractData" to capture success as I wanted to get a response as quickly as possible. It's possible that success would have worked equally as well but looking at the language in the Typescript typings it wasn't entirely clear and in my testing I'm certain that the "extractData" method does work as expected.
As for why I was not getting any execution on my step functions ... it had to the way I was naming the function ... you're limited to a subset of characters in the name and I'd stepped over that restriction but didn't realize until I was able to capture the error with the code above.

For anyone encountering issues executing state machines from Lambda's make sure the permission 'states:StartExecution' is added to the Lambda permissions and the regions match up.
Promise based version:
import { StepFunctions } from 'aws-sdk';
const clients = {
stepFunctions: new StepFunctions();
}
const createExecutor = ({ clients }) => async (event) => {
console.log('Executing media pipeline job');
const params = {
stateMachineArn: '<state-machine-arn>',
input: JSON.stringify({}),
name: 'new-job',
};
const result = await stepFunctions.startExecution(params).promise();
// { executionArn: "string", startDate: number }
return result;
};
const startExecution = createExecutor({ clients });
// Pass in the event from the Lambda e.g S3 Put, SQS Message
await startExecution(event);
Result should contain the execution ARN and start date (read more)

Related

AWS CloudWatch Synthetics Canary doesn't load the entire page

I basically want to load the following page every minute and create a screenshot of it: https://www.amazon.com/b?ie=UTF8&node=24088939011
It takes about one minute to create an AWS CloudWatch canary: I open AWS CloudWatch, click on "Synthetics Canaries" on the left hand side, click on "Create canary", enter the URL to a webpage, and just use the default settings except that I change it from running every 5 minutes to running every minute.
Availability tab of the canary I created:
Configuration tab of the canary I created:
The canary runs and says it's 100% successful but when I look at the screenshots I see that the page never loads fully:
The screenshots should show what you would see when opening the page in your browser: https://www.amazon.com/b?ie=UTF8&node=24088939011
This is the default script that is used by the canary:
const { URL } = require('url');
const synthetics = require('Synthetics');
const log = require('SyntheticsLogger');
const syntheticsConfiguration = synthetics.getConfiguration();
const syntheticsLogHelper = require('SyntheticsLogHelper');
const loadBlueprint = async function () {
const urls = ['https://www.amazon.com/b?ie=UTF8&node=24088939011'];
// Set screenshot option
const takeScreenshot = true;
/* Disabling default step screen shots taken during Synthetics.executeStep() calls
* Step will be used to publish metrics on time taken to load dom content but
* Screenshots will be taken outside the executeStep to allow for page to completely load with domcontentloaded
* You can change it to load, networkidle0, networkidle2 depending on what works best for you.
*/
syntheticsConfiguration.disableStepScreenshots();
syntheticsConfiguration.setConfig({
continueOnStepFailure: true,
includeRequestHeaders: true, // Enable if headers should be displayed in HAR
includeResponseHeaders: true, // Enable if headers should be displayed in HAR
restrictedHeaders: [], // Value of these headers will be redacted from logs and reports
restrictedUrlParameters: [] // Values of these url parameters will be redacted from logs and reports
});
let page = await synthetics.getPage();
for (const url of urls) {
await loadUrl(page, url, takeScreenshot);
}
};
// Reset the page in-between
const resetPage = async function(page) {
try {
await page.goto('about:blank',{waitUntil: ['load', 'networkidle0'], timeout: 30000} );
} catch(ex) {
synthetics.addExecutionError('Unable to open a blank page ', ex);
}
}
const loadUrl = async function (page, url, takeScreenshot) {
let stepName = null;
let domcontentloaded = false;
try {
stepName = new URL(url).hostname;
} catch (error) {
const errorString = `Error parsing url: ${url}. ${error}`;
log.error(errorString);
/* If we fail to parse the URL, don't emit a metric with a stepName based on it.
It may not be a legal CloudWatch metric dimension name and we may not have an alarms
setup on the malformed URL stepName. Instead, fail this step which will
show up in the logs and will fail the overall canary and alarm on the overall canary
success rate.
*/
throw error;
}
await synthetics.executeStep(stepName, async function () {
const sanitizedUrl = syntheticsLogHelper.getSanitizedUrl(url);
/* You can customize the wait condition here. For instance, using 'networkidle2' or 'networkidle0' to load page completely.
networkidle0: Navigation is successful when the page has had no network requests for half a second. This might never happen if page is constantly loading multiple resources.
networkidle2: Navigation is successful when the page has no more then 2 network requests for half a second.
domcontentloaded: It's fired as soon as the page DOM has been loaded, without waiting for resources to finish loading. Can be used and then add explicit await page.waitFor(timeInMs)
*/
const response = await page.goto(url, { waitUntil: ['networkidle0'], timeout: 30000});
if (response) {
domcontentloaded = true;
const status = response.status();
const statusText = response.statusText();
logResponseString = `Response from url: ${sanitizedUrl} Status: ${status} Status Text: ${statusText}`;
//If the response status code is not a 2xx success code
if (response.status() < 200 || response.status() > 299) {
throw `Failed to load url: ${sanitizedUrl} ${response.status()} ${response.statusText()}`;
}
} else {
const logNoResponseString = `No response returned for url: ${sanitizedUrl}`;
log.error(logNoResponseString);
throw new Error(logNoResponseString);
}
});
// Wait for 15 seconds to let page load fully before taking screenshot.
if (domcontentloaded && takeScreenshot) {
await page.waitFor(15000);
await synthetics.takeScreenshot(stepName, 'loaded');
await resetPage(page);
}
};
const urls = [];
exports.handler = async () => {
return await loadBlueprint();
};
I tried creating a canary in exactly the same way but for another similar page (https://www.amazon.ca/b?ie=UTF8&node=6548466011) and it just works:
What am I doing wrong? Why aren't the screenshots that the canary takes showing the fully loaded page?
Why are parts of the page missing in the screenshot but they show up correctly when I open the page (https://www.amazon.com/b?ie=UTF8&node=24088939011) in my browser?

Authenticate AWS lambda against Google Sheets API

I am trying to create an aws lambda function that will read rows from multiple Google Sheets documents using the Google Sheet API and will merge them afterwards and write in another spreadsheet. To do so I did all the necessary steps according to several tutorials:
Create credentials for the AWS user to have the key pair.
Create a Google Service Account, download the credentials.json file.
Share each necessary spreadsheet with the Google Service Account client_email.
When executing the program locally it works perfectly, it successfully logins using the credentials.json file and reads & writes all necessary documents.
However when uploading it to AWS Lambda using the serverless framework and google-spreadsheet, the program fails silently in the authentication step. I've tried changing the permissions as recommended in this question but it still fail. The file is read properly and I can print it to the console.
This is the simplified code:
async function getData(spreadsheet, psychologistName) {
await spreadsheet.useServiceAccountAuth(clientSecret);
// It never gets to this point, it fails silently
await spreadsheet.loadInfo();
... etc ...
}
async function main() {
const promises = Object.entries(psychologistSheetIDs).map(async (psychologistSheetIdPair) => {
const [psychologistName, googleSheetId] = psychologistSheetIdPair;
const sheet = new GoogleSpreadsheet(googleSheetId);
psychologistScheduleData = await getData(sheet, psychologistName);
return psychologistScheduleData;
});
//When all sheets are available, merge their data and write back in joint view.
Promise.all(promises).then(async (psychologistSchedules) => {
... merge the data ...
});
}
module.exports.main = async (event, context, callback) => {
const result = await main();
return {
statusCode: 200,
body: JSON.stringify(
result,
null,
2
),
};
I solved it,
While locally having a Promise.all(promises).then(result =>...) eventually returned the value and executed what was inside the then(), aws lambda returned before the promises were resolved.
This solved it:
const res = await Promise.all(promises);
mergeData(res);

AWS lambda send partial response

I have a lambda function which does a series of actions. I have a react application which triggers the lambda function.
Is there a way I can send a partial response from the lambda function after each action is complete.
const testFunction = (event, context, callback) => {
let partialResponse1 = await action1(event);
// send partial response to client
let partialResponse2 = await action2(partialResponse1);
// send partial response to client
let partialResponse3 = await action3(partialResponse2);
// send partial response to client
let response = await action4(partialResponse3);
// send final response
}
Is this possible in lambda functions? If so, how we can do this. Any ref docs or sample code would be do a great help.
Thanks.
Note: This is fairly a simple case of showing a loader with % on the client-side. I don't want to overcomplicate things SQS or step functions.
I am still looking for an answer for this.
From what I understand you're using API Gateway + Lambda and are looking to show the progress of the Lambda via UI.
Since each step must finish before the next step begin I see no reason not to call the lambda 4 times, or split the lambda to 4 separate lambdas.
E.g.:
// Not real syntax!
try {
res1 = await ajax.post(/process, {stage: 1, data: ... });
out(stage 1 complete);
res2 = await ajax.post(/process, {stage: 2, data: res1});
out(stage 2 complete);
res3 = await ajax.post(/process, {stage: 3, data: res2});
out(stage 3 complete);
res4 = await ajax.post(/process, {stage: 4, data: res3});
out(stage 4 complete);
out(process finished);
catch(err) {
out(stage {$err.stage-number} failed to complete);
}
If you still want all 4 calls to be executed during the same lambda execution you may do the following (this especially true if the process is expected to be very long) (and because it's usually not good practice to execute "long hanging" http transaction).
You may implement it by saving the "progress" in a database, and when the process is complete save the results to the database as well.
All you need to do is query the status every X seconds.
// Not real syntax
Gateway-API --> lambda1 - startProcess(): returns ID {
uuid = randomUUID();
write to dynamoDB { status: starting }.
send sqs-message-to-start-process(data, uuid);
return response { uuid: uuid };
}
SQS --> lambda2 - execute(): returns void {
try {
let partialResponse1 = await action1(event);
write to dynamoDB { status: action 1 complete }.
// send partial response to client
let partialResponse2 = await action2(partialResponse1);
write to dynamoDB { status: action 2 complete }.
// send partial response to client
let partialResponse3 = await action3(partialResponse2);
write to dynamoDB { status: action 3 complete }.
// send partial response to client
let response = await action4(partialResponse3);
write to dynamoDB { status: action 4 complete, response: response }.
} catch(err) {
write to dynamoDB { status: failed, error: err }.
}
}
Gateway-API --> lambda3 -> getStatus(uuid): returns status {
return status from dynamoDB (uuid);
}
Your UI Code:
res = ajax.get(/startProcess);
uuid = res.uuid;
in interval every X (e.g. 3) seconds:
status = ajax.get(/getStatus?uuid=uuid);
show(status);
if (status.error) {
handle(status.error) and break;
}
if (status.response) {
handle(status.response) and break;
}
}
Just remember that lambda's cannot exceed 15 minutes execution. Therefore, you need to be 100% certain that whatever the process does, it never exceeds this hard limit.
What you are looking for is to have response expose as a stream where you can write to the stream and flush it
Unfortunately its not there in Node.js
How to stream AWS Lambda response in node?
https://docs.aws.amazon.com/lambda/latest/dg/programming-model.html
But you can still do the streaming if you use Java
https://docs.aws.amazon.com/lambda/latest/dg/java-handler-io-type-stream.html
package example;
import java.io.InputStream;
import java.io.OutputStream;
import com.amazonaws.services.lambda.runtime.RequestStreamHandler;
import com.amazonaws.services.lambda.runtime.Context;
public class Hello implements RequestStreamHandler{
public void handler(InputStream inputStream, OutputStream outputStream, Context context) throws IOException {
int letter;
while((letter = inputStream.read()) != -1)
{
outputStream.write(Character.toUpperCase(letter));
}
}
}
Aman,
You can push the partial outputs into SQS and read the SQS messages to process those message. This is a simple and scalable architecture. AWS provides SQS SDKs in different languages, for example, JavaScript, Java, Python, etc.
Reading and writing into SQS is very easy using SDK and that too can be implemented in serverside or in your UI layer (with proper IAM).
I found AWS step function may be what you need:
AWS Step Functions lets you coordinate multiple AWS services into serverless workflows so you can build and update apps quickly.
Check this link for more detail:
In our example, you are a developer who has been asked to create a serverless application to automate handling of support tickets in a call center. While you could have one Lambda function call the other, you worry that managing all of those connections will become challenging as the call center application becomes more sophisticated. Plus, any change in the flow of the application will require changes in multiple places, and you could end up writing the same code over and over again.

Google IOT per device heartbeat alert using Stackdriver

I'd like to alert on the lack of a heartbeat (or 0 bytes received) from any one of large number of Google IOT core devices. I can't seem to do this in Stackdriver. It instead appears to let me alert on the entire device registry which does not give me what I'm looking for (How would I know that a particular device is disconnected?)
So how does one go about doing this?
I have no idea why this question was downvoted as 'too broad'.
The truth is Google IOT doesn't have per device alerting, but instead offers only alerting on an entire device registry. If this is not true, please reply to this post. The page that clearly states this is here:
Cloud IoT Core exports usage metrics that can be monitored
programmatically or accessed via Stackdriver Monitoring. These metrics
are aggregated at the device registry level. You can use Stackdriver
to create dashboards or set up alerts.
The importance of having per device alerting is built into the promise assumed in this statement:
Operational information about the health and functioning of devices is
important to ensure that your data-gathering fabric is healthy and
performing well. Devices might be located in harsh environments or in
hard-to-access locations. Monitoring operational intelligence for your
IoT devices is key to preserving the business-relevant data stream.
So its not easy today to get an alert if one among many, globally dispersed devices, loses connectivity. One needs to build that, and depending on what one is trying to do, it would entail different solutions.
In my case I wanted to alert if the last heartbeat time or last event state publish was older than 5 minutes. For this I need to run a looping function that scans the device registry and performs this operation regularly. The usage of this API is outlined in this other SO post: Google iot core connection status
For reference, here's a Firebase function I just wrote to check a device's online status, probably needs some tweaks and further testing, but to help anybody else with something to start with:
// Example code to call this function
// const checkDeviceOnline = functions.httpsCallable('checkDeviceOnline');
// Include 'current' key for 'current' online status to force update on db with delta
// const isOnline = await checkDeviceOnline({ deviceID: 'XXXX', current: true })
export const checkDeviceOnline = functions.https.onCall(async (data, context) => {
if (!context.auth) {
throw new functions.https.HttpsError('failed-precondition', 'You must be logged in to call this function!');
}
// deviceID is passed in deviceID object key
const deviceID = data.deviceID
const dbUpdate = (isOnline) => {
if (('wasOnline' in data) && data.wasOnline !== isOnline) {
db.collection("devices").doc(deviceID).update({ online: isOnline })
}
return isOnline
}
const deviceLastSeen = () => {
// We only want to use these to determine "latest seen timestamp"
const stamps = ["lastHeartbeatTime", "lastEventTime", "lastStateTime", "lastConfigAckTime", "deviceAckTime"]
return stamps.map(key => moment(data[key], "YYYY-MM-DDTHH:mm:ssZ").unix()).filter(epoch => !isNaN(epoch) && epoch > 0).sort().reverse().shift()
}
await dm.setAuth()
const iotDevice: any = await dm.getDevice(deviceID)
if (!iotDevice) {
throw new functions.https.HttpsError('failed-get-device', 'Failed to get device!');
}
console.log('iotDevice', iotDevice)
// If there is no error status and there is last heartbeat time, assume device is online
if (!iotDevice.lastErrorStatus && iotDevice.lastHeartbeatTime) {
return dbUpdate(true)
}
// Add iotDevice.config.deviceAckTime to root of object
// For some reason in all my tests, I NEVER receive anything on lastConfigAckTime, so this is my workaround
if (iotDevice.config && iotDevice.config.deviceAckTime) iotDevice.deviceAckTime = iotDevice.config.deviceAckTime
// If there is a last error status, let's make sure it's not a stale (old) one
const lastSeenEpoch = deviceLastSeen()
const errorEpoch = iotDevice.lastErrorTime ? moment(iotDevice.lastErrorTime, "YYYY-MM-DDTHH:mm:ssZ").unix() : false
console.log('lastSeen:', lastSeenEpoch, 'errorEpoch:', errorEpoch)
// Device should be online, the error timestamp is older than latest timestamp for heartbeat, state, etc
if (lastSeenEpoch && errorEpoch && (lastSeenEpoch > errorEpoch)) {
return dbUpdate(true)
}
// error status code 4 matches
// lastErrorStatus.code = 4
// lastErrorStatus.message = mqtt: SERVER: The connection was closed because MQTT keep-alive check failed.
// will also be 4 for other mqtt errors like command not sent (qos 1 not acknowledged, etc)
if (iotDevice.lastErrorStatus && iotDevice.lastErrorStatus.code && iotDevice.lastErrorStatus.code === 4) {
return dbUpdate(false)
}
return dbUpdate(false)
})
I also created a function to use with commands, to send a command to the device to check if it's online:
export const isDeviceOnline = functions.https.onCall(async (data, context) => {
if (!context.auth) {
throw new functions.https.HttpsError('failed-precondition', 'You must be logged in to call this function!');
}
// deviceID is passed in deviceID object key
const deviceID = data.deviceID
await dm.setAuth()
const dbUpdate = (isOnline) => {
if (('wasOnline' in data) && data.wasOnline !== isOnline) {
console.log( 'updating db', deviceID, isOnline )
db.collection("devices").doc(deviceID).update({ online: isOnline })
} else {
console.log('NOT updating db', deviceID, isOnline)
}
return isOnline
}
try {
await dm.sendCommand(deviceID, 'alive?', 'alive')
console.log('Assuming device is online after succesful alive? command')
return dbUpdate(true)
} catch (error) {
console.log("Unable to send alive? command", error)
return dbUpdate(false)
}
})
This also uses my version of a modified DeviceManager, you can find all the example code on this gist (to make sure using latest update, and keep post on here small):
https://gist.github.com/tripflex/3eff9c425f8b0c037c40f5744e46c319
All of this code, just to check if a device is online or not ... which could be easily handled by Google emitting some kind of event or adding an easy way to handle this. COME ON GOOGLE GET IT TOGETHER!

AWS API Gateway error response generates 502 "Bad Gateway"

I have an API Gateway with a LAMBDA_PROXY Integration Request Type. Upon calling context.succeed in the Lambda, the response header is sent back with code 302 as expected (shown below). However, I want to handle 500 and 404 errors, and the only thing I am sure about so far, is that I am returning the error incorrectly as I am getting 502 Bad Gateway. What is wrong with my context.fail?
Here is my handler.js
const handler = (event, context) => {
//event consists of hard coded values right now
getUrl(event.queryStringParameters)
.then((result) => {
const parsed = JSON.parse(result);
let url;
//handle error message returned in response
if (parsed.error) {
let error = {
statusCode: 404,
body: new Error(parsed.error)
}
return context.fail(error);
} else {
url = parsed.source || parsed.picture;
return context.succeed({
statusCode: 302,
headers: {
Location : url
}
});
}
});
};
If you throw an exception within the Lambda function (or context.fail), API Gateway reads it as if something had gone wrong with your backend and returns 502. If this is a runtime exception you expect and want to return a 500/404, use the context.succeed method with the status code you want and message:
if (parsed.error) {
let error = {
statusCode: 404,
headers: { "Content-Type": "text/plain" } // not sure here
body: new Error(parsed.error)
}
return context.succeed(error);
I had the same problem, in my case the issue was that my function was not returning anything in context.done(). So instead of context.done(null), I did context.done(null, {});
I've gotten 502's from multiple things. Here are the ones I have figured out so far.
Answer 1:
claudia generate-serverless-express-proxy --express-module {src/server?}
If you are not using claudia and express, this answer won't help you.
Answer 2:
Lambda function->Basic Settings->Timeout. Increase it to something reasonable. It defaults to 3 seconds. But the first time building it typically takes longer.
I had a problem like this, I was returning JSON as a JavaScript Object in the body, but you are supposed to return it as a string. All I had to do was do a JSON.stringify(dataobject) to convert the JSON into a string before returning it.
https://aws.amazon.com/premiumsupport/knowledge-center/malformed-502-api-gateway/