AWS S3 Readstream returns no data

AWS S3 Readstream returns no data - amazon-web-services

I have an application on which files are uploaded to S3, and an event is triggered to process them by a Lambda function.
When a file is uploaded I may see the function execution on Cloud Watch logs, however no data is returned, no error is thrown and the on('end') handler is never called.
The files being processed are .csv, and I'm able to open them and check the contents manually.
Any ideas on what may be happening?
This is my code:
let es = require('event-stream');
let readStream = s3.getObject({
Bucket: event.Records[0].s3.bucket.name,
Key: event.Records[0].s3.object.key
}).createReadStream();
readStream
.pipe(es.split())
.pipe(es.mapSync(function (line) {
console.log(line);
processLine(line);
}))
.on('end', async () => {
console.log('ON END');
callback(null, 'OK');
})
.on('error', (err) => {
console.error(JSON.stringify(err));
callback(JSON.stringify(err));
})

When a Node.js Lambda reaches the end of the main thread, it ends all other threads.
Make your handler async and then promisify the last call as
new Promise((resolve) => {
readStream
.pipe(es.split())
.pipe(es.mapSync(function (line) {
console.log(line);
processLine(line);
}))
.on('end', async () => {
console.log('ON END');
callback(null, 'OK');
})
.on('error', (err) => {
console.error(JSON.stringify(err));
callback(JSON.stringify(err));
resolve();
})
}

Related

AWS s3 NoSuchKey error after retrieving file which was just uploaded

I uploaded a file to my s3 bucket and tried to read the file immediately after upload. I most of the time get "err NoSuchKey: The specified key does not exist". i check the bucket using the console and the file actually exist.
After refreshing the page, the file is able to be read.
Aws region is US East (N Virginia).
File is uploaded with a private read.
export function uploadFile(absolutePath: string, fileBuffer: Buffer, callback: (err, result) => void) {
try {
let uploadParams: awsSdk.S3.PutObjectRequest = {
Bucket: cfg.aws[process.env.NODE_ENV].bucket,
Key: absolutePath,
Body: fileBuffer,
ACL: 'private',
CacheControl: 'public, max-age=2628000'
}
s3.upload(uploadParams, function (err, result) {
if (err) {
Util.logError('Aws Upload File', err)
}
return callback(err, result)
})
} catch (err) {
Util.logError('Aws Upload File', err)
return callback(err, null)
}
}
export function obtainObjectOutput(absolutePath: string, callback: (err, result: awsSdk.S3.GetObjectOutput) => void) {
let getParaams: awsSdk.S3.GetObjectRequest = {
Bucket: cfg.aws[process.env.NODE_ENV].bucket,
Key: absolutePath
}
s3.getObject(getParaams, (error, result) => {
(error) ? callback(error, null) : callback(null, result)
})
}

The number one reason that S3 GetObject fails after an upload is that the GetObject request actually happened before the upload completed, This is easy to do in async JavaScript.

AWS - Sending 1000's of emails from Lambda / Node.js

I have a "main" Lambda function that gets triggered by SNS. It pulls a list of recipients from the database and it needs to send each of them a message based on a template, replacing things like first name and such.
The way I have it setup is I created another Lambda function called "email-send" which is subscribed to "email-send" topic. The "main" Lambda then loops through the recipients list and publishes messages to "email-send" with a proper payload (from, to, subject, message). This might eventually need to process 1000's of emails in a single batch.
Is this a good approach to my requirements? Perhaps Lambda/SNS is not a way to go? If so, what would you recommend.
With this setup I am running into issues when my "main" function finishes running and somehow "sns.publish" does not get triggered in my loop. I assume because I am not letting it finish. But I am not sure how to fix it, being a loop.
Here is the snippet from my Lambda function:
exports.handler = (event, context, callback) => {
// code is here to pull data into "data" array
// process records
for (var i = 0; i < data.length; i++) {
var sns = new aws.SNS();
sns.publish({
Message: JSON.stringify({ from: data[i].from, to: data[i].to, subject: subject, body: body }),
TopicArn: 'arn:aws:sns:us-west-2:XXXXXXXX:email-send'
}, function(err, data) {
if (err) {
console.log(err.stack);
} else {
console.log('SNS pushed!');
}
});
}
context.succeed("success");
};
Thanks for any assistance.

Your code is doing this...
Begin calling sns.publish() 1000 times
Return (through context.succeed())
You didn't wait for those 1000 calls to finish!
What your code should do is...
Begin calling sns.publish() 1000 times
When all calls to sns.publish() has returned, then return. (context.succeed is old so we should use callback() instead).
Something like this...
// Instantiate the client only once instead of data.length times
const sns = new aws.SNS();
exports.handler = (event, context, callback) => {
const snsCalls = []
for (var i = 0; i < data.length; i++) {
snsCalls.push(sns.publish({
Message: JSON.stringify({
from: data[i].from,
to: data[i].to,
subject: subject,
body: body
}),
TopicArn: 'arn:aws:sns:us-west-2:XXXXXXXX:email-send'
}).promise();
}
return Promise.all(snsCalls)
.then(() => callback(null, 'Success'))
.catch(err => callback(err));
};

I think that a better approach is using AWS Lambda API.
That way, you don't need SNS.
For example:
var lambda = new AWS.Lambda({region: AWS_REGION});
function invokeWorkerLambda(task, callback) {
var params = {
FunctionName: WORKER_LAMBDA_NAME,
InvocationType: 'Event',
Payload: JSON.stringify({.....})
};
lambda.invoke(params, function(err, data) {
if (err) {
console.error(err, err.stack);
callback(err);
} else {
callback(null, data);
}
});
}
As you can see, you don't need SNS for lambda function's invocation.
Important: Another suggestion is to create an Array of invocations (functions) and later execute them as follow:
async.parallel(invocations, function(err) {
if (err) {
console.error(err, err.stack);
callback(err);
}
});
Take a look at this link where I got a lot of knowledge about Lambda invocation: https://cloudonaut.io/integrate-sqs-and-lambda-serverless-architecture-for-asynchronous-workloads/

Testing catch block via jest mock

I'm trying to test the 'catch' block of an async redux action via jest, but throwing a catch in the mock causes the test as a whole to fail.
My action is as follows:
export function loginUser(username, password) {
return async dispatch => {
dispatch({type: UPDATE_IN_PROGRESS});
try {
let response = await MyRequest.postAsync(
'/login', {username: username, password: password}
);
dispatch({
type: USER_AUTHENTICATED,
username: response.username,
token: response.token,
role: response.role,
id: response.id
});
} catch (error) {
dispatch({type: USER_SIGNED_OUT});
throw error;
} finally {
dispatch({type: UPDATE_COMPLETE});
}
};
}
The test is trying to mock up 'MyRequest.postAsync' to throw an error and thus trigger the catch block, but the test just bails with a 'Failed' message
it('calls expected actions when failed log in', async() => {
MyRequest.postAsync = jest.fn(() => {
throw 'error';
});
let expectedActions = [
{type: UPDATE_IN_PROGRESS},
{type: USER_SIGNED_OUT},
{type: UPDATE_COMPLETE}
];
await store.dispatch(userActions.loginUser('foo', 'bar'));
expect(store.getActions()).toEqual(expectedActions);
});
Is there a way to trigger the catch block to execute in my test via a jest mock function (or any other way for that matter)? Would be annoying to not be able to test a large chunk of code (as all my requests work in the same way).
Thanks in advance for help with this.

I don't know if it's still relevant, but you can do it in this way:
it('tests error with async/await', async () => {
expect.assertions(1);
try {
await store.dispatch(userActions.loginUser('foo', 'bar'));
} catch (e) {
expect(e).toEqual({
error: 'error',
});
}
});
Here is a documentation about error handling

I had the same issue. For me the below works. Wrapping up the await with a try/catch
it('calls expected actions when failed log in', async() => {
MyRequest.postAsync = jest.fn(() => {
throw 'error';
});
let expectedActions = [
{type: UPDATE_IN_PROGRESS},
{type: USER_SIGNED_OUT},
{type: UPDATE_COMPLETE}
];
try {
await store.dispatch(userActions.loginUser('foo', 'bar'));
} catch(e) {
expect(store.getActions()).toEqual(expectedActions);
}
});

I set the instance variable which we will access in our testing function to undefined so that it will go to catch block.
PS : This might not be possible all the times as we might not be having variables all time
class APIStore {
async fetchProductsAPI() {
try {
const products = networkManager.fetch('products')
this.productsStore.setProducts(prodcuts)
}
catch(e) {
this.apiStatus = API_FAILED
this.apiError = e
}
}
}
Test case
it('Check API Error ', async () => {
const toCheckErrorStore = new APIStore()
// Setting products store to undefined so that execution goes to catch block
toCheckErrorStore.productsStore = undefined
await toCheckErrorStore.fetchProductsAPI()
expect(toCheckErrorStore.apiStatus).toBe(API_FAILED)
expect(toCheckErrorStore.apiError).toEqual(errorObjectIWantToCompareWith)
}

AWS Lambda function timing out

In my local mocha tests the following handler function works just fine. However, when I upload to AWS (using Serverless framework) it times out (unless you don't provide a uid parameter where it then correctly responds immediately).
What's particularly odd is that in less than 3 seconds (timeout is set at 5 seconds), the job completes and even the "post-facto" log message is output but it somehow calling the callback and that is not completing the Lambda function
Here's the cloudwatch log:
]1
And here's the handler function:
export const handler = (event: IRequestInput, context: IContext, cb: IGatewayCallback) => {
console.log('EVENT:\n', JSON.stringify(event, null, 2));
const uid = _.get(event, 'queryStringParameters.uid', undefined);
if(!uid) {
cb(null, {
statusCode: 412,
body: 'no User ID was provided by frontend'
});
return;
}
oauth.getRequestToken()
.then(token => {
console.log('Token is:\n', JSON.stringify(token, null, 2));
console.log('User ID: ', uid);
token.uid = uid;
return Promise.resolve(token);
})
.then((token) => {
console.log('URL: ', token.url);
cb(null, {
statusCode: 200,
body: token.url
});
console.log('post-facto');
})
.catch((err: PromiseError) => {
console.log('Problem in getting promise token: ', err);
cb(err.message);
});
};

Add the following as the first line of your handler function:
context.callbackWaitsForEmptyEventLoop = false

I guess that you're using lambda with "Node.js Runtime 0.10"
So you should add
context.done(null, 'Terminate Lambda');
to terminate the execution.
As the AWS lambda document, it mentions that:
The callback is supported only in the Node.js runtime v4.3. If you
are using the earlier runtime v0.10.42, you need to use the context
methods (done, succeed, and fail) to properly terminate the Lambda
function.
Please refer this link for above information

Pushing AWS Lambda data to Kinesis Stream

Is there are way to push data from a Lambda function to a Kinesis stream? I have searched the internet but have not found any examples related to it.
Thanks.

Yes, you can send information from Lambda to Kinesis Stream and it is very simple to do. Make sure you are running Lambda with the right permissions.
Create a file called kinesis.js, This file will provide a 'save' function that receives a payload and sends it to the Kinesis Stream. We want to be able to include this 'save' function anywhere we want to send data to the stream. Code:
const AWS = require('aws-sdk');
const kinesisConstant = require('./kinesisConstants'); //Keep it consistent
const kinesis = new AWS.Kinesis({
apiVersion: kinesisConstant.API_VERSION, //optional
//accessKeyId: '<you-can-use-this-to-run-it-locally>', //optional
//secretAccessKey: '<you-can-use-this-to-run-it-locally>', //optional
region: kinesisConstant.REGION
});
const savePayload = (payload) => {
//We can only save strings into the streams
if( typeof payload !== kinesisConstant.PAYLOAD_TYPE) {
try {
payload = JSON.stringify(payload);
} catch (e) {
console.log(e);
}
}
let params = {
Data: payload,
PartitionKey: kinesisConstant.PARTITION_KEY,
StreamName: kinesisConstant.STREAM_NAME
};
kinesis.putRecord(params, function(err, data) {
if (err) console.log(err, err.stack);
else console.log('Record added:',data);
});
};
exports.save = (payload) => {
const params = {
StreamName: kinesisConstant.STREAM_NAME,
};
kinesis.describeStream(params, function(err, data) {
if (err) console.log(err, err.stack);
else {
//Make sure stream is able to take new writes (ACTIVE or UPDATING are good)
if(data.StreamDescription.StreamStatus === kinesisConstant.STATE.ACTIVE
|| data.StreamDescription.StreamStatus === kinesisConstant.STATE.UPDATING ) {
savePayload(payload);
} else {
console.log(`Kinesis stream ${kinesisConstant.STREAM_NAME} is ${data.StreamDescription.StreamStatus}.`);
console.log(`Record Lost`, JSON.parse(payload));
}
}
});
};
Create a kinesisConstant.js file to keep it consistent :)
module.exports = {
STATE: {
ACTIVE: 'ACTIVE',
UPDATING: 'UPDATING',
CREATING: 'CREATING',
DELETING: 'DELETING'
},
STREAM_NAME: '<your-stream-name>',
PARTITION_KEY: '<string-value-if-one-shard-anything-will-do',
PAYLOAD_TYPE: 'String',
REGION: '<the-region-where-you-have-lambda-and-kinesis>',
API_VERSION: '2013-12-02'
}
Your handler file: we added the 'done' function to send a response to whoever wants to send the data to the stream but 'kinesis.save(event)' does all the work.
const kinesis = require('./kinesis');
exports.handler = (event, context, callback) => {
console.log('LOADING handler');
const done = (err, res) => callback(null, {
statusCode: err ? '400' : '200',
body: err || res,
headers: {
'Content-Type': 'application/json',
},
});
kinesis.save(event); // here we send it to the stream
done(null, event);
}

This should be done exactly like doing it on your computer.
Here's an example in nodejs:
let aws = require('aws');
let kinesis = new aws.Kinesis();
// data that you'd like to send
let data_object = { "some": "properties" };
let data = JSON.stringify(data_object);
// push data to kinesis
const params = {
Data: data,
PartitionKey: "1",
StreamName: "stream name"
}
kinesis.putRecord(params, (err, data) => {
if (err) console.error(err);
else console.log("data sent");
}
Please note, this piece of code will not work, as the Lambda has no permissions to your stream.
When accessing AWS resources through Lambda, it is better to use IAM roles;
When configuring a new Lambda, you can choose existing / create a role.
Go to IAM, then Roles, and pick the role name you assigned to the Lambda function.
Add the relevant permissions (putRecord, putRecords).
Then, test the Lambda.

Yes, this can be done, I was trying to accomplish the same thing and was able to do so in Lambda using Node.js 4.3 runtime, and it also works in version 6.10.
Here is the code:
Declare the following at the top of your Lambda function:
var AWS = require("aws-sdk");
var kinesis = new AWS.Kinesis();
function writeKinesis(rawdata){
data = JSON.stringify(rawdata);
params = {Data: data, PartitionKey: "<PARTITION_KEY>", StreamName: "<STREAM_NAME>"};
kinesis.putRecord(params, (err, data) => {
if (err) console.error(err);
else console.log("data sent");
});
}
Now, in the exports.handler, call the function:
writeKinesis(<YOUR_DATA>);
A few things to note... for Kinesis to ingest data, it must be encoded. In the example below, I have function that takes logs from CloudWatch, and sends them over to a Kinesis stream.
Note that I'm inserting the contents of buffer.toString('utf8') into the writeKinesis function:
exports.handler = function(input, context) {
...
var zippedInput = new Buffer(input.awslogs.data, 'base64');
zlib.gunzip(zippedInput, function(error, buffer) {
...
writeKinesis(buffer.toString('utf8'));
...
}
...
}
Finally, in IAM, configure the appropriate permissions. Your Lambda function has to run within the context of an IAM role that includes the following permissions below. In my case, I just modified the default lambda_elasticsearch_execution role to include a policy called "lambda_kinesis_execution" with the following code:
"Effect": "Allow",
"Action": [
"kinesis:*"
],
"Resource": [
"<YOUR_STREAM_ARN>"
]

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

AWS S3 Readstream returns no data - amazon-web-services

Related

AWS s3 NoSuchKey error after retrieving file which was just uploaded

AWS - Sending 1000's of emails from Lambda / Node.js

Testing catch block via jest mock

AWS Lambda function timing out

Pushing AWS Lambda data to Kinesis Stream

Categories

Resources