Why can't my Lambda function find my DynamoDB table? - amazon-web-services

I'm new to AWS and I am struggling with it a little.
I have been following an Amazon tutorial on how to
Create a permission and role
create a Lambda function
create an API to talk to the Lambda function
create a DynamoDB table that the Lambda function will change
This is the tutorial: https://docs.aws.amazon.com/lambda/latest/dg/services-apigateway-tutorial.html
My problem is that I follow the instructions but when it comes to testing the API's POST method in the API Gateway console I get these confusing messages in response:
Request: /dynamodbmanager
Status: 200
Latency: 1719 ms
Response Body
{
"errorType": "ResourceNotFoundException",
"errorMessage": "Requested resource not found",
"trace": [
"ResourceNotFoundException: Requested resource not found",
" at deserializeAws_json1_0ResourceNotFoundExceptionResponse (/var/runtime/node_modules/#aws-sdk/client-dynamodb/dist-cjs/protocols/Aws_json1_0.js:3090:23)",
" at deserializeAws_json1_0PutItemCommandError (/var/runtime/node_modules/#aws-sdk/client-dynamodb/dist-cjs/protocols/Aws_json1_0.js:2088:25)",
" at process.processTicksAndRejections (node:internal/process/task_queues:95:5)",
" at async /var/runtime/node_modules/#aws-sdk/middleware-serde/dist-cjs/deserializerMiddleware.js:7:24",
" at async /var/runtime/node_modules/#aws-sdk/lib-dynamodb/dist-cjs/baseCommand/DynamoDBDocumentClientCommand.js:18:34",
" at async /var/runtime/node_modules/#aws-sdk/middleware-signing/dist-cjs/middleware.js:13:20",
" at async StandardRetryStrategy.retry (/var/runtime/node_modules/#aws-sdk/middleware-retry/dist-cjs/StandardRetryStrategy.js:51:46)",
" at async /var/runtime/node_modules/#aws-sdk/middleware-logger/dist-cjs/loggerMiddleware.js:6:22",
" at async Runtime.handler (file:///var/task/index.mjs:33:16)"
]
}
I can't see what I'm doing wrong. It seems as if the table doesn't exist (even though I have created it).

There are many ways to create an AWS Lambda function. The ways described in that doc (Node or Python) are a few supported ways.
If you want to learn how to build an AWS Lambda function that successfully interacts with an Amazon DynamoDB table, then take a look at this step by step document. It builds a Lambda function using the Java Lambda Runtime API.
The use case is the AWS Lambda function that detects personal protective equipment (PPE) in images located in an Amazon Simple Storage Service (Amazon S3) bucket. After you execute the Lambda function, it detects PPE information in the image using the Amazon Rekognition service and creates a record in an Amazon DynamoDB table.
If you follow this step by step dev doc, you will not get that ResourceNotFoundException.
See: Creating an AWS Lambda function that detects images with Personal Protective Equipment
How do I know this works? It's tested a lot and I just re-tested. The Lambda function works:
and data is written to the table.

Related

Is it possible to use bigquery in aws lambda?

I want to write a python function that send data to Bigquery every time a put event occurs in my S3 bucket but I'm new in AWS is it possible to integrate bigquery with a lambda function? or can someone give me another way to stream my dynamodb data to bigquery? Thank you my language is python
N.B: I used dynamostream firehose to send my data in S3 now I want to retrieve my data from s3 every time a put event occur to send it into bigquery.
There are already plenty of resources online about "how to trigger a lambda after a put object on a S3".
But here are a few links to get you set up:
You will need to set up an EventBridge (CloudWatch Events are legacy) to trigger your Lambda when some action happens on your S3 bucket:
https://aws.amazon.com/fr/blogs/compute/using-dynamic-amazon-s3-event-handling-with-amazon-eventbridge/
You can use the boto3 Python framework to write AWS Lambdas:
https://boto3.amazonaws.com/v1/documentation/api/latest/index.html
You can check the BigQuery Python SDK by GCP to communicate with your BQ database: https://googleapis.dev/python/bigquery/latest/index.html

Querying and updating Redshift through AWS lambda

I am using a step function and it gives a JSON to the Lambda as event(object data from s3 upload). I have to check the JSON and compare 2 values in it(file name and eTag) to the data in my redshift DB. If the entry does not exist, I have to classify the file to a different bucket and add an entry to the redshift DB(versioning). Trouble is, I do not have a good idea of how I can query and update Redshift through Lambda. Can someone please give suggestions on what methods I should adopt? Thanks!
Edit: Should've mentioned the lambda is in Python
One way to achieve this use case is you can write the Lambda function by using the Java run-time API and then within the Lambda function, use a RedshiftDataClient object. Using this API, you can perform CRUD operations on a Redshift cluster.
To see examples:
https://github.com/awsdocs/aws-doc-sdk-examples/tree/master/javav2/example_code/redshift/src/main/java/com/example/redshiftdata
If you are unsure how to build a Lambda function by using the Lambda Java run-time API that can invoke AWS Services, please refer to :
Creating an AWS Lambda function that detects images with Personal Protective Equipment
This example shows you how to develop a Lambda function using the Java runtime API that invokes AWS Services. So instead of invoking Amazon S3 or Rekognition, use the RedshiftDataClient within the Lambda function to perform Redshift CRUD opertions.

Serverless-ly Query External REST API from AWS and Store Results in S3?

Given a REST API, outside of my AWS environment, which can be queried for json data:
https://someExternalApi.com/?date=20190814
How can I setup a serverless job in AWS to hit the external endpoint on a periodic basis and store the results in S3?
I know that I can instantiate an EC2 instance and just setup a cron. But I am looking for a serverless solution, which seems to be more idiomatic.
Thank you in advance for your consideration and response.
Yes, you absolutely can do this, and probably in several different ways!
The pieces I would use would be:
CloudWatch Event using a cron-like schedule, which then triggers...
A lambda function (with the right IAM permissions) that calls the API using eg python requests or equivalent http library and then uses the AWS SDK to write the results to an S3 bucket of your choice:
An S3 bucket ready to receive!
This should be all you need to achieve what you want.
I'm going to skip the implementation details, as it is largely outside the scope of your question. As such, I'm going to assume your function already is written and targets nodeJS.
AWS can do this on its own, but to make it simpler, I'd recommend using Serverless. We're going to assume you're using this.
Assuming you're entirely new to serverless, the first thing you'll need to do is to create a handler:
serverless create --template "aws-nodejs" --path my-service
This creates a service based on the aws-nodejs template on the provided path. In there, you will find serverless.yml (the configuration for your function) and handler.js (the code itself).
Assuming your function is exported as crawlSomeExternalApi on the handler export (module.exports.crawlSomeExternalApi = () => {...}), the functions entry on your serverless file would look like this if you wanted to invoke it every 3 hours:
functions:
crawl:
handler: handler.crawlSomeExternalApi
events:
- schedule: rate(3 hours)
That's it! All you need now is to deploy it through serverless deploy -v
Below the hood, what this does is create a CloudWatch schedule entry on your function. An example of it can be found over on the documentation
First thing you need is a Lambda function. Implement your logic, of hitting the API and writing data to S3 or whatever, inside the Lambda function. Next thing, you need a schedule to periodically trigger your lambda function. Schedule expression can be used to trigger an event periodically either using a cron expression or a rate expression. The lambda function you created earlier should be configured as the target for this CloudWatch rule.
The resulting flow will be, CloudWatch invokes the lambda function whenever there's a trigger (depending on your CloudWatch rule). Lambda then performs your logic.

Lambda read DynamoDB and send to ML Endpoint

Background:
I have a DynamoDB table with column's "TimeStamp | Data1 | Data2". I
also have a ML endpoint in SageMaker which needs Data1 and Data2 to
generate one output value(score).
Question:
My ambition is to script a Lambda function (Java or Python) to read
the latest row in the DynamoDB table, and send this through the
Endpoint and receive the score.
What I have tried:
I have only found guides where you do this by exporting the whole
DynamoDB table to s3 and in Data Pipeline send it to the Endpoint.
This is not how I want it to work!
I am an engineer of Sagemaker team.
As I understand, you would like to use Lambda function to
Listen to DynamoDB table updates
Invoke Sagemaker endpoint for real-time predictions.
For (1), DynamoDB stream could be a wonderful place to start. Here is a tutorial for processing DynamoDB streams using Lambda function:
For (2), here is an step-by-step tutorial for invoke Sagemaker endpoint for predictions inside Lambda function.
Hope this helps.

Is it possible to automatically delete objects older than 10 minutes in AWS S3?

We want to delete objects from S3, 10 minutes after they are created. Is it possible currently?
I have a working solution that was built serverless with the help of AWS's Simple Queue Service and AWS Lambda. This works for all objects created in an s3 bucket.
Overview
When any object is created in your s3 bucket, the bucket will send an event with object details to an SQS queue configured with a 10 minute delivery delay. The SQS queue is also configured to trigger a Lambda function. The Lambda function reads the object details from the event sent and deletes the object from the s3 bucket. All three components involved (s3, SQS and Lambda) are low cost, loosely coupled, serverless and scale automatically to very large workloads.
Steps Involved
Setup your Lambda Function First. In my solution, I used Python 3.7. The code for the function is:
import json
import boto3
def lambda_handler(event, context):
for record in event['Records']:
v = json.loads(record['body'])
for rec in v["Records"]:
bucketName = rec["s3"]["bucket"]["name"]
objectKey = rec["s3"]["object"]["key"]
#print("bucket is " + bucketName + " and object is " + objectKey )
sss = boto3.resource("s3")
obj = sss.Object(bucketName, objectKey)
obj.delete()
return {
'statusCode': 200,
'body': json.dumps('Delete Completed.')
}
This code and a sample message file were uploaded to a github repo.
Create a vanilla SQS queue. Then configure the SQS queue to have a 10 minute delivery Delay. This setting can be found under Queue Actions -> Configure Queue -> 4 setting down
Configure the SQS Queue to trigger the Lambda Function you created in Step 1. To do this use Queue Actions -> Configure Trigger for Lambda Function. The setup screen is self explanatory. If you don't see your Lambda function from step 1, redo it correctly and make sure you are using the same Region.
Setup your S3 Bucket so that it fires an event to the SQS Queue you created in step 2. This is found on the main bucket screen, click Properties tab and select Events. Click the plus sign to add an event and fill out the following form:
Important points to select are to select All Object create events and to select the queue you created in Step 2 for the last pull down on this screen.
Last step - Add an execute policy to your Lambda Function that allows it to only delete from the specific S3 bucket. You can do this via the Lambda function console. Scroll down the Lambda function screen of your console and configure it under Execution Role.
This works for files I've copied into a single s3 bucket. This solution could support many S3 buckets to 1 queue and 1 lambda.
In addition to the detailed solution proposed by #taterhead involving a SQS queue, one might also consider the following serverless solution using AWS Step Functions:
Create a State Machine in AWS Step Functions with a Wait state of 10 minutes followed by a Task state executing a Lambda function that will delete the object.
Configure CloudTrail and CloudWatch Events to start an execution of your state machine when an object is uploaded to S3.
It has the advantage of (1) not having the 15 minutes limit and (2) avoiding the continuous queue polling cost generated by the Lambda function.
Inspiration: Schedule emails without polling a database using Step Functions
If anyone is still interest in this, S3 now offers Life Cycle rules which I've just been looking into, and they seem simple enough to configure in the AWS S3 Console.
The "Management" tab of an S3 bucket will reveal a button labeled "Add lifecycle rule" where users can select specific prefixes for objects and also set expiration times for the life times of the objects in the bucket that's being modified.
For a more detailed explanation, AWS have published an article on the matter, which explains this in more detail here.