I am someone who is totally new to REST APIs, pardon the newbie-ish mistakes.
My requirement is:
The source Database people wants to send JSON data on an hourly basis to an API endpoint which I publish. I am not sure of what all do I need to build to make sure it happens seamlessly. My target is to receive the data and create CSV files and save in it AWS S3 for further downstream processing.
My plan is, creating an AWS API Gateway endpoint which will accept POST requests and whenever anyone sends data through POST, the API Gateway will trigger AWS Lambda Function which will run Python to parse the JSON data to CSV and store in AWS S3. Is this thought valid? What all am I missing out? Are there best practices which needs to be implemented?
This architecture seems to be what you wanna do.
You wanna make sure that your API is secured with a key or via Cognito (more complex) and that your Lambda have the IAM permissions needed in order to access your bucket.
This post will help you understand the Lambda blueprint that is triggered when an object is upload to s3. Just change the Lambda trigger and a little bit the Python code and you're done.
Yes,this is a simple, typical serverless stack and it works perfectly fine.
Additionally, you may also focus on the authentication on the API Gateway end point to make it secure.
Related
I’m approaching AWS for the first time, and I’m trying to build an app using Amplify, API Gateway, Cognito, Lambda and Dynamo DB (I’m building some sort of a ToDo app).
I’ve learnt how to use lambda and dynamoDB without any authentication system and now I want to implement it.
I’ve already setup cognito, api gateway and lambda in order to access API with an IdToken, but what I’m not able to do is saving into dynamoDB the data of the user that called the API, as when I log the event inside CloudWatch I see that all the fields that refer to cognito are null (ex. “Context.identity” is null).
I need it so when an user want to see its data I’ll call the GET method which filters only that user’s data and retrieves it.
Anyone could please explain if this is the correct way to do it and what I’m missing, or if there’s an easier way to build this?
I have a system which uses AWS IoT to send data to a DynamoDB table. This is simple weather data and stores time, temperature, humidity and pressure. So far that side is working well.
I want to display data from this table on a webpage hosted on S3. In its most simple form this would just display the latest row of data. I had though that this would be a case of a simple client-side javascript to query the database, but looking on Amazon it gets quite complicated with Lambda functions called through API gateway using IAM to certify.
Is there a simpler way to go about this? Data should be publicly readable, and non-writeable, so I thought should be easier than what I have read so far.
Please have a look at the Simple Web Service CDK Pattern. It helps you create a simple end-to-end service using API Gateway, a Lambda function, and access to a DynamoDB table with just a few lines of code. It is available in multiple programming languages.
As a general note: Whenever you want to provide dynamic content, you need some kind of application that takes care of it. An API Gateway backed by an AWS Lambda function is no more complicated than running a web server with all the undifferentiated heavy lifting like network configuration, firewall setup, OS patching, and maintenance. And proper handling of identity and access control needs to be done in any case.
If you really want to just display the latest row, and you prefer to keep you webpage as static as possible, I would consider just writing out the latest row of dynamodb to a simple json file using whatever backend process you want, then that file can be consumed by your front end application without having to worry about IAM Credentials or even the AWS JS SDK - keep it as simple and lightweight as possible.
No need to repeatedly hit your dynamodb to pull back the same data for each page load either - which should also save you some money in the long run.
There is a in-browser JavaScript SDK. It allows the JavaScript on your web page to make calls directly to DynamoDB, without having to make API calls through API Gateway that trigger a Lambda function that itself makes calls to DynamoDB on your behalf.
The main consideration here is how to authenticate your web browser client. It cannot make DynamoDB API calls without AWS credentials. I'm assuming that you have no server-side component that can return AWS credentials to the client.
You could embed read-only, minimally-permissioned AWS credentials in the page itself - this is definitely not a best practice, but it's simple and in some cases might be acceptable. Other options include requiring the web site user to authenticate via Web Identity Federation, which would dynamically yield a set of usable, time-limited AWS credentials. Another, more sophisticated option, would be AWS Amplify which supports both authenticated and unauthenticated clients. This is generally preferable to hard-coding (read-only) AWS credentials in a web page but is a little more complex to set up.
There is also a blog post: Dynamic Websites Using the AWS SDK for JavaScript in the Browser.
In all these scenarios, the page itself would make API calls directly to DynamoDB, and you should ensure that the IAM role associated with the credentials is heavily restricted (just the relevant DynamoDB table(s) and just the necessary query/get permissions).
I am trying to figure out what is the best way to read a JSON file from a specific endpoint and then save/post such object to AWS S3. I have created a mocked endpoint with a mocked response via https://www.mockable.io/ and I would like to know what is the best way to 'POST' it to a S3 bucket. New JSON files will be available weekly and I was thinking that perhaps a way to go about would be to use Lambda AWS and the API Gateway. Is this a viable way? I also would like to explore the possibility of enable an event trigger way to pull data or a scheduler. What would you recommend? I know that AWS SQS is an option, but how do you send the fetched JSON files to the queue?
Thank you, any resource or suggestion is more than welcome. I am looking for potential approaches.
There are quite a lot of different ways you could approach this, but if I understand correctly you want to retrieve a JSON response once a week from a fixed endpoint (which you have set up?), and then write that JSON to a file, or sequence of files, you store on S3.
If that is correct, then all you really need is Cloudwatch Events (to set up a weekly scheduled recurring event in cron format) which triggers a lambda function that makes the request and then writes it to S3. You can also use the same lambda function (or write another which is triggered by the same CloudWatch Event) to post a message to SQS with the JSON.
Depending on what language you are most comfortable writing it in, you can use the SDK to do all the things you want to do. Personally I like the python library boto3, and combined with a little file IO to get the JSON to a text file of some kind, and the requests library to make the actual HTTP request to your endpoint, you should be able to do all you need. Helpful functions in boto3 will be sending a SQS message and writing to S3.
I'm not sure why you'd necessarily need API Gateway to do anything here, unless instead of triggering the lambda via a scheduled event, you wanted to do it by making a separate HTTP request, but then you may as well just make the request to your original API!
Please consider using Lambda with NodeJS code to do GET from endpoint for invoking the lambda function use cloudwatch event
https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/RunLambdaSchedule.html
Well, I have a web page (PHP) that is running on-premise and it's accessed from different countries. I would like to catch some data and store it somewhere. I can handle internally with the team the data and the format of the file to catch the info. But we would like to get leverage of AWS to store it in S3. So we notice that we need an intermedium layer to avoid use AWS credentials required for S3.
as this page is on the internet and it's consumed by a user thru web for sure we don't want to include anything for credentials embedded in the site. So likely Kinesis data firehose as consumer role could just catch the data send by our page and then internally store it in S3.
Question
I see that exist an SDK for Kinesis but it requires AWS credentials. We really need a kind of link where we need the data produced and AWS handles the rest. But I don't know why I require to set up AWS credentials using the SDK. Does it mean then that our website will load and live with our credentials? I don't feel this approach secure. I appreciate the comments.
You can use API Gateway Kinesis Proxy to avoid using credentials or even aws-sdk in your webpages.
https://docs.aws.amazon.com/apigateway/latest/developerguide/integrating-api-with-aws-services-kinesis.html
This way you don't need to expose any credentials and control permissions with a role.
If you are worried about having a security issue and if the users are authenticated, you can use custom authorizers to authorize the url.
https://docs.aws.amazon.com/apigateway/latest/developerguide/use-custom-authorizer.html
If it is public facing, then just the above integration should work.
Hope it helps.
I am trying to upload a file to s3 and then have lambda generate id, date.
I then want to return this data back to the client.
I want to avoid generating id and date on the client for security reasons.
Currently, I am trying to use API Gateway which invokes a lambda to upload into s3. However, I am having problems setting this up. I know that this is not a preferred method.
Is there another way to do this without writing my own web server. (I would like to use lambda).
If not, how can I configure my API Gateway method to support file upload to lambda?
You have a couple of options here:
Use API Gateway as an AWS Service Proxy to S3
Use API Gateway to invoke a Lambda function, which uses the AWS SDK to upload to S3
In either case, you will need to base64 encode the file content before calling API Gateway, and POST it in the request body.
We don't currently have any documentation on this exact use case but I would refer you to the S3 API and AWS SDK docs for more information. If you have any specific questions we'd be glad to help.
Thanks,
Ryan