How to facilitate downloading both CSV and PDF from API Gateway connected to S3

How to facilitate downloading both CSV and PDF from API Gateway connected to S3 - amazon-web-services

In the app I'm working on, we have a process whereby a user can download a CSV or PDF version of their data. The generation works great, but I'm trying to get it to download the file and am running into all sorts of problems. We're using API Gateway for all the requests, and the generation happens inside a Lambda on a POST request. The GET endpoint takes in a file_name parameter and then constructs the path in S3 and then makes the request directly there. The problem I'm having is when I'm trying to transform the response. I get a 500 error and when I look at the logs, it says Execution failed due to configuration error: Unable to transform response. So, clearly that's where I've spent most of my time. I've tried at least 50 different iterations of templates and combinations with little success. The closest I've gotten is the following code, where the CSV downloads fine, but the PDF is not a valid PDF anymore:
CSV:
#set($contentDisposition = "attachment;filename=${method.request.querystring.file_name}")
$input.body
#set($context.responseOverride.header.Content-Disposition = $contentDisposition)
PDF:
#set($contentDisposition = "attachment;filename=${method.request.querystring.file_name}")
$util.base64Encode($input.body)
#set($context.responseOverride.header.Content-Disposition = $contentDisposition)
where contentHandling = CONVERT_TO_TEXT. My binaryMediaTypes just has application/pdf and that's it. My goal is to get this working without having to offload the problem into a Lambda so we don't have that overhead at the download step. Any ideas how to do this right?
Just as another comment, I've tried CONVERT_TO_BINARY and just leaving it as Passthrough. I've tried it with text/csv as another binary media type and I've tried different combinations of encoding and decoding base64 and stuff. I know the data is coming back right from S3, but the transformation is where it's breaking. I am happy to post more logs if need be. Also, I'm pretty sure this makes sense on StackOverflow, but if it would fit in another StackExchange site better, please let me know.
Resources I've looked at:
https://docs.aws.amazon.com/apigateway/latest/developerguide/request-response-data-mappings.html
https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-mapping-template-reference.html#util-template-reference
https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-payload-encodings-workflow.html
https://docs.amazonaws.cn/en_us/apigateway/latest/developerguide/api-gateway-payload-encodings-configure-with-control-service-api.html.
(But they're all so confusing...)
EDIT: One Idea I've had is to do CONVERT_TO_BINARY and somehow base64 encode the CSVs in the transformation, but I can't figure out how to do it right. I keep feeling like I'm misunderstanding the order of things, specifically when the "CONVERT" part happens. If that makes any sense.
EDIT 2: So, I got rid of the $util.base64Encode in the PDF one and now I have a PDF that's empty. The actual file in S3 definitely has things in it, but for some reason CONVERT_TO_TEXT is not handling it right or I'm still not understading how this all works.

Had similar issues. One major thing is the Accept header. I was testing in chrome which sends Accept header as text/html,application/xhtml.... api-gateway ignores everything except the first one(text/html). It will then convert any response from S3 to base64 to try and conform to text/html.
At last after trying everything else I tried via Postman which defaults the Accept header to */*. Also set your content handling on the Integration response to Passthrough. And everything was working!
One other thing is to pass the Content-Type and Content-Length headers through(Add them in method response first and then in Integration response):
Content-Length integration.response.header.Content-Length
Content-Type integration.response.header.Content-Type

Related

Unable to PUT big file (2gb) to aws s3 bucket (nodejs) | RangeError: data is too long

I scouted trough all of the internet and everybody gives out different advice but none of them helped me.
Im currently trying to simply send file.buffer that gets send to my endpoint directly to aws bucket.
im using PutObjectCommand have correctly entered all the details in but there's apparently problem with me using simple await s3.send(command) because my 2.2gbs video is way too big.
i get this error when attempting to upload said file to cloud.
RangeError: data is too long at Hash.update (node:internal/crypto/hash:113:22) at Hash.update (C:\Users\misop\Desktop\sebi\sebi-auth\node_modules\#aws-sdk\hash-node\dist-cjs\index.js:12:19) at getPayloadHash (C:\Users\misop\Desktop\sebi\sebi-auth\node_modules\#aws-sdk\signature-v4\dist-cjs\getPayloadHash.js:18:18) at SignatureV4.signRequest (C:\Users\misop\Desktop\sebi\sebi-auth\node_modules\#aws-sdk\signature-v4\dist-cjs\SignatureV4.js:96:71) at process.processTicksAndRejections (node:internal/process/task_queues:95:5) { code: 'ERR_OUT_OF_RANGE', '$metadata': { attempts: 1, totalRetryDelay: 0 } }
I browsed quite a lot,there's lots of people saying that i should be using presigned url,i did try however if i do await getSignedUrl(s3,putCommand,{expires:3600}); then i do get generated url but there's not PUT send to cloud. when i read little more into it getSignedUrl is just for generating signed url therefore there's no way for me to use Put command there so im not sure how to approach this situation.
Im currently working with :
"#aws-sdk/client-s3": "^3.238.0",
"#aws-sdk/s3-request-presigner": "^3.238.0",
Honestly i've been testing lots of different ways i saw online but i wasnt successful following even amazon's official documentation where they mention these thing and i trully dont want to implement multipart upload for smaller than 4 ~ 5gbs of videos.
I'll be honored to hear any advice on this topic, thank you.
Get advice on how to implement simple video upload to aws s3 because of my many failed attempts on doing so since there's lots of information and vast majority doesnt work.

The solution to my problem was essentially using multer's s3 "addon" that had s3 property and had pre-done solution.
"multer-s3": "^3.0.1" version worked even with file that have 5gbs and such. solutions such as using PutObject command inside presigned url method or presigned-post methods were unable to work with multer's file.buffer that node server receives after its being submitted.
If you experienced same problem and want quick and easy solution. use this Multer-s3 npm

Remove the .html extension by using lamda edge - aws S3

Is it possible to remove the .html extension from using lambda edge. it would be a lot easier to write
const redirects = {
'/about': '/about',
'/contact.html': '/contact',
'/start.html': '/start',
I been racking my brain about this for so long. None of this works. My brain is dead now. So I am asking for help
The site is stored on S3 and is using cloudfront.
https://github.com/aws-samples/aws-lambda-edge-workshops/tree/master/Workshop1/Lab4_PrettyUrls

There's not any built-in feature that allows this operation, but there's a method you can try to accomplish your target.
Don't name the file as index.html, just name it as index. Web browsers never care about the file extension you have, especially if the content of the file is correct. As long as the content type is set to text or html when the object is uploaded to S3, this works perfectly. And if you are using the console, you will have to set that manually because its not assumed automatically.

CSV Export using Api Gateway and Lambda

What I would like to do:
What I would like to do is have a url which would return to the caller a CSV file which is essentially a export of data. I would like this to remain to be a serverless solution.
What I have done:
I have created an AWS API Gateway with the URL I want. I have created a lambda that will query the database and create a CSV string of that data. That data is placed in a JSON object and returned. API gateway then gets the CSV data from the json object and returns CSV to the caller with appropriate headers to indicate tht it is a CSV and attachment. Testing from the browser I get the download automatically just like I intended.
The problem I see:
This works well until there is a sizable amount of data at which point I start getting "body size is too long".
My attempts to resolve:
I did some googling around and I see others have had similar issues. In one solution I saw that they return a link to the file that they created. This solution seems viable for them because they had a server. For my serverless architecture it seems to be a little trickier. I could take and store the file into S3 but then i would have to return a link to S3. That seems like it could work but doesn't feel right like im missing a configuration option. It also feels like im exposing the implementation by returning the s3 urls as well.
I have looked around for tutorials and example of people doing similar things and i haven't found any.
My Questions:
Is there a way to do this?
Is there another solution that i dont know of?
How do i return a file, in this case CSV, from API Gateway of a larger size

There is a limit of 6 MB for AWS Lambda response payloads. If the files you need to server are larger than that you won't be able to serve them directly from Lambda.
Using S3 to store and serve the files is the standard way of doing something like this. I would leave the S3 bucket private and generate S3 Pre-signed URLs in the Lambda function. That will limit the time that the CSV file is available for download, and it will prevent people from being able to guess the URLs of files you are serving. You would use an S3 Lifecycle Policy to archive or delete the files after a period of time.

WireMock returns image that's corrupt

I've recorded a mock through WireMock that contains an image in the body. When I try to get the stub using Postman the response back is an image that won't load and the size of the content is roughly 20-50% larger than when I get the same image from the production server. In Google Chrome it says Resource interpreted as Document but transferred with MIME type image/jpeg.
I can't tell if this is an underlying issue with Jetty or WireMock. I read some related chatter on the user group about images being returned incorrectly, but I've tried the suggestion of removing the mapping stub and just keeping the __file - no luck. This seems like an encoding issue, but I don't know how to debug it further.

If you can hang in there until next week we're putting the finishing touches on a brand new recorder and I've been specifically working through the encoding issues the current recorder suffers from.
Meanwhile, you might want to try turning off gzip in your client code.

JItterBit HTTP Endpoint

I am working to set up a HTTP Endpoint in JitterBit, for this end point we have a system that will call this Endpoint and pass parameters through the URL to it.
example...
http://[server]:[server port]/EndPoint?Id={SalesForecID}&Status={updated status in SF}
Would i need to use the Text File, JSON or XML Method for this? Follow up question would be if it is JSON or XML what would the file look like that is uploaded during creating the endpoint. I have tired with no success with the text file version.
any help would be great.

I'm just seeing your question now. You may have found a solution, but this took me a while to figure out, so I'll respond anyway.
To get the passed values, go ahead and create your HTTP Endpoint and add a new operation triggered by it. Then, in your new operation create a script with something like the following:
$SalesForceID = $jitterbit.networking.http.query.Id
$UpdatedStatus = $jitterbit.networking.http.query.Status
You can then use these variables elsewhere in your operation chain.
If you want to use these values to feed into another RESTful web service (i.e. an HTTP Source), you'll have to create a separate transformation operation with the HTTP Source. You'd set that source URL to be: http://mysfapp.com/call?Id=[SalesForceID]&Status=[UpdatedStatus]. I'm not sure why, but you can't have the script that extracts the parameters from the Endpoint and the HTTP Source that uses those in the same operation.
Cheers

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js