Aws::Transfer::TransferManager::UploadDirectory and content-type - c++

I'm attempting to use Aws::Transfer::TransferManager::UploadDirectory to upload a directory of files to s3. These files will later be hosted via CloudFront to web clients. For this reason, I need to set several headers such as Content-Type, Content-Encoding and they will be different depending on the file.
At first glance, there does not appear to be a way to specify this information when as part of the UploadDirectory call. There is a forAws::Map<Aws::String, Aws::String>() metadata that feels like it should be what I want, but it's missing documentation and I'm not sure how a string -> string mapping could do what I want.
Is UploadDirectory the wrong approach here? Would I be better off re-implementing my own version so that I can do more per-file operations?

Related

How to specify custom metadata in resumable upload (via XML API)?

I am following steps for resumable upload outlined here.
According to documentation custom metadata has to be specified in first POST and is to be passed via x-goog-meta-* headers. I.e.:
x-goog-meta-header1: value1
x-goog-meta-header2: value2
... etc
But in my testing all these values disappear. After final PUT object shows up in the bucket with proper content-type but without a single piece of custom metadata.
What I am doing wrong?
P.S. It is rather suspicious that JSON API in resumable upload takes metadata as payload of first POST...
P.P.S. I am performing resumable upload via XML API described here (only using C++ code instead of curl utility). Adding x-goog-meta-mykey: myvalue header has no effect on object's custom metadata.
if you replace AWS4-HMAC-SHA256 in Authorization header with GOOG4-HMAC-SHA256 -- it works. GCS uses this bit as a "should I expect x-amz- or x-goog- headers?" switch. Problem is that with resumable upload you have to specify x-goog-resumable and adding x-amz-meta-* headers causes request to fail with a message about mixing x-goog- and x-amz- headers.
I also went ahead and changed few other aspects of signature, namely:
request type: aws4_request -> goog4_request
signing key: AWS4 -> GOOG4 (GOOG1 works too)
service name: s3 -> storage (even though in some errors GCS asks for either s3 or storage to be specified here, it also takes gcs and maybe other values)
... this isn't necessary, it seems. I've done it just for consistency.

How to facilitate downloading both CSV and PDF from API Gateway connected to S3

In the app I'm working on, we have a process whereby a user can download a CSV or PDF version of their data. The generation works great, but I'm trying to get it to download the file and am running into all sorts of problems. We're using API Gateway for all the requests, and the generation happens inside a Lambda on a POST request. The GET endpoint takes in a file_name parameter and then constructs the path in S3 and then makes the request directly there. The problem I'm having is when I'm trying to transform the response. I get a 500 error and when I look at the logs, it says Execution failed due to configuration error: Unable to transform response. So, clearly that's where I've spent most of my time. I've tried at least 50 different iterations of templates and combinations with little success. The closest I've gotten is the following code, where the CSV downloads fine, but the PDF is not a valid PDF anymore:
CSV:
#set($contentDisposition = "attachment;filename=${method.request.querystring.file_name}")
$input.body
#set($context.responseOverride.header.Content-Disposition = $contentDisposition)
PDF:
#set($contentDisposition = "attachment;filename=${method.request.querystring.file_name}")
$util.base64Encode($input.body)
#set($context.responseOverride.header.Content-Disposition = $contentDisposition)
where contentHandling = CONVERT_TO_TEXT. My binaryMediaTypes just has application/pdf and that's it. My goal is to get this working without having to offload the problem into a Lambda so we don't have that overhead at the download step. Any ideas how to do this right?
Just as another comment, I've tried CONVERT_TO_BINARY and just leaving it as Passthrough. I've tried it with text/csv as another binary media type and I've tried different combinations of encoding and decoding base64 and stuff. I know the data is coming back right from S3, but the transformation is where it's breaking. I am happy to post more logs if need be. Also, I'm pretty sure this makes sense on StackOverflow, but if it would fit in another StackExchange site better, please let me know.
Resources I've looked at:
https://docs.aws.amazon.com/apigateway/latest/developerguide/request-response-data-mappings.html
https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-mapping-template-reference.html#util-template-reference
https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-payload-encodings-workflow.html
https://docs.amazonaws.cn/en_us/apigateway/latest/developerguide/api-gateway-payload-encodings-configure-with-control-service-api.html.
(But they're all so confusing...)
EDIT: One Idea I've had is to do CONVERT_TO_BINARY and somehow base64 encode the CSVs in the transformation, but I can't figure out how to do it right. I keep feeling like I'm misunderstanding the order of things, specifically when the "CONVERT" part happens. If that makes any sense.
EDIT 2: So, I got rid of the $util.base64Encode in the PDF one and now I have a PDF that's empty. The actual file in S3 definitely has things in it, but for some reason CONVERT_TO_TEXT is not handling it right or I'm still not understading how this all works.
Had similar issues. One major thing is the Accept header. I was testing in chrome which sends Accept header as text/html,application/xhtml.... api-gateway ignores everything except the first one(text/html). It will then convert any response from S3 to base64 to try and conform to text/html.
At last after trying everything else I tried via Postman which defaults the Accept header to */*. Also set your content handling on the Integration response to Passthrough. And everything was working!
One other thing is to pass the Content-Type and Content-Length headers through(Add them in method response first and then in Integration response):
Content-Length integration.response.header.Content-Length
Content-Type integration.response.header.Content-Type

gZIP with AWS cloudFront and S3

CloudFront offers compression (gZIP) for certain file types from the origin. My architecture looks like this:
So, the requirements for the files to get compressed in cloudFront are:
1. Have to enable Compress Objects Automatically option in cloudFront's cache behaviour settings.
2. content-type and content-length has to be returned by S3. S3 sends these headers by default. I have cross checked this.
3. The received file type must be one of the file types listed by cloudFront. In my case, I want to compress app.bundle.js which comes under application/javascript (content-type) and it is also present in the supported file-types of cloudFront.
I guess above are the only requirements to get a gZipped version of the files to browser. Even after having the above things, gzip does not work for me. Any ideas, what am I missing?

AWS API Gateway with dynamic URL path parameters

I've got an API with an integration to S3 to serve static files. My resource is quite simple in that I only require the filename to serve the file, like so:
/api/v1/{file}
However this requires the consumer to know the exact filename, i.e.
/api/v1/purple.json
I want to make this a little more dynamic. Since my files are all JSON, I want the consumer to not have to provide the .json suffix. Is this currently possible with the URL path parameters? I know I can use method.request.path.file to access the purple value, but can I append .json to it myself?
API Gateway does not currently allow for concatenation of values in parameter mapping. This is a feature other customers have requested and is on our backlog.

AWS CloudFront Behavior

I've been setting up aws lambda functions for S3 events. I want to set up a new structure for my bucket, but it's not possible--so I set up a new bucket the way I want and will migrate old things and send new things there. I wanted to have some of the structure the same under a given base folder name old-bucket/images and new-bucket/images. I set up CloudFront to serve from old-bucket/images now, but I wanted to add new-bucket/images as well. I thought the behavior tab would set it such that it would check the new-bucket/images first then old-bucket/images. Alas, that didn't work. If the object wasn't found in the first, that was the end of the line.
Am I misunderstanding how behaviors work? Has anyone attempted anything like this?
That is expected behavior. An origin tells Amazon CloudFront where to obtain the data to serve to users, based upon a prefix, suffix, etc.
For example, you could serve old-bucket/* from one Amazon S3 bucket, while serving new-bucket/* from a different bucket.
However, there is no capability to 'fall-back' to a different origin if a file is not found.
You could check for the existence of files before serving the link, and then provide a different link depending upon where the files are stored. Otherwise, you'll need to put all of your files in the location that matches the link you are serving.