API Gateway caching not calling Lambda function - amazon-web-services

I'm using Amazon API Gateway to execute a Lambda function when the API endpoint is called. In my Lambda function I'm updating a DynamoDB table.
Whenever I call the API with caching disabled using Chrome Developer Tools, the DynamoDB table is updated.
When I have caching enabled, the first request from my API updates the table, every subsequent request is much faster but doesn't update the table.
I'm assuming that CloudFront is caching the responses so as to not have to call the Lambda function each time.
Is there any way to force the Lambda function to be executed with each request?

Few possible solutions:
CloudFront should be used only when u want caching. In this case you don't need it; so call API endpoint directly from browser instead of calling CF end point. This will also save your cloudfront cost.
With each request add a timestamp.
If you have to use CF; you can configure it very easily as to what requests should ALWAYS go to API end points ( which serves dynamic content) while which one should be cached.
Probably you are calling CF as a GET request; just make it POST which is NEVER cached. Ideally, as you are updating table it should be a POST request. This should be simplistic solution with minimal and right changes.

Related

Path based AWS API Caching Keys Issue

I have several API paths set up in a test API Gateway setup with a simple 'api' stage. I am using AWS Lambda and wish to cache the results of the lambda call.
There are three test paths (no authentication)
/a/{thing} (GET Caching turned on in stage)
/b/{thing} (GET Caching turned off in stage)
/c/{thing} (GET Caching turned off in stage)
They all map to the same lambda function. The lambda function returns the current time and the value of {thing}.
If I request /a/0000 through /a/1000 I get back the same result for a function that ran for thing=0000.
If I request /b/0000 through /b/1000 (or /c/) I get back uncached results.
thing is selected as 'cache' in resources /a/{thing}. Nothing else is set 'cache'.
It is my understanding that selecting 'cache' next to a path element, query element, or header would construct a cache key - possibly a multi-key cache key hash. That would be ideal!
Ideally /a/0000 and /a/1234 would return a cached version keyed to the {thing} value.
What did I do wrong or misread or step over? Am I hitting a bug when it comes to AWS Lambda? Is caching keyed to authorization - these URLs are public and unauthenticated. I'm just using curl to request these and nothing is being cached on the client side of course.
Honestly. I've also tried using a query argument as the only cache key and let the cache flush and waited 30 minutes to try try try again. Still not giving the results I would expect.
Pro Tip:
You still have to deploy from resources to stage when you set up cache keys. This makes sense of course but it would be good if the management console showed more about the method parameters than it does.
I am using Chalice.. which is why I wasn't deploying in the normal fashion.

lambda#edge count s3 asset views

I am trying to create a lambda function that counts how many times an asset (in this case an mp4 file) is accessed from s3. This will be used to update a db that stores the views for each file. Trying to used lambda#edge to get viewer request triggers has caused the lambda to be triggered a different amount of times depending on if you're on mobile vs desktop or on different browsers. This is due to the fact that the request from each browser looks different. I figured I could have clauses for each type of browser request but I feel like that isn't a very good approach. Tips for ensuring each request will only trigger the lambda once?

AWS Lambda#edge. How to read HTML file from S3 and put content in response body

Specifically, in an origin response triggered function (EX. With 404 Status), how can I read an HTML file stored in S3 and use its content for the response body?
(I would like to manually return a custom error page just as CloudFront does, but choosing it based on cookies).
NOTE: The HTML file in S3 is stored in the same bucket of my website. OAI Enabled.
Thank you very much!
Lambda#Edge functions don't currently¹ have direct access to any body content from the origin.
You will need to grant your Lambda Execution Role the necessary privileges to read from the bucket, and then use s3.getObject() from the JavaScript SDK to fetch the object from the bucket, then use its body.
The SDK is already in the environment,² so you don't need to bundle it with your code. You can just require it, and create the S3 client globally, outside the handler, which saves time on subsequent invocations.
'use strict';
const AWS = require('aws-sdk');
const s3 = new AWS.S3({ region: 'us-east-2' }); // use the correct region for your bucket
exports.handler ...
Note that one of the perceived hassles of updating a Lambda#Edge function is that the Lambda console gives the impression that redeploying it is annoyingly complicated... but you don't have to use the Lambda console to do this. The wording of the "enable trigger and replicate" checkbox gives you the impression that it's doing something important, but it turns out... it isn't. Changing the version number in the CloudFront configurarion and saving changes accomplishes the same purpose.
After you create a new version of the function, you can simply go to the Cache Behavior in the CloudFront console and edit the trigger ARN to use the new version number, then save changes.
¹currently but I have submitted this as a feature request; this could potentially allow a response trigger to receive a copy of the response body and rewrite it. It would necessarily be limited to the maximum size of the Lambda API (or smaller, as generated responses are currently limited), and might not be applicable in this case, since I assume you may be fetching a language-specific response.
²already in the environment. If I remember right, long ago, Lambda#Edge didn't include the SDK, but it is always there, now.

How to invalidate AWS APIGateway cache

We have a service which inserts into dynamodb certain values. For sake of this question let's say its key:value pair i.e., customer_id:customer_email. The inserts don't happen that frequently and once the inserts are done, that specific key doesn't get updated.
What we have done is create a client library which, provided with customer_id will fetch customer_email from dynamodb.
Given that customer_id data is static, what we were thinking is to add cache to the table but one thing which we are not sure that what will happen in the following use-case
client_1 uses our library to fetch customer_email for customer_id = 2.
The customer doesn't exist so API Gateway returns not found
APIGateway will cache this response
For any subsequent calls, this cached response will be sent
Now another system inserts customer_id = 2 with its email id. This system doesn't know if this response has been cached previously or not. It doesn't even know that any other system has fetched this specific data. How can we invalidate cache for this specific customer_id when it gets inserted into dynamodb
You can send a request to the API endpoint with a Cache-Control: max-age=0 header which will cause it to refresh.
This could open your application up to attack as a bad actor can simply flood an expensive endpoint with lots of traffic and buckle your servers/database. In order to safeguard against that it's best to use a signed request.
In case it's useful to people, here's .NET code to create the signed request:
https://gist.github.com/secretorange/905b4811300d7c96c71fa9c6d115ee24
We've built a Lambda which takes care of re-filling cache with updated results. It's a quite manual process, with very little re-usable code, but it works.
Lambda is triggered by the application itself following application needs. For example, in CRUD operations the Lambda is triggered upon successful execution of POST, PATCH and DELETE on a specific resource, in order to clear the general GET request (i.e. clear GET /books whenever POST /book succeeded).
Unfortunately, if you have a View with a server-side paginated table you are going to face all sorts of issues because invalidating /books is not enough since you actually may have /books?page=2, /books?page=3 and so on....a nightmare!
I believe APIG should allow for more granular control of cache entries, otherwise many use cases aren't covered. It would be enough if they would allow to choose a root cache group for each request, so that we could manage cache entries by group rather than by single request (which, imho, is also less common).
Did you look at this https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-caching.html ?
There is way to invalidate entire cache or a particular cache entry

How do I set the cache key for the AWS API Gateway?

I have a Lambda function that is mapped to a HTTP endpoint using the AWS API Gateway. This works fine, I have mapped query string params to the Lambda event, everything works:
https://api.buzzcloud.xyz/?count=999
Which I can call from http://buzzcloud.xyz
I would like to enable caching, but it seems that by default the API Gateway uses the URL for caching, and so changes in my query string parameters are not triggering a different cache result.
The result is that with caching on, my page returns whatever data was first requested and put in the cache.
How do I set a custom cache key or ensure querystring is part of the cache identifier?
Turns out the is a not-so-secret setting that I totally missed that allows for the exact query string params that should be used for the cache to be set.