What is Cloudfront Minimum TTL for? - amazon-web-services

I am trying to understand Minimum TTL, Maximum TTL and Default TTL with this document.
As my understanding, Maximum TTL is used when HTTP cache header appears in respons to limit maximum cache time and Default TTL is used when there is no HTTP cache header to use as default cache time.
However, for Maximum TTL, there is no specific mention.
In addition, It mentions the relation with forwarding head. Does it mean that if I set any HTTP header to forward to an origin and Minimum TTL is not 0, it doesn't cache anything?
Minimum TTL
Specify the minimum amount of time, in seconds, that you want objects to stay in CloudFront caches before CloudFront forwards another request to your origin to determine whether the object has been updated. The default value for Minimum TTL is 0 seconds.
Important
. If you configure CloudFront to forward all headers to your origin for a cache behavior, CloudFront never caches the associated objects. Instead, CloudFront forwards all requests for those objects to the origin. In that configuration, the value of Minimum TTL must be 0.

When deciding whether and for how long to cache an object, CloudFront uses the following logic:
Check for any Cache-Control response header with these values:
no-cache
no-store
private
If any of these is encountered, stop, and set the object's TTL¹ to the configured value of Minimum TTL. A non-zero value means CloudFront will cache objects that it would not otherwise cache.
Otherwise, find the origin's directive for how long the object may be cached. In order, find one of these response headers:
Cache-Control: s-maxage=x
Cache-Control: max-age=x
Expires
Stop on the first value encountered using this ordering, then continue to the next step.
If no value was found, use Default TTL. Stop.
Otherwise, using the value discovered in the previous step:
If smaller than Minimum TTL, then set the object's TTL to Minimum TTL; otherwise,
If larger than Maximum TTL, then set the object's TTL to Maximum TTL; otherwise,
Use the value found in the previous step as the object's TTL.
See https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Expiration.html.
It's important to note that the TTL determines how long CloudFront is allowed to cache the response. It does not dictate how long CloudFront is required to cache the response. CloudFront can evict objects from cache before TTL expires, if the object is rarely accessed.
Whitelisting some (but not all) headers for forwarding to the origin does not change any of the above logic.
What it does change is how objects are evaluated to determine whether a cached response is available.
For example, if you forward the Origin header to the origin, then each unique value for an Origin header creates a different cache entry. Two requests that are identical, except for their Origin header, are then considered different objects... so a cached response for Origin: https://one.example.com would not be used if a later request for the same resource included Origin: https://two.example.com. Both would be sent to the origin, and both would be cached independently, for use in serving future requests with same the matching request header.
CloudFront does this because if you need to forward headers to the origin, then this implies that the origin will potentially react differently to different values for the whitelisted headers... so they are cached separately.
Forwarding headers unnecessarily will thus reduce your cache hit rate unnecessarily.
There is no documented limit to the number of different copies of the same resource that CloudFront can cache, based on varying headers.
But forwarding all headers to the origin reduces to almost zero the chance of any future request being truly identical. This would potentially consume a lot of cache storage, storing objects that would never again be reused, so CloudFront treats this as a special case, and does not allow any caching under this condition. As a result, you are required to set Minimum TTL to 0 for consistency.
¹the object's TTL as used here refers to CloudFront's internal timer for each cached object that tracks how long it is allowed to continue to serve the cached object without checking back with the origin. The object's TTL inside CloudFront is known only to CloudFront, so this value does not impact browser caching.

Related

AWS Cloudfront 0 Minimum TTL meaning & cost

I read that Cloudfront now supports setting 0 as minimum TTL, I am wondering how it works though and if it affects the cost of Cloudfront/ AWS S3?
How does Cloudfront know when a js or css file has changed? We upload new file during deployments to S3. I have no idea if the modification time is correct or not but I think it is.
Main question is if it's cheaper to set it to 0 or keep 5 minutes as we have now and do an invalidation of js/css in Cloudfront after deployments. Sometimes when we golive with only backend changes we won't need to invalidate. Most of the time we have css/js changes also though.
Minimum TTL rarely needs to be customized.
Minimum TTL does not define the minimum amount of time that CloudFront will cache an object. That's what it sounds like, but that's a common misconception.
To help clarify that point, let me first clarify TTL.
The TTL (time-to-live) is a value CloudFront calculates, internally, for each individual object, for the purpose of determining whether an object should be cached at all, and then whether an object found in the cache is still considered fresh. If an object has been in the cache for less time than its TTL, it is said to be fresh, otherwise it is said to be stale.
A fresh object can be served to a viewer without checking with the origin.
A stale object should not be served to a viewer without verifying with the origin, and should eventually be purged.
CloudFront may retain a stale object in its cache for some period of time. (It can verify with the origin whether the stale object is still good, with a conditional request. It can also continue to use the stale object in limited conditions, including an outage on your origin).
CloudFront may also purge a fresh object from its cache at any time. Why would it do that? Cache space is free, so it wouldn't make sense for everything that CloudFront caches to continue to be stored in the cache if it isn't getting requested. CloudFront can purge an "unpopular" object from cache at any time.
From this, it is hopefully clear that TTL is not "how long will CloudFront cache the object?" but rather "how long will CloudFront consider a cached object fresh?"
If an object has a calculated TTL of 0, CloudFront won't cache the object.
So... how would an object get a calculated 0 TTL? At least one of these headers in the response from the origin:
Cache-Control: private
Cache-Control: no-cache
Cache-Control: no-store
Cache-Control: s-maxage=0
Cache-Control: max-age=0 // ignored when s-maxage is present
These are the only cases where Minimum TTL typically has a role to play. (Also, the Expires HTTP header can trigger it, but don't use Expires -- use Cache-Control).
Cache-Control -- as returned by the origin server -- is primarily intended for the browser, but CloudFront observes it as well, when trying to calculate an appropriate TTL for each object.
Minimum TTL establishes a lower boundary on the value CloudFront will extrapolate from the origin's Cache-Control header, from the conditions above, and smaller values are rounded up. If your origin returns (e.g.) Cache-Control: private, no-cache, no-store, that's a TTL that should be 0... but CloudFront sets its internal TTL for that object to the greater of 0 or Minimum TTL.
If the response had (e.g.) Cache-Control: max-age=15 then both browers and CloudFront would calculate a TTL for that object of 15 seconds. If (e.g.) Minimum TTL is set to 300 then CloudFront ignores the 15 seconds specified by the origin and sets its internal TTL for the object to 300 seconds. The browser will still use 15 seconds, because CloudFront does not modify the Cache-Control response header.
tl;dr: Minimum TTL is the minimum internal TTL value that CloudFront will assign to an object, regardless of values in the origin response that should cause it to calculate a lower value. Changing this setting from the default value 0 is an advanced option. It does not set a minimum amount of time that CloudFront is required to cache objects.
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Expiration.html#ExpirationDownloadDist

Does Cloudwatch not log Lambda functions from Cloudfront?

Does Cloudfront require special settings to trigger a log?
I have the following flow:
Devices -> Cloudfront -> API Gateway -> Lambda Function
which works, but Cloudwatch doesn't seem to create logs for the lambda function (or API Gateway).
However, the following flow creates logs:
Web/Curl -> API Gateway -> Lambda Function
In comments, above, we seem to have arrived at a conclusion that unanticipated client-side caching (or caching somewhere between the client and the AWS infrastructure) may be a more appropriate explanation for the observed behavior, since there is no known mechanism by which an independent CloudFront distribution could access a Lambda function via API Gateway and cause those requests not to be logged by Lambda.
So, I'll answer this with a way to confirm or reject this hypothesis.
CloudFront injects a header into both requests and responses, X-Amz-Cf-Id, containing opaque tokens that uniquely identify the request and the response. Documentation refers to these as "encrypted," but for our purposes, they're simply opaque values with a very high probability of uniqueness.
In spite of having the same name, the request header and the response header are actually two uncorrelated values (they don't match each other on the same request/response).
The origin-side X-Amz-Cf-Id is sent to the origin server in the request is only really useful to AWS engineers, for troubleshooting.
But the viewer-side X-Amz-Cf-Id returned by CloudFront in the response is useful to us, because not only is it unique to each response (even responses from the CloudFront cache have different values each time you fetch the same object) but it also appears in the CloudFront access logs as x-edge-request-id (although the documentation does not appear to unambiguously state this).
Thus, if the client side sees duplicate X-Amz-Cf-Id values across multiple responses, there is something either internal to the client or between the client and CloudFront (in the client's network or ISP) that is causing cached responses to be seen by the client.
Correlating the X-Amz-Cf-Id from the client across multiple responses may be useful (since they should never be the same) and with the CloudFront logs may also be useful, since this confirms the timestamp of the request where CloudFront actually generated this particular response.
tl;dr: observing the same X-Amz-Cf-Id in more than one response means caching is occurring outside the boundaries of AWS.
Note that even though CloudFront allows min/max/default TTLs to impact how long CloudFront will cache the object, these settings don't impact any downstream or client caching behavior. The origin should return correct Cache-Control response headers (e.g. private, no-cache, no-store) to ensure correct caching behavior throughout the chain. If the origin behavior can't be changed, then Lambda#Edge origin response or viewer response triggers can be used to inject appropriate response headers -- see this example on Server Fault.
Note also that CloudFront caches 4xx/5xx error responses for 5 minutes by default. See Amazon CloudFront Latency for an explanation and steps to disable this behavior, if desired. This feature is designed to give the origin server a break, and not bombard it with requests that are presumed to continue to fail, anyway. This behavior may cause various problems in testing as well as production, so there are cases where it should be disabled.

How do I add simple licensing to api when using AWS Cloudfront to cache queries

I have an application deployed on AWS Elastic Beanstalk, I added some simple licensing to stop abuse of the api, the user has to pass a licensekey as a field
i.e
search.myapi.com/?license=4ca53b04&query=fred
If this is not valid then the request is rejected.
However until the monthly updates the above query will always return the same data, therefore I now point search.myapi.com to an AWS CloudFront distribution, then only if query is not cached does it go to actual server as
direct.myapi.com/?license=4ca53b04&query=fred
However the problem is that if two users make the same query they wont be deemed the same by Cloudfront because the license parameter is different. So the Cloudfront caching is only working at a per user level which is of no use.
What I want to do is have CloudFront ignore the license parameters for caching but not the other parameters. I dont mind too much if that means user could access CloudFront with invalid license as long as they cant make successful query to server (since CloudFront calls are cheap but server calls are expensive, both in terms of cpu and monetary cost)
Perhaps what I need is something in front of CloudFront that does the license check and then strips out the license parameter but I don't know what that would be ?
Two possible come to mind.
The first solution feels like a hack, but would prevent unlicensed users from successfully fetching uncached query responses. If the response is cached, it would leak out, but at no cost in terms of origin server resources.
If the content is not sensitive, and you're only trying to avoid petty theft/annoyance, this might be viable.
For query parameters, CloudFront allows you to forward all, cache on whitelist.
So, whitelist query (and any other necessary fields) but not license.
Results for a given query:
valid license, cache miss: request goes to origin, origin returns response, response stored in cache
valid license, cache hit: response served from cache
invalid license, cache hit: response served from cache
invalid license, cache miss: response goes to origin, origin returns error, error stored in cache.
Oops. The last condition is problematic, because authorized users will receive the cached error if the make the same query.
But we can fix this, as long as the origin returns an HTTP error for an invalid request, such as 403 Forbidden.
As I explained in Amazon CloudFront Latency, CloudFront caches responses with HTTP errors using different timers (not min/default/max-ttl), with a default of t minutes. This value can be set to 0 (or other values) for each of several individual HTTP status codes, like 403. So, for the error code your origin returns, set the Error Caching Minimum TTL to 0 seconds.
At this point, the problematic condition of caching error responses and playing them back to authorized clients has been solved.
The second option seems like a better idea, overall, but would require more sophistication and probably cost slightly more.
CloudFront has a feature that connects it with AWS Lambda, called Lambda#Edge. This allows you to analyze and manipulate requests and responses using simple Javascript scripts that are run at specific trigger points in the CloudFront signal flow.
Viewer Request runs for each request, before the cache is checked. It can allow the request to continue into CloudFront, or it can stop processing and generate a reaponse directly back to the viewer. Generated responses here are not stored in the cache.
Origin Request runs after the cache is checked, only for cache misses, before the request goes to the origin. If this trigger generates a response, the response is stored in the cache and the origin is not contacted.
Origin Response runs after the origin response arrives, only for cache misses, and before the response goes onto the cache. If this trigger modifies the response, the modified response stored in the cache.
Viewer Response runs immediately before the response is returned to the viewer, for both cache misses and cache hits. If this trigger modifies the response, the modified response is not cached.
From this, you can see how this might be useful.
A Viewer Request trigger could check each request for a valid license key, and reject those without. For this, it would need access to a way to validate the license keys.
If your client base is very small or rarely changes, the list of keys could be embedded in the trigger code itself.
Otherwise, it needs to validate the key, which could be done by sending a request to the origin server from within the trigger code (the runtime environment allows your code to make outbound requests and receive responses via the Internet) or by doing a lookup in a hosted database such as DynamoDB.
Lambda#Edge triggers run in Lambda containers, and depending on traffic load, observations suggest that it is very likely that subsequent requests reaching the same edge location will be handled by the same container. Each container only handles one request at a time, but the container becomes available for the next request as soon as control is returned to CloudFront. As a consequence of this, you can cache the results in memory in a global data structure inside each container, significantly reducing the number of times you need to ascertain whether a license key is valid. The function either allows CloudFront to continue processing as normal, or actively rejects the invalid key by generating its own response. A single trigger will cost you a little under $1 per million requests that it handles.
This solution prevents missing or unauthorized license keys from actually checking the cache or making query requests to the origin. As before, you would want to customize the query string whitelist in the CloudFront cache behavior settings to eliminate license from the whitelist, and change the error caching minimum TTL to ensure that errors are not cached, even though these errors should never occur.

Objects are getting expired from CloudFront [duplicate]

Cloudfront is configured to cache the images from our app. I found that the images were evicted from the cache really quickly. Since the images are generated dynamically on the fly, this is pretty intense for our server. In order to solve the issue I set up a testcase.
Origin headers
The image is served from our origin server with correct Last-Modified and Expires headers.
Cloudfront cache behaviour
Since the site is HTTPS only I set the Viewer Protocol Policy to HTTPS. Forward Headers is set to None and Object Caching to Use Origin Cache Headers.
The initial image request
I requested an image at 11:25:11. This returned the following status and headers:
Code: 200 (OK)
Cached: No
Expires: Thu, 29 Sep 2016 09:24:31 GMT
Last-Modified: Wed, 30 Sep 2015 09:24:31 GMT
X-Cache: Miss from cloudfront
A subsequent request
A reload a little while later (11:25:43) returned the image with:
Code: 304 (Not Modified)
Cached: Yes
Expires: Thu, 29 Sep 2016 09:24:31 GMT
X-Cache: Hit from cloudfront
A request a few hours later
Nearly three hours later (at 14:16:11) I went to the same page and the image loaded with:
Code: 200 (OK)
Cached: Yes
Expires: Thu, 29 Sep 2016 09:24:31 GMT
Last-Modified: Wed, 30 Sep 2015 09:24:31 GMT
X-Cache: Miss from cloud front
Since the image was still cached by the browser it loaded quickly. But I cannot understand how the Cloudfront could not return the cached image. Therefor the app had to generate the image again.
I read that Cloudfront evicts files from its cache after a few days of being inactive. This is not the case as demonstrated above. How could this be?
I read that Cloudfront evicts files from its cache after a few days of being inactive.
Do you have an official source for that?
Here's the official answer:
If an object in an edge location isn't frequently requested, CloudFront might evict the object—remove the object before its expiration date—to make room for objects that have been requested more recently.
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Expiration.html
There is no guaranteed retention time for cached objects, and objects with low demand are more likely to be evicted... but that isn't the only factor you may not have considered. Eviction may not be the issue, or the only issue.
Objects cached by CloudFront are like Schrödinger's cat. It's a loose analogy, but I'm running with it: whether an object is "in the cloudfront cache" at any given instant is not a yes-or-no question.
CloudFront has somewhere around 53 edge locations (where your browser connects and the content is physically stored) in 37 cities. Some major cities have 2 or 3. Each request that hits cloudfront is routed (via DNS) to the most theoretically optimal location -- for simplicity, we'll call it the "closest" edge to where you are.
The internal workings of Cloudfront are not public information, but the general consensus based on observations and presumably authoritative sources is that these edge locations are all independent. They don't share caches.
If, for example, your are in Texas (US) and your request routed through and was cached in Dallas/Fort Worth, TX, and if the odds are equal that you any request from you could hit either of the Dallas edge locations, then until you get two misses of the same object, the odds are about 50/50 that your next request will be a miss. If I request that same object from my location, which I know from experience tends to route through South Bend, IN, then the odds of my first request being a miss are 100%, even though it's cached in Dallas.
So an object is not either in, or not in, the cache because there is no "the" (single, global) cache.
It is also possible that CloudFront's determination of the "closest" edge to your browser will change over time.
CloudFront's mechanism for determining the closest edge appears to be dynamic and adaptive. Changes in the topology of the Internet at large can change shift which edge location will tend to receive requests sent from a given IP address, so it is possible that over the course of a few hours, that the edge you are connecting to will change. Maintenance or outages or other issues impacting a particular edge could also cause requests from a given source IP address to be sent to a different edge than the typical one, and this could also give you the impression of objects being evicted, since the new edge's cache would be different from the old.
Looking at the response headers, it isn't possible to determine which edge location handled each request. However, this information is provided in the CloudFront access logs.
I have a fetch-and-resize image service that handles around 750,000 images per day. It's behind CloudFront, and my hit/miss ratio is about 50/50. That is certainly not all CloudFront's fault, since my pool of images exceeds 8 million, the viewers are all over the world, and my max-age directive is shorter than yours. It has been quite some time since I last analyzed the logs to determine which and how "misses" seemed unexpected (though when I did, there definitely were some, but their number was not unreasonable), but that is done easily enough, since the logs tell you whether each response was a hit or a miss, as well as identifying the edge location... so you could analyze that to see if there's really a pattern here.
My service stores all of its output content in S3, and when a new request comes in, it first sends a quick request to the S3 bucket to see if there is work that can be avoided. If a result is returned by S3, then that result is returned to CloudFront instead of doing all the fetching and resizing work, again. Mind you, I did not implement that capability because of the number of CloudFront misses... I designed that in from the beginning, before I ever even tested it behind CloudFront, because -- after all -- CloudFront is a cache, and the contents of a cache are pretty much volatile and ephemeral, by definition.
Update: I stated above that it does not appear possible to identify the edge location forwarding a particular request by examining the request headers from CloudFront... however, it appears that it is possible with some degree of accuracy by examining the source IP address of the incoming request.
For example, a test request sent to one of my origin servers through CloudFront arrives from 54.240.144.13 if I hit my site from home, or 205.251.252.153 when I hit the site from my office -- the locations are only a few miles apart, but on opposite sides of a state boundary and using two different ISPs. A reverse DNS lookup of these addresses shows these hostnames:
server-54-240-144-13.iad12.r.cloudfront.net.
server-205-251-252-153.ind6.r.cloudfront.net.
CloudFront edge locations are named after the nearest major airport, plus an arbitrarily chosen number. For iad12 ... "IAD" is the International Air Transport Association (IATA) code for Washington, DC Dulles airport, so this is likely to be one of the edge locations in Ashburn, VA (which has three, presumably with different numerical codes at the end, but I can't confirm that from just this data). For ind6, "IND" matches the airport at Indianapolis, Indiana, so this strongly suggests that this request comes through the South Bend, IN, edge location. The reliability of this test would depend on the consistency with which CloudFront maintains its reverse DNS entries. It is not documented how many independent caches might be at any given edge location; the assumption is that there's only one, but there might be more than one, having the effect of increasing the miss ratio for very small numbers of requests, but disappearing into the mix for large numbers of requests.

Cloudfront TTL not working

I'm having a problem and tried to follow answers here in forum, but with no success whatsoever.
In order to generate thumbnails, I have set up the following schema:
S3 Account for original images
Ubuntu Server using NGINX and Thumbor
Cloudfront
The user uploads original images to S3, which will be pulled through Ubuntu Server with Cloudfront in front of the request:
http://cloudfront.account/thumbor-server/http://s3.aws...
The big deal is, that we often loose objects in Cloudfront, I want them to stay 360 days in cache.
I get following response through Cloudfront URL:
Cache-Control:max-age=31536000
Connection:keep-alive
Content-Length:4362
Content-Type:image/jpeg
Date:Sun, 26 Oct 2014 09:18:31 GMT
ETag:"cc095261a9340535996fad26a9a882e9fdfc6b47"
Expires:Mon, 26 Oct 2015 09:18:31 GMT
Server:nginx/1.4.6 (Ubuntu)
Via:1.1 5e0a3a528dab62c5edfcdd8b8e4af060.cloudfront.net (CloudFront)
X-Amz-Cf-Id:B43x2w80SzQqvH-pDmLAmCZl2CY1AjBtHLjN4kG0_XmEIPk4AdiIOw==
X-Cache:Miss from cloudfront
After a new refresh, I get:
Age:50
Cache-Control:max-age=31536000
Connection:keep-alive
Date:Sun, 26 Oct 2014 09:19:21 GMT
ETag:"cc095261a9340535996fad26a9a882e9fdfc6b47"
Expires:Mon, 26 Oct 2015 09:18:31 GMT
Server:nginx/1.4.6 (Ubuntu)
Via:1.1 5e0a3a528dab62c5edfcdd8b8e4af060.cloudfront.net (CloudFront)
X-Amz-Cf-Id:slWyJ95Cw2F5LQr7hQFhgonG6oEsu4jdIo1KBkTjM5fitj-4kCtL3w==
X-Cache:Hit from cloudfront
My Nginx responses as following:
Cache-Control:max-age=31536000
Content-Length:4362
Content-Type:image/jpeg
Date:Sun, 26 Oct 2014 09:18:11 GMT
Etag:"cc095261a9340535996fad26a9a882e9fdfc6b47"
Expires:Mon, 26 Oct 2015 09:18:11 GMT
Server:nginx/1.4.6 (Ubuntu)
Why does Cloudfront not store my objects as indicated? Max-Age is set?
Many thanks in advance.
Your second request shows that the object was indeed cached. I assume you see that, but the question doesn't make it clear.
The Cache-Control: max-age only specifies the maximum age of your objects in the Cloudfront Cache at any particular edge location. There is no minimum time interval for which your objects are guaranteed to persist... after all, Cloudfront is a cache, which is volatile by definition.
If an object in an edge location isn't frequently requested, CloudFront might evict the object—remove the object before its expiration date—to make room for objects that are more popular.
— http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Expiration.html
Additionally, there is no concept of Cloudfront as a whole having a copy of your object. Each edge location's cache appears to operate independently of the others, so it's not uncommon to see multiple requests for relatively popular objects coming from different Cloudfront edge locations.
If you are trying to mediate the load on your back-end server, it might make sense to place some kind of cache that you control, in front of it, like varnish, squid, another nginx or a custom solution, which is how I'm accomplishing this in my systems.
Alternately, you could store every result in S3 after processing, and then configure your existing server to check S3, first, before attempting the work of resizing the object again.
Then why is there a documented "minimum" TTL?
On the same page quoted above, you'll also find this:
For web distributions, if you add Cache-Control or Expires headers to your objects, you can also specify the minimum amount of time that CloudFront keeps an object in the cache before forwarding another request to the origin.
I can see why this, and the tip phrase cited on the comment, below...
The minimum amount of time (in seconds) that an object is in a CloudFront cache before CloudFront forwards another request to your origin to determine whether an updated version is available. 
...would seem to contradict my answer. There is no contradiction, however.
The minimum ttl, in simple terms, establishes a lower boundary for the internal interpretation of Cache-Control: max-age, overriding -- within Cloudfront -- any smaller value sent by the origin server. Server says cache it for 1 day, max, but configured minimum ttl is 2 days? Cloudfront forgets about what it saw in the max-age header and may not check the origin again on subsequent requests for the next 2 days, rather than checking again after 1 day.
The nature of a cache dictates the correct interpretation of all of the apparent ambiguity:
Your configuration limits how long Cloudfront MAY serve up cached copies of an object, and the point after which it SHOULD NOT continue to return the object from its cache. They do not mandate how long Cloudfront MUST maintain the cached copy, because Cloudfront MAY evict an object at any time.
If you set the Cache-Control: header correctly, Cloudfront will consider the larger of max-age or your Minimum TTL as the longest amount of time you want them to serve up the cached copy without consulting the origin server again.
As your site traffic increases, this should become less of an issue, since your objects will be more "popular," but fundamentally there is no way to mandate that Cloudfront maintain a copy of an object.