Does akamai caches everything on a host or I must use headers to tell which to cache? - akamai

I am new to Akamai, thus my question is very basic.
How does Akamai decides what to cache? Does it caches everythin on a host it is enabled on or I have to tell explicitly using Pragma headers what to cache?
Another question if you don't mind! :) If I don't want to cache a response, is it necessary to set Pragma: no-store or I can simply not set any Pragma header at all?

You can configure it.
It's possible to:
have custom rules in the CDN depending on specific rules such as matching url, cookies,...
tell the CDN to honor your Cache-Control header
or to add a proprietary header in your response
In each case you can choose whether you want to cache or not the response and how long.

Related

Google Cloud CDN vary:cookie response never gets cache hit

I'm using Google Cloud CDN to cache an HTML page.
I've configured all the correct headers as per the docs, and the page is caching fine. Now, I want to change it so that it only caches when the request has no cookies, i.e. no cookie header set.
My understanding was that this was simply a case of changing my origin server to add a vary: cookie header to all responses for the page, then only adding the caching headers Cache-Control: public and Cache-Control: max-age=300 when no cookie header is set on the request.
However, this doesn't work. Using curl I can see that all caching headers, the vary: cookie header, are set as expected when I send requests with and without cookies, but I never get cache hits on the requests without cookies.
Digging into the Cloud CDN logs, I see that every request with no cookie header has cacheFillBytes populated with the same number as the response size - whereas it's not for the requests with a cookie header set with a value (as expected).
So it appears like Cloud CDN is attempting to populate the cache as expected for requests with no cookies, it's just that I never get a cache hit - i.e. it's just cacheFillBytes every time, cacheHit: true never appears in the logs.
Has anyone come across anything similar? I've triple-checked all my headers for typos, and indeed just removing the vary: cookie header makes caching work as expected, so I'm almost certain my configuration is right in terms of headers and what Cloud CDN considers cacheable.
Should Cloud CDN handle vary: cookie like I'm expecting it to? The docs suggest it handles arbitrary vary headers. And if so, why would I see cacheFillBytes on every request, with Cache-Control: public and Cache-Control: max-age=300 set on the response, but then never see a cacheHit: true on any subsequent request (I've tried firing hundreds with curl in a loop, it really never hits, it's not just that I'm populating a few different edge caches)?
I filed a bug with Google and it turns out that, indeed, the documentation was wrong.
vary: cookie is not supported by Cloud CDN
The docs have been updated - the only headers that can be used with vary are Accept, Accept-Encoding and Origin.
As per the GCP documentation[1], it is informed that Cloud CDN respects any Vary headers that origin servers include in responses. As per this information it looks like vary:cookie is supported by GCP Cloud CDN since any Vary header that the origin serves will be respected by Cloud CDN. Keep in mind though that this will negatively impact caching because the Vary header indicates that the response varies depending on the client's request headers. Therefore, if a request for an object has request header Cookie: abc, then a subsequent request for the same object with request header Cookie: xyz would not be served from the cache.So, yes it is supported and respected but will impact caching (https://cloud.google.com/cdn/docs/troubleshooting-steps?hl=en#low-hit-rate).
[1]https://cloud.google.com/cdn/docs/caching#vary_headers

How to allow Azure CDN (Standard Microsoft) to allow "set-cookie" for "ASP.NET_SessionId"?

My site uses ASP.NET_SessionId cookies. This is standard ASP.NET header used for session management.
CDN itself removes some headers from the response: in this case, the browser is not receiving "set-cookie" header for "ASP.NET_SessionId", despite the fact, it was sent by the web site (see screenshots below).
The home page is dynamic and is not intended to be cached. Also, page sets "no-cache" header.
This happens only with Azure CDN with Standard Microsoft profile.
Could you please provide any ideas on how to allow set-cookie to pass-through the CDN?
Original response headers:
Original Headers (two)
As you can see there are two "Set-Cookie" headers.
CDN-ified response headers:
Headers with CDN (one)
As you can see only one "Set-Cookie" header left, "ASP.NET_SessionId" is removed by CDN (some security rule?).
I cannot find any documentation on how to allow all headers to pass-through.
Thank you!
It seems that the ASP.NET session ID could not be cached as CDN could not cache such resources:
Dynamic resources that change frequently or are unique to an
individual user cannot be cached.
You can get more details about how CDN caching works.

How to pass UserAgent header to origin but not cache on it

We use CloudFront for caching user requests on CDN level.
The problem is that we want to pass User-Agent header to origin to be able to track it but we do not want to use it as caching key(there are just too many User agents and value of caching will be zero).
When using Whitelist strategy for headers AWS Cloudfront cuts all headers which are not whitelisted and are not default and doesn't send them to origin.
As we use Whitelist Forward strategy now, one of the solutions will be to switch to All strategy but we do like other headers that we are using as cache keys.
Any ideas?
I checked similar questions, but found no answer to this question by far:
Does Amazon pass custom headers to origin?
Pass cookie to CloudFront origin but prevent from caching
(similar but for cookies)

Django's default behaviour for web/http caching

My question is referring to this section of the django docs.
In there, there is a paragraph that reads:
Note that the caching middleware already sets the cache header's max-age with the value of the CACHE_MIDDLEWARE_SECONDS setting. If you use a custom max_age in a cache_control decorator, the decorator will take precedence, and the header values will be merged correctly.
My interpretation is that, by default, responses from django server-end would have "Cache-Control:max-age=600" in their http header sections, unless some http-cache-related decorator is used to modify the "Cache-Control" header.
I performed a quick experiment to verify my interpretation above. Surprisingly, when no http-cache-related is used on the view, the response generated has NO "Cache-Control" header at all.
Why am I seeing a different result from what the official docs describes? Have I misunderstood the outlined paragraph?
Also, when there is no "Cache-Control" header in a response, can I safely assume there's no http caching involved (ie. no cached response would be used)?
This does not happen "by default". Two conditions must be met for Django to attach the Cache-Control header:
You must have a caching backend set up, with either CACHES (Django 1.3+) or BACKEND (Django <1.3).
You must add the cache middleware to MIDDLEWARE_CLASSES.
See the docs for details.
As far as caching in the absence of a Cache-Control header, that then becomes the decision of the web browser or client in general. Cache-Control gives directives that must be followed, but browsers already generally cache on their own, so its real purpose is usually to prevent caching in certain scenarios, rather than enabling it.

Overriding ETag lookups?

I would like to override Etag lookups, because apparently they are slowing down a page as the latency for each request is quite BIG!
Expires headers don't seem to do the trick...
any ideas?
In the response object from the view you can set the ETag to whatever you like. This requires that you are using CommonMiddleware and USE_ETAGS is set to True.
However, if what you really want to do is to not call the view at all, why don't you just use the cache decorators and cache the result?
The code for etag handling.
I wish people would ask questions stating the version of apache they are running. This can get confusing quite quickly. Look at the FileETag directive (at least for apache 2.0 and 2.2).
FileETag none // will not generate and ETag for a file
See FileETag
each request is quite BIG!
Unless you've got some really funky custom patches on your apache installation or a very weird filesystem, the effort of generating an eTAG is not dependent on the size of the file - by default Apache uses the inode number, mod time and size.
Usually conditional requests can actually slow down a site a lot - and should be avoided (preferable by stripping the if-none-match / if-modified-since request header). The one time this is not the case is where you publish very large files (e.g. video, PDFs). If you can't modify the request headers (e.g. using Apache<2.0), then you'd need to strip both the ETag and the last-modified from the response - or refresh the timestamps opn your files regularly.
The 'Expires' header is an HTTP/1.0 directive - there's very little HTTP/1.0 traffic going on out there - the little which is usually comes from badly configured MSIE6 browsers working through a proxy). You should be sending out Cache-Control headers.