Force CloudFront distribution/file update - amazon-web-services

I'm using Amazon's CloudFront to serve static files of my web apps.
Is there no way to tell a cloudfront distribution that it needs to refresh it's file or point out a single file that should be refreshed?
Amazon recommend that you version your files like logo_1.gif, logo_2.gif and so on as a workaround for this problem but that seems like a pretty stupid solution. Is there absolutely no other way?

Good news. Amazon finally added an Invalidation Feature. See the API Reference.
This is a sample request from the API Reference:
POST /2010-08-01/distribution/[distribution ID]/invalidation HTTP/1.0
Host: cloudfront.amazonaws.com
Authorization: [AWS authentication string]
Content-Type: text/xml
<InvalidationBatch>
<Path>/image1.jpg</Path>
<Path>/image2.jpg</Path>
<Path>/videos/movie.flv</Path>
<CallerReference>my-batch</CallerReference>
</InvalidationBatch>

As of March 19, Amazon now allows Cloudfront's cache TTL to be 0 seconds, thus you (theoretically) should never see stale objects. So if you have your assets in S3, you could simply go to AWS Web Panel => S3 => Edit Properties => Metadata, then set your "Cache-Control" value to "max-age=0".
This is straight from the API documentation:
To control whether CloudFront caches an object and for how long, we recommend that you use the Cache-Control header with the max-age= directive. CloudFront caches the object for the specified number of seconds. (The minimum value is 0 seconds.)

With the Invalidation API, it does get updated in a few of minutes.
Check out PHP Invalidator.

Bucket Explorer has a UI that makes this pretty easy now. Here's how:
Right click your bucket. Select "Manage Distributions."
Right click your distribution. Select "Get Cloudfront invalidation list"
Then select "Create" to create a new invalidation list.
Select the files to invalidate, and click "Invalidate." Wait 5-15 minutes.

Automated update setup in 5 mins
OK, guys. The best possible way for now to perform automatic CloudFront update (invalidation) is to create Lambda function that will be triggered every time when any file is uploaded to S3 bucket (a new one or rewritten).
Even if you never used lambda functions before, it is really easy -- just follow my step-by-step instructions and it will take just 5 mins:
Step 1
Go to https://console.aws.amazon.com/lambda/home and click Create a lambda function
Step 2
Click on Blank Function (custom)
Step 3
Click on empty (stroked) box and select S3 from combo
Step 4
Select your Bucket (same as for CloudFront distribution)
Step 5
Set an Event Type to "Object Created (All)"
Step 6
Set Prefix and Suffix or leave it empty if you don't know what it is.
Step 7
Check Enable trigger checkbox and click Next
Step 8
Name your function (something like: YourBucketNameS3ToCloudFrontOnCreateAll)
Step 9
Select Python 2.7 (or later) as Runtime
Step 10
Paste following code instead of default python code:
from __future__ import print_function
import boto3
import time
def lambda_handler(event, context):
for items in event["Records"]:
path = "/" + items["s3"]["object"]["key"]
print(path)
client = boto3.client('cloudfront')
invalidation = client.create_invalidation(DistributionId='_YOUR_DISTRIBUTION_ID_',
InvalidationBatch={
'Paths': {
'Quantity': 1,
'Items': [path]
},
'CallerReference': str(time.time())
})
Step 11
Open https://console.aws.amazon.com/cloudfront/home in a new browser tab and copy your CloudFront distribution ID for use in next step.
Step 12
Return to lambda tab and paste your distribution id instead of _YOUR_DISTRIBUTION_ID_ in the Python code. Keep surrounding quotes.
Step 13
Set handler: lambda_function.lambda_handler
Step 14
Click on the role combobox and select Create a custom role. New tab in browser will be opened.
Step 15
Click view policy document, click edit, click OK and replace role definition with following (as is):
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "arn:aws:logs:*:*:*"
},
{
"Effect": "Allow",
"Action": [
"cloudfront:CreateInvalidation"
],
"Resource": [
"*"
]
}
]
}
Step 16
Click allow. This will return you to a lambda. Double check that role name that you just created is selected in the Existing role combobox.
Step 17
Set Memory (MB) to 128 and Timeout to 5 sec.
Step 18
Click Next, then click Create function
Step 19
You are good to go! Now on, each time you will upload/reupload any file to S3, it will be evaluated in all CloudFront Edge locations.
PS - When you are testing, make sure that your browser is loading images from CloudFront, not from local cache.
PSS - Please note, that only first 1000 files invalidation per month are for free, each invalidation over limit cost $0.005 USD. Also additional charges for Lambda function may apply, but it is extremely cheap.

If you have boto installed (which is not just for python, but also installs a bunch of useful command line utilities), it offers a command line util specifically called cfadmin or 'cloud front admin' which offers the following functionality:
Usage: cfadmin [command]
cmd - Print help message, optionally about a specific function
help - Print help message, optionally about a specific function
invalidate - Create a cloudfront invalidation request
ls - List all distributions and streaming distributions
You invaliate things by running:
$sam# cfadmin invalidate <distribution> <path>

one very easy way to do it is FOLDER versioning.
So if your static files are hundreds for example, simply put all of them into a folder called by year+versioning.
for example i use a folder called 2014_v1 where inside i have all my static files...
So inside my HTML i always put the reference to the folder. ( of course i have a PHP include where i have set the name of the folder. ) So by changing in 1 file it actually change in all my PHP files..
If i want a complete refresh, i simply rename the folder to 2014_v2 into my source and change inside the php include to 2014_v2
all HTML automatically change and ask the new path, cloudfront MISS cache and request it to the source.
Example:
SOURCE.mydomain.com is my source,
cloudfront.mydomain.com is CNAME to cloudfront distribution.
So the PHP called this file
cloudfront.mydomain.com/2014_v1/javascript.js
and when i want a full refresh, simply i rename folder into the source to "2014_v2" and i change the PHP include by setting the folder to "2014_v2".
Like this there is no delay for invalidation and NO COST !
This is my first post in stackoverflow, hope i did it well !

In ruby, using the fog gem
AWS_ACCESS_KEY = ENV['AWS_ACCESS_KEY_ID']
AWS_SECRET_KEY = ENV['AWS_SECRET_ACCESS_KEY']
AWS_DISTRIBUTION_ID = ENV['AWS_DISTRIBUTION_ID']
conn = Fog::CDN.new(
:provider => 'AWS',
:aws_access_key_id => AWS_ACCESS_KEY,
:aws_secret_access_key => AWS_SECRET_KEY
)
images = ['/path/to/image1.jpg', '/path/to/another/image2.jpg']
conn.post_invalidation AWS_DISTRIBUTION_ID, images
even on invalidation, it still takes 5-10 minutes for the invalidation to process and refresh on all amazon edge servers

current AWS CLI support invalidation in preview mode. Run the following in your console once:
aws configure set preview.cloudfront true
I deploy my web project using npm. I have the following scripts in my package.json:
{
"build.prod": "ng build --prod --aot",
"aws.deploy": "aws s3 sync dist/ s3://www.mywebsite.com --delete --region us-east-1",
"aws.invalidate": "aws cloudfront create-invalidation --distribution-id [MY_DISTRIBUTION_ID] --paths /*",
"deploy": "npm run build.prod && npm run aws.deploy && npm run aws.invalidate"
}
Having the scripts above in place you can deploy your site with:
npm run deploy

Set TTL=1 hour and replace
http://developer.amazonwebservices.com/connect/ann.jspa?annID=655

Just posting to inform anyone visiting this page (first result on 'Cloudfront File Refresh')
that there is an easy-to-use+access online invalidator available at swook.net
This new invalidator is:
Fully online (no installation)
Available 24x7 (hosted by Google) and does not require any memberships.
There is history support, and path checking to let you invalidate your files with ease. (Often with just a few clicks after invalidating for the first time!)
It's also very secure, as you'll find out when reading its release post.
Full disclosure: I made this. Have fun!

Go to CloudFront.
Click on your ID/Distributions.
Click on Invalidations.
Click create Invalidation.
In the giant example box type * and click invalidate
Done

If you are using AWS, you probably also use its official CLI tool (sooner or later). AWS CLI version 1.9.12 or above supports invalidating a list of file names.
Full disclosure: I made this. Have fun!

Related

Cache-Control policy for external Next.js Image coming from AWS CloudFront is not efficient for Google Lighthouse

I'm trying to optimize my Next.js app by fixing all the issues reported by Google Lighthouse.
One of the most important pitfalls it has right now regards to images:
As per the documentation, Next.js automatically does so for media in the static folder
Next.js automatically adds caching headers to immutable assets served from /_next/static including JavaScript, CSS, static images, and other media.
As all of those problematic images are coming from an API, which serves them from AWS CloudFront, I can't find a way to fix the problem.
I guess adding the suggested Cache-Control policy in CloudFront may help but I don't know
1.) If that's the right solution
2.) How to do so in AWS Console
Next.js automatically adds caching headers to immutable assets served from /_next/static including JavaScript, CSS, static images, and other media.
This is not the case when you are using next export and deploying a static site to S3 & CloudFront, see unsupported features (although they do not state that explicitly). What you can do is set these Cache-Control headers yourself manually with S3 object metadata. This is a preferred way for a next.js app because you can specify the headers for each object separately.
Generally (see Caching best practices & max-age gotchas) you should add max-age=31536000,public,immutable directives to the whole _next/static folder since these will have the hash appended to the file name, thus the cache gets invalidated on every new change.
Other than that it's for you to manage & depends on what type of app you're building, but it's common practice to set the HTML documents to public,max-age=0,must-revalidate (even no-cache,no-store). Since you are using CloudFront it's fine to keep them in the edge cache as long as u have a proper invalidation set up.
You also might have non-statically imported images with src="<path string>", these also won't be exported to the _next/static folder so if you want to add long max-age & immutable content to these you have to manage the versioning/hashing yourself to invalidate properly when the images change.
With AWS Console
Checkout editing object metadata in the Amazon S3 console, to add Cache-Control headers to _next/static objects metadata:
Open the Amazon S3 console and your bucket.
Select the check box to the left of the _next/ directory.
On the Actions menu, choose Edit actions, and choose Edit metadata.
Choose Add metadata.
For metadata Type, select System-defined.
Select Cache-Control for the key and add max-age=31536000,public,immutable as the value.
When you are done, hit Save Changes and Amazon S3 should edit all the _next/static files metadata recursively, you can verify it by opening a specific file and scrolling down to the metadata section.
With CDK
If you are using AWS CDK you can use multiple BucketDeployment constructs to specify different Cache-Control headers for different out directories (see examples):
// _next/static - long max-age & immutable content
new s3deploy.BucketDeployment(this, 'BucketDeployment', {
...
sources: [s3deploy.Source.asset('./out', { exclude: [ '/**/*', '!/_next/static/**/*'] })],
cacheControl: [s3deploy.CacheControl.fromString('max-age=31536000,public,immutable')],
...
});
// revalidate everything else
new s3deploy.BucketDeployment(this, 'BucketDeployment', {
...
sources: [s3deploy.Source.asset('./out', { exclude: ['/_next/static/**/*'] })],
cacheControl: [s3deploy.CacheControl.fromString('max-age=0,no-cache,no-store,must-revalidate')],
...
});
Improving without Vercel
I'd also suggest (if using some additional AWS resources and terraform is not an issue) taking a look at this terraform module or very least their image optimizer which can be dropped in as a standalone image optimization loader for the Next.js image component, so you get all the next/image component benefits (there's also this issue for bunch of other workarounds).

Changing Storage class from Multi-Regional to Coldline in Google Cloud Platform

I just finished my 1 year free trial with Google Cloud Platform and I am now being billed.
When I set my first project up, it looks like I set it up as Multi-Regional. I would only use the Google Cloud Storage in the event of a catastrophic failure in my home where i lose data on both internal and external hard drives (ie. fire, etc) . I believe for this type of backup, I only need Coldline storage. I did change my project over to Coldline but it looks like it only changes new data, not the original stored data because I am still being charged for Multi-regional storage.
From what I understand, I have to change the Object Storage Class either by overwriting the data using "gsutil rewrite -s [STORAGE_CLASS] gs://[PATH_TO_OBJECT]" or by Object Lifestyle Management. I could not figure out how to do either, so I need help doing this (I am not even sure where to type these commands or which approach to use (I am not a programmer!!)).
I also saw in another post that my gsutil command needs to up to date 4.22 or higher. How do I check this?? I also saw in this post that the [PATH_TO_OBJECT] is My Bucket. I see a Project Name, Project ID, and Project number. Which of these (if any) are used in that field for My Bucket?
Thank you for any help
I also saw in another post that my gsutil command needs to up to date
4.22 or higher. How do I check this??
Get the gsutil version:
gsutil version
Update the Cloud SDK which includes gsutil:
Windows:
Open a command prompt with Administrator rights
gcloud components update
Linux:
gcloud components update
I see a Project Name, Project ID, and Project number. Which of these
(if any) are used in that field for My Bucket.
Use the PROJECT_ID. To get a list of the projects that you have access to. This command will list each project.
gcloud projects list
To see which is your default project:
gcloud config list project
If the default project is blank or the wrong one, use the following command.
To set the default project:
gcloud config set project [PROJECT_ID]
From what I understand, I have to change the Object Storage Class
either my overwriting the data
Assuming your bucket name is mybucket.
STEP 1: Change the default storage class for the bucket:
gsutil defstorageclass set coldline gs://mybucket
STEP 2: Change the storage class for each object manually. This is an option if you want to just select a few files.
gsutil rewrite -s coldline gs://mybucket/objectname
STEP 3: Verify the existing lifecycle policy. Change step 4 accordingly if an existing policy exists.
gsutil lifecycle get gs://mybucket
STEP 4: Change the lifecycle of the bucket. This policy will move all files older than 7 days to coldline storage.
POLICY (write to lifecycle.json):
{
"lifecycle": {
"rule": [
{
"action": {
"type": "SetStorageClass",
"storageClass": "COLDLINE"
},
"condition": {
"age": 7,
"matchesStorageClass": [
"MULTI_REGIONAL",
"STANDARD",
"DURABLE_REDUCED_AVAILABILITY"
]
}
}
]
}
}
Command:
gsutil lifecycle set lifecycle.json gs://mybucket

Remove Incomplete Multipart Upload files from AWS S3

actually we have one application for file storage (Dropbox) which is using the AWS s3 bucket .
we have diffrect plans for end users like Free and silver/paied depanding on the size of file.
Some time users upload the file druing the upload process its intrept due to some reason like
1 - user cancel the uploading Process in middle
2 - Network glitch between user's internet and AWS S3
In above cases if for example user try to upload 1GB file and in the middle of upload process user/he/she cancel it, in this cases 50% (0.5GB) file was already uploaded to S3.
so that uploaded file is there on the s3 backet and it occoupied the space on s3 and also we have to pay for that 0.5GB file.
I want if upload process kill by end user or due to the network issue the uploaded part of file should be delete from s3 after some time or on the same time when user upload it and it was not completed/intercepted.
how can i define a life cycle for S3 bucket to accomplished my requirement.
You can create a new rule for incomplete multipart uploads using the Console:
1) Start by opening the console and navigating to the desired bucket
2) Then click on Properties, open up the Lifecycle section, and click on Add rule:
3) Decide on the target (the whole bucket or the prefixed subset of your choice) and then click on Configure Rule:
4) Then enable the new rule and select the desired expiration period:
5) As a best practice, we recommend that you enable this setting even if you are not sure that you are actually making use of multipart uploads. Some applications will default to the use of multipart uploads when uploading files above a particular, application-dependent, size.
Here’s how you set up a rule to remove delete markers for expired objects that have no previous versions:
You can refer this AWS Blog Post
Note: If you are on New Console Select Bucket --> Click Management
(4th Tab) --> Select Lifecycle Tab (1st) --> Click Add Lifecycle Rule
Butto
n.

Amazon S3 static site serves old contents

My S3 bucket hosts a static website. I do not have cloudfront set up.
I recently updated the files in my S3 bucket. While the files got updated, I confirmed manually in the bucket. It still serves an older version of the files. Is there some sort of caching or versioning that happens on Static websites hosted on S3?
I haven't been able to find any solution on SO so far. Note: Cloudfront is NOT enabled.
Is there some sort of caching or versioning that happens on Static websites hosted on S3?
Amazon S3 buckets provide read-after-write consistency for PUTS of new objects and eventual consistency for overwrite PUTS and DELETES
what does this mean ?
If you create a new object in s3, you will be able to immediately access your object - however in case you do an update of an existing object, you will 'eventually' get the newest version of you object from s3, so s3 might still deliver you the previous version of the object.
I believe that starting some time ago, read-after-write consistency is also available for update in the US Standard region.
how much do you need to wait ? well it depends, Amazon does not provide much information about this.
what you can do ? no much. If you want to make sure you do not have any issue with your S3 bucket delivering the files, upload a new file in your bucket, you will be able to access it immediately
Solution is here:
But you need to use CloundFront. like #Frederic Henri said, you cannot do much in S3 bucket itself, but with CloudFront, you can invalidate it.
CloudFront will have cached that file on an edge location for 24 hours which is the default TTL (time to live), and will continue to return that file for 24 hours. Then after the 24 hours are over, and a request is made for that file, CloudFront will check the origin and see if the file has been updated in the S3 bucket. If is has been updated, CloudFront will then serve the new updated version of the object. If it has not been updated, then CloudFront will continue to serve the original version of the object.
However where you update the file in the origin and wish for it to be served immediately via your website, then what needs to be done is a CloudFront invalidation. An invalidation wipes the file(s) from the CloudFront cache, so when a request is made to CloudFront, it will see that there are no files on the cache, will then check the origin and will serve the new updated file in the origin. Running an invalidation is recommended each time files are updated in the origin.
To run an invalidation:
click on the following link for CloudFront console
-- https://console.aws.amazon.com/cloudfront/home?region=eu-west-1#
open the distribution in question
click on the 'Invalidations' tab to the right of all the tabs
click on 'Create Invalidation'
on the popup, it will ask for the path. You can enter /* to invalidate every object from the cache, or enter the exact path tot he file, such as /images/picture.jpg
finally click on 'Invalidate'
this typically will be completed within 2/3 minutes
then once the invalidation is complete, when you request the object again through CloudFront, CloudFront will check the origin and return the updated file.
It sounds like Akshay tried uploading with a new filename and it worked.
I just tried the same (I was having the same problem), and it resolved the file not being available for me.
Do a push of index.html
index.html not updated
mv index.html index-new.html
Do a push of new-index.htlml
After this, index-html was immediately available.
That's kind of shite - I can't share one link to my website if I want to be sure that the recipient will see the latest version? I need to keep changing the filename and re-sharing the new link.

AWS cloudfront not updating on update of files in S3

I created a distribution in cloudfront using my files on S3.
It worked fine and all my files were available. But today I updated my files on S3 and tried to access them via Cloudfront, but it still gave old files.
What am I missing ?
Just ran into the same issue. At first I tried updating the cache control to be 0 and max-age=0 for the files in my S3 bucket I updated but that didn't work.
What did work was following the steps from #jpaljasma. Here's the steps with some screen shots.
First go to your AWS CloudFront service.
Then click on the CloudFront distrubition you want to invalidate.
Click on the invalidations tab then click on "Create Invalidation" which is circled in red.
In the "object path" text field, you can list the specific files ie /index.html or just use the wildcard /* to invalidate all. This forces cloudfront to get the latest from everything in your S3 bucket.
Once you filled in the text field click on "Invalidate", after CloudFront finishes invalidating you'll see your changes the next time you go to the web page.
Note: if you want to do it via aws command line interface you can do the following command
aws cloudfront create-invalidation --distribution-id <your distribution id> --paths "/*"
The /* will invalidate everything, replace that with specific files if you only updated a few.
To find the list of cloud front distribution id's you can do this command aws cloudfront list-distributions
Look at these two links for more info on those 2 commands:
https://docs.aws.amazon.com/cli/latest/reference/cloudfront/create-invalidation.html
https://docs.aws.amazon.com/cli/latest/reference/cloudfront/list-distributions.html
You should invalidate your objects in CloudFront distribution cache.
Back in the old days you'd have to do it 1 file at a time, now you can do it wildcard, e.g. /images/*
https://aws.amazon.com/about-aws/whats-new/2015/05/amazon-cloudfront-makes-it-easier-to-invalidate-multiple-objects/
How to change the Cache-Control max-age via the AWS S3 Console:
Navigate to the file whose Cache-Control you would like to change.
Check the box next to the file name (it will turn blue)
On the top right click Properties
Click Metadata
If you do not see a Key named Cache-Control, then click Add more metadata.
Set the Key to Cache-Control set the Value to max-age=0 (where 0 is the number of seconds you would like the file to remain in the cache). You can replace 0 with whatever you want.
The main advantage of using CloudFront is to get your files from a source (S3 in your case) and store it on edge servers to respond to GET requests faster. CloudFront will not go back to S3 source for each http request.
To have CloudFront serve latest fiels/objects, you have multiple options:
Use CloudFront to Invalidate modified Objects
You can use CloudFront to invalidate one or more files or directories manually or using a trigger. This option have been described in other responses here. More information at Invalidate Multiple Objects in CloudFront. This approach comes handy if you are updating your files infrequently and do not want to impact the performance benefits of cached objects.
Setting object expiration dates on S3 objects
This is now the recommended solution. It is straight forward:
Log in to AWS Management Console
Go into S3 bucket
Select all files
Choose "Actions" drop down from the menu
Select "Change metadata"
In the "Key" field, select "Cache-Control" from the drop down menu.
In the "Value" field, enter "max-age=300" (number of seconds)
Press "Save" button
The default cache value for CloudFront objects is 24 hours. By changing it to a lower value, CloudFront checks with the S3 source to see if a newer version of the object is available in S3.
I use a combination of these two methods to make sure updates are propagated to an edge locations quickly and avoid serving outdated files managed by CloudFront.
AWS however recommends changing the object names by using a version identifier in each file name. If you are using a build command and compiling your files, that option is usually available (as in react npm build command).
For immediate reflection of your changes, you have to invalidate objects in the Cloudfront - Distribution list -> settings -> Invalidations -> Create Invalidation.
This will clear the cache objects and load the latest ones from S3.
If you are updating only one file, you can also invalidate exactly one file.
It will just take few seconds to invalidate objects.
Distribution List -> settings -> Invalidations -> Create Invalidation
I also faced similar issues and found out its really easy to fix in your cloudfront distribution
Step 1.
Login To your AWS account and select your target distribution as shown in the picture below
Step 2.
Select Distribution settings and select behaviour tab
Step 3.
Select Edit and choose option All as per the below image
Step 4.
Save your settings and that's it
I also had this issue and solved it by using versioning (not the same as S3 versioning). Here is a comprehensive link to using versioning with cloudfront
Invalidating Files
In summary:
When you upload a new file or files to your S3 bucket, change the version, and update your links as appropriate. From the documentation the benefit of using versioning vs. invalidating (the other way to do this) is that there is no additional charge for making CloudFront refresh by version changes whereas there is with invalidation. If you have hundreds of files this may be problematic, but its possible that by adding a version to your root directory, or default root object (if applicable) it wouldn't be a problem. In my case, I have an SPA, all I have to do is change the version of my default root object (index.html to index2.html) and it instantly updates on CloudFront.
Thanks tedder42 and Chris Heald
I was able to reduce the cache duration in my origin i.e. s3 object and deliver the files more instantly then what it was by default 24 hours.
for some of my other distribution I also set forward all headers to origin in which cloudfront doesn't cache anything and sends all request to origin.
thanks.
Please refer to this answer this may help you.
What's the difference between Cache-Control: max-age=0 and no-cache?
Adding a variable Cache-Control to 0 in the header to the selected file in S3
How to change the Cache-Control max-age via the AWS S3 Console:
Go to your bucket
Select all files you would like to change (you can select folders as well, it will include all files inside them
Click on the Actions dropdown, then click on Edit Metadata
On the page that will open, click on Add metadata
Set Type to System defined
Set Key to Cache-Control
Set value to 0 (or whatever you would like to set it to)
Click on Save Changes
Invalidate all distribution files:
aws cloudfront create-invalidation --distribution-id <dist-id> --paths "/*"
If you need to remove a file from CloudFront edge caches before it expires docs
The best practice for solving this issue is probably using the Object Version approach.
The invalidation method could solve this problem anyhow but it will bring you some side effects simultaneously. Such as cost increasing if exceeding 1000 times per month, or some object could not be deleted via this method.
Hope the official doc on "Why CloudFront is serving outdated content from Amazon" could help the poor guys.