I want to have multiple lifecycles for many folders in my bucket.
This seems easy if I use web interface but this should be an automated process so, at least in my case, it must use s3cmd.
It works fine when I use:
s3cmd expire ...
But, somehow, everytime I run this my last lifecycle gets overwrited.
There's an issue on github:
https://github.com/s3tools/s3cmd/issues/863
My question is: is there another way?
You made me notice I had the exact same problem as you. Another way to access the expire rules with s3cmd is to show the lifecycle configuration of the bucket.
s3cmd getlifecycle s3://bucketname
This way you get some xml formatted text:
<?xml version="1.0" ?>
<LifecycleConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<Rule>
<ID>RULEIDENTIFIER</ID>
<Prefix>PREFIX</Prefix>
<Status>Enabled</Status>
<Expiration>
<Days>NUMBEROFDAYS</Days>
</Expiration>
</Rule>
<Rule>
<ID>RULEIDENTIFIER2</ID>
<Prefix>PREFIX2</Prefix>
<Status>Enabled</Status>
<Expiration>
<Days>NUMBEROFDAYS2</Days>
</Expiration>
</Rule>
</LifecycleConfiguration>
If you put that text in a file, changing the appropriate fields (put identifiers of your choice, set the prefixes you want and the number of days until expiration), you now can use the following command (changing FILE for the path where you put the rules):
s3cmd setlifecycle FILE s3://bucketname
That should work (in my case, now I see several rules when I execute the getlifecycle command, although I do not know yet if the objects actually expire or not).
Related
Can you help me with that script? I would like to transfer files
to my bucket on S3 AWS.
My code:
$cmd = "s3cmd -v -c /root/.s3cfg put /var/project_db_" . $date . ".sql.xz s3://bucket600";
My second code - What I'm using. And doesn't work.
$cmd = "s3cmd expire -v -c /root/.s3cfg put /var/project_db_" . $date . ".sql.xz --expiry-day=90 s3://bucket600";
Thank you for your help
I searched Google for s3cmd expire and the first result took me to this page which says this:
Advanced features
Object Expiration with s3cmd
You can set an object
expiration policy on a bucket, so that objects older than a particular
age will be deleted automatically. The expiration policy can have a
prefix, an effective date, and number of days to expire after.
s3cmd v2.0.0 can be used to set or review the policy:
[lxplus-cloud]$ s3cmd expire s3://dvanders-test --expiry-days 2
Bucket 's3://dvanders-test/': expiration configuration is set.
[lxplus-cloud]$ s3cmd getlifecycle s3://dvanders-test
<?xml version="1.0" ?>
<LifecycleConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<Rule>
<ID>ir0smpb610i0lthrl31jpxzegwximbcz3rrgb1he2yfxgudm</ID>
<Prefix/>
<Status>Enabled</Status>
<Expiration>
<Days>2</Days>
</Expiration>
</Rule>
</LifecycleConfiguration>
Additional s3cmd expire options include:
--expiry-date=EXPIRY_DATE
Indicates when the expiration rule takes effect. (only
for [expire] command)
--expiry-days=EXPIRY_DAYS
Indicates the number of days after object creation the
expiration rule takes effect. (only for [expire] command)
--expiry-prefix=EXPIRY_PREFIX
Identifying one or more objects with the prefix to
which the expiration rule applies. (only for [expire]
command)
So you use the --expiry-date or --expiry-days command line option to do what you want.
(This question has nothing at all to do with Perl.)
when you have to place any file into the s3 bucket. You need to provide the location inside the s3 where you would want the file to be stored. If the specified folder(abstract) does not exist then it will be created.
In case you want to put the file to / (root) of the s3 then the destination would be set as "s3://bucket600/" (notice the slash at the end).
While reading over this S3 Lifecycle Policy document I see that it's possible to delete an S3 object containing a particular key=value pair e.g.,
<LifecycleConfiguration>
<Rule>
<Filter>
<Tag>
<Key>key</Key>
<Value>value</Value>
</Tag>
</Filter>
transition/expiration actions.
...
</Rule>
</LifecycleConfiguration>
But is it possible to create a similar rule that deletes any object NOT in the key=value pair? For example, anytime my object is accessed I could update it's tag with the days current date e.g., object-last-accessed=07-26-2019. Then I could create a Lambda function that deletes the current S3 Lifecycle policy each day and then create a new lifecycle policy that has a tag for each of the last 30 days, then my lifecycle policy would automatically delete any object that has not been accessed in the last 30 days; anything that was accessed longer than 30 days would have a date value older than any value inside the lifecycle policy and hence it would get deleted.
Here's an example of what I desire (note I added the desired field <exclude>,
<LifecycleConfiguration>
<Rule>
<Filter>
<exclude>
<Tag>
<Key>last-accessed</Key>
<Value>07-30-2019</Value>
</Tag>
...
<Tag>
<Key>last-accessed</Key>
<Value>07-01-2019</Value>
</Tag>
<exclude>
</Filter>
transition/expiration actions.
...
</Rule>
</LifecycleConfiguration>
Is something like my made up <exclude> value possible? I want to delete any S3 Object that has not been accessed in the last 30 days (that's different than an object which is older than 30 days).
From what I understand, this is possible but via a different mechanism.
My solution is to take a slightly different approach and set a tag on every object and then alter that tag as you need.
So in your instance when the object is created set object-last-accessed to "default" do that through an S3 trigger to a piece of Lambda or when the object is written to S3.
When the object is accessed, then update the tag value to the current date.
If you already have a bucket full of objects, you can use S3 batch to set the tag to the current date and use that as a delta reference point from which to assume files were last accessed
https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObjectTagging.html
Now set the lifecycle rule to remove objects with a tag of "default" after 10 days (or whatever you want).
Add additional rules to remove files with a tag of a date 10 days after that date. You will need to update the lifecycle rule periodically, but you can create 1000 at a time.
this doc gives details of the formal for a rule
https://docs.aws.amazon.com/AmazonS3/latest/API/API_LifecycleRule.html
I'd suggest something like this
<LifecycleConfiguration>
<Rule>
<ID>LastAccessed Default Rule</ID>
<Filter>
<Tag>
<Key>object-last-accessed</Key>
<Value>default</Value>
</Tag>
</Filter>
<Status>Enabled</Status>
<Expiration>
<Days>10</Days>
</Expiration>
</Rule>
<Rule>
<ID>Last Accessed 2020-05-19 Rule</ID>
<Filter>
<Tag>
<Key>object-last-accessed</Key>
<Value>2020-05-19</Value>
</Tag>
</Filter>
<Status>Enabled</Status>
<Expiration>
<Date>2020-05-29</Date>
</Expiration>
</Rule>
</LifecycleConfiguration>
Reading further on this, as I'm faced with this problem, an alternative is to just use the object lock retention mode which allows you to set a default retention on a bucket and then change that retention period as the file is accessed. This works at an version level, i.e. each version is retained for a period not the whole file, so may not be suitable for all. more details in https://docs.aws.amazon.com/AmazonS3/latest/dev/object-lock-overview.html#object-lock-retention-modes
Is it somewhat possible to have a file containing ignored files and folders during uploading items through AWS CLI.
It has an --exclude flag like mentioned here. However, the concept I seek is something like .gitignore or .dockerignore file rather than enlisting with a flag.
No, there is no in-built capability within the AWS Command-Line Interface (CLI) to support .ignore file capabilities.
I know it's not exactly what you are looking for but you could set an alias in your ~/.bash_profile something like:
alias s3_cp=`aws s3 cp --exclude "yadda, yadda, yadda"`
This would at least reduce the need to type them every time, even though it isn't in a concise file.
Edit: Here is a link that shows it doesn't look like the base config file supports what you are looking for. https://docs.aws.amazon.com/cli/latest/topic/s3-config.html
While transferring my files using "aws s3 sync", transferred files does not have right Content-type and Content-encoding. I am able to solve the types by tweaking /etc/mime.types however no idea how to set right encoding for ".gz" extension so zipped files are served as text apart from:
changing types on s3 afterwards (seems like double-work to me)
aws-cli using exclude / include with correct types
(this results in multiple commands)
Any idea how to solve this? Thanks...
Here is how I solved it,
aws s3 sync /tmp/foo/ s3://bucket/ --recursive
--exclude "*" --include "*.gz" --content-type "text/plain; charset=UTF-8"
By default aws s3 sync command assumes the best matching content types. If you want to change the default behavior you need to handle them separately.
Reference:
https://docs.aws.amazon.com/cli/latest/reference/s3/sync.html
Hope it helps.
I've been searching for an answer to this question for quite some time but apparently I'm missing something.
I use s3cmd heavily to automate document uploads to AWS S3, via script. One of the parameters that can be used in s3cmd is --add-header, which I assume allows for lifecycle rules to be added.
My objective is to add this parameters and specify a +X (where X is days) to the upload. In the event of ... --add-header=...1 ... the lifecyle rule would delete this file after 24h.
I know this can be easily done via the console, but I would like to have a more detailed control over individual files/scripts.
I've read the parameters that can be passed to S3 via s3cmd, but I somehow can't understand how to put all of those together to get the intended result.
Thank you very much for any help or assistance!
The S3 API itself does not implement support for any request header that triggers lifecycle management at the object level.
The --add-header option for s3cmd can add headers that S3 understands, such as Content-Type, but there is no lifecycle header you can send using any tool.
You might be thinking of this:
If you make a GET or a HEAD request on an object that has been scheduled for expiration, the response will include an x-amz-expiration header that includes this expiration date and the corresponding rule Id
https://aws.amazon.com/blogs/aws/amazon-s3-object-expiration/
This is a reaponse header, and is read-only.