Copy data from one folder to another inside a AWS bucket automatically - amazon-web-services

I want to copy files from one folder to another inside the same bucket.
I have two folders Actual and Backup
As soon as new files comes to actual folder i want a way so that it immediately gets copied to Backup folder.

What you need are S3 Event Notifications. With these you can trigger a lambda function when a new item is put, then if it is put with one prefix, write the same object to the other prefix.
It is also worth noting that, though it is functionally as it seems, S3 doesn't really have directories; just objects. So you are just creating the same object as /Actual/some-file with key /Backup/some-file . It just looks like there is a directory because files /Actual/some-file and /Actual/other-file share a prefix /Actual/.

Related

AWS S3 how to make folder as today's date

I would like to create a folder structure displayed like below.
How should I specify the key?
*Top level folder would be today's date
/yyyymmdd/*.jason
e.g.
/2021-05-21/example.json
Folders do not actually exist in Amazon S3. You can create an object with any path and the folder magically 'appears'. Then, if the object is deleted, the folder will disappear. Amazon S3 calls them CommonPrefixes.
The Amazon S3 management console has a "Create Folder" button. This actually creates a zero-length object with the same name as the folder (e.g. 2021-05-21/). This 'forces' the folder to appear because there is an object inside it. However, the zero-length object is hidden in the console.
So, if you wish to 'create a folder', simply create a zero-length object with the name of the folder.
Or, better yet, do not create the folder. Just pretend that it exists and things will work perfectly fine.

How to prevent Amazon S3 object expiration from deleting the directory

We want to delete temp files from the S3 bucket from one of the folder on daily basis. I have tried with s3 lifecycle policy. For example my folder name is Test, I have set prefix for expiration is Test/ . But the issue here is along with all the files, Test folder is also getting deleted. I want to keep the folder as is and only delete the files in that . Is there any way i can do this?
Don't worry about the folder.
As soon as you have at least one file "in" the folder (object key beginning with Test/) it will appear again and when there are no objects beginning with that key prefix it will disappear -- all of this is normal/expected behavior because folders in S3 are not containers like they are on a filesystem and they do not actually need to exist before putting files "in" them.

uploading file to specific folder in S3 bucket using boto3

My code is working. The only issue I'm facing is that I cannot specify the folder within the S3 bucket that I would like to place my file in. Here is what I have:
with open("/hadoop/prodtest/tips/ut/s3/test_audit_log.txt", "rb") as f:
s3.upload_fileobj(f, "us-east-1-tip-s3-stage", "BlueConnect/test_audit_log.txt")
Explanation from #danimal captures pretty much everything. If you wanted to just create a folder-like object in s3, you can simply specify that folder-name and end it with "/", so that when you look at it from the console, it will look like a folder.
It's rather useless, an empty object, without a body (consider it as a key with null value) just for eye-candy but if you really want to do it, you can.
1) You can create it on the console interactively, as it gives you that option
2_ You can use aws sdk. boto3 has put_object method for s3 client, where you specify the key as "your_folder_name/", see example below:
import boto3
session = boto3.Session() # I assume you know how to provide credentials etc.
s3 = session.client('s3', 'us-east-1')
bucket = s3.create_bucket('my-test-bucket')
response = s3.put_object(Bucket='my-test-bucket', Key='my_pretty_folder/' # note the ending "/"
And there you have your bucket.
Again, when you are uploading a file you specify "my-test-bucket/my_file" and what you did there is create a "key" with name "my-test-bucket/my_file" and put the content of your file as its "value".
In this case you have 2 objects in the bucket. First object has null body and looks like a folder, while the second one looks like it is inside that but as #danimal pointed out in reality you created 2 keys in the same flat hierarchy, it just "looks-like" what we are used to seeing in a file system.
If you delete the file, you still have the other objects, so on the aws console, it looks like folder is still there but no files inside.
If you skipped creating the folder and simply uploaded the file like you did, you would still see the folder structure in AWS Console but you have a single object at that point.
When you however list the objects from command line, you would see a single object and if you delete it on the console it looks like folder is gone too.
Files ('objects') in S3 are actually stored by their 'Key' (~folders+filename) in a flat structure in a bucket. If you place slashes (/) in your key then S3 represents this to the user as though it is a marker for a folder structure, but those folders don't actually exist in S3, they are just a convenience for the user and allow for the usual folder navigation familiar from most file systems.
So, as your code stands, although it appears you are putting a file called test_audit_log.txt in a folder called BlueConnect, you are actually just placing an object, representing your file, in the us-east-1-tip-s3-stage bucket with a key of BlueConnect/test_audit_log.txt. In order then to (seem to) put it in a new folder, simply make the key whatever the full path to the file should be, for example:
# upload_fileobj(file, bucket, key)
s3.upload_fileobj(f, "us-east-1-tip-s3-stage", "folder1/folder2/test_audit_log.txt")
In this example, the 'key' of the object is folder1/folder2/test_audit_log.txt which you can think of as the file test_audit_log.txt, inside the folder folder1 which is inside the folder folder2 - this is how it will appear on S3, in a folder structure, which will generally be different and separate from your local machine's folder structure.

Replicate local directory in S3 bucket

I have to replicate my local folder structure in S3 bucket, I am able to do so but its not creating folders which are empty. My local folder structure is as follows and command used is.
"aws-exec s3 sync ./inbound s3://msit.xxwmm.supplychain.relex.eeeeeeeeee/
its only creating inbound/procurement/pending/test.txt, masterdata and transaction is not cretated but if i put some file in each directory it will create.
As answered by #SabeenMalik in this StackOverflow thread:
S3 doesn't have the concept of directories, the whole folder/file.jpg
is the file name. If using a GUI tool or something you delete the
file.jpg from inside the folder, you will most probably see that the
folder is gone too. The visual representation in terms of directories
is for user convenience.
You do not need to pre-create the directory structure. Just pretend that the structure is there and everything will be okay.
Amazon S3 will automatically create the structure as objects are written to paths. For example, creating an object called s3://bucketname/inbound/procurement/foo` will automatically create the directories.
(This isn't strictly true because Amazon S3 doesn't use directories, but it will appear that the directories are there.)

Cost of renaming a folder in AWS S3 bucket

I want to rename a folder in S3 bucket, I understand that rename will run a PUT request which costs 1 cent per 1000 request.
However, the PUT request is defined as a COPY and involves with also a GET
My question is, when we rename a folder in S3 bucket, does it involve copying all sub-folders and files to a new folder with the name I want (which costs more than 1 PUT request), or it just simply 1 PUT request to change the name without touching all the items within the folder.
In case you've missed it... there are no folders in S3.
The object /pics/funny/cat.jpg is not a file called cat.jpg inside a folder called funny inside another folder called pics.
In fact, it is a file with an 18-character name: pics/funny/cat.jpg. The hierarchy shown in the console is largely for human convenience, and the ability to create new folders in the console is an illusion, also.
So, yes, renaming a "folder" actually means making a new copy of each object in the "folder," with a change to the object names to look like their are in the path.
This can be done with a PUT/COPY request ($0.005 per 1000 depending on the region) followed by a DELETE request of the old object (free). There is no corresponding GET request, because PUT/COPY is an atomic operation inside S3, so actually downloading and re-uploading the data is avoided.