Replicate local directory in S3 bucket - amazon-web-services

I have to replicate my local folder structure in S3 bucket, I am able to do so but its not creating folders which are empty. My local folder structure is as follows and command used is.
"aws-exec s3 sync ./inbound s3://msit.xxwmm.supplychain.relex.eeeeeeeeee/
its only creating inbound/procurement/pending/test.txt, masterdata and transaction is not cretated but if i put some file in each directory it will create.

As answered by #SabeenMalik in this StackOverflow thread:
S3 doesn't have the concept of directories, the whole folder/file.jpg
is the file name. If using a GUI tool or something you delete the
file.jpg from inside the folder, you will most probably see that the
folder is gone too. The visual representation in terms of directories
is for user convenience.

You do not need to pre-create the directory structure. Just pretend that the structure is there and everything will be okay.
Amazon S3 will automatically create the structure as objects are written to paths. For example, creating an object called s3://bucketname/inbound/procurement/foo` will automatically create the directories.
(This isn't strictly true because Amazon S3 doesn't use directories, but it will appear that the directories are there.)

Related

Is it feasible to maintain directory structure when backing up to AWS S3 Glacier classes?

I am trying to backup 2TB from a shared drive of Windows Server to S3 Glacier
There are maybe 100 folders (some may be nested ) and perhaps 5000 files (some small like spread sheets, photos and other are larger like server images. My first question is what counts as an object here?
Let’s say I have Folder 1 which has 10 folders inside it. Each of 10 folders have 100 files.
Would number of objects be 1 folder + (10 folders * 100 files) = 1001 objects?
I am trying to understand how folder nesting is treated in S3. Do I have to manually create each folder as a prefix and then upload each file inside that using AWS CLI? I am trying to recreate the shared drive experience on the cloud where I can browse the folders and download the files I need.
Amazon S3 does not actually support folders. It might look like it does, but it actually doesn't.
For example, you could upload an object to invoices/january.txt and the invoices directory will just magically 'appear'. Then, if you deleted that object, the invoices folder would magically 'disappear' (because it never actually existed).
So, feel free to upload objects to any location without creating the directories first.
However, if you click the Create folder button in the Amazon S3 management console, it will create a zero-length object with the name of the directory. This will make the directory 'appear' and it would be counted as an object.
The easiest way to copy the files from your Windows computer to an Amazon S3 bucket would be:
aws s3 sync directoryname s3://bucket-name/ --storage-class DEEP_ARCHIVE
It will upload all files, including files in subdirectories. It will not create the folders, since they aren't necessary. However, the folder will still 'appear' in S3.

Copy data from one folder to another inside a AWS bucket automatically

I want to copy files from one folder to another inside the same bucket.
I have two folders Actual and Backup
As soon as new files comes to actual folder i want a way so that it immediately gets copied to Backup folder.
What you need are S3 Event Notifications. With these you can trigger a lambda function when a new item is put, then if it is put with one prefix, write the same object to the other prefix.
It is also worth noting that, though it is functionally as it seems, S3 doesn't really have directories; just objects. So you are just creating the same object as /Actual/some-file with key /Backup/some-file . It just looks like there is a directory because files /Actual/some-file and /Actual/other-file share a prefix /Actual/.

AWS S3 - Use powershell to delete all files but keep the folders

I have a powershell script, that downloads all files form an S3 bucket, and then removes the files from the bucket. All the files I'm removing are stored in a subfolder in the S3 bucket, and I just want to delete the files but maintain the subfolders.
I'm currently using the following command to delete the files in S3 once the file has been downloaded from S3.
Remove-S3Object -BucketName $S3Bucket -Key $key -Force
My problem is that if it removes all the files in the subfolder, the subfolder is removed as well. Is there a way to remove the file, but keep the subfolder present using powerhsell. I believe I can do something like this,
aws s3 rm s3://<key_to_be_removed> --exclude "<subfolder_key>"
but not quite sure if that'll work.
I'm looking for the best way to accomplish this, and at the moment, my only option is to recreate the subfolder via the script if the subfolder not longer exists.
The only way to accomplish having an empty folder is to create a zero-length object which has the same name as the folder you want to keep. This is actually how the S3 console enables you to create an empty folder.
You can check this by running $ aws s3 ls s3://your-bucket/folderfoo/ and observing an output object having length of zero bytes.
See more on this topic here.
As already commented, S3 does not really have folders the way file systems do. The folders as presented by most S3 browsers are just generated based on the paths of the files/objects. If you upload an object/file named folder/file, the browsers will present folder as folder with file as a file in the folder. But technically, all that there is is the file/object folder/file. The folder does not exist on its own.
You can explicitly create a folder by creating an empty empty-named object with "in the folder": folder/. If you do that, it will appear the the folder exists even if there are no files in it. But if you do not do that, the virtual folder disappears once you remove all objects in the folder.
Now the question is whether your command removes even the empty named object representing the folder or not. I cannot tell that.

AWS s3 bucket shows 1 file less

I have used aws cli tool to move couple of folders named: 2014, 2015, 2016 etc from root directory:
/2015/
into:
/images/2015/
When I moved them it seems that there is one file less in each bucket:
Before copying:
After coping:
Could you help me to understand this phenomena ?
The count is probably including/excluding the 'folder object'.
Normally, there is no need to create folders in Amazon S3. Simply putting an object in a particular path (eg /images/2014 will "create" the images and 2014 folders -- they 'appear' to exist, but they actually do not exist. Deleting the objects will make the folders disappear.
However, it is possible to create a folder by clicking Create folder. This will create a zero-length object with the same name as the folder. This will force the folder to appear, even when there are no objects inside the folder.
Therefore, it is likely that the "off by 1" count of objects is related to a folder that was/wasn't created via the Create folder command. I have previously seen exactly this behaviour.

Why AWS S3 uses objects and not file & directories

Why AWS S3 uses objects and not file & directories is there any specific reason to not have directories/folders in s3
You are welcome to use directories/folders in Amazon S3. However, please realise that they do not actually exist.
Amazon S3 is not a filesystem. It is an object storage service that is highly scalable, stores trillions of objects and serves millions of objects per second. To meet the demands of such scale, it has been designed as a Key-Value store. The name of the file is the Key and the contents of the file is the Object.
When a file is uploaded to a directory (eg cat.jpg is stored in the images directory), it is actually stored with a filename of images/cat.jpg. This makes is appear to be in the images directory, but the reality is that the directory does not exist -- rather, the name of the object includes the full path.
This will not impact your normal usage of Amazon S3. However, it is not possible to rename a directory because the directory does not exist. Instead, rename the file to rename the directory. For example:
aws s3 mv s3://my-bucket/images/cat.jpg s3://my-bucket/pictures/cat.jpg
This will cause the pictures directory to magically appear, with cat.jpg inside it. There is not need to create the directory first, because it doesn't actually exist. This is because the user interface is making it appear as though there are directories.
Bottom line: Feel free to use directories, but be aware that they do not actually exist and can't be renamed.