AWS S3 how to make folder as today's date - amazon-web-services

I would like to create a folder structure displayed like below.
How should I specify the key?
*Top level folder would be today's date
/yyyymmdd/*.jason
e.g.
/2021-05-21/example.json

Folders do not actually exist in Amazon S3. You can create an object with any path and the folder magically 'appears'. Then, if the object is deleted, the folder will disappear. Amazon S3 calls them CommonPrefixes.
The Amazon S3 management console has a "Create Folder" button. This actually creates a zero-length object with the same name as the folder (e.g. 2021-05-21/). This 'forces' the folder to appear because there is an object inside it. However, the zero-length object is hidden in the console.
So, if you wish to 'create a folder', simply create a zero-length object with the name of the folder.
Or, better yet, do not create the folder. Just pretend that it exists and things will work perfectly fine.

Related

Copy data from one folder to another inside a AWS bucket automatically

I want to copy files from one folder to another inside the same bucket.
I have two folders Actual and Backup
As soon as new files comes to actual folder i want a way so that it immediately gets copied to Backup folder.
What you need are S3 Event Notifications. With these you can trigger a lambda function when a new item is put, then if it is put with one prefix, write the same object to the other prefix.
It is also worth noting that, though it is functionally as it seems, S3 doesn't really have directories; just objects. So you are just creating the same object as /Actual/some-file with key /Backup/some-file . It just looks like there is a directory because files /Actual/some-file and /Actual/other-file share a prefix /Actual/.

AWS s3 bucket shows 1 file less

I have used aws cli tool to move couple of folders named: 2014, 2015, 2016 etc from root directory:
/2015/
into:
/images/2015/
When I moved them it seems that there is one file less in each bucket:
Before copying:
After coping:
Could you help me to understand this phenomena ?
The count is probably including/excluding the 'folder object'.
Normally, there is no need to create folders in Amazon S3. Simply putting an object in a particular path (eg /images/2014 will "create" the images and 2014 folders -- they 'appear' to exist, but they actually do not exist. Deleting the objects will make the folders disappear.
However, it is possible to create a folder by clicking Create folder. This will create a zero-length object with the same name as the folder. This will force the folder to appear, even when there are no objects inside the folder.
Therefore, it is likely that the "off by 1" count of objects is related to a folder that was/wasn't created via the Create folder command. I have previously seen exactly this behaviour.

uploading file to specific folder in S3 bucket using boto3

My code is working. The only issue I'm facing is that I cannot specify the folder within the S3 bucket that I would like to place my file in. Here is what I have:
with open("/hadoop/prodtest/tips/ut/s3/test_audit_log.txt", "rb") as f:
s3.upload_fileobj(f, "us-east-1-tip-s3-stage", "BlueConnect/test_audit_log.txt")
Explanation from #danimal captures pretty much everything. If you wanted to just create a folder-like object in s3, you can simply specify that folder-name and end it with "/", so that when you look at it from the console, it will look like a folder.
It's rather useless, an empty object, without a body (consider it as a key with null value) just for eye-candy but if you really want to do it, you can.
1) You can create it on the console interactively, as it gives you that option
2_ You can use aws sdk. boto3 has put_object method for s3 client, where you specify the key as "your_folder_name/", see example below:
import boto3
session = boto3.Session() # I assume you know how to provide credentials etc.
s3 = session.client('s3', 'us-east-1')
bucket = s3.create_bucket('my-test-bucket')
response = s3.put_object(Bucket='my-test-bucket', Key='my_pretty_folder/' # note the ending "/"
And there you have your bucket.
Again, when you are uploading a file you specify "my-test-bucket/my_file" and what you did there is create a "key" with name "my-test-bucket/my_file" and put the content of your file as its "value".
In this case you have 2 objects in the bucket. First object has null body and looks like a folder, while the second one looks like it is inside that but as #danimal pointed out in reality you created 2 keys in the same flat hierarchy, it just "looks-like" what we are used to seeing in a file system.
If you delete the file, you still have the other objects, so on the aws console, it looks like folder is still there but no files inside.
If you skipped creating the folder and simply uploaded the file like you did, you would still see the folder structure in AWS Console but you have a single object at that point.
When you however list the objects from command line, you would see a single object and if you delete it on the console it looks like folder is gone too.
Files ('objects') in S3 are actually stored by their 'Key' (~folders+filename) in a flat structure in a bucket. If you place slashes (/) in your key then S3 represents this to the user as though it is a marker for a folder structure, but those folders don't actually exist in S3, they are just a convenience for the user and allow for the usual folder navigation familiar from most file systems.
So, as your code stands, although it appears you are putting a file called test_audit_log.txt in a folder called BlueConnect, you are actually just placing an object, representing your file, in the us-east-1-tip-s3-stage bucket with a key of BlueConnect/test_audit_log.txt. In order then to (seem to) put it in a new folder, simply make the key whatever the full path to the file should be, for example:
# upload_fileobj(file, bucket, key)
s3.upload_fileobj(f, "us-east-1-tip-s3-stage", "folder1/folder2/test_audit_log.txt")
In this example, the 'key' of the object is folder1/folder2/test_audit_log.txt which you can think of as the file test_audit_log.txt, inside the folder folder1 which is inside the folder folder2 - this is how it will appear on S3, in a folder structure, which will generally be different and separate from your local machine's folder structure.

AWS S3 Listing API - How to list everything inside S3 Bucket with specific prefix

I am trying to list all items with specific prefix in S3 bucket. Here is directory structure that I have:
Item1/
Item2/
Item3/
Item4/
image_1.jpg
Item5/
image_1.jpg
image_2.jpg
When I set prefex to be Item1/Item2, I get as a result following keys:
Item1/Item2/
Item1/Item2/Item3/Item4/image_1.jpg
Item1/Item2/Item3/Item5/image_1.jpg
Item1/Item2/Item3/Item5/image_2.jpg
What I would like to get is:
Item1/Item2/
Item1/Item2/Item3
Item1/Item2/Item3/Item4
Item1/Item2/Item3/Item5
Item1/Item2/Item3/Item4/image_1.jpg
Item1/Item2/Item3/Item5/image_1.jpg
Item1/Item2/Item3/Item5/image_2.jpg
Is there anyway to achieve this in golang?
Folders do not actually exist in Amazon S3. It is a flat object storage system.
For example, using the AWS Command-Line Interface (CLI) I could copy a command to an Amazon S3 bucket:
aws s3 cp foo.txt s3://my-bucket/folder1/folder2/foo.txt
This work just fine, even though folder1 and folder2 do not exist. This is because objects are stored with a Key (filename) that includes the full path of the object. So, the above object actually has a Key (filename) of:
folder1/folder2/foo.txt
However, to make things easier for humans, the Amazon S3 management console makes it appear as though there are folders. In S3, these are called Common Prefixes rather than folders.
So, when you make an API call to list the contents of the bucket while specifying a Prefix, it simply says "List all objects whose Key starts with this string".
Your listing doesn't show any folders because they don't actually exist.
Now, just to contradict myself, it actually is possible to create a folder (eg by clicking Create folder in the management console). This actually creates a zero-length object with the same name as the folder. The folder will then appear in listings because it is actually listing the zero-length object rather than the folder.
This is probably why Item1/Item2/ appears in your listing, but Item1/Item2/Item3 does not. Somebody, at some stage, must have "created a folder" called Item1/Item2/, which actually created a zero-length object with that Key.

Folders in S3 Bucket not visible in Web Console

After deleting a few folders in our S3 bucket, I am not able to see any of my folders through the web console. We had around 10 folders and ended up deleting 6 of them. The remaining four show up when I do an 'ls' on that S3 bucket through the CLI but the bucket shows up empty on the web console.
When I turn on 'Versions' I see everything (including the 6 folders that were deleted). Am I overlooking something extremely simple?
Folders do not actually exist in Amazon S3.
For example, you could create an object like this:
aws s3 cp foo.txt s3://my-bucket/folder1/folder2/foo.txt
This would instantly 'create' folder1 and folder2. Or, to be more accurate, the folders would 'appear' but they don't actually exist because the full filename (Key) of the object is folder1/folder2/foo.txt.
If you were then to delete that object, the folders would 'disappear' because they never actually existed.
Sometimes, if a system wants to forcefully make a folder 'appear', it can create a zero-length object with the same name as the folder. This makes the folder 'appear', but it is really the empty file that is appearing.
Bottom line: Don't worry about creating and deleting folders. They will appear when necessary and disappear when not being used. Do not try to map normal filesystem behaviour to Amazon S3.