Delete a weird object in s3 bucket - amazon-web-services

I somehow endedup creating a weird object name in aws s3 bucket which is something like: 
I tried deleting it from aws cli, aws-sdk-go and from the aws console as well. Nothing seems to work. Has anyone faced an issue like this and how did you counter it?
P.S: My bucket contains 24 giga bytes of data.

Using aws-cli, I moved the objects I wanted to keep to another folder. After that, I ran:
$ aws s3 rm s3://mybucket/public/0 --recursive
# where 0 is the directory containing the object I wanted to delete

It is likely that the name of the file includes some unprintable characters, or something that looks from in an HTML page. You can use an API call to delete it, but the hard part would be finding the exact filename!
I would use the AWS CLI to get a list of all Keys:
aws s3api list-objects-v2 --bucket my-bucket --query Contents[].Key
Then find the offending object and delete it:
aws s3 rm XXX

Related

Delete Folders, Subfolders and All Files from a S3 bucket older than X days

I have a S3 bucket with the below architecture -
Bucket
|__2019-08-23/
| |__SubFolder1
| |__Files
|
|__2019-08-22/
|__SubFolder2
I want to delete all folders, subfolders and files within which are older than X days.
How can that be done? I am not sure if S3 LifeCycle can be used for this ?
Also when I do -
aws s3 ls s3://bucket/
I get this -
PRE 2019-08-23/
PRE 2019-08-22/
Why do I see PRE in front of the folder name?
As per the valuable comments I tried this -
$ Number=1;current_date=$(date +%Y-%m-%d);
past_date=$(date -d "$current_date - $Number days" +%Y-%m-%d);
aws s3api list-objects --bucket bucketname --query 'Contents[?LastModified<=$past_date ].{Key:Key,LastModified: LastModified}' --output text | xargs -I {} aws s3 rm bucketname/{}
I am trying to remove all files which are 1 day old. But I get this error -
Bad jmespath expression: Unknown token $:
Contents[?LastModified<=$past_date ].{Key:Key,LastModified: LastModified}
How can I pass a variable in lastmodified?
You can use lifecycle, a lambda function if you have more complex logic or command line.
here is an example using command line:
aws s3api list-objects --bucket your-bucket --query 'Contents[?LastModified>=`2019-01-01` ].{Key:Key,LastModified: LastModified}' --prefix "2019-01-01" --output text | xargs -I {} aws s3 rm s3://your-bucket/{}
#Elzo's answer already covers the life cycle policy and how to delete the objects, therefore here I have an answer for the second part of your question:
PRE stands for PREFIX as stated in the aws s3 cli's manual.
If you run aws s3 ls help you will come across the following section:
The following ls command lists objects and common prefixes under
a
specified bucket and prefix. In this example, the user owns the bucket
mybucket with the objects test.txt and somePrefix/test.txt. The Last-
WriteTime and Length are arbitrary. Note that since the ls command has
no interaction with the local filesystem, the s3:// URI scheme is not
required to resolve ambiguity and may be omitted:
aws s3 ls s3://mybucket
Output:
PRE somePrefix/
2013-07-25 17:06:27 88 test.txt
This is just to differentiate keys that have a prefix (split by forward slashes) from keys that don't.
Therefore, if your key is prefix/key01 you will always see a PRE in front of it. However, if your key is key01, then PRE is not shown.
Keep in mind that S3 does not work with directories even though you can tell otherwise when looking from the UI. S3's file structure is just one flat single-level container of files.
From the docs:
In Amazon S3, buckets and objects are the primary resources, where
objects are stored in buckets. Amazon S3 has a flat structure with no
hierarchy like you would see in a file system. However, for the sake
of organizational simplicity, the Amazon S3 console supports the
folder concept as a means of grouping objects. Amazon S3 does this by
using a shared name prefix for objects (that is, objects that have
names that begin with a common string). Object names are also referred
to as key names.
For example, you can create a folder in the console called photos, and
store an object named myphoto.jpg in it. The object is then stored
with the key name photos/myphoto.jpg, where photos/ is the prefix.
S3 Lifecycle can be used for buckets. For folders and subfolder management, you can write a simple AWS lambda to delete the folders and sub folders which are xx days old. Leverage S3 AWS SDK for JavaScript or Java or Python, etc. to develop the Lambda.

Using AWS CLI to query file names inside folders?

Our bucket structure goes from MyBucket -> CustomerGUID(folder) -> [actual files]
I'm having a hell of a time trying to use the AWS CLI (on windows) --query option to try and locate a file across all of the customer folders. Can someone look at my --query and see what i'm doing wrong here? Or tell me the proper way to search for a specific file name?
This is an example of how i'm able to list ALL the files in the bucket LastModified by a date.
I need to limit the output based on filename, and that is where i'm getting stuck. When I look at the individual files in S3, I can see other files have a "Key", is the Key the 'name' of the file?
See Photo
aws s3 ls s3://mybucket --recursive --output text --query "Contents[?contains(LastModified) > '2018-12-8']"
The aws s3 ls command only returns a text list of objects.
If you wish to use --query, then use: aws s3api list-objects
See: list-objects — AWS CLI Command Reference

AWS S3 Bucket endpoint

I have created a bucket and trying to use from an application and it is giving the following error:
"error: S3ServiceException:The bucket you are attempting to access must be addressed using the specified endpoint."
I am using this format: s3://bucketname. I know the format is not an issue because I am able to use this format for another public bucket. I think the permissions on my bucket may be an issue but I am not sure.
Can someone pl. help? Thank you in advance.
May be this can help you a bit I used this command to copy mine images in bucket on S3 from linux command line. Please see I also use trailing /
//these commands are bidirectional
// . is showing current directory in which you are
// bucketname & region are mandatory
aws s3 cp . s3://bucketname/foldername/ --recursive --include "*" --region ap-southeast-1

Unable to copy from S3 to Ec2 instance

I am trying to copy a file from S3 to an Ec2 instance, here is the strange behavior
Following command runs perfectly fine and show me the contents of s3, that I want to access
$aws s3 ls
2016-05-05 07:40:57 folder1
2016-05-07 15:04:42 my-folder
then I issue following command (also successful)
$ aws s3 ls s3://my-folder
2016-05-07 16:44:50 6007 myfile.txt
but when I try to copy this file, I recive an error as follows
$aws s3 cp s3://my-folder/myfile.txt ./
A region must be specified --region or specifying the region in a
configuration file or as an environment variable. Alternately, an
endpoint can be specified with --endpoint-url
I simply want to copy txt file from s3 to ec2 instance.
At least modify the above command to copy the contents. I am not sure about region as If I visit S3 from web it says
"S3 does not require region selection"
What is happening on the earth?
Most likely something is not working right, you should not be able to list the bucket if your regions is not setup as default in the aws configure.
Therefore from my experience with S3 if this works:
aws s3 ls s3://my-folder
then this should work as well:
aws s3 cp s3://my-folder/myfile.txt ./
However if it's asking you for region, then you need to provide it.
Try this to get the bucket region:
aws s3api get-bucket-location --bucket BUCKET
And then this to copy the file:
aws s3 cp --region <your_buckets_region> s3://my-folder/myfile.txt ./
If I visit S3 from web it says
"S3 does not require region selection"
S3 and bucket regions can be very confusing especially with that message. As it is the most misleading information ever IMO when it comes to s3 regions. Every bucket has got specific region (default is us-east-1) unless you have enabled cross-region replication.
You can choose a region to optimize latency, minimize costs, or
address regulatory requirements. Objects stored in a region never
leave that region unless you explicitly transfer them to another
region. For more information about regions, see Accessing a Bucket: in
the Amazon Simple Storage Service Developer Guide.
How about
aws s3 cp s3://my-folder/myfile.txt .
# or
aws s3 cp s3://my-folder/myfile.txt myfile.txt
I suspect the problem is something to do with the local path parser.
aws cli s3 fileformat parser
It is kinda strange because aws cli read the credential and region config.
The fix is specifying the region, below explains how to get the bucket region if you cant get it from the cli.
aws s3 cp s3://xxxxyyyyy/2008-Nissan-Sentra.pdf myfile.pdf --region us-west-2

How to create folder on S3 from Ec2 instance

I want to create folder in S3 bucket from Ec2 instacne . I tried the put object but its not working . Is there any way of creating folder on s3 from ec2 instace using cli.
You don't need to create a folder to put item in it. For example, just run something like the below command and s3 will create the folders if they don't exist:
aws s3 cp ./testfile s3://yourBucketName/any/path/you/like
If you want to use cp recursively you can specify --recursive option, or use "aws s3 sync".
If your command does not work, then you may have permission issues. Paste your error so that we can help you.
aws s3api put-object --bucket bucketname --key foldername/
This command works like a charm.
Courtesy AWS Support.
aws s3 sync <folder_name> s3://<you-prefix>/<some_other_folder>/<folder_name>
And bare in mind that, S3 is an object store. It doesn't deal with folder.
If you create /xyz/ and upload a file call /xyz/foo.txt , those are actually 2 different object. if you delete /xyz/ , it will not delete /xyz/foo.txt.
S3 console allow you to "create folder", but after you play with it, you will notice , you CANNOT RENAME folder, or do ANYTHING that you can play with a folder (like moving a tree structure, recursively specify access rights)
In S3, there is something call "PREFIX" where the API allow you to list/filter file with particular "prefix", that let you deal with abstraction.
As mentioned above, since you CANNOT do anything like a file system folder, if you want to perform task like moving one folder to another folder, You need to write your own code to "rewrite" the file name(To be specific, it is "Key" in S3) , i.e. copy it to new object name and delete the old object.
If you want build advance control on S3, you may choose any of the AWS SDK to do it.
https://aws.amazon.com/tools/
You can play around with the API function call put_object() (naming varied depends on SDK language) and proof those facts (which most is found inside AWS documentation)
update: Since #Tom raise up the issues.
You cannot create an virtual folder using AWS cli (Maybe #Tom can show how), only ways to do that is using AWS SDK put_object()
Let's try this
First I create dummy file in shell
echo "dummy">test.txt
Then try use python aws sdk
import boto3
s3=boto3.client("s3")
s3.create_bucket(Bucket="dummy")
# now create so call xyz/ "empty virtual folder"
s3.put_object(Bucket="dummy", Key="xyz/")
# now I put above file name to S3 , call xyz/test.txt
# First I must open the file, because put_object only take bytes or file object
myfile=open("test.txt")
s3.put_object(Bucket="dummy", Key="xyz/test.txt")
Now, go to your command shell, fire up your AWS CLI (or continue to play with boto3)
# check everything
aws s3 ls s3://dummy --recursive
#now delete the so call "folder"
aws s3 rm s3://dummy/xyz/
# And you see the file "xyz/test.txt" is still there
aws s3 s3://dummy --recursive
You can find the commands here from official blog of AWS:
http://docs.aws.amazon.com/cli/latest/userguide/using-s3-commands.html
And there are different other tools available which can be used to create Bucket/ folders in S3. One of the known tool is S3Browser which is available for windows servers. Install it on your EC2 instance and provide your AWS access key and secret keys to access the S3. This tool provide simple UI to do that.
There is no cli command that allows you to simply create a folder in an s3 bucket. To create this folder I would use the following command, which creates an empty file, with nothing inside. But if you delete the file you will delete the folder as long as you have not added anything else afterwards
aws s3api put-object --bucket bucket_name --key folder_name/empty.csv