AWS relative path in S3 key name - amazon-web-services

Every documentation for AWS S3 says that although S3 key names look like paths, they are not paths. However, can anyone explain why the following code does not work? It seems like it's doing some kind of relative path check.
import com.amazonaws.PredefinedClientConfigurations;
import com.amazonaws.auth.profile.ProfileCredentialsProvider;
import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.AmazonS3ClientBuilder;
import java.io.File;
public class S3Test {
public static void main(String[] args) {
AmazonS3 amazonS3 = AmazonS3ClientBuilder.standard()
.withClientConfiguration(
PredefinedClientConfigurations.defaultConfig().withMaxErrorRetry(10).withRequestTimeout(30_000))
.withCredentials(new ProfileCredentialsProvider("myprofile"))
.withRegion("us-east-1")
.build();
File file = new File("test.json");
System.out.println("Start upload1");
// this works
amazonS3.putObject("mybucket", "a/../test.json", file);
System.out.println("Start upload2");
//this fails with error code 400 Invalid URI
amazonS3.putObject("mybucket", "a/../../test.json", file);
}
}

This:
a/../test.json
translates to the same path level as the folder a. So after your upload command, your bucket will look like this:
- a/
- test.json
this:
a/../../test.json
on the other hand translates to one level above the folder a, which doesn't exist in the bucket, hence the error.

Related

How copy file automatically bewteen 2 buckets with two different projects gcp?

Actually i use that command , and it works well :
gsutil cp gs:/bucket1/file.xml gs://bucket2/destination_folder
(bucket1 is in project1 in GCP and bucket2 is in another project in GCP)
But i would like to do that command every day at 9am, how can i do that on my GCP project in a easy way ?
Edit : It will copy the file over and over each day from the source bucket to the destination bucket( the two buckets are in a different project each). (actually when the file arrive in the destination bucket, it is consume and ingest in bigquery automatically , i just want to trigg my command gsutil and stop to do it manually each morning )
(except the method with Data transfert because i have not the right of the source project so i cannot activate the service account for data transfert , i have only the rights on destination project.)
Bests regards,
Actually i can copy a file from a bucket into another bucket into a specfic folder (RQ : the 2 buckets are on the same gcp project)
I don't arrive to use the second method with a gs://
EDIT 2:
import base64
import sys
import urllib.parse
# Imports the Google Cloud client library , dont forget the requirement or else it's ko
from google.cloud import storage
def copy_blob(
bucket_name ="prod-data", blob_name="test.csv", destination_bucket_name = "prod-data-f", destination_blob_name ="channel_p"
):
"""Copies a blob from one bucket to another with a new name."""
bucket_name = "prod-data"
blob_name = "test.csv"
destination_bucket_name = "prod-data-f"
destination_blob_name = "channel_p/test.csv"
storage_client = storage.Client()
source_bucket = storage_client.bucket(bucket_name)
source_blob = source_bucket.blob("huhu/"+blob_name)
destination_bucket = storage_client.bucket(destination_bucket_name)
blob_copy = source_bucket.copy_blob(
source_blob, destination_bucket, destination_blob_name
)
# Second Method (KO)
#
# client = storage.Client()
# with open('gs://prod-data-f/channelp.xml','wb') as file_obj:
# client.download_blob_to_file(
# 'gs://pathsource/somefolder/channelp.xml', file_obj)
#
# End of second Method
print(
"Blob {} in bucket {} copied to blob {} in bucket {}.".format(
source_blob.name,
source_bucket.name,
blob_copy.name,
destination_bucket.name,
)
)
Data transfer is obviously the right tool for doing this, but since you cannot use it, there are alternative solutions.
One of them is to copy files using a Cloud Function (you can use this snippet), and trigger each day at 9am that Cloud Function using Cloud Scheduler. Cloud Function can also be triggered by a Pub/Sub message.
The solution that i was seeking (it works for me when i test):
Main.py
import base64
import os
import sys
import json
import uuid
import logging
from time import sleep
from flask import request
from random import uniform
from google.cloud import firestore
from google.cloud.exceptions import Forbidden, NotFound
from google.cloud import storage
# set retry deadline to 60s
DEFAULT_RETRY = storage.retry.DEFAULT_RETRY.with_deadline(60)
def Move2FinalBucket(data, context):
# if 'data' in event:
# name = base64.b64decode(event['data']).decode('utf-8')
# else:
# name = 'NO_DATA'
# print('Message {}!'.format(name))
# Get cache source bucket
cache_bucket = storage.Client().get_bucket('nameofmysourcebucket', timeout=540, retry=DEFAULT_RETRY)
# Get source file to copy
blob2transfer = cache_bucket.blob('uu/oo/pp/filename.csv')
# Get cache destination bucket
destination_bucket = storage.Client().get_bucket('nameofmydestinationbucket', timeout=540, retry=DEFAULT_RETRY)
# Get destination file
new_file = destination_bucket.blob('kk/filename.csv')
#rewrite into new_file
new_file.rewrite(blob2transfer, timeout=540, retry=DEFAULT_RETRY)
requirement.txt
# Function dependencies, for example:
# package>=version
#google-cloud-storage==1.22.0
google-cloud-storage
google-cloud-firestore
google-api-core
flask==1.1.4
Dont forget to add a service account with the right Storage admin on this CF and it will works.
Best regards,

Creating a dmarc parser using parsedmarc in python3 for use in AWS s3

I am very new to programming. I am working on a pipeline to analyze DMARC report files that are sent to my email account, that I am manually placing in an s3 bucket. The goal of this task is to download, extract, and analyze files using parsedmarc: https://github.com/domainaware/parsedmarc The part I'm having difficulty with is setting a conditional statement to extract .gz files if the target file is not a .zip file. I'm assuming the gzip library will be sufficient for this purpose. Here is the code I have so far. I'm using python3 and the boto3 library for AWS. Any help is appreciated!
import parsedmarc
import pprint
import json
import boto3
import zipfile
import gzip
pp = pprint.PrettyPrinter(indent=2)
def main():
#Set default session profile and region for sandbox account. Access keys are pulled from /.aws/config and /.aws/credentials.
#The 'profile_name' value comes from the header for the account in question in /.aws/config and /.aws/credentials
boto3.setup_default_session(region_name="aws-region-goes-here")
boto3.setup_default_session(profile_name="aws-account-profile-name-goes-here")
#Define the s3 resource, the bucket name, and the file to download. It's hardcoded for now...
s3_resource = boto3.resource(s3)
s3_resource.Bucket('dmarc-parsing').download_file('source-dmarc-report-filename.zip' '/home/user/dmarc/parseme.zip')
#Use the zipfile python library to extract the file into its raw state.
with zipfile.ZipFile('/home/user/dmarc/parseme.zip', 'r') as zip_ref:
zip_ref.extractall('/home/user/dmarc')
#Ingest all locations for xml file source
dmarc_report_directory = '/home/user/dmarc/'
dmarc_report_file = 'parseme.xml'
"""I need an if statement here for extracting .gz files if the file type is not .zip. The contents of every archive are .xml files"""
#Set report output variables using functions in parsedmarc. Variable set to equal the output
pd_report_output=parsedmarc.parse_aggregate_report_file(_input=f"{dmarc_report_directory}{dmarc_report_file}")
#use jsonify to make the output in json format
pd_report_jsonified = json.loads(json.dumps(pd_report_output))
dkim_status = pd_report_jsonified['records'][0]['policy_evaluated']['dkim']
spf_status = pd_report_jsonified['records'][0]['policy_evaluated']['spf']
if dkim_status == 'fail' or spf_status == 'fail':
print(f"{dmarc_report_file} reports failure. oh crap. report:")
else:
print(f"{dmarc_report_file} passes. great. report:")
pp.pprint(pd_report_jsonified['records'][0]['auth_results'])
if __name__ == "__main__":
main()
Here is the code using the parsedmarc.parse_aggregate_report_xml method I found. Hope this helps others in parsing these reports:
import parsedmarc
import pprint
import json
import boto3
import zipfile
import gzip
pp = pprint.PrettyPrinter(indent=2)
def main():
#Set default session profile and region for account. Access keys are pulled from ~/.aws/config and ~/.aws/credentials.
#The 'profile_name' value comes from the header for the account in question in ~/.aws/config and ~/.aws/credentials
boto3.setup_default_session(profile_name="aws_profile_name_goes_here", region_name="region_goes_here")
source_file = 'filename_in_s3_bucket.zip'
destination_directory = '/tmp/'
destination_file = 'compressed_report_file'
#Define the s3 resource, the bucket name, and the file to download. It's hardcoded for now...
s3_resource = boto3.resource('s3')
s3_resource.Bucket('bucket-name-for-dmarc-report-files').download_file(source_file, f"{destination_directory}{destination_file}")
#Extract xml
outputxml = parsedmarc.extract_xml(f"{destination_directory}{destination_file}")
#run parse dmarc analysis & convert output to json
pd_report_output = parsedmarc.parse_aggregate_report_xml(outputxml)
pd_report_jsonified = json.loads(json.dumps(pd_report_output))
#loop through results and find relevant status info and pass fail status
dmarc_report_status = ''
for record in pd_report_jsonified['records']:
if False in record['alignment'].values():
dmarc_report_status = 'Failed'
#************ add logic for interpreting results
#if fail, publish to sns
if dmarc_report_status == 'Failed':
message = "Your dmarc report failed a least one check. Review the log for details"
sns_resource = boto3.resource('sns')
sns_topic = sns_resource.Topic('arn:aws:sns:us-west-2:112896196555:TestDMARC')
sns_publish_response = sns_topic.publish(Message=message)
if __name__ == "__main__":
main()

Reading .conf file from AWS s3 through spark and scala

I was able to load a text file from AWS S3 but facing a problem in reading the ".conf" file. Getting the error
"Exception in thread "main" com.typesafe.config.ConfigException$Missing: No configuration setting found for key 'spark'"
Scala code:
val configFile1 = ConfigFactory.load( "s3n://<bucket_name>/aws.conf" )
configFile1.getString("spark.lineage.key")
Here what I end up doing it, Create a wrapper utility Config.scala
import java.io.File
import com.amazonaws.auth.DefaultAWSCredentialsProviderChain
import com.amazonaws.services.s3.{AmazonS3Client, AmazonS3URI}
import com.typesafe.config.{ConfigFactory, Config => TConfig}
import scala.io.Source
object Config {
private def read(location: String): String = {
val awsCredentials = new DefaultAWSCredentialsProviderChain()
val s3Client = new AmazonS3Client(awsCredentials)
val s3Uri = new AmazonS3URI(location)
val fullObject = s3Client.getObject(s3Uri.getBucket, s3Uri.getKey)
Source.fromInputStream(fullObject.getObjectContent).getLines.mkString("\n")
}
def apply(location: String): TConfig = {
if (location.startsWith("s3")) {
val content = read(location)
ConfigFactory.parseString(content)
} else {
ConfigFactory.parseFile(new File(location))
}
}
}
Use the created wrapper
val conf: TConfig = Config("s3://config/path")
You may use provided scope for aws-java-sdk since it will be available in the EMR cluster.
According to my research, we can only read delimiter files from AWS S3 through spark/scala. As .conf files are of = pair, its not possible.
Only way would be modify the format of data in the file.
Typesafe Config does not support loading .conf files from S3, but you can read s3 file as a string yourself and pass to typesafe config like val conf = ConfigFactory.parseString(... .conf files as string ...)

Issue with uploading files from local directory to aws S3 using python 2.7 and boto 2

I’m doing simple operation to of downloading the gzip files from S3 bucket to the local directory. I’m extracting those into another local directory and then uploading them back to S3 bucket again into archive folder path. While doing this operation I want to make sure I am processing same set of files that I initially download from S3 bucket which is (f_name) in below code. Now, below code is not uploading those back to S3 , that’s where I’m stuck. But able to download from S3 and extract it into local directory. Can you please help me understand what is wrong with the _uploadFile function?
from boto.s3.connection import S3Connection
from boto.s3.key import *
import os
import os.path
aws_bucket= "event-logs-dev” ## S3 Bucket name
local_download_directory= "/Users/TargetData/Download/test_queue1/“ ## local directory to download the gzip files from S3.
Target_directory_to_extract = "/Users/TargetData/unzip” ##local directory to gunzip the downloaded files.
Target_s3_path_to_upload= "event-logs-dev/data/clean/xact/logs/archive/“ ## S3 bucket path to upload the files.
def decompressAllFilesFromNetfiler(self,aws_bucket,local_download_directory,Target_d irectory_to_extract,Target_s3_path_to_upload):
zipFiles = [f for f in os.listdir(local_download_directory) if re.match(r'.*\.tar\.gz', f)]
for f_name in zipFiles:
if os.path.exists(Target_directory_to_extract+"/"+f_name[:-len('.tar.gz')]) and os.access(Target_directory_to_extract+"/"+f_name[:-len('.tar.gz')], os.R_OK):
print ('File {} already exists!'.format(f_name))
else:
f_name_with_path = os.path.join(local_download_directory, f_name)
os.system('mkdir -p {} && tar vxzf {} -C {}'.format(Target_directory_to_extract, f_name_with_path, Target_directory_to_extract))
print ('Extracted file {}'.format(f_name))
self._uploadFile(aws_bucket,f_name,Target_s3_path_to_upload,Target_directory_to_extract)
def _uploadFile(self, aws_bucket, f_name,Target_s3_path_to_upload,Target_directory_to_extract):
full_key_name = os.path.expanduser(os.path.join(Target_s3_path_to_upload, f_name))
path = os.path.expanduser(os.path.join(Target_directory_to_extract, f_name))
try:
print "Uploaded extracted file to: %s" % (full_key_name)
key = aws_bucket.new_key(full_key_name)
key.set_contents_from_filename(path)
except:
if full_key_name is None:
print "Error uploading”
Currently, the output prints that Uploaded extracted file to: event-logs-dev/data/clean/xact/logs/archive/1442235602129200000.tar.gz, but nothing is uploaded to S3 bucket. Your help is greatly appreciated!! Thank you in advance!
It appears that you have cut and pasted parts of your code - and maybe formatting was lost as your code above will not work as pasted. I've taken the liberty to make it PEP8 (mostly) however there is still some missing code to create the S3 objects. Since your import the modules, I presume that you have that section of code and just didn't paste it.
here is a cleaned up version of your code formatted correctly. I also added a Exception code to your try: block to print out the error you get. You should update the Exception to be more specific to the Exceptions thrown for make_key or set_contents_... but the general Exception will get you started. If nothing more this is more readable, but you should include your S3 connection code too - and remove anything that is specific to your domain (e.g. keys, trade secrets, etc).
#!/usr/bin/env python
"""
do some download
some extract
and some upload
"""
from boto.s3.connection import S3Connection
from boto.s3.key import *
import os
import os.path
aws_bucket = 'event-logs-dev'
local_download_directory = '/Users/TargetData/Download/test_queue1/'
Target_directory_to_extract = '/Users/TargetData/unzip'
Target_s3_path_to_upload = 'event-logs-dev/data/clean/xact/logs/archive/'
'''
MUST BE SOME MAGIC HERE TO GET AN S3 CONNECTION ???
aws_bucket IS NOT A BUCKET OBJECT ...
'''
def decompressAllFilesFromNetfiler(self,
aws_bucket,
local_download_directory,
Target_directory_to_extract,
Target_s3_path_to_upload):
'''
decompress stuff
'''
zipFiles = [f for f in os.listdir(
local_download_directory) if re.match(r'.*\.tar\.gz', f)]
for f_name in zipFiles:
if os.path.exists(
"{}/{}".format(Target_directory_to_extract,
f_name[:len('.tar.gz')])) and os.access(
"{}/{}".format(Target_directory_to_extract,
f_name[:len('.tar.gz')])) and os.R_OK:
print ('File {} already exists!'.format(f_name))
else:
f_name_with_path = os.path.join(local_download_directory, f_name)
os.system('mkdir -p {} && tar vxzf {} -C {}'.format(
Target_directory_to_extract,
f_name_with_path,
Target_directory_to_extract))
print ('Extracted file {}'.format(f_name))
self._uploadFile(aws_bucket,
f_name,
Target_s3_path_to_upload,
Target_directory_to_extract)
def _uploadFile(self,
aws_bucket,
f_name,
Target_s3_path_to_upload,
Target_directory_to_extract):
full_key_name = os.path.expanduser(os.path.join(Target_s3_path_to_upload,
f_name))
path = os.path.expanduser(os.path.join(Target_directory_to_extract, f_name))
try:
S3CONN = S3Connection()
BUCKET = S3CONN.get_bucket(aws_bucket)
key = BUCKET.new_key(full_key_name)
key.set_contents_from_filename(path)
print "Uploaded extracted file to: {}".format(full_key_name)
except Exception as UploadERR:
if full_key_name is None:
print 'Error uploading'
else:
print "Error : {}".format(UploadERR)

How to rename files and folder in Amazon S3?

Is there any function to rename files and folders in Amazon S3? Any related suggestions are also welcome.
I just tested this and it works:
aws s3 --recursive mv s3://<bucketname>/<folder_name_from> s3://<bucket>/<folder_name_to>
There is no direct method to rename a file in S3. What you have to do is copy the existing file with a new name (just set the target key) and delete the old one.
aws s3 cp s3://source_folder/ s3://destination_folder/ --recursive
aws s3 rm s3://source_folder --recursive
You can use the AWS CLI commands to mv the files
You can either use AWS CLI or s3cmd command to rename the files and folders in AWS S3 bucket.
Using S3cmd, use the following syntax to rename a folder,
s3cmd --recursive mv s3://<s3_bucketname>/<old_foldername>/ s3://<s3_bucketname>/<new_folder_name>
Using AWS CLI, use the following syntax to rename a folder,
aws s3 --recursive mv s3://<s3_bucketname>/<old_foldername>/ s3://<s3_bucketname>/<new_folder_name>
I've just got this working. You can use the AWS SDK for PHP like this:
use Aws\S3\S3Client;
$sourceBucket = '*** Your Source Bucket Name ***';
$sourceKeyname = '*** Your Source Object Key ***';
$targetBucket = '*** Your Target Bucket Name ***';
$targetKeyname = '*** Your Target Key Name ***';
// Instantiate the client.
$s3 = S3Client::factory();
// Copy an object.
$s3->copyObject(array(
'Bucket' => $targetBucket,
'Key' => $targetKeyname,
'CopySource' => "{$sourceBucket}/{$sourceKeyname}",
));
http://docs.aws.amazon.com/AmazonS3/latest/dev/CopyingObjectUsingPHP.html
This is now possible for Files, select the file then select Actions > Rename in the GUI.
To rename a folder, you instead have to create a new folder, and select the contents of the old one and copy/paste it across (Under "Actions" again)
We have 2 ways by which we can rename a file on AWS S3 storage -
1 .Using the CLI tool -
aws s3 --recursive mv s3://bucket-name/dirname/oldfile s3://bucket-name/dirname/newfile
2.Using SDK
$s3->copyObject(array(
'Bucket' => $targetBucket,
'Key' => $targetKeyname,
'CopySource' => "{$sourceBucket}/{$sourceKeyname}",));
To rename a folder (which is technically a set of objects with a common prefix as key) you can use the aws CLI move command with --recursive option.
aws s3 mv s3://bucket/old_folder s3://bucket/new_folder --recursive
There is no way to rename a folder through the GUI, the fastest (and easiest if you like GUI) way to achieve this is to perform an plain old copy. To achieve this: create the new folder on S3 using the GUI, get to your old folder, select all, mark "copy" and then navigate to the new folder and choose "paste". When done, remove the old folder.
This simple method is very fast because it is copies from S3 to itself (no need to re-upload or anything like that) and it also maintains the permissions and metadata of the copied objects like you would expect.
Here's how you do it in .NET, using S3 .NET SDK:
var client = new Amazon.S3.AmazonS3Client(_credentials, _config);
client.CopyObject(oldBucketName, oldfilepath, newBucketName, newFilePath);
client.DeleteObject(oldBucketName, oldfilepath);
P.S. try to use use "Async" versions of the client methods where possible, even though I haven't done so for readability
This works for renaming the file in the same folder
aws s3 mv s3://bucketname/folder_name1/test_original.csv s3://bucket/folder_name1/test_renamed.csv
Below is the code example to rename file on s3. My file was part-000* because of spark o/p file, then i copy it to another file name on same location and delete the part-000*:
import boto3
client = boto3.client('s3')
response = client.list_objects(
Bucket='lsph',
MaxKeys=10,
Prefix='03curated/DIM_DEMOGRAPHIC/',
Delimiter='/'
)
name = response["Contents"][0]["Key"]
copy_source = {'Bucket': 'lsph', 'Key': name}
client.copy_object(Bucket='lsph', CopySource=copy_source,
Key='03curated/DIM_DEMOGRAPHIC/'+'DIM_DEMOGRAPHIC.json')
client.delete_object(Bucket='lsph', Key=name)
File and folder are in fact objects in S3. You should use PUT OBJECT COPY to rename them. See http://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectCOPY.html
rename all the *.csv.err files in the <<bucket>>/landing dir into *.csv files with s3cmd
export aws_profile='foo-bar-aws-profile'
while read -r f ; do tgt_fle=$(echo $f|perl -ne 's/^(.*).csv.err/$1.csv/g;print'); \
echo s3cmd -c ~/.aws/s3cmd/$aws_profile.s3cfg mv $f $tgt_fle; \
done < <(s3cmd -r -c ~/.aws/s3cmd/$aws_profile.s3cfg ls --acl-public --guess-mime-type \
s3://$bucket | grep -i landing | grep csv.err | cut -d" " -f5)
As answered by Naaz direct renaming of s3 is not possible.
i have attached a code snippet which will copy all the contents
code is working just add your aws access key and secret key
here's what i did in code
-> copy the source folder contents(nested child and folders) and pasted in the destination folder
-> when the copying is complete, delete the source folder
package com.bighalf.doc.amazon;
import java.io.ByteArrayInputStream;
import java.io.InputStream;
import java.util.List;
import com.amazonaws.auth.AWSCredentials;
import com.amazonaws.auth.BasicAWSCredentials;
import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.AmazonS3Client;
import com.amazonaws.services.s3.model.CopyObjectRequest;
import com.amazonaws.services.s3.model.ObjectMetadata;
import com.amazonaws.services.s3.model.PutObjectRequest;
import com.amazonaws.services.s3.model.S3ObjectSummary;
public class Test {
public static boolean renameAwsFolder(String bucketName,String keyName,String newName) {
boolean result = false;
try {
AmazonS3 s3client = getAmazonS3ClientObject();
List<S3ObjectSummary> fileList = s3client.listObjects(bucketName, keyName).getObjectSummaries();
//some meta data to create empty folders start
ObjectMetadata metadata = new ObjectMetadata();
metadata.setContentLength(0);
InputStream emptyContent = new ByteArrayInputStream(new byte[0]);
//some meta data to create empty folders end
//final location is the locaiton where the child folder contents of the existing folder should go
String finalLocation = keyName.substring(0,keyName.lastIndexOf('/')+1)+newName;
for (S3ObjectSummary file : fileList) {
String key = file.getKey();
//updating child folder location with the newlocation
String destinationKeyName = key.replace(keyName,finalLocation);
if(key.charAt(key.length()-1)=='/'){
//if name ends with suffix (/) means its a folders
PutObjectRequest putObjectRequest = new PutObjectRequest(bucketName, destinationKeyName, emptyContent, metadata);
s3client.putObject(putObjectRequest);
}else{
//if name doesnot ends with suffix (/) means its a file
CopyObjectRequest copyObjRequest = new CopyObjectRequest(bucketName,
file.getKey(), bucketName, destinationKeyName);
s3client.copyObject(copyObjRequest);
}
}
boolean isFodlerDeleted = deleteFolderFromAws(bucketName, keyName);
return isFodlerDeleted;
} catch (Exception e) {
e.printStackTrace();
}
return result;
}
public static boolean deleteFolderFromAws(String bucketName, String keyName) {
boolean result = false;
try {
AmazonS3 s3client = getAmazonS3ClientObject();
//deleting folder children
List<S3ObjectSummary> fileList = s3client.listObjects(bucketName, keyName).getObjectSummaries();
for (S3ObjectSummary file : fileList) {
s3client.deleteObject(bucketName, file.getKey());
}
//deleting actual passed folder
s3client.deleteObject(bucketName, keyName);
result = true;
} catch (Exception e) {
e.printStackTrace();
}
return result;
}
public static void main(String[] args) {
intializeAmazonObjects();
boolean result = renameAwsFolder(bucketName, keyName, newName);
System.out.println(result);
}
private static AWSCredentials credentials = null;
private static AmazonS3 amazonS3Client = null;
private static final String ACCESS_KEY = "";
private static final String SECRET_ACCESS_KEY = "";
private static final String bucketName = "";
private static final String keyName = "";
//renaming folder c to x from key name
private static final String newName = "";
public static void intializeAmazonObjects() {
credentials = new BasicAWSCredentials(ACCESS_KEY, SECRET_ACCESS_KEY);
amazonS3Client = new AmazonS3Client(credentials);
}
public static AmazonS3 getAmazonS3ClientObject() {
return amazonS3Client;
}
}
In the AWS console, if you navigate to S3, you will see your folders listed. If you navigate to the folder, you will see the object (s) listed. right click and you can rename. OR, you can check the box in front of your object, then from the pull down menu named ACTIONS, you can select rename. Just worked for me, 3-31-2019
If you want to rename a lot of files from an s3 folder you can run the following script.
FILES=$(aws s3api list-objects --bucket your_bucket --prefix 'your_path' --delimiter '/' | jq -r '.Contents[] | select(.Size > 0) | .Key' | sed '<your_rename_here>')
for i in $FILES
do
aws s3 mv s3://<your_bucket>/${i}.gz s3://<your_bucket>/${i}
done
What I did is create a new folder and move older files object to the new folder.
There are a lot of 'issues' with folder structures in s3 it seems as the storage is flat.
I have a Django project where I needed the ability to rename a folder but still keep the directory structure in-tact, meaning empty folders would need to be copied and stored in the renamed directory as well.
aws cli is great but neither cp or sync or mv copied empty folders (i.e. files ending in '/') over to the new folder location, so I used a mixture of boto3 and the aws cli to accomplish the task.
More or less I find all folders in the renamed directory and then use boto3 to put them in the new location, then I cp the data with aws cli and finally remove it.
import threading
import os
from django.conf import settings
from django.contrib import messages
from django.core.files.storage import default_storage
from django.shortcuts import redirect
from django.urls import reverse
def rename_folder(request, client_url):
"""
:param request:
:param client_url:
:return:
"""
current_property = request.session.get('property')
if request.POST:
# name the change
new_name = request.POST['name']
# old full path with www.[].com?
old_path = request.POST['old_path']
# remove the query string
old_path = ''.join(old_path.split('?')[0])
# remove the .com prefix item so we have the path in the storage
old_path = ''.join(old_path.split('.com/')[-1])
# remove empty values, this will happen at end due to these being folders
old_path_list = [x for x in old_path.split('/') if x != '']
# remove the last folder element with split()
base_path = '/'.join(old_path_list[:-1])
# # now build the new path
new_path = base_path + f'/{new_name}/'
# remove empty variables
# print(old_path_list[:-1], old_path.split('/'), old_path, base_path, new_path)
endpoint = settings.AWS_S3_ENDPOINT_URL
# # recursively add the files
copy_command = f"aws s3 --endpoint={endpoint} cp s3://{old_path} s3://{new_path} --recursive"
remove_command = f"aws s3 --endpoint={endpoint} rm s3://{old_path} --recursive"
# get_creds() is nothing special it simply returns the elements needed via boto3
client, resource, bucket, resource_bucket = get_creds()
path_viewing = f'{"/".join(old_path.split("/")[1:])}'
directory_content = default_storage.listdir(path_viewing)
# loop over folders and add them by default, aws cli does not copy empty ones
# so this is used to accommodate
folders, files = directory_content
for folder in folders:
new_key = new_path+folder+'/'
# we must remove bucket name for this to work
new_key = new_key.split(f"{bucket}/")[-1]
# push this to new thread
threading.Thread(target=put_object, args=(client, bucket, new_key,)).start()
print(f'{new_key} added')
# # run command, which will copy all data
os.system(copy_command)
print('Copy Done...')
os.system(remove_command)
print('Remove Done...')
# print(bucket)
print(f'Folder renamed.')
messages.success(request, f'Folder Renamed to: {new_name}')
return redirect(request.META.get('HTTP_REFERER', f"{reverse('home', args=[client_url])}"))
S3DirectoryInfo has a MoveTo method that will move one directory into another directory, such that the moved directory will become a subdirectory of the other directory with the same name as it originally had.
The extension method below will move one directory to another directory, i.e. the moved directory will become the other directory. What it actually does is create the new directory, move all the contents of the old directory into it, and then delete the old one.
public static class S3DirectoryInfoExtensions
{
public static S3DirectoryInfo Move(this S3DirectoryInfo fromDir, S3DirectoryInfo toDir)
{
if (toDir.Exists)
throw new ArgumentException("Destination for Rename operation already exists", "toDir");
toDir.Create();
foreach (var d in fromDir.EnumerateDirectories())
d.MoveTo(toDir);
foreach (var f in fromDir.EnumerateFiles())
f.MoveTo(toDir);
fromDir.Delete();
return toDir;
}
}
There is one software where you can play with the s3 bucket for performing different kinds of operation.
Software Name: S3 Browser
S3 Browser is a freeware Windows client for Amazon S3 and Amazon CloudFront. Amazon S3 provides a simple web services interface that can be used to store and retrieve any amount of data, at any time, from anywhere on the web. Amazon CloudFront is a content delivery network (CDN). It can be used to deliver your files using a global network of edge locations.
If it's only single time then you can use the command line to perform these operations:
(1) Rename the folder in the same bucket:
s3cmd --access_key={access_key} --secret_key={secret_key} mv s3://bucket/folder1/* s3://bucket/folder2/
(2) Rename the Bucket:
s3cmd --access_key={access_key} --secret_key={secret_key} mv s3://bucket1/folder/* s3://bucket2/folder/
Where,
{access_key} = Your valid access key for s3 client
{secret_key} = Your valid scret key for s3 client
It's working fine without any problem.
Thanks