Copy S3 Object with MultiPartUpload - amazon-web-services

I need to rename a quite a bunch of objects in AWS S3. For small objects the following snippet works flawlessly:
input := &s3.CopyObjectInput{
Bucket: aws.String(bucket),
Key: aws.String(targetPrefix),
CopySource: aws.String(source),
}
_, err = svc.CopyObject(input)
if err != nil {
panic(errors.Wrap(err, "error copying object"))
}
I am running into the S3 size limitation for larger objects. I understand I need to copy the object using a multi part upload. This is what I tried so far:
multiPartUpload, err := svc.CreateMultipartUpload(
&s3.CreateMultipartUploadInput{
Bucket: aws.String(bucket),
Key: aws.String(targetPrefix), // targetPrefix is the new name
},
)
if err != nil {
panic(errors.Wrap(err, "could not create MultiPartUpload"))
}
resp, err := svc.UploadPartCopy(
&s3.UploadPartCopyInput{
UploadId: multiPartUpload.UploadId,
Bucket: aws.String(bucket),
Key: aws.String(targetPrefix),
CopySource: aws.String(source),
PartNumber: aws.Int64(1),
},
)
if err != nil {
panic(errors.Wrap(err, "error copying multipart object"))
}
log.Printf("copied: %v", resp)
The golang SDK bails out on me with:
InvalidRequest: The specified copy source is larger than the maximum allowable size for a copy source: 5368709120
I have also tried the following approach but I do not get any parts listed here:
multiPartUpload, err := svc.CreateMultipartUpload(
&s3.CreateMultipartUploadInput{
Bucket: aws.String(bucket),
Key: aws.String(targetPrefix), // targetPrefix is the new name
},
)
if err != nil {
panic(errors.Wrap(err, "could not create MultiPartUpload"))
}
err = svc.ListPartsPages(
&s3.ListPartsInput{
Bucket: aws.String(bucket), // Required
Key: obj.Key, // Required
UploadId: multiPartUpload.UploadId, // Required
},
// Iterate over all parts in the `CopySource` object
func(parts *s3.ListPartsOutput, lastPage bool) bool {
log.Printf("parts:\n%v\n%v", parts, parts.Parts)
// parts.Parts is an empty slice
for _, part := range parts.Parts {
log.Printf("copying %v part %v", source, part.PartNumber)
resp, err := svc.UploadPartCopy(
&s3.UploadPartCopyInput{
UploadId: multiPartUpload.UploadId,
Bucket: aws.String(bucket),
Key: aws.String(targetPrefix),
CopySource: aws.String(source),
PartNumber: part.PartNumber,
},
)
if err != nil {
panic(errors.Wrap(err, "error copying object"))
}
log.Printf("copied: %v", resp)
}
return true
},
)
if err != nil {
panic(errors.Wrap(err, "something went wrong with ListPartsPages!"))
}
What am I doing wrong or am I missunderstanding something?

I think that ListPartsPages is the wrong direction because it works on "Multipart Uploads" which is a different entity than an an s3 "Object". So you're listing the already-uploaded parts to the multipart upload you just created.
Your first example is close to what's needed, but you need to manually split the original file into parts, with the range of each part specified by UploadPartCopyInput's CopySourceRange. At least that's my take from reading the documentation.

Related

Upload a struct or object to S3 bucket using GoLang?

I am working with the AWS S3 SDK in GoLang, playing with uploads and downloads to various buckets. I am wondering if there is a simpler way to upload structs or objects directly to the bucket?
I have a struct representing an event:
type Event struct {
ID string
ProcessID string
TxnID string
Inputs map[string]interface{}
}
That I would like to upload into the S3 bucket. But the code that I found in the documentation only works for uploading strings.
func Save(client S3Client, T interface{}, key string) bool {
svc := client.S3clientObject
input := &s3.PutObjectInput{
Body: aws.ReadSeekCloser(strings.NewReader("testing this one")),
Bucket: aws.String(GetS3Bucket()),
Key: aws.String(GetObjectKey(T, key)),
Metadata: map[string]*string{
"metadata1": aws.String("value1"),
"metadata2": aws.String("value2"),
},
}
This is successful in uploading a basic file to the S3 bucket that when opened simply reads "testing this one". Is there a way to upload to the bucket so that it is uploading an object rather than simply just a string value??
Any help is appreciated as I am new to Go and S3.
edit
This is the code I'm using for the Get function:
func GetIt(client S3Client, T interface{}, key string) interface{} {
svc := client.S3clientObject
s3Key := GetObjectKey(T, key)
resp, err := svc.GetObject(&s3.GetObjectInput{
Bucket: aws.String(GetS3Bucket()),
Key: aws.String(s3Key),
})
if err != nil {
fmt.Println(err)
return err
}
result := json.NewDecoder(resp.Body).Decode(&T)
fmt.Println(result)
return json.NewDecoder(resp.Body).Decode(&T)
}
func main() {
client := b.CreateS3Client()
event := b.CreateEvent()
GetIt(client, event, key)
}
Encode the value as bytes and upload the bytes. Here's how to encode the value as JSON bytes:
func Save(client S3Client, value interface{}, key string) error {
p, err := json.Marshal(value)
if err != nil {
return err
}
input := &s3.PutObjectInput{
Body: aws.ReadSeekCloser(bytes.NewReader(p)),
…
}
…
}
Call Save with the value you want to upload:
value := &Event{ID: "an id", …}
err := Save(…, value, …)
if err != nil {
// handle error
}
There are many possible including including gob, xml and json, msgpack, etc. The best encoding format will depend on your application requirements.
Reverse the process when getting an object:
func GetIt(client S3Client, T interface{}, key string) error {
svc := client.S3clientObject
resp, err := svc.GetObject(&s3.GetObjectInput{
Bucket: aws.String(GetS3Bucket()),
Key: aws.String(key),
})
if err != nil {
return err
}
return json.NewDecoder(resp.Body).Decode(T)
}
Call GetIt with a pointer to the destination value:
var value model.Event
err := GetIt(client, &value, key)
if err != nil {
// handle error
}
fmt.Println(value) // prints the decoded value.
The example cited here shows that S3 allows you to upload anything that implements the io.Reader interface. The example is using the strings.NewReader syntax create a io.Reader that knows how to provide the specified string to the caller. Your job (according to AWS here) is to figure out how to adapt whatever you need to store into an io.Reader.
You can store the bytes directly JSON encoded like this
package main
import (
"bytes"
"encoding/json"
)
type Event struct {
ID string
ProcessID string
TxnID string
Inputs map[string]interface{}
}
func main() {
// To prepare the object for writing
b, err := json.Marshal(event)
if err != nil {
return
}
// pass this reader into aws.ReadSeekCloser(...)
reader := bytes.NewReader(b)
}

S3 images downloading instead of displaying when uploading with Golang

I'm trying to upload an image to AWS S3. The images are saving in the bucket but when I click on them (their URL) they download instead of displaying. In the past this has been because the Content Type wasn't set to image/jpeg but I verified this time that it is.
Here's my code:
func UploadImageToS3(file os.File) error {
fi, err := file.Stat() // get FileInfo
if err != nil {
return errors.New("Couldn't get FileInfo")
}
size := fi.Size()
buffer := make([]byte, size)
file.Read(buffer)
tempFileName := "images/picturename.jpg" // key to save under
putObject := &s3.PutObjectInput{
Bucket: aws.String("mybucket"),
Key: aws.String(tempFileName),
ACL: aws.String("public-read"),
Body: bytes.NewReader(buffer),
ContentLength: aws.Int64(int64(size)),
// verified is properly getting image/jpeg
ContentType: aws.String(http.DetectContentType(buffer)),
}
_, err = AwsS3.PutObject(putObject)
if err != nil {
log.Fatal(err.Error())
return err
}
return nil
}
I also tried making my s3.PutObjectInput as
putObject := &s3.PutObjectInput{
Bucket: aws.String("mybucket"),
Key: aws.String(tempFileName),
ACL: aws.String("public-read"),
Body: bytes.NewReader(buffer),
ContentLength: aws.Int64(int64(size)),
ContentType: aws.String(http.DetectContentType(buffer)),
ContentDisposition: aws.String("attachment"),
ServerSideEncryption: aws.String("AES256"),
StorageClass: aws.String("INTELLIGENT_TIERING"),
}
What am I doing wrong here?
Figured it out.
Not totally sure why, but I needed to separate all the values.
var size int64 = fi.Size()
buffer := make([]byte, size)
file.Read(buffer)
fileBytes := bytes.NewReader(buffer)
fileType := http.DetectContentType(buffer)
path := "images/test.jpeg"
params := &s3.PutObjectInput{
Bucket: aws.String("mybucket"),
Key: aws.String(path),
Body: fileBytes,
ContentLength: aws.Int64(size),
ContentType: aws.String(fileType),
}
_, err = AwsS3.PutObject(params)
If anyone knows why this works and the previous code doesn't, please share.

How to recursively delete objects

I would like to delete all .JPEG files from specified path at S3 bucket. For example, lets say that I have structure on S3 cloud service similar to following:
Obj1/
Obj2/
Obj3/
image_1.jpeg
...
image_N.jpeg
Is it possible to specify Obj1/Obj2/Obj3 as DeleteObjectsInput's prefix and recursively delete all .JPEG files that contain such prefix.
Here is my code:
func (s3Obj S3) Delete() error {
sess := session.Must(session.NewSession(&aws.Config{
Region: aws.String(s3Obj.Region),
}))
svc := s3.New(sess)
input := &s3.DeleteObjectsInput{
Bucket: aws.String(s3Obj.Bucket),
Delete: &s3.Delete{
Objects: []*s3.ObjectIdentifier{
{
Key: aws.String(s3Obj.ItemPath),
},
},
Quiet: aws.Bool(false),
},
}
result, err := svc.DeleteObjects(input)
if err != nil {
if aerr, ok := err.(awserr.Error); ok {
switch aerr.Code() {
default:
glog.Errorf("Error occurred while trying to delete object from S3. Error message - %v", aerr.Error())
}
} else {
glog.Errorf("Error occurred while trying to delete object from S3. Error message - %v", err.Error())
}
return err
}
glog.Info(result)
return nil
}
sObj3.ItemPath represents Obj1/Obj2/Obj3 path from example above. As a result of this function I do not get any error. I actually get the following message:
Deleted: [{
Key: "Obj1/Obj2/Obj3"
}]
But when I check my S3 cloud service, nothing is done. What am I doing wrong?
EDIT
I've changed my code so my Delete function accepts list of objects from which I make a list of s3.ObjectIdentifier. There is roughly 50 .JPEG files in that list, and for some reason following code ONLY DELETES LAST ONE. I am not sure why.
func (s3Obj S3) Delete(objects []string) error {
sess := session.Must(session.NewSession(&aws.Config{
Region: aws.String(s3Obj.Region),
}))
svc := s3.New(sess)
var objKeys = make([]*s3.ObjectIdentifier, len(objects))
for i, v := range objects {
glog.Info("About to delete: ", v)
objKeys[i] = &s3.ObjectIdentifier{
Key: &v,
}
}
input := &s3.DeleteObjectsInput{
Bucket: aws.String(s3Obj.Bucket),
Delete: &s3.Delete{
Objects: objKeys,
Quiet: aws.Bool(false),
},
}
result, err := svc.DeleteObjects(input)
if err != nil {
if aerr, ok := err.(awserr.Error); ok {
switch aerr.Code() {
default:
glog.Errorf("Error occurred while trying to delete object from S3. Error message - %v", aerr.Error())
}
} else {
glog.Errorf("Error occurred while trying to delete object from S3. Error message - %v", err.Error())
}
return err
}
glog.Info(result)
return nil
}

AccessDenied being encountered while using UploadPartCopy to MultiPartUpload in Golang

I am attempting to use S3 MultipartUpload to concat files in an S3 bucket. If you have several files >5MB (the last file can be smaller), you can concatenate them in S3 into a larger file. It's basically the equivalent of using cat to merge files together. When I attempt to do this in Go, I get:
An error occurred (AccessDenied) when calling the UploadPartCopy operation: Access Denied
The code looks kind of like this:
mpuOut, err := s3CreateMultipartUpload(&S3.CreateMultipartUploadInput{
Bucket: aws.String(bucket),
Key: aws.String(concatenatedFile),
})
if err != nil {
return err
}
var ps []*S3.CompletedPart
for i, part := range parts { // parts is a list of paths to things in s3
partNumber := int64(i) + 1
upOut, err := s3UploadPartCopy(&S3.UploadPartCopyInput{
Bucket: aws.String(bucket),
CopySource: aws.String(part),
Key: aws.String(concatenatedFile),
UploadId: aws.String(*mpuOut.UploadId),
PartNumber: aws.Int64(partNumber),
})
if err != nil {
return err // <- fails here
}
ps = append(ps, &S3.CompletedPart{
ETag: s3Out.CopyPartResult.ETag,
PartNumber: aws.Int64(int64(i)),
})
}
_, err = s3CompleteMultipartUpload(&S3.CompleteMultipartUploadInput{
Bucket: aws.String(bucket),
Key: aws.String(concatenatedFile),
MultipartUpload: &S3.CompletedMultipartUpload{Parts: ps},
UploadId: aws.String(*mpuOut.UploadId),
})
if err != nil {
return err
}
_, err = s3CompleteMultipartUpload(&S3.CompleteMultipartUploadInput{
Bucket: aws.String(bucket),
Key: aws.String(concatenatedFile),
MultipartUpload: &S3.CompletedMultipartUpload{Parts: ps},
UploadId: aws.String(*mpuOut.UploadId),
})
if err != nil {
return err
}
When it runs, it blows up with the error above. The permissions on the bucket are wide open. Any ideas?
Ok, so the problem is that when you are doing a UploadPartCopy, for the CopySource parameter, you don't just use the path in the s3 bucket. You have to put the buckname at the front of the path, even if it is in the same bucket. Derp
mpuOut, err := s3CreateMultipartUpload(&S3.CreateMultipartUploadInput{
Bucket: aws.String(bucket),
Key: aws.String(concatenatedFile),
})
if err != nil {
return err
}
var ps []*S3.CompletedPart
for i, part := range parts { // parts is a list of paths to things in s3
partNumber := int64(i) + 1
upOut, err := s3UploadPartCopy(&S3.UploadPartCopyInput{
Bucket: aws.String(bucket),
CopySource: aws.String(fmt.Sprintf("%s/%s", bucket, part), // <- ugh
Key: aws.String(concatenatedFile),
UploadId: aws.String(*mpuOut.UploadId),
PartNumber: aws.Int64(partNumber),
})
if err != nil {
return err
}
ps = append(ps, &S3.CompletedPart{
ETag: s3Out.CopyPartResult.ETag,
PartNumber: aws.Int64(int64(i)),
})
}
_, err = s3CompleteMultipartUpload(&S3.CompleteMultipartUploadInput{
Bucket: aws.String(bucket),
Key: aws.String(concatenatedFile),
MultipartUpload: &S3.CompletedMultipartUpload{Parts: ps},
UploadId: aws.String(*mpuOut.UploadId),
})
if err != nil {
return err
}
_, err = s3CompleteMultipartUpload(&S3.CompleteMultipartUploadInput{
Bucket: aws.String(bucket),
Key: aws.String(concatenatedFile),
MultipartUpload: &S3.CompletedMultipartUpload{Parts: ps},
UploadId: aws.String(*mpuOut.UploadId),
})
if err != nil {
return err
}
This just wasted about an hour of my life, so I figure I would try to save someone else the trouble.

Golang Aws S3 NoSuchKey: The specified key does not exist

I'm trying to download Objects from S3, the following is my code:
func listFile(bucket, prefix string) error {
svc := s3.New(sess)
params := &s3.ListObjectsInput{
Bucket: aws.String(bucket), // Required
Prefix: aws.String(prefix),
}
return svc.ListObjectsPages(params, func(p *s3.ListObjectsOutput, lastPage bool) bool {
for _, o := range p.Contents {
//log.Println(*o.Key)
log.Println(*o.Key)
download(bucket, *o.Key)
return true
}
return lastPage
})
}
func download(bucket, key string) {
logDir := conf.Cfg.Section("share").Key("LOG_DIR").MustString(".")
tmpLogPath := filepath.Join(logDir, bucket, key)
s3Svc := s3.New(sess)
downloader := s3manager.NewDownloaderWithClient(s3Svc, func(d *s3manager.Downloader) {
d.PartSize = 2 * 1024 * 1024 // 2MB per part
})
f, err := os.OpenFile(tmpLogPath, os.O_CREATE|os.O_WRONLY, 0644)
if _, err = downloader.Download(f, &s3.GetObjectInput{
Bucket: aws.String(bucket),
Key: aws.String(key),
}); err != nil {
log.Fatal(err)
}
f.Close()
}
func main() {
bucket := "mybucket"
key := "myprefix"
listFile(bucket, key)
}
I can get the objects list in the function listFile(), but a 404 returned when call download, why?
I had the same problem with recent versions of the library. Sometimes, the object key will be prefixed with a "./" that the SDK will remove by default making the download fail.
Try adding this to your aws.Config and see if it helps:
config := aws.Config{
...
DisableRestProtocolURICleaning: aws.Bool(true),
}
I submitted an issue.