Search Files in AWS S3 by LastModified

Search Files in AWS S3 by LastModified - amazon-web-services

I want to search files in AWS S3 based on file creation time (or LastModified) time in go. I am aware of same in python using boto3 paginator which provides the options to provide the query string but want to achieve same in go.
Any suggestion or any sample in go-lang would be appreciated?
Sample code I am trying to list all files:
for s.NextContinuationToken != "" {
maxFileRead := 15
bucket := "XXX-XXX-test"
// To check if previous result was truncated
if s.IsTruncated {
fileList, err = s.session.ListObjectsV2(&s3.ListObjectsV2Input{
Bucket: aws.String(bucket),
MaxKeys: aws.Int64(maxFileRead),
ContinuationToken: &s.NextContinuationToken,
})
} else {
fileList, err = s.session.ListObjectsV2(&s3.ListObjectsV2Input{
Bucket: aws.String(bucket),
MaxKeys: aws.Int64(maxFileRead),
})
}
s.IsTruncated = *fileList.IsTruncated
if s.IsTruncated {
s.NextContinuationToken = *fileList.NextContinuationToken
} else {
s.NextContinuationToken = ""
}
if err != nil {
if aerr, ok := err.(awserr.Error); ok {
switch aerr.Code() {
case s3.ErrCodeNoSuchBucket:
fmt.Println(s3.ErrCodeNoSuchBucket, aerr.Error())
default:
fmt.Println(aerr.Error())
}
} else {
// Print the error, cast err to awserr.Error to get the Code and
// Message from an error.
fmt.Println(err.Error())
}
}
}
Now I want to modify the search to only list files created after a particular time.

Call ListObjectsV2 (https://docs.aws.amazon.com/sdk-for-go/api/service/s3/#S3.ListObjectsV2) on each bucket.
The Contents property returned is a list of metadata about each bucket object.
Use the LastModified field.

Related

When using headObject in aws sdk 2 for go. Why it gives undefined?

Thanks in advance :) . I'm using the following code to get metadata from an s3 object after listing all the object in a bucket . But I don't know why it gives the error undefined: s3.HeadObject when running go run listObjects.go -bucket xxxx -prefix xxxx
I tried two solutions: giving the client as the one created from the config and from the context as in this link appears [1]. BUt both gave the same error. Can you give me any clue?
package main
import (
"context"
"flag"
"fmt"
"log"
"github.com/aws/aws-sdk-go-v2/config"
"github.com/aws/aws-sdk-go-v2/service/s3"
)
var (
bucketName string
objectPrefix string
objectDelimiter string
maxKeys int
)
func init() {
flag.StringVar(&bucketName, "bucket", "", "The `name` of the S3 bucket to list objects from.")
flag.StringVar(&objectPrefix, "prefix", "", "The optional `object prefix` of the S3 Object keys to list.")
flag.StringVar(&objectDelimiter, "delimiter", "",
"The optional `object key delimiter` used by S3 List objects to group object keys.")
flag.IntVar(&maxKeys, "max-keys", 0,
"The maximum number of `keys per page` to retrieve at once.")
}
// Lists all objects in a bucket using pagination
func main() {
flag.Parse()
if len(bucketName) == 0 {
flag.PrintDefaults()
log.Fatalf("invalid parameters, bucket name required")
}
// Load the SDK's configuration from environment and shared config, and
// create the client with this.
cfg, err := config.LoadDefaultConfig(context.TODO())
if err != nil {
log.Fatalf("failed to load SDK configuration, %v", err)
}
client := s3.NewFromConfig(cfg)
// Set the parameters based on the CLI flag inputs.
params := &s3.ListObjectsV2Input{
Bucket: &bucketName,
}
if len(objectPrefix) != 0 {
params.Prefix = &objectPrefix
}
if len(objectDelimiter) != 0 {
params.Delimiter = &objectDelimiter
}
// Create the Paginator for the ListObjectsV2 operation.
p := s3.NewListObjectsV2Paginator(client, params, func(o *s3.ListObjectsV2PaginatorOptions) {
if v := int32(maxKeys); v != 0 {
o.Limit = v
}
})
// Iterate through the S3 object pages, printing each object returned.
var i int
log.Println("Objects:")
for p.HasMorePages() {
i++
// Next Page takes a new context for each page retrieval. This is where
// you could add timeouts or deadlines.
page, err := p.NextPage(context.TODO())
if err != nil {
log.Fatalf("failed to get page %v, %v", i, err)
}
// Log the objects found
// Headobject function is called
for _, obj := range page.Contents {
input := &s3.HeadObjectInput{
Bucket: &bucketName,
Key: obj.Key,
}
result, err := &s3.HeadObject(client, input)
if err != nil {
panic(err)
}
fmt.Println("Object:", *obj.Key)
}
}
}
./listObjects.go:86:20: undefined: s3.HeadObject
1

Doing the headObject as an auxiliary method works
package main
import (
"context"
"flag"
"fmt"
"log"
"github.com/aws/aws-sdk-go-v2/config"
"github.com/aws/aws-sdk-go-v2/service/s3"
)
var (
bucketName string
objectPrefix string
objectDelimiter string
maxKeys int
)
func init() {
flag.StringVar(&bucketName, "bucket", "", "The `name` of the S3 bucket to list objects from.")
flag.StringVar(&objectPrefix, "prefix", "", "The optional `object prefix` of the S3 Object keys to list.")
flag.StringVar(&objectDelimiter, "delimiter", "",
"The optional `object key delimiter` used by S3 List objects to group object keys.")
flag.IntVar(&maxKeys, "max-keys", 0,
"The maximum number of `keys per page` to retrieve at once.")
}
// Lists all objects in a bucket using pagination
func main() {
flag.Parse()
if len(bucketName) == 0 {
flag.PrintDefaults()
log.Fatalf("invalid parameters, bucket name required")
}
// Load the SDK's configuration from environment and shared config, and
// create the client with this.
cfg, err := config.LoadDefaultConfig(context.TODO())
if err != nil {
log.Fatalf("failed to load SDK configuration, %v", err)
}
client := s3.NewFromConfig(cfg)
// Set the parameters based on the CLI flag inputs.
params := &s3.ListObjectsV2Input{
Bucket: &bucketName,
}
if len(objectPrefix) != 0 {
params.Prefix = &objectPrefix
}
if len(objectDelimiter) != 0 {
params.Delimiter = &objectDelimiter
}
// Create the Paginator for the ListObjectsV2 operation.
p := s3.NewListObjectsV2Paginator(client, params, func(o *s3.ListObjectsV2PaginatorOptions) {
if v := int32(maxKeys); v != 0 {
o.Limit = v
}
})
// Iterate through the S3 object pages, printing each object returned.
var i int
log.Println("Objects:")
for p.HasMorePages() {
i++
// Next Page takes a new context for each page retrieval. This is where
// you could add timeouts or deadlines.
page, err := p.NextPage(context.TODO())
if err != nil {
log.Fatalf("failed to get page %v, %v", i, err)
}
// Log the objects found
// Headobject function is called
for _, obj := range page.Contents {
fmt.Println("Object:", *obj.Key)
OpHeadObject(client, bucketName, *obj.Key)
}
}
}
func OpHeadObject(sess *s3.Client, bucketName, objectName string) {
input := &s3.HeadObjectInput{
Bucket: &bucketName,
Key: &objectName,
}
resp, err := sess.HeadObject(context.TODO(), input)
if err != nil {
panic(err)
}
fmt.Println(resp.StorageClass) // that you want.
}

Upload a struct or object to S3 bucket using GoLang?

I am working with the AWS S3 SDK in GoLang, playing with uploads and downloads to various buckets. I am wondering if there is a simpler way to upload structs or objects directly to the bucket?
I have a struct representing an event:
type Event struct {
ID string
ProcessID string
TxnID string
Inputs map[string]interface{}
}
That I would like to upload into the S3 bucket. But the code that I found in the documentation only works for uploading strings.
func Save(client S3Client, T interface{}, key string) bool {
svc := client.S3clientObject
input := &s3.PutObjectInput{
Body: aws.ReadSeekCloser(strings.NewReader("testing this one")),
Bucket: aws.String(GetS3Bucket()),
Key: aws.String(GetObjectKey(T, key)),
Metadata: map[string]*string{
"metadata1": aws.String("value1"),
"metadata2": aws.String("value2"),
},
}
This is successful in uploading a basic file to the S3 bucket that when opened simply reads "testing this one". Is there a way to upload to the bucket so that it is uploading an object rather than simply just a string value??
Any help is appreciated as I am new to Go and S3.
edit
This is the code I'm using for the Get function:
func GetIt(client S3Client, T interface{}, key string) interface{} {
svc := client.S3clientObject
s3Key := GetObjectKey(T, key)
resp, err := svc.GetObject(&s3.GetObjectInput{
Bucket: aws.String(GetS3Bucket()),
Key: aws.String(s3Key),
})
if err != nil {
fmt.Println(err)
return err
}
result := json.NewDecoder(resp.Body).Decode(&T)
fmt.Println(result)
return json.NewDecoder(resp.Body).Decode(&T)
}
func main() {
client := b.CreateS3Client()
event := b.CreateEvent()
GetIt(client, event, key)
}

Encode the value as bytes and upload the bytes. Here's how to encode the value as JSON bytes:
func Save(client S3Client, value interface{}, key string) error {
p, err := json.Marshal(value)
if err != nil {
return err
}
input := &s3.PutObjectInput{
Body: aws.ReadSeekCloser(bytes.NewReader(p)),
…
}
…
}
Call Save with the value you want to upload:
value := &Event{ID: "an id", …}
err := Save(…, value, …)
if err != nil {
// handle error
}
There are many possible including including gob, xml and json, msgpack, etc. The best encoding format will depend on your application requirements.
Reverse the process when getting an object:
func GetIt(client S3Client, T interface{}, key string) error {
svc := client.S3clientObject
resp, err := svc.GetObject(&s3.GetObjectInput{
Bucket: aws.String(GetS3Bucket()),
Key: aws.String(key),
})
if err != nil {
return err
}
return json.NewDecoder(resp.Body).Decode(T)
}
Call GetIt with a pointer to the destination value:
var value model.Event
err := GetIt(client, &value, key)
if err != nil {
// handle error
}
fmt.Println(value) // prints the decoded value.

The example cited here shows that S3 allows you to upload anything that implements the io.Reader interface. The example is using the strings.NewReader syntax create a io.Reader that knows how to provide the specified string to the caller. Your job (according to AWS here) is to figure out how to adapt whatever you need to store into an io.Reader.
You can store the bytes directly JSON encoded like this
package main
import (
"bytes"
"encoding/json"
)
type Event struct {
ID string
ProcessID string
TxnID string
Inputs map[string]interface{}
}
func main() {
// To prepare the object for writing
b, err := json.Marshal(event)
if err != nil {
return
}
// pass this reader into aws.ReadSeekCloser(...)
reader := bytes.NewReader(b)
}

How to delete a non-empty S3 bucket using AWS SDK for Go v2

I'm using AWS sdk v2
and I need to delete a bucket that had objects
what's the best way to do so? is there something to force delete? or that deleted all the objects inside a bucket?

The AWS documentation Deleting a bucket describes how to delete a bucket that has objects. The documentation also provides an SDK example (written in Java, but mainly serves as a guideline) that performs the following steps:
Delete all objects
Delete all object versions (for versioned buckets)
Finally delete bucket
There is no "force delete" option for non-empty buckets. You would need to implement the above steps.
The following sample code shows how to completely delete a non-empty bucket:
func main() {
cfg, err := config.LoadDefaultConfig(context.TODO(), config.WithRegion("us-east-1"))
if err != nil {
log.Fatalf("Failed to load config: %v", err)
}
bucket := aws.String("your-bucket-name")
client := s3.NewFromConfig(cfg)
deleteObject := func(bucket, key, versionId *string) {
log.Printf("Object: %s/%s\n", *key, aws.ToString(versionId))
_, err := client.DeleteObject(context.TODO(), &s3.DeleteObjectInput{
Bucket: bucket,
Key: key,
VersionId: versionId,
})
if err != nil {
log.Fatalf("Failed to delete object: %v", err)
}
}
in := &s3.ListObjectsV2Input{Bucket: bucket}
for {
out, err := client.ListObjectsV2(context.TODO(), in)
if err != nil {
log.Fatalf("Failed to list objects: %v", err)
}
for _, item := range out.Contents {
deleteObject(bucket, item.Key, nil)
}
if out.IsTruncated {
in.ContinuationToken = out.ContinuationToken
} else {
break
}
}
inVer := &s3.ListObjectVersionsInput{Bucket: bucket}
for {
out, err := client.ListObjectVersions(context.TODO(), inVer)
if err != nil {
log.Fatalf("Failed to list version objects: %v", err)
}
for _, item := range out.DeleteMarkers {
deleteObject(bucket, item.Key, item.VersionId)
}
for _, item := range out.Versions {
deleteObject(bucket, item.Key, item.VersionId)
}
if out.IsTruncated {
inVer.VersionIdMarker = out.NextVersionIdMarker
inVer.KeyMarker = out.NextKeyMarker
} else {
break
}
}
_, err = client.DeleteBucket(context.TODO(), &s3.DeleteBucketInput{Bucket: bucket})
if err != nil {
log.Fatalf("Failed to delete bucket: %v", err)
}
}
You should probably optimize this further and use DeleteObjects for batch calls in order to reduce request overhead.

How to recursively delete objects

I would like to delete all .JPEG files from specified path at S3 bucket. For example, lets say that I have structure on S3 cloud service similar to following:
Obj1/
Obj2/
Obj3/
image_1.jpeg
...
image_N.jpeg
Is it possible to specify Obj1/Obj2/Obj3 as DeleteObjectsInput's prefix and recursively delete all .JPEG files that contain such prefix.
Here is my code:
func (s3Obj S3) Delete() error {
sess := session.Must(session.NewSession(&aws.Config{
Region: aws.String(s3Obj.Region),
}))
svc := s3.New(sess)
input := &s3.DeleteObjectsInput{
Bucket: aws.String(s3Obj.Bucket),
Delete: &s3.Delete{
Objects: []*s3.ObjectIdentifier{
{
Key: aws.String(s3Obj.ItemPath),
},
},
Quiet: aws.Bool(false),
},
}
result, err := svc.DeleteObjects(input)
if err != nil {
if aerr, ok := err.(awserr.Error); ok {
switch aerr.Code() {
default:
glog.Errorf("Error occurred while trying to delete object from S3. Error message - %v", aerr.Error())
}
} else {
glog.Errorf("Error occurred while trying to delete object from S3. Error message - %v", err.Error())
}
return err
}
glog.Info(result)
return nil
}
sObj3.ItemPath represents Obj1/Obj2/Obj3 path from example above. As a result of this function I do not get any error. I actually get the following message:
Deleted: [{
Key: "Obj1/Obj2/Obj3"
}]
But when I check my S3 cloud service, nothing is done. What am I doing wrong?
EDIT
I've changed my code so my Delete function accepts list of objects from which I make a list of s3.ObjectIdentifier. There is roughly 50 .JPEG files in that list, and for some reason following code ONLY DELETES LAST ONE. I am not sure why.
func (s3Obj S3) Delete(objects []string) error {
sess := session.Must(session.NewSession(&aws.Config{
Region: aws.String(s3Obj.Region),
}))
svc := s3.New(sess)
var objKeys = make([]*s3.ObjectIdentifier, len(objects))
for i, v := range objects {
glog.Info("About to delete: ", v)
objKeys[i] = &s3.ObjectIdentifier{
Key: &v,
}
}
input := &s3.DeleteObjectsInput{
Bucket: aws.String(s3Obj.Bucket),
Delete: &s3.Delete{
Objects: objKeys,
Quiet: aws.Bool(false),
},
}
result, err := svc.DeleteObjects(input)
if err != nil {
if aerr, ok := err.(awserr.Error); ok {
switch aerr.Code() {
default:
glog.Errorf("Error occurred while trying to delete object from S3. Error message - %v", aerr.Error())
}
} else {
glog.Errorf("Error occurred while trying to delete object from S3. Error message - %v", err.Error())
}
return err
}
glog.Info(result)
return nil
}

How to empty a S3 bucket using the AWS SDK for GOlang?

Goal: To empty an existing S3 bucket using the AWS SDK for GOlang.

AWS SDK now has BatchDeleteIterator that can do the job. Example provided via Amazon docs.
package main
import (
"github.com/aws/aws-sdk-go/aws"
"github.com/aws/aws-sdk-go/aws/session"
"github.com/aws/aws-sdk-go/service/s3"
"github.com/aws/aws-sdk-go/service/s3/s3manager"
"fmt"
"os"
)
// go run s3_delete_objects BUCKET
func main() {
if len(os.Args) != 2 {
exitErrorf("Bucket name required\nUsage: %s BUCKET", os.Args[0])
}
bucket := os.Args[1]
// Initialize a session in us-west-2 that the SDK will use to load
// credentials from the shared credentials file ~/.aws/credentials.
sess, _ := session.NewSession(&aws.Config{
Region: aws.String("us-west-2")},
)
// Create S3 service client
svc := s3.New(sess)
// Setup BatchDeleteIterator to iterate through a list of objects.
iter := s3manager.NewDeleteListIterator(svc, &s3.ListObjectsInput{
Bucket: aws.String(bucket),
})
// Traverse iterator deleting each object
if err := s3manager.NewBatchDeleteWithClient(svc).Delete(aws.BackgroundContext(), iter); err != nil {
exitErrorf("Unable to delete objects from bucket %q, %v", bucket, err)
}
fmt.Printf("Deleted object(s) from bucket: %s", bucket)
}
func exitErrorf(msg string, args ...interface{}) {
fmt.Fprintf(os.Stderr, msg+"\n", args...)
os.Exit(1)
}

NOTE: These are code snippets that might require YOU to make changes on YOUR SIDE to make it run.
You will need to implement the below method:
//EmptyBucket empties the Amazon S3 bucket
func (s awsS3) EmptyBucket(bucket string) error {
log.Info("removing objects from S3 bucket : ", bucket)
params := &s3.ListObjectsInput{
Bucket: aws.String(bucket),
}
for {
//Requesting for batch of objects from s3 bucket
objects, err := s.Client.ListObjects(params)
if err != nil {
return err
}
//Checks if the bucket is already empty
if len((*objects).Contents) == 0 {
log.Info("Bucket is already empty")
return nil
}
log.Info("First object in batch | ", *(objects.Contents[0].Key))
//creating an array of pointers of ObjectIdentifier
objectsToDelete := make([]*s3.ObjectIdentifier, 0, 1000)
for _, object := range (*objects).Contents {
obj := s3.ObjectIdentifier{
Key: object.Key,
}
objectsToDelete = append(objectsToDelete, &obj)
}
//Creating JSON payload for bulk delete
deleteArray := s3.Delete{Objects: objectsToDelete}
deleteParams := &s3.DeleteObjectsInput{
Bucket: aws.String(bucket),
Delete: &deleteArray,
}
//Running the Bulk delete job (limit 1000)
_, err = s.Client.DeleteObjects(deleteParams)
if err != nil {
return err
}
if *(*objects).IsTruncated { //if there are more objects in the bucket, IsTruncated = true
params.Marker = (*deleteParams).Delete.Objects[len((*deleteParams).Delete.Objects)-1].Key
log.Info("Requesting next batch | ", *(params.Marker))
} else { //if all objects in the bucket have been cleaned up.
break
}
}
log.Info("Emptied S3 bucket : ", bucket)
return nil
}
UPDATE : The latest version of AWS SDK for GO has resolved the prior issue I had.

The AWS SDK for Go has a Amazon S3 batching abstraction. Take a look here.

Don't forget that by default ListObjects only returns up to 1000 bucket items. If you might have more than 1000, check the IsTruncated property on the return value. If true, use the NextMarker property from the return value to get the next 1000 items.
See my example in the Go dev guide: http://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/s3-example-basic-bucket-operations.html#s3-examples-bucket-ops-delete-all-bucket-items

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Search Files in AWS S3 by LastModified - amazon-web-services

Call ListObjectsV2 (https://docs.aws.amazon.com/sdk-for-go/api/service/s3/#S3.ListObjectsV2) on each bucket. The Contents property returned is a list of metadata about each bucket object. Use the LastModified field.

Related

When using headObject in aws sdk 2 for go. Why it gives undefined?

Upload a struct or object to S3 bucket using GoLang?

How to delete a non-empty S3 bucket using AWS SDK for Go v2

How to recursively delete objects

How to empty a S3 bucket using the AWS SDK for GOlang?

Categories

Resources