We are currently publishing data to an S3 bucket. We now have multiple clients to consume this data that we stored in our bucket. Each client wants to have their own bucket. The ask is to publish data to each bucket.
Option 1: Have our publisher publish to each S3 bucket.
cons: More logic on our publishing application. Handle failures/retries based on clients.
Option 2: Use S3's Cross-region replication
reason against it: Even though we can transfer objects to other accounts, Only one destination can be specified. If source bucket has server side encryption we cannot replicate.
Option 3: AWS Lamba. Have S3 invoke Lamba and lamba publish to multiple buckets.
confused: Not sure how different this is from option 1.
Option 4: Restrict access to our S3 bucket with read only. Have clients read from it. But wondering how clients can know if an object is already read! I do not prefer time based folders, we have multiple publishers to this S3 bucket and clients cant know for sure if the folder is indeed complete.
Is there any good option to solve the above problem?
I would go with option 3, Lambda. Your Lambda function could be triggered by S3 events so you wouldn't have to add any manual steps or change your current publishing process at all.
Related
Use-case
We basically want to collect files from external customers into a file server.
We were thinking of using the S3 bucket as the file server that customers can interact with directly.
Question
Is it possible to accomplish this where we create a bucket for each customer, and he can be given a link to the S3 bucket that also serves as the UI for him to drag and drop his files into directly?
He shouldn't have to log-in to AWS or create an AWS account
He should directly interact with only his S3 bucket (drag-drop, add, delete files), there shouldn't be a way for him to check other buckets. We will probably create many S3 buckets for our customers in the same AWS account. His entry point into the S3 bucket UI is via a link (S3 bucket URL perhaps)
If such a thing is possible - would love some general pointers as to what more I should do (see my approach below)
My work so far
I've been able to create an S3 bucket - grant public access
Set policies to Get, List and PutObject into the S3 bucket.
I've been able to give public access to objects inside the bucket using their link, but never their bucket itself.
Is there something more I can build on or am I hitting a dead-end and this is not possible to accomplish?
P.S: This may not be a coding question, but maybe your answer could have code to accomplish it if at all possible, general pointers if possible
S3 presigned url can help in such cases, but you have write your own custom frontend application for drag and drop features.
Link: https://docs.aws.amazon.com/AmazonS3/latest/userguide/PresignedUrlUploadObject.html
Recently, one of my AWS accounts got compromised, fortunately we were able to change all secure information in time. To avoid recurrence of such a situation the first thing to do would be to have a process in place for secret info management.
That said, I would also want to trigger a cloudwatch alarm in a case where multiple download or delete is taking place from inside my AWS account.
I have come across solutions like
AWS WAF
Have a CDN in place
Trigger a lambda function on an event in S3
Solutions #1 & #2 are not serving to my requirement as they throttle requests coming from outside of AWS. Once it is implemented at S3 level, it will automatically throttle both inside and outside requests.
In solution #3 I could not get a hold of multiple objects requested by an IP in my lambda function, when a threshold time limit and threshold number of file is crossed.
Is raising an alarm by rate-limiting at S3 level a possibility?
There is no rate limit provided by AWS on S3 directly, but you can implement alarms over SNS Topics with CloudTrails.
Unless you explicitly require anyone in your team to remove the objects in your S3 bucket, you shouldn't provide anyone access. The following are some idea you can follow:
Implement the least privilege access
You can block the access to remove the objects on the IAM User
level, so no-one will be able to remove any items.
You can modify the Bucket policy to provide DeleteObject Access to
specific users/roles as conditions.
Enable multi-factor authentication (MFA) Delete
MFA Delete can help prevent accidental bucket deletions. If MFA
Delete is not enabled, any user with the password of a sufficiently
privileged root or IAM user could permanently delete an Amazon S3
object.
MFA Delete requires additional authentication for either of the
following operations:
Changing the versioning state of your bucket
Permanently deleting an object version.
S3 Object Lock
S3 Object Lock enables you to store objects using a "Write Once Read Many" (WORM) model. S3 Object Lock can help prevent accidental or inappropriate deletion of data. For example, you could use S3 Object Lock to help protect your AWS CloudTrail logs.
Amazon Macie with Amazon S3
Macie uses machine learning to automatically discover, classify, and protect sensitive data in AWS. Macie recognizes sensitive data such as personally identifiable information (PII) or intellectual property. It provides you with dashboards and alerts that give visibility into how this data is being accessed or moved.
You can learn more about the best Security Practices with S3.
https://aws.amazon.com/premiumsupport/knowledge-center/secure-s3-resources/
Is it possible to send/sync files from source AWS S3 bucket into destination S3 bucket on a different AWS account, in a different location?
I found this: https://aws.amazon.com/premiumsupport/knowledge-center/copy-s3-objects-account/
But if I understand it correctly, this is the way how to sync files from destination account.
Is there a way how to do it other way around? Accessing destination bucket from source account (using source IAM user credentials).
AWS finally came up with a solution for this: S3 batch operations.
S3 Batch Operations is an Amazon S3 data management feature that lets
you manage billions of objects at scale with just a few clicks in the
Amazon S3 Management Console or a single API request. With this
feature, you can make changes to object metadata and properties, or
perform other storage management tasks, such as copying objects
between buckets, replacing object tag sets, modifying access controls,
and restoring archived objects from S3 Glacier — instead of taking
months to develop custom applications to perform these tasks.
It allows you to replicate data at bucket, prefix or object level, from any region to any region, between any storage class (e.g. S3 <> Glacier) and across AWS accounts! No matter if it's thousands, millions or billions of objects.
This introduction video has an overview of the options (my apologies if I almost sound like a salesperson, I'm just very excited about it as I have a couple of million objects to copy ;-) https://aws.amazon.com/s3/s3batchoperations-videos/
That needs the right IAM and Bucket policy settings.
A detailed configuration for cross account access, is discussed here
Once you have it configured you can perform sync,
aws s3 sync s3://sourcebucket s3://destinationbucket --recursive
Hope it helps.
I'm looking for a way to log when data is copied from my S3 bucket. Most importantly, which file(s) were copied and when. If I had my way, I'd like by who and where but I don't want to get ahead of myself.
A couple of options:
Server Access Logging provides detailed records for the requests that are made to an S3 bucket
AWS CloudTrail captures a subset of API calls for Amazon S3 as events, including calls from the Amazon S3 console and from code calls to the Amazon S3 APIs
how to save voice message of customer number and store in an s3 bucket using aws connect. I made a contact workflow but I am not understanding how to save voice message to s3 bucket?
We've tried many ways to build a voicemail solution, including many of the things you might have found on the web. After much iteration we realized that we had a product that would be useful to others.
For voicemail in Amazon Connect, take a look at https://amazonconnectvoicemail.com as a simple, no-code integration that can be customized to meet the needs of your customers and organization!
As soon as you enabled Voice Recording all recordings are placed automatically in the bucket you defined at the very beginning as you setup your AWS Connect Instance. Just check your S3 Bucket if you can spot the recordings.
By default, AWS creates a new Amazon S3 bucket during the
configuration process, with built-in encryption. You can also use
existing S3 buckets. There are separate buckets for call recordings
and exported reports, and they are configured independently.
(https://docs.aws.amazon.com/connect/latest/adminguide/what-is-amazon-connect.html)
The recording in S3 is only starting when an agent is taking the call. Currently, there is no direct voice mail feature in Amazon connect. You can forward the call to a service that allows it, such as Twillio.