Goal: Automated full and incremental backups of an AWS EFS filesystem to an S3 bucket.
I have been looking at Duplicity/Duply to accomplish this, and it looks like it could work.I do have one concern, you would have to store API keys in the clear on an AMI for this to work. Is there any way to accomplish this using a role?
I do backups exactly as you want to and it can be done since duplicity has support for instance profile. Make sure to give appropriate access to your role and attach it to your instance.
Related
Can someone explain me whats the best way to transfer data from a harddrive on an EC2 Instance (running Windows Server 2012) to an S3 Bucket for the same AWS Account on a daily basis?
Backround idea to this:
I'm generating a .csv file for one of our Business partners daily at 11:00 am and I want to deliver it to S3 (he has access to our S3 Bucket).
After that he can pull it out of S3 manually or automatically whenever he wants.
Hope you can help me, I only found manually solutions with the CLI, but no automated way for daily transfers.
Best Regards
You can directly mount S3 buckets as mounted drives on your EC2 instances. This way you don't even need some sort of triggers/daily task scheduler along with third party service as objects would be directly available in the S3 bucket.
For Linux typically you would use Filesystem in Userspace (FUSE). Take a look at this repo if you need it for Linux: https://github.com/s3fs-fuse/s3fs-fuse.
Regarding Windows, there is this tool:
https://tntdrive.com/mount-amazon-s3-bucket.aspx
If these tools don't suit you or if you don't want to mount directly the s3 bucket, here is another option: Whatever you can do with the CLI you should be able to do with the SDK. Therefore if you are able to code in one of the various language AWS Lambda proposes - C#/Java/Go/Powershell/Python/Node.js/Ruby - you could automate that using a Lambda function along with a daily task scheduler triggering at 11a.m.
Hope this helps!
Create a small application that uploads your file to an S3 bucket (there are a some example here). Then use Task Scheduler to execute your application on a regular basis.
I have to do this for almost 100 account so planning to create using something infra as code. Cloud formation does not support creating object.. can anyone help
There are several strategies, depending on the client environment.
The aws-cli may be used for shell scripting, aws-sdk for JavaScript environments, or Boto3 for python environments.
If you provide the client environment, creating an S3 object is almost a one-liner holding equal s3 bucket security and lifecycle matters.
As Rich Andrew said, there are several different technologies. If you are trying to do infrastructure as code and attach policies and roles I would suggest you look into Terraform or Serverless.
I frequently combine two of the techniques already mentioned above.
For infrastructure setup - Terraform. This tool is always ahead of competition (Ansible, etc.) in terms of cloud modules. You can use it to create bucket, create bucket policies, users, their IAM policies for bucket access, upload initial files to bucket and much more.
It will keep state file containing record of those resources, so you can use the same workflow to destroy all that's created if necessary with very little modifications.
Very easy to get started, but not flexible and you can be caught out if scope change in middle of project suddenly requires feature that's not there.
To get started check out Terrafrom module registry - https://registry.terraform.io/.
It has quite a few S3 modules available to get started even quicker.
For interaction with aws resources - Python Boto3. In your case that would be subsequent file uploads, deletions in S3 bucket.
You can use Boto3 to set up infrastructure - just like Terraform, but it will require more work on your side (like handling exceptions and errors).
Hey there I am new to AWS and trying to piece together the best way to do this.
I have thousands of photos I'd like to upload and process on AWS. The software is Agisoft Photoscan and is run in stages. So for the first stage i'd like to use an instance that is geared towards CPU/Memory usage and the second stage geared towards GPU/Memory.
What is the best way to do this? Do I create a new volume for each project in EC2 and attach that volume to each instance when I need to? I see people saying to use S3, do I just create a bucket for each project and then attach the bucket to my instances?
Sorry for the basic questions, the more I read the more questions I seem to have,
I'd recommend starting with s3 and seeing if it works - will be cheaper and easier to setup. Switch to EBS volumes if you need to, but I doubt you will need to.
You could create a bucket for each project, or you could just create a bucket a segregate the images based on the file-name prefix (i.e. project1-image001.jpg).
You don't 'attach' buckets to EC2, but you should assign an IAM role to the instances as you create them, and then you can grant that IAM role permissions to access the S3 bucket(s) of your choice.
Since you don't have a lot of AWS experience, keep things simple, and using S3 is about as simple as it gets.
You can go with AWS S3 to upload photos. AWS S3 is similar like Google Drive.
If you want to use AWS EBS volumes instead of S3. The problem you may face is,
EBS volumes is accessible within availability zone but not within region also means you have to create snapshots to transfer another availability zone. But S3 is global.
EBS volumes are not designed for storing multimedia files. It is like hard drive. Once you launch an EC2 instance need to attach EBS volumes.
As per best practice, you use AWS S3.
Based on your case view, you can create bucket for each project or you can use single bucket with multiple folders to identify the projects.
Create an AWS IAM role with S3 access permission and attach it to EC2 instance. No need of using AWS Credentials in the project. EC2 instance will use role to access S3 and role doesn't have permanent credentials, it will keep rotating it.
Title says it thought a bit garbled. I am looking for the required policies /permissions (IAM) that I will need to grant to a user in
order to create a usable profile to run boto.
The root of this is that we use the ec2.py inventory script for
ansible, that will need to list ips in order to login with ansible.
I currently have a god level user (all access) that works fine, but I
will need to restrict these further down so we can create runable jobs
without wide open permissions. I image that we will need something with
describe-* but thats about as far as i've been able to figure out.
It all depends on what AWS services you will be using and what operations you will be performing. You need read only access (the operations that don't make any change) or power access?
You mentioned you will need to list ips. For you to use ansible's ec2.py script, you need read only access.
As a starting point, you can use EC2ReadOnlyAccess stock policy that comes with IAM which will solve your issue. If you want it more granular, copy paste the EC2ReadOnly policy and remove the ones that are not needed and save the policy.
I have an AMI image in which will be used for autoscaling, every EC2 instance that initiated from the AMI image,suppose to download some files from a s3 bucket, (They are all in the same VPC) the s3 suppose to be private(Not open to public).
How does this can be done?
There are lots of ways. You could use the AWS CLI (S3 Command) or you could use the SDK for the language of your choice. You will also probably want to use IAM to establish the credentials for accessing the resources. The CLI is probably the quickest way to get up and running.