How to restore postgres dump with RDS? - amazon-web-services

I have a postgres dump in AWS S3 bucket, what is the most convenience way to restore it in a AWS RDS ?

AFAIK, there is no native AWS way to manually push data from S3 to anywhere else. The dump stored on S3 needs to first be downloaded and then restored.
You can use the link posted above (http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/PostgreSQL.Procedural.Importing.html), however that doesn't help you download the data.
The easiest way to get something off of S3 is to simply go to the S3 console and point/click your way to the file, right click it and click Download. If you need to restore FROM an EC2 instance (e.g. because your RDS does not have a public IP), than install and configure the AWS CLI (http://docs.aws.amazon.com/cli/latest/userguide/installing.html).
Once you have the CLI configured, download with the following command:
aws s3 cp s3://<<bucket>>/<<folder>>/<<folder>>/<<key>> dump.gz
NOTE: the above command may need some additional tweaking depending on whether you have multiple AWS profiles installed on the machine, the dump is not one file (but many), etc.
From there restore to RDS just like you would a normal Postgres server following the instructions in the AWS link.
Hope that helps!

Related

On-Premise file backup to aws

Use case:
I have one directory on-premise, I want to make a backup for it let's say at every midnight. And want to restore it if something goes wrong.
Doesn't seem a complicated task,but reading through the AWS documentation even this can be cumbersome and costly.Setting up Storage gateway locally seems unnecessarily complex for a simple task like this,setting up at EC2 costly also.
What I have done:
Reading through this + some other blog posts:
https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html
https://docs.aws.amazon.com/storagegateway/latest/userguide/WhatIsStorageGateway.html
What I have found:
1.Setting up file gateway (locally or as an EC2 instance):
It just mount the files to an S3. And that's it.So my on-premise App will constantly write to this S3.The documentation doesn't mention anything about scheduled backup and recovery.
2.Setting up volume gateway:
Here I can make a scheduled synchronization/backup to the a S3 ,but using a whole volume for it would be a big overhead.
3.Standalone S3:
Just using a bare S3 and copy my backup there by AWS API/SDK with a manually made scheduled job.
Solutions:
Using point 1 from above, enable versioning and the versions of the files will serve as a recovery point.
Using point 3
I think I am looking for a mix of file-volume gateway: Working on file level and make an asynchronus scheduled snapshot for them.
How this should be handled? Isn't there a really easy way which will just send a backup of a directory to the AWS?
The easiest way to backup a directory to Amazon S3 would be:
Install the AWS Command-Line Interface (CLI)
Provide credentials via the aws configure command
When required run the aws s3 sync command
For example
aws s3 sync folder1 s3://bucketname/folder1/
This will copy any files from the source to the destination. It will only copy files that have been added or changed since a previous sync.
Documentation: sync — AWS CLI Command Reference
If you want to be more fancy and keep multiple backups, you could copy to a different target directory, or create a zip file first and upload the zip file, or even use a backup program like Cloudberry Backup that knows how to use S3 and can do traditional-style backups.

How to import EC2 snapshot from S3 backup? (AWS CLI import-snapshot)

I want examples on how to backup an EC2 snapshots to S3 bucket, and import it back afterwards.
I found the AWS CLI can export the snapshots to S3, and was explained here
Copying aws snapshot to S3 bucket
I also found the import command from AWS CLI reference, but I failed to execute that command, as I don't follow understand the option
https://docs.aws.amazon.com/cli/latest/reference/ec2/import-snapshot.html
can someone explain how to use this command? especially on how to specific which file on the S3 bucket to import from?
EC2 snapshots are by default stored on S3 standard storage. However, you cannot copy the snapshot to a specific S3 bucket using the AWS CLI.
There may be some third party tool out there somewhere that can do it, but I do not see any reason why you would need to download a snapshot to your s3 bucket? It's like paying for the snapshot twice!!!
Could you mention why you have this requirement? An easier alternate to your problem might exist.
Note:
The two links that you shared in your question, do not copy a snapshot to S3.
The first link shows how to copy a snapshot from one region to another, while the second link is to export a disk image into an EBS snapshot and only the following disk formats are supported for this import:
Virtual Hard Disk (VHD/VHDX)
ESX Virtual Machine Disk (VMDK)
Raw
If I am reading your question correctly, you are having trouble with choosing the bucket from which to restore your backup. You might find this easier using the EC2 console.
In the console - Navigation bar - select Snapshots
Select the snapshot you want to copy from the list
Choose Copy from Action list, complete the dialog box and click Copy
When the confirmation dialog box comes up if you click Snapshots then you can monitor the progress.
Here's some additional information on AWS backups that might help you.

Enable log file rotation to s3

I have enabled this option.
Problem is:
If I don't press snapshot log button log, is not going to s3.
Is there any method through which log publish to s3 each day?
Or how log file rotation option is working ?
If you are using default instance profile with Elastic Beanstalk, then AWS automatically creates permission to rotate the logs to S3.
If you are using custom instance profile, you have to grant Elastic Beanstalk permission to rotate logs to Amazon S3.
The logs are rotated every 15 minutes.
AWS Elastic Beanstalk: Working with Logs
For a more robust mechanism to push your logs to S3 from any EC2 server instance, you can pair LogRotate with S3. I've put all the details in this post as a reference whicould should be able to achieve exactly what you're describing.
Hope that helps.
NOTICE: if you want to rotate custom log files, then, depending on your container, you need to add links to your custom log files in a proper places. For example, consider Ruby on Rails deployment, if you want to store custom information, eg. some monitoring using Oink gem in oink.log file, add proper link in /var/app/support/logs using .ebextensions
.ebextensions/XXXlog.config
files:
"/var/app/support/logs/oink.log" :
mode: "120400"
content: "/var/app/current/log/oink.log"
This, after deploy, will create symlink:
/var/app/support/logs/oink.log -> /var/app/current/log/oink.log
I'm not sure why permissions 120400 are used, I took it from the example in Amazon AWS doc page http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/customize-containers-ec2.html (seems like 120xxx is for symlinks in unix fs)
This log file rotation is good for archival purpose, but difficult to search and consolidate when you need the most.
Consider using services like splunk or loggly.

downloading a file from Internet into S3 bucket

I would like to grab a file straight of the Internet and stick it into an S3 bucket to then copy it over to a PIG cluster. Due to the size of the file and my not so good internet connection downloading the file first onto my PC and then uploading it to Amazon might not be an option.
Is there any way I could go about grabbing a file of the internet and sticking it directly into S3?
Download the data via curl and pipe the contents straight to S3. The data is streamed directly to S3 and not stored locally, avoiding any memory issues.
curl "https://download-link-address/" | aws s3 cp - s3://aws-bucket/data-file
As suggested above, if download speed is too slow on your local computer, launch an EC2 instance, ssh in and execute the above command there.
For anyone (like me) less experienced, here is a more detailed description of the process via EC2:
Launch an Amazon EC2 instance in the same region as the target S3 bucket. Smallest available (default Amazon Linux) instance should be fine, but be sure to give it enough storage space to save your file(s). If you need transfer speeds above ~20MB/s, consider selecting an instance with larger pipes.
Launch an SSH connection to the new EC2 instance, then download the file(s), for instance using wget. (For example, to download an entire directory via FTP, you might use wget -r ftp://name:passwd#ftp.com/somedir/.)
Using AWS CLI (see Amazon's documentation), upload the file(s) to your S3 bucket. For example, aws s3 cp myfolder s3://mybucket/myfolder --recursive (for an entire directory). (Before this command will work you need to add your S3 security credentials to a config file, as described in the Amazon documentation.)
Terminate/destroy your EC2 instance.
[2017 edit]
I gave the original answer back at 2013. Today I'd recommend using AWS Lambda to download a file and put it on S3. It's the desired effect - to place an object on S3 with no server involved.
[Original answer]
It is not possible to do it directly.
Why not do this with EC2 instance instead of your local PC? Upload speed from EC2 to S3 in the same region is very good.
regarding stream reading/writing from/to s3 I use python's smart_open
You can stream the file from internet to AWS S3 using Python.
s3=boto3.resource('s3')
http=urllib3.PoolManager()
urllib.request.urlopen('<Internet_URL>') #Provide URL
s3.meta.client.upload_fileobj(http.request('GET', 'Internet_URL>', preload_content=False), s3Bucket, key,
ExtraArgs={'ServerSideEncryption':'aws:kms','SSEKMSKeyId':'<alias_name>'})

Amazon RDS automated backup

I can see from the AWS console that my RDS instance is being backed up once a day. From the FAQ I understand that it is being backup on S3. But when I use the console to view my S3 buckets, I don't see the RDS backup.
So:
How do I get my hands on my RDS backup?
Once I have it how do I use it to restore my DB i.e is it a regular mysqldump file or something else?
OK - I see it under the DB snapshots, Automated Snapshots (Had it selected to Manual Snapshots and hence could not see it)
RDS snapshot as well as EBS snapsots are stored in S3, but not accesible via the S3 interface.
You can restore a whole database be clicking "Restore Snapshot" from the AWS Management Console.
If you'd like to have .sql backups manually, you can also use the script I've been developing:
https://github.com/Ardakilic/backmeup
This script backups your SQL databases along with your webhost root to your S3 or Dropbox. So this means, you can dump any SQL from any host (RDS or any other provider) and upload them to S3. It uses aws-cli as backend.
I had the same issue, what I did is I wrote a simple bash script to do this for me, but I works fine in a single region, it doesnt work with multiple regions, here is the script http://geekospace.com/back-up-and-restore-the-database-between-two-aws-ec2-instances/