How to set up scheduled backups in amazon s3? - amazon-web-services

I have S3 bucket "foo.backups", s3cmd is installed on my DigitalOcean droplet. I need to backup mydb.sqlite3 database and "myfolder".
How to make scheduled daily backups of these databse and folder with such structure:
s3://foo.backups/
-30jan15/
--mydb.sqlite3
--myfolder/
---...
-31jan15/
--mydb.sqlite3
--myfolder/
---...
-1feb15/
--mydb.sqlite3
--myfolder/
---...
How can I set it up?
Thanks!

As an alternative to s3cmd, aws-cli - you might consider using https://github.com/minio/mc
mc implements mc mirror command to recursively sync files and directories to multiple destinations in parallel.
Features a cool progress bar and session management for resumable copy/mirror operations.
$ mc mirror
NAME:
mc mirror - Mirror folders recursively from a single source to many destinations.
USAGE:
mc mirror SOURCE TARGET [TARGET...]
EXAMPLES:
1. Mirror a bucket recursively from Minio cloud storage to a bucket on Amazon S3 cloud storage.
$ mc mirror https://play.minio.io:9000/photos/2014 https://s3.amazonaws.com/backup-photos
2. Mirror a local folder recursively to Minio cloud storage and Amazon S3 cloud storage.
$ mc mirror backup/ https://play.minio.io:9000/archive https://s3.amazonaws.com/archive
3. Mirror a bucket from aliased Amazon S3 cloud storage to multiple folders on Windows.
$ mc mirror s3/documents/2014/ C:\backup\2014 C:\shared\volume\backup\2014
4. Mirror a local folder of non english character recursively to Amazon s3 cloud storage and Minio cloud storage.
$ mc mirror 本語/ s3/mylocaldocuments play/backup
5. Mirror a local folder with space characters to Amazon s3 cloud storage
$ mc mirror 'workdir/documents/Aug 2015' s3/miniocloud
Hope this helps.

As an alternative to s3cmd, you might consider using the AWS Command-Line Interface (CLI).
The CLI has an aws s3 sync command that will copy directories and sub-directories to/from Amazon S3. You can also nominate which filetypes are included/excluded. It will only copies that a new or have been modified since the previous sync.
See: AWS CLI S3 documentation

Related

How to un-tar a file in s3 without passing through local machine

I have a huge tar file in an s3 bucket that I want to decompress while remaining in the bucket. I do not have enough space on my local machine to download the tar file and upload it back to the s3 bucket. Whats the best way to do this?
Amazon S3 does not have in-built functionality to manipulate files (such as compressing/decompressing).
I would recommend:
Launch an Amazon EC2 instance in the same region as the bucket
Login to the EC2 instance
Download the file from S3 using the AWS CLI
Untar the file
Upload desired files back to S3 using the AWS CLI
Amazon EC2 instances are charged per-second, so choose a small machine (eg t3a.micro) and it will be rather low-cost (perhaps under 1 cent).

How to store bacula (community-edition) backup on Amazon S3?

I am using centOS-7 machine, bacula community-edition 11.0.5 and PostgreSql Database
Bacula is used to take full and incremental backup
I followed bellow document link to store the backup on an Amazon S3 bucket.
https://www.bacula.lat/community/bacula-storage-in-any-cloud-with-rclone-and-rclone-changer/?lang=en
I configured storage daemon as they shown in the above link, once after the backup, backup is success and backed up file storing in the given path /mnt/vtapes/tapes, but backup-file is not moving from /mnt/vtapes/tapes to AWS s3 bucket.
In the above document mentioned as, we need to create Schedule routines to the cloud to move backup file from /mnt/vtapes/tapes to Amazon S3 bucket.
**I am not aware of what is cloud Schedule routines in AWS, whether it is any cloud lambda function or something else?
Is there any S3 cloud driver which support bacula backup or any other way to store bacula-community backup file on Amazon S3 other than S3FS-Fuse and libs3 ?
The link which you shared is for bacula-enterprise, we are using bacula-community. so any related document you prefer for bacula-community edition
Bacula Community include AWS S3 cloud driver starting from 9.6.0. Check https://www.bacula.org/11.0.x-manuals/en/main/main.pdf - Chapter 3, New Features in 9.6.0. And additional: 4.0.1 New Commands, Resource, and Directives for Cloud. This is the same exact driver available at Enterprise version.

How can I move my media files stored in local machine to S3?

I have a Django application running on EC2. Currently, all my media files are stored in the instance. All the documents I uploaded to the models are in the instance too. Now I want to add S3 as my default storage. What I am worried about is that, how am I gonna move my current media files and to the S3 after the integration.
I am thinking of running a Python script one time. But I am looking for any builtin solution or maybe just looking for opinions.
Amazon CLI should do the job:
aws s3 cp path/to/file s3://your-bucket/
or if the whole directory then:
aws s3 cp path/to/dir/* s3://your-bucket/ --recursive
All options can be seen here : https://docs.aws.amazon.com/cli/latest/reference/s3/cp.html
The easiest method would be to use the AWS Command-Line Interface (CLI) aws s3 sync command. It can copy files to/from Amazon S3.
However, if there are complicated rules associated with where to move the files, then you certain use a Python script and the upload_file() command.

Migrating GCP Object storage data to AWS S3 Bucket

We have terabytes of Data in Google Cloud Object Storage, we want to migrate it to AWS S3. What are the best ways to do it? Is there any 3rd party tool that can be better instead of going for direct transfer?
There could be multiple options available even without using any device (cloud to cloud migration) in less time.
** gsutil to copy data from a Google Cloud Storage bucket to an Amazon bucket, using a command such as:**
gsutil -m rsync -r gs://your-gcp-bucket s3://your-aws-s3-bucket
More details is available # https://cloud.google.com/storage/docs/gsutil/commands/rsync
Note: if you face speed challenge with default cloud shell, then you can create a big machine and execute above command from there.

downloading a file from Internet into S3 bucket

I would like to grab a file straight of the Internet and stick it into an S3 bucket to then copy it over to a PIG cluster. Due to the size of the file and my not so good internet connection downloading the file first onto my PC and then uploading it to Amazon might not be an option.
Is there any way I could go about grabbing a file of the internet and sticking it directly into S3?
Download the data via curl and pipe the contents straight to S3. The data is streamed directly to S3 and not stored locally, avoiding any memory issues.
curl "https://download-link-address/" | aws s3 cp - s3://aws-bucket/data-file
As suggested above, if download speed is too slow on your local computer, launch an EC2 instance, ssh in and execute the above command there.
For anyone (like me) less experienced, here is a more detailed description of the process via EC2:
Launch an Amazon EC2 instance in the same region as the target S3 bucket. Smallest available (default Amazon Linux) instance should be fine, but be sure to give it enough storage space to save your file(s). If you need transfer speeds above ~20MB/s, consider selecting an instance with larger pipes.
Launch an SSH connection to the new EC2 instance, then download the file(s), for instance using wget. (For example, to download an entire directory via FTP, you might use wget -r ftp://name:passwd#ftp.com/somedir/.)
Using AWS CLI (see Amazon's documentation), upload the file(s) to your S3 bucket. For example, aws s3 cp myfolder s3://mybucket/myfolder --recursive (for an entire directory). (Before this command will work you need to add your S3 security credentials to a config file, as described in the Amazon documentation.)
Terminate/destroy your EC2 instance.
[2017 edit]
I gave the original answer back at 2013. Today I'd recommend using AWS Lambda to download a file and put it on S3. It's the desired effect - to place an object on S3 with no server involved.
[Original answer]
It is not possible to do it directly.
Why not do this with EC2 instance instead of your local PC? Upload speed from EC2 to S3 in the same region is very good.
regarding stream reading/writing from/to s3 I use python's smart_open
You can stream the file from internet to AWS S3 using Python.
s3=boto3.resource('s3')
http=urllib3.PoolManager()
urllib.request.urlopen('<Internet_URL>') #Provide URL
s3.meta.client.upload_fileobj(http.request('GET', 'Internet_URL>', preload_content=False), s3Bucket, key,
ExtraArgs={'ServerSideEncryption':'aws:kms','SSEKMSKeyId':'<alias_name>'})