amazon ec2 free server with persistent data - amazon-web-services

I will install a website in the free EC2 from amazon but I read something not good: I have a simple website which uses a database. Users come inside my website and post information, send commetns... if for some reason the instance breaks or amazon shuts it down, will I lose all information posted in my website and database? All files users uploaded and information saved will be gone?
If so, why would someone use EC2 if you lose all your data if some problem happens, and because problems always happen, sometime I will certainly lose my data!
I know I can save an image of my current OS in AWS but do I need to save the image everytime a user posts something to my website? It's ridiculous. I know I am missing something here, but I looked into google and people all the time say I should use EBS but it's not in the free plan. So how is it good idea using AWS EC2 free plan if my data will always be at risk of being lost?

Typically you would want to use an EBS backed instance. Since the free tier does not support that, but does offer EBS storage, create your database on an EBS partition for data you cannot lose
30 GB of Amazon Elastic Block Storage, plus 2 million I/Os and 1 GB of snapshot storage*
http://aws.amazon.com/free/
You should have a means to quickly launch a new instance, and you should back up the data on your EBS partition because EBS volumes can and do fail from time to time.
UPDATE
It seems that Micro instances are in fact EBS backed.
It is still advisable to attach a separate EBS volume, because it makes it much more convenient to backup the database (you create a snapshot of the EBS volume... you can find scripts online to accomplish that, which vary a bit depending on your choice of database and file system).

Related

What does EC2 store and why does it even need a storage solution like EBS or Instance Store?

If you use EC2 and launch instances, you can add EBS volumes. So a storage option. However, what I still don't understand exactly is why. Why is there or does EC2 even need a storage option like EBS or Instance Store? What does EC2 store anyway? And why it makes sense that there is EBS?
I know that EBS volume is persistent block storage and data is not lost after exit, unlike instance store. I just don't really understand what EBS is useful for. For which cases and applications is EBS used? Or does using EBS have more to do with creating snapshots that you can create to cache data and then save it to S3?
I've already read a lot and tried to make it understandable somehow, but somehow I can't get any further here. I would be really happy if someone could shed some light on this for me.
Thank you already!
Think of an Amazon EC2 instance as a normal computer. Inside, there is CPU, RAM and (perhaps) a hard disk.
When an EC2 instance has a hard disk, it is called Instance Storage and it behaves just like a normal hard disk in a computer. However, when you turn off the instance and stop paying for it, the EC2 instance can give that computer to somebody else. Rather than giving your data to somebody else, the disk is erased. So, anything you stored on Instance Store is gone! (In truth, instance store is also a virtualised disk, but this is close enough.)
In fact, in the early days of EC2, this was the only storage available. If you wanted to keep data after the instance was turned off, you first had to copy it to Amazon S3. People didn't like this, so they invented Amazon EBS.
If you want to keep your data so that it is still there when you turn on the instance in future, it needs to be stored on a network disk and that is what Amazon EBS provides. Think of it a bit like a USB drive that you can plug into one computer, then disconnect it and plug it into another computer. However, rather than being a physical device, it uses a storage service that keeps multiple copies of the data (in case a disk fails) and lets you modify the size of the disk. You are charged based on the amount of storage space assigned and how long the data is kept ("GB-Month").
Amazon EBS Snapshots are simply a backup of the disk. A snapshot contains all the data currently on the disk, allowing you to create a new disk anytime that will contain an exact copy of the disk as it was when the snapshot was created. This is great for backups, but is also very useful for creating multiple EC2 instances with the same disk content. An Amazon Machine Image (AMI) is actually just an Amazon EBS Snapshot plus a bit of metadata. When a new EC2 instance is launched, it uses an AMI to populate the boot disk rather than loading the operating system from scratch every time.
It is possible to create an AMI that populates an Instance Store disk. This way, you don't actually need to use an Amazon EBS volume. This is good for instances that don't need to permanently keep any data -- they could simply store information in a database or Amazon S3 instead of saving it on disk. Instance Store disks can be very fast since they don't send data across the network, so this is very useful in some situations.
In summary:
Instance Store is a normal disk in a computer (but it gets erased when the instance turns off so nobody else sees your data)
Amazon EBS volumes are network-attached storage that stays around until you delete it

Loading large amount of data from local machine into Amazon Elastic Block Store

I am interested in doing some machine learning using an AWS EC2 instance. I have played around with launching instances with a an attached EBS and I was able to load files into it via scp on my local command line. I will have several gigabytes of data to load onto this EBS (I know that isn't a lot by ML standards but that's not really my point). I would like to know what is the appropriate way to load this data. I'm concerned about racking up large fees because I did something in a silly way.
So far I have just uploaded a few files to the EC2 instance's associated EBS manually via the command line, like this:
scp -i keys/ec2-ml-micro2.pem data/BB000000001.png ubuntu#<my instance ip>:/data
This seems to me to be a rather primitive approach (not that that is always a bad thing). Is it the "right" way? I'm not opposed to letting a batch jbb run overnight like this but I am not sure if it may incur some data transfer fees. I've looked around for information on this, and I have read the page on EBS pricing. I didn't see anything on costs associated with loading data but I just wanted to confirm with someone or some people who have done something similar that this is the correct approach, and if not, what is a a better one
In managing large objects in AWS. Always check for S3 as an initial option, it provides unlimited Storage capacity and best use for object store compared to EBS(block store). EBS billed you from the size of the volume that you provisioned, having said that there is a chance that you over-provisioned(overhead cost) or under-provisioned (can lead to poor performance or even downtime).
Using S3 you are billed for the storage that you consumed per GB per month, pay for what you use model and it's very cheap compared to EBS.
And lastly, try to evaluate first the AWS Machine Learning services that might fit for your use-cases it will save you alot of time and effort.
Data Transfer from S3 to EBS within the same region is free of charge.
AWS Pricing Details

How to increase RAM size and database storage capacity in AWS

I have AWS linux based server with one project, and now I want to deploy another project on the same server. For this I want to know whether my existing memory is enough or should I have to increase the memory limit, and please let me know how to increase the memory limit.
Please refer the below images for available memory space.
There are two approaches to using a database in AWS.
You can install the database on the Amazon EC2 instance. You will then be responsible for configuring and maintaining the database and doing backups. The up-side is that it can run on the same EC2 instance as your application.
Or, you can use Amazon RDS to provide a database. Amazon RDS can install, configure and operate the database for you, including taking backups. It runs on a separate computer so there are additional costs involved, but there are many benefits to keeping a database separate from the application, such as allowing you to scale your application separately to the database. Large applications often run across multiple computers and they can all connect to the one database on Amazon RDS.
From your description, it looks like you are going with the first option. You can increase the disk capacity of the Amazon EC2 instance by increasing the size of the Amazon EBS disk volume (and then do a reboot). If you desire more RAM, then Stop the instance, change the Instance Type to something larger, then Start the instance again.

Cost-effectively store volumes that I won't need for a few months?

I have two EC2 instances I created this summer for personal use while learning basic ML concepts and doing Kaggle competitions. I'd like to save the work on them on eventually be able to use them again if I'm interested in competing in a Kaggle competition again without having to setup a new instance, but probably won't need them for a few months (and when I do need them, it won't be at a moment's notice).
Each instance has an 128gb EBS gp2 volume that's costing me ~$13/month. I was wondering if there's a way that I could pull these off AWS so that I'm not still paying for them when I don't need them. Is there a feature where I can store a snapshot outside of AWS and eventually upload it to AWS and restore the volumes if I need them?
Or is there a much cheaper (slower) storage method for keeping them on AWS? (sc1 volumes are $0.025/GB-month, but is there something even cheaper?)
Edit: Clarified volume type ($0.10/GB-month gp2)
Edit2: I think my best bet for now is to snapshot them since each only has ~30GB of used space (60GB*$0.05 = $3/month) and delete the original volumes.
If you wish to retain the exact contents of the disk volumes, the choice really comes down to:
Amazon EBS volume snapshots
ISO images
Amazon EBS volume snapshots are only charged for blocks that are used. They are the easiest to create and restore. It is not possible to export an Amazon EBS snapshot.
If you wish to move a disk image out of Amazon EC2 (eg to download, or to store in Amazon S3), use a standard disk utility to create a .iso image of the disk. This can later be restored to a new disk volume, and can even be directly mounted in read-only mode using disk utilities.
You can put all this data into Amazon Glacier which is far more cheaper ( around 10% cost )

Amazon instance store

As far as I understand for new created amazon instance ephermeral data store is used by default, unless EBS store is configured.
After stop of the instance, which uses ephermeral data store, I will loose all data. Is it correct ?
I noticed that EBS store has been created automatically for my instance. I have created few files in home directory, but this files were not deleted after reboot. So where is ephermeral data is stored ?
I want to install database to Amazon host. Should I worry about data loose with default setup and what is the common configuration, for example
Create instance
Install and configure database on ephermeral data store
Make AMI
Create EBS store and configure database to use it as storages
After stop of the instance, which uses ephermeral data store, I will loose all data. Is it correct ?
To be specific, after you terminate or stop a node, any data on instance-specific storage will be lost. A reboot is different, and your data is intact in those cases. I am using these terms to match the terms in the AWS console.
To confuse matters slightly, some EBS-backed nodes also have some instance-specific storage. All instance-storage nodes are 100% instance-backed, though. So you really need to understand whether your data is hitting an EBS disk or instance-local storage.
I noticed that EBS store has been created automatically for my instance. I have created few files in home directory, but this files were not deleted after reboot. So where is ephermeral data is stored ?
Several points here:
For an EBS-backed instance, your /home partition is on the EBS root device, and hence data will persist provided the volume exists.
Again a reboot wouldn't delete your data even if you had an instance-storage node, but it sounds like you chose an EBS-backed node.
If you had instead created these files in /mnt, then stopped your instance and later started it again, you might have lost them. Again it depends exactly which ec2 node type you're running.
Regarding your last point - I would recommend that you just make sure your data is being stored on some EBS backed disk. Whether that is your root device or a separate EBS volume is up to you and depends on your specific needs.
I want to install database to Amazon host.
You should give some thought to not installing and maintaining your own database. Doing so is complex, error prone, and can be quite time consuming. I
A better option for most folks is a turnkey database solution like RDS. This is a performant database that you don't have to really think about - it'll just work. RDS isn't for everyone, as there are some restrictive permission issues, but generally speaking it's great. I use it every day.
You can run databases on top of EBS and it'll work just fine. But you are biting off being a database admin at that point, and need to worry about all the complexity that comes with it. In my opinion, better to focus your time & energy on things like database schema, queries, and other aspects of your business.