Is volume in aws like a hard drive that the instance uses? - amazon-web-services

I am learning about aws and using ec2 instances. I am trying to understand what a volume is.
I have read from the aws site that:
An Amazon EBS volume is a durable, block-level storage device that you
can attach to your instances. After you attach a volume to an
instance, you can use it as you would use a physical hard drive.
Is it where things are stored when I install things like npm and node? Does it function like the harddrive o my server?

AWS EBS is block storage volume, and for the ease of understanding, yes you can consider it same as hard drive, however with more benefits over traditional hard drive. few of them are:
You can increase/decrease size of the storage as per your requirement
(Hence name Elastic)
You can add multiple ebs to your instances, for example 20 GB of volume1 and 30 GB of volume2
And for the question you asked if you can install npm & node yes you
can as it would be attached to your EC2 instance and your instance
can easily utilised attached data, modules,etc
For further explanation you can refer this user guide from AWS on EBS: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-volumes.html

Yes it is exactly like a hard drive on your server and you can have multiple devices.
The cool thing is that you also can expand them if you need extra space.

where things are stored when I install things like npm and node
Yes, technically ebs volume is virtual storage drive which is
connected to your instance through network, ( flash drive connected over network).
Since network is involved which implicitly means there will be some latency because of data transfer through network.
The data is persistent even if instance stops,terminates, hibernates or hardware failure.
Since it is network drive it can be attached or detached to any other instance.
Adding to this there is another type of storage which you will find called as instance store
You can specify instance store volumes for an instance only when you launch it. You can't detach an instance store volume from one instance and attach it to a different instance.
it gives very high IOPS because it is directly (physically) attached to instance.
The use case for instance store would where data changes rapidly like for cache or buffers.
Your data will be lost if any of these events happens like The underlying disk drive fails The instance stops, instance hibernates instance terminates or drive failure.

Related

What does EC2 store and why does it even need a storage solution like EBS or Instance Store?

If you use EC2 and launch instances, you can add EBS volumes. So a storage option. However, what I still don't understand exactly is why. Why is there or does EC2 even need a storage option like EBS or Instance Store? What does EC2 store anyway? And why it makes sense that there is EBS?
I know that EBS volume is persistent block storage and data is not lost after exit, unlike instance store. I just don't really understand what EBS is useful for. For which cases and applications is EBS used? Or does using EBS have more to do with creating snapshots that you can create to cache data and then save it to S3?
I've already read a lot and tried to make it understandable somehow, but somehow I can't get any further here. I would be really happy if someone could shed some light on this for me.
Thank you already!
Think of an Amazon EC2 instance as a normal computer. Inside, there is CPU, RAM and (perhaps) a hard disk.
When an EC2 instance has a hard disk, it is called Instance Storage and it behaves just like a normal hard disk in a computer. However, when you turn off the instance and stop paying for it, the EC2 instance can give that computer to somebody else. Rather than giving your data to somebody else, the disk is erased. So, anything you stored on Instance Store is gone! (In truth, instance store is also a virtualised disk, but this is close enough.)
In fact, in the early days of EC2, this was the only storage available. If you wanted to keep data after the instance was turned off, you first had to copy it to Amazon S3. People didn't like this, so they invented Amazon EBS.
If you want to keep your data so that it is still there when you turn on the instance in future, it needs to be stored on a network disk and that is what Amazon EBS provides. Think of it a bit like a USB drive that you can plug into one computer, then disconnect it and plug it into another computer. However, rather than being a physical device, it uses a storage service that keeps multiple copies of the data (in case a disk fails) and lets you modify the size of the disk. You are charged based on the amount of storage space assigned and how long the data is kept ("GB-Month").
Amazon EBS Snapshots are simply a backup of the disk. A snapshot contains all the data currently on the disk, allowing you to create a new disk anytime that will contain an exact copy of the disk as it was when the snapshot was created. This is great for backups, but is also very useful for creating multiple EC2 instances with the same disk content. An Amazon Machine Image (AMI) is actually just an Amazon EBS Snapshot plus a bit of metadata. When a new EC2 instance is launched, it uses an AMI to populate the boot disk rather than loading the operating system from scratch every time.
It is possible to create an AMI that populates an Instance Store disk. This way, you don't actually need to use an Amazon EBS volume. This is good for instances that don't need to permanently keep any data -- they could simply store information in a database or Amazon S3 instead of saving it on disk. Instance Store disks can be very fast since they don't send data across the network, so this is very useful in some situations.
In summary:
Instance Store is a normal disk in a computer (but it gets erased when the instance turns off so nobody else sees your data)
Amazon EBS volumes are network-attached storage that stays around until you delete it

Instance Store Volume shared across multiple EC2 Instances

I am trying to understand instance store volume and I understand instance store is ideal for temporary storage and provides massive IOPS. It is retained in case of reboot but lost if you stop and start, hibernation or instance termination.
One question I have here is can Instance store be shared across EC2 instance ?
I am seeing the below in the documentation so asking. Also how to achieve this on AWS console ?
An instance store provides temporary block-level storage for your
instance. This storage is located on disks that are physically
attached to the host computer. Instance store is ideal for temporary
storage of information that changes frequently, such as buffers,
caches, scratch data, and other temporary content, or for data that is
replicated across a fleet of instances, such as a load-balanced pool
of web servers
Documentation taken : https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/InstanceStorage.html
The diagram is showing a physical host computer in an AWS Data Center. The host can be reconfigured to run many different sizes of an Instance Family (eg large, 2xlarge, 4xlarge). Do not be too concerned by the details of what it is showing.
The simple fact is that, no, instance store volumes cannot be shared across multiple Amazon EC2 instances.
The diagram given in the docs is very confusing(at least for me). I am not able to get my head around it. Maybe the InstanceA,B and C are not meant to be EC2 instances but instance store volumes as in the same diagram you see Host Computer 1 and 2.
Also the most important part is
You can specify instance store volumes for an instance only when you launch it. You can't detach an instance store volume from one instance and attach it to a different instance.
Which is what you want to know. It means you cannot share an instance store volume between 2 or more EC2 instances. When the EC2 machine is up and running, there is no way you can attach it and while launching there is no way you can specify which volume to mounted on the EC2 instance when it's created.

Where OS and its settings are stored in EC2 instance

Using ec2 Windows instance with Instance storage (let's say 32GB SSD) - where OS and its settings are stored? Like Program Files, User profiles. Are they all stored on Instance Storage? As far as I understood from other topics Instance storage is not-persistent and doesn't survive shutdowns/terminations. Does that mean I will lose everything under C: drive if I turn it off?
Can I use EBS storage as a default storage for OS (C drive)? Can I map multiple EBS storages to one Windows storage?
If above is true, then I will be charged for the capacity used by OS on EBS instance? It would be around 20GB I believe. Is that correct?
I am quite new in aws, and before paying for such instances or EBS I would like to know how this technical and billing model is working.
Thank you!
The Storage for the Root device is dependent on the AMI (EBS-Backed or Instance Store-Backed) used to launch the instance.
As far as I understood from other topics Instance storage is
not-persistent and doesn't survive shutdowns/terminations.
If the Root storage device is Instance Store, Stopping (shutdown) the instance is not possible. On termination, Both the storage and Instance does not survive. The Instance does not survive once terminated even if the AMI is EBS-Backed, but you can persist the Root Volume by setting the DeleteOnTermination flag set to False.
Does that mean I will lose everything under C: drive if I turn it off?
You cannot turn off (shutdown) an Instance Store-backed instance.
Can I use EBS storage as a default storage for OS (C drive)?
Yes, Choose an EBS backed Windows AMI.
Can I map multiple EBS storages to one Windows storage?
Yes, multiple EBS Volumes can be attached to one EC2 Windows Instance.
If above is true, then I will be charged for the capacity used by OS
on EBS instance?
You will be charged for the total size of the EBS volumes attached to the instance including the Root Device.
It would be around 20GB I believe. Is that correct?
The EBS Volume Size is adjustable. The upper Size limit is 16TiB.
Read Storage for Root Device and Ec2 Root Device Volume
Please spend more time on the AWS documentation, I don't think here is enough to cover all your question.
Only for specify EC2 instance come with attached SSD storage AKA instance storage. Bare in mind that, this instance storage doesn't come with Snapshot capabilities, so you must backup the file yourself. This is mean for people who need fastest disk access to process their data.
Only EBS allow you do multiple snapshot.
You can always create an AMI image for your instance after complete the deployment. AMI image is store inside EBS, so you will not lost the initial instance if you do this, so for new instance, you just trigger load it from AMI.
If you "Terminate" an instance, it will delete the virtual image. There is no way to recover it even with EBS, unless you make a snapshot. However, attached EBS storage will not be deleted.
EBS is calculate by Per GB and give you 1GB x 3 IOPS, with base 100 IOPS given. This is not enough if anyone want to carry out disk I/O intensive task.

When I stop and start an ec2 cents os instance , what data do I loose

I have an ec2 instance that is hosting a CentOS AMI image and the root device is EBS , however it is not EBS optimized.
I have installed a few packages on it now I want to stop and start it again , Amazon documentation says that the EBS data would be available but the instance store data would be lost.
How do I know where(EBS or Instance store) my packages are stored ? I see that package files are in /opr /var /etc directories .
Will I loose my installed packages if I stop and start the Amazon ec2 instance ?
Thanks.
When you create an EBS backed instance (with ephemeral or instance store storage, and it doesn't matter whether it's optimized or not optimized) you don't lose data in your /opt or /var or /etc directory or any of the system data. So you are safe to stop and then restart it. Keep in mind that your internal and public IP addresses change once you restart it.
The only data that you lose is if you have ephemeral volumes which are generally mounted volumes with devices like /dev/sdb, /dev/xvdb, /dev/xvdc, etc.
If you create an instance store "only" instance then you lose everything. However, you will be able to tell if your instance is this type by not having the option to "stop" it. Meaning you can only terminate it. These are the first type of instances that EC2 offered when they started and maybe up until 3-4 years ago were the only ones, so they are not used that much AFAIK unless you need an ephemeral volume as your root volume.
[Edit]
This is what it's supposed to look like for an EBS backed instance (non-optimized):
You will not lose your data if the instance is setup as EBS.
EBS optimised is another option which adds additional IOPS, useful for busy database applications, etc.

AWS can an EBS-backed instance also access "instance store?

I thought I clearly understood the difference between instance-store and EBS backed AMIs.
But http://aws.amazon.com/maintenance-help/ says "if you are running an EBS-backed AMI, you can stop and then restart your instance in order to easily re-launch it. This will cause the loss of any data you have saved on the local instance store of the instance,"
Stop/start does NOT lose the sysvol data, so this confuses me.
I'm assuming that here, by "local instance store", they mean the backing EBS volume (the sysvol), and I'm thinking that they meant to say "terminate" instead of stop. Am I correct?
Terminating an EBS-backed instance will not cause your data to be deleted. You can still access the EBS volume until you delete it (unless you set it to delete when your instance is terminated).
Local instance store refers to hard drive space on the actual physical server that is running your instance. You can see the available instance store by doing sudo fdisk -l. Some images come with some instance store volumes already mounted (see df -h). Otherwise you'll have to mount and format the instance store volumes before you can use them.
Data on an instance store volume is lost when you stop (not terminate) your instance because it is local to a physical server, and your instance might start up on a new server.
Quite simply, EC2 is running your virtual server on some physical server. The root filesystem can either be on a local disk (ephemeral storage) or on network attached storage (EBS). With EBS, they can snapshot it for backups or to make a copy, so EBS is far more flexible, although not as fast as a local disk in the server where your instance is running.
In order to make this all work, when you shutdown an ephemeral server, amazon wipes the disk in order to reallocate it to the next customer. There is no need or reason for them to do that with EBS, since it was not physically attached to that server in the first place.
You might note, that even EBS backed instances (depending on size) come with an allocation of ephemeral storage (2-500gig+) which can be used for swap, logs, or whatever else you want to do with them. The only issue of course is that should the server be shutdown, or should there be a catastrophic disk or hardware error, you'll lose that data. You can still manually back it up, in the same way people have backed up traditional servers over the years.
Making your own AMI from an EBS backed server is trivial now, and can be done easily through the AWS web interface. Making a non-EBS backed AMI is a very complicated task the last time I tried to do it. With that said, there are certain use cases where it makes a lot of sense to consider using purely ephemeral storage. Computation or memory/cache nodes that have no need to persist data will be faster and cost less.