Shared Storage AWS EC2 - amazon-web-services

I have done a lot of reading about the sharing of files on EC2 instances and am finding the answer to be EFS which for now is not available in my region.
I am new to AWS and have got my instances set up how I want them with my app running, but have a few questions if someone can help would be appreciated.
I am using logs as an example but there are many other parts of the application that are really centralised and I don't which to lose them when an instance is terminated after use.
1) If I have an application writing logs to the local disk and I have auto scale set up that means a new instance fires up with my app again writing logs to local instance disk. Am I correct?
If so, when the instance is no longer needed it would delete the instance and local disk therefore losing the application logs.
2) I understand s3 is available but am worried about the performance as this app will be writing logs continuously and has million plus users.
Alternatives I am considering.
Write Logs to database
Have a small instance which could be used
purely as file share and everything sits in the share
Any suggestions would be helpful.
Also this is not just for logs there are security credentials etc that are shared.

Related

AWS EC2 mutiple Instances with redundant data

So I am noob, and just started learning about cloud computing and how AWS works.
Aws provide us the EC2 as service, Where I can run VM and put my data on top of it or say run my web server on top of the newly created instance.
I have seen people creating multiple instances in the same AZ.
Doesn't that leads to Redudant data, I mean we are creating more EC2 instances in the same AZ and putting the same data on each insances, so that when one gets called off, the client can access the data from another instance.
My question is this the industry practice to keep the redudant data (same data) across all the instances for better reachability or we are not putting the whole data on other instances just a fraction of it.
Please don't mind my stupid question, I am just learning.
Usually, when you run several instances of the same application, you run them in autoscaling group (ASG). For this, your applications should be stateless, as instances in ASG can be launched and terminated at any time automatically. To protect from data lose and ensure that new instances have access to existing data files, you don't store any user-data (e.g. user uploaded images) on the instances.
Instead you are going to store the data files outside of your instances. Common choices for that are S3 and EFS. This solves the data redundancy issue, as you only have one copy of your files which can be accessed from all the instances. Also it protects your data from being lost if your instances will get terminated, as S3 and EFS are highly-available and fault resistant data stores managed by AWS.

Major differences of AWS and normal VPS (server)

I have a very basic idea on servers. So far I have only worked with few Ubuntu VPS server which I can easily maintain, install a database, upload my code and run my projects. And to save static data like image/video I use local SSD storage of my server.
Now I got some projects where AWS is required to use. In the beginning, I thought it would be very similar to my normal Ubuntu based VPS server. But while I start researching/reading articles also their own docs I find out it has lots more cool features for server and at the same, it's little complicated for a beginner. I would be really glad if someone give his time and reply on these questions of mine to clear concept about AWS of mine and people like me
As my plan is to use one EC2 instance to run my project. But I can see many experts suggest to use Elastic Beanstalk and create EC2 instance inside that. While I can directly run my project with EC2 without taking help from Elastic Beanstalk. So why it's better / what other help do it(Elastic Beanstalk) provide?
When I am checking the pricing of EC2(On-demand > Linux Unix) it says ECU as Variable. What does that mean? And where does ECU work
Instance Storage (GB) as EBS only. Does that mean I can't have any storage with my server I must buy separately? But in my previous VPS server, I use to get fewer storages with my server. Because storage is required if I want to install new software like MySQL/Redis/Python each of them requires local storage. Also if I want to upload my code or few static images it requires storage.
Like storage do I also need to buy other instances for a database? Like if I want to use PostgreSQL as my database do I need to buy AWS RDS or I can install that inside my Linux system?
Lastly, what are the main differences of my normal VPS Linux server and in AWS EC2 Linux server?
Thanks in advance for giving time :)
Let me try to answer your questions inline.
As my plan is to use one EC2 instance to run my project. But I can
see many experts suggest to use Elastic Beanstalk and create EC2
instance inside that. While I can directly run my project with EC2
without taking help from Elastic Beanstalk. So why it's better /
what other help do it(Elastic Beanstalk) provide?
If you are planning to use a single server and a database going with EC2 and RDS would be straightforward. However, if you are planning to set up, autoscaling (automatically increasing the number of servers only when load increases and return back to one server), load balancing and DevOps support, you need to set them up which requires more knowledge on AWS platform. AWS Elastic Beanstalk does these for you automatically, also by giving you the options to select the technology of your application and simply upload the code.
When I am checking the pricing of EC2(On-demand > Linux Unix) it says ECU as Variable. What does that mean? And where does ECU work
ECU is simply a rough figure to compare the processing across multiple EC2 classes that are having the different levels processing power.
Instance Storage (GB) as EBS only. Does that mean I can't have any storage with my server I must buy separately? But in my previous VPS server, I use to get fewer storages with my server. Because storage is required if I want to install new software like MySQL/Redis/Python each of them requires local storage. Also if I want to upload my code or few static images it requires storage.
EBS storage is reliable storage (With internal redundancy) that will last beyond your instance lifetime. Which means, you can upgrade the EC2 class and install software, or store files, which will remain in the EBS volume unless you delete it.
Since you are basically paying for the GBs, you can also create another EBS volume for static files and mount it to the EC2 instance if you want.
Like storage do I also need to buy other instances for a database? Like if I want to use PostgreSQL as my database do I need to buy AWS RDS or I can install that inside my Linux system?
It's not mandatory but recommended since you can even use a smaller instance for a web server and use another one for the DB. It's up to you. For example, the cost would be roughly similar if you use two small EC2 instances for a web server and DB server (Or use RDS) or use a single medium-size EC2 instance where both DB and web is running.
Lastly what are the main differences of my normal VPS Linux server and in AWS EC2 Linux server?
You will get more options in terms of selecting the hardware underneath since AWS provides different configuration options. In addition, EC2 instances are able to utilize the AWS ecosystem for Networking, Security, Load balancing & etc for better-optimized solution architectures in terms of reliability, security, performance & etc.
Q1) Beanstalk is a management application. AWS has several: CloudFormation, OpsWorks. Third party vendors have their own: Chef, Ansible, Terraform, etc. I really like Beanstalk and how it makes deploying code very easy for small sites (one command). I can scale up or scale down with a button push. I also use CloudFormation every day for just about everything.
Q2) ECU is a AWS Equivalent Compute Unit used to compare one instance with another. How does that translate to physical CPUs? Don't know as AWS does not publish its absolute meaning. Use is only to compare EC2 instances.
Q3) When you launch an EC2 instance, you will need storage. This is an additional cost (around $0.10 per GB per month). You will specify the size and type of storage (there are a number of types). There is also Instance Store Volumes. Stay away from these unless you really understand how to use them (they don't persist a shutdown so all data is lost). There are good use cases for Instance Store (AI, Big Data, Image processing), but a website is not one of them.
Q4) If your EC2 instance is big enough (2 GB of memory and larger), you can install PostgreSQL, MySQL, etc on your EC2 instance. Otherwise AWS has a number of database optios: DynamoDB, RDS, Aurora, etc.
Q5) Difficult to answer as each vendor offers its own set of features. EC2 instances are virtual machines. You have control over the raw power of that VM. Most VPS servers have management interfaces that EC2 does not. Usually EC2 is more expensive than VPS servers.
Watch a couple of AWS videos on YouTube. This will help you to understand AWS and why it is so successful in the cloud. Linux Academy, A Cloud Guru, etc. have very good training courses on AWS.
AWS Essentials: EC2 Basics
If you have further questions, open a new StackOverflow question per question. You will seldom get answers to long multi-question questions.

Creating a persistent Link to an EFS drive on a Windows EC2 Server

I have created a Windows EC2 instance on AWS, and I have loaded it up with all of my needed software. My intention is to use this instance to create an image, so that I can (in the very near future) load up a much more powerful instance type using this image, and run a bunch of computations.
However, I also need to have a centralized location to store data. So, I created an EFS drive on AWS, and now I am trying to connect my instance to the EFS using a symbolic link that will persist to every other instance I load up in the future. I want to eventually have an army of instances, all of which use the centralized EFS drive as their primary storage device so that they can all load and save data, which can then be used by other instances.
I've been running Google searches all morning, but I'm coming up empty on how to do this. Any resources or tips would be greatly appreciated.
Thanks!
EFS is basically a managed NFS server. In order to mount this to a Windows instance, you will need to find an NFS client for windows.
An alternative would be to mount the EFS to a linux-based instance, and export the file system using Samba which could then be mounted on your Windows instances. Doing this you would lose out on a lot of the benefits of EFS (your linux instance is a single point of failure, and for high-bandwidth requirements will be a bottleneck) but it might be possible.
You don't say what you are trying to accomplish, but I would suggest designing a solution that would pull data from S3 as needed. That would also allow you to run multiple instances in parallel.

choosing a hosting platform that allows file and directory creation

I am trying to launch a project where my server generates user files and directories. Since heroku doesn't allow that, i am trying to find the best platform that will fit my needs without changing a bunch of my code.
my node server is storing data to firebase along with some files on the server itself. I realize this is not best practice but it is what it is for now
What would you recommend?
You can store your objects in S3. Do not store files on VMs in case of any failure.
Depending on your needs, an EBS volume would be a good start. It is meant to be redundant and the chances of losing any data is very small. The advantage is that it lives on if you terminate or stop an instance.
The newer EFS is very fast and can be mounted to multiple machines, much like an NFS file system. It is redundant across availability zones and will also survive a machine stop/termination.
S3 is an object store and isn't really meant for file system I/O. It can easily store files but it doesn't have nearly the performance of either EBS or EFS. It lives on after machine termination - indeed, it can be accessed with HTTP when properly configured.
Ultimately, you can create files normally on the EC2 with instance store, EBS, or EFS. The instance store data is lost if you terminate or even stop the instance. Be careful with that - you can easily lose tons of data when it is on instance store and not properly backed up.

Best AWS setup for a dedicated FTP server?

My company is looking for a solution for file sharing via FTP - currently, we share one server for client/admin FTP file sharing and serving multiple sites, and are looking to split off our roles so that we have one server dedicated to FTP and one for serving websites.
I have tried to find a good solution with AWS, but cannot find any detailed information regarding EBS and EC2 servers, and whether an EC2 package will be able to handle FTP storage. For example, a T2.nano instance seems ideal with 1 cpu and minimal RAM, but I see no information regarding EBS storage limits.
We need around 500GiB at most, and will have transfers happening daily in the neighborhood of 1GiB in and out. We don't need to run a database or http server. We may run services for file cleanup in the background weekly.
EDIT:
I mis-worded the question, which was founded from a fundamental lack of understanding AWS EC2 and EBS which I now grasp. I know EC2 can run FTP services, the question was more of a cost-effective solution with dynamic storage. Thanks for the input!
As others here on SO will tell you: don't bother with EBS. It can be made to work but does not make much sense in the long run. It's also more expensive and trickier to operate (backups/disaster recovery/having multiple ftp server machines).
Go with S3 storing your files and use something that is able to leverage S3 for ftp (like s3fs)
See:
http://resources.intenseschool.com/amazon-aws-howto-configure-a-ftp-server-using-amazon-s3/
Setting up FTP on Amazon Cloud Server
http://cloudacademy.com/blog/s3-ftp-server/
If FTP is not a strong requirement you can also look at migrating people to using S3 directly (either initially or after you do the setup and give them the option of both FTP and S3 directly)
the question is among the most seen on SO for aws: You can install a FTP server on any EC2 instance type
There's no limit on EBS and you can always increase the storage if you need, so best rule is: start low and increase when needed
Only point to mention is the network performance comes with the instance type so if you care about the speed a t2.nano (low network performance) might not be sufficient