AWS CloudEndure migration specifics - amazon-web-services

My company plans on using AWS CloudEndure to migrate bunch of on-site HyperV servers to aws cloud.
I want to specifically know what folder structure is being migrated, and have not been able to find it anywhere. For example, if theres VScode with very specific configuration and plugins on those servers, is all that configuration migrated as well? does that mean that "/user/appdata/.vscode" folder is being migrated?
I understand that agent migrates all the server volumes to EBS cluster and then they are being replicated in EC2 instances, but
Can anybody show an example of files structure that is being migrated?

CloudEndure does a block level replication for your disks, so you will end up having identical replica on your AWS target

On a windows machine check the logs at C:\Program Files (x86)\Cloudendure\agent.log.0
to get the details on what the agent is working on.

CloudEndure works at disk level, It is going replicate data on the disk, CloudEndure works with Re-Hosting(Lift and Shift) method. Once the replication done, you can use migrated server as same as your on-premise server but on AWS.

Related

Centralized & versioned file based system for a web application deployed in Kubernetes

I am trying to create an Centralized file based repository where I can upload all the configuration files needed for an application to run which is deployed as a pod inside the Kubernetes. Any suggestion on achieving this functionality ? Can the file based repository version the files uploaded ?
I see that s3fs-fuse can be used to achieve this, but i lack to see that, it wont support versioning the added config files in the S3 bucket.
https://github.com/s3fs-fuse/s3fs-fuse
Any other suggestion ?
You could use elastic file system which is supported by EKS:
Applications running in Kubernetes can use EFS file systems to share data between pods in a scale-out group, or with other applications running within or outside of Kubernetes. EFS can also help Kubernetes applications be highly available because all data written to EFS is written to multiple AWS Availability zones. If a Kubernetes pod is terminated and relaunched, the CSI driver will reconnect the EFS file system, even if the pod is relaunched in a different AWS Availability Zone.
But its not S3 and it does not have versioning of files such as S3 has. You would have to add such functionality yourself, e.g. by keeping everything in a git repository on the EFS file system.
Why not use git?
The following article contains an example which runs a git clone within an initContainer:
https://blog.frankel.ch/versatility-kubernetes-initcontainer/

Migrating on premise web application to AWS ec2

Can some one please advise the steps required for migrating a web application which is currently running on tomcat server at onpremise to AWS ec2 instance. I understand this is not a straight forward and requires some detailed process.
The code is wrriten in Java and database used as oracle.
So it would be helpfull if someone can suggest me any relavent document or any website which gives some demo to refer me and proceed with this scenario.
If it's a personal project then I would recommend Lightsail as the simplest way to deploy existing Java application.
For a database a small instance of MySQL or if relational database is not needed then a document database like DynamoDB. https://aws.amazon.com/products/databases/?nc2=h_m1
There are multiple choices one how to migrate a Java application to AWS.
You could potentially use existing AWS services like:
Lightsail - https://aws.amazon.com/lightsail/
Beanstock - https://aws.amazon.com/elasticbeanstalk/
or
EC2 instance and install Tomcat manually
Use ECS with Docker https://aws.amazon.com/getting-started/tutorials/deploy-docker-containers/?nc2=type_a
As for Database solution Oracle is an option but quite expensive one.
When moving to AWS it's better to use one of the RDS managed databases like MySQL, Postgress or more expensive like Aurora.
In order to propose an architecture some details would be needed on predicted load, the size of the application and volume of data. Is the product regional or global, are there any additional issues that need to be addressed while moving to a cloud (performance, availability etc), how users are authenticated (are any other services needed).

How to setup shared persistent storage for multiple AWS EC2 instances?

I have a service hosted on Amazon Web Services. There I have multiple EC2 instances running with the exact same setup and data, managed by an Elastic Load Balancer and scaling groups.
Those instances are web servers running web applications based on PHP. So currently there are the very same files etc. placed on every instance. But when the ELB / scaling group launches a new instance based on load rules etc., the files might not be up-to-date.
Additionally, I'd rather like to use a shared file system for PHP sessions etc. than sticky sessions.
So, my question is, for those reasons and maybe more coming up in the future, I would like to have a shared file system entity which I can attach to my EC2 instances.
What way would you suggest to resolve this? Are there any solutions offered by AWS directly so I can rely on their services rather than doing it on my on with a DRBD and so on? What is the easiest approach? DRBD, NFS, ...? Is S3 also feasible for those intends?
Thanks in advance.
As mentioned in a comment, AWS has announced EFS (http://aws.amazon.com/efs/) a shared network file system. It is currently in very limited preview, but based on previous AWS services I would hope to see it generally available in the next few months.
In the meantime there are a couple of third party shared file system solutions for AWS such as SoftNAS https://aws.amazon.com/marketplace/pp/B00PJ9FGVU/ref=srh_res_product_title?ie=UTF8&sr=0-3&qid=1432203627313
S3 is possible but not always ideal, the main blocker being it does not natively support any filesystem protocols, instead all interactions need to be via an AWS API or via http calls. Additionally when looking at using it for session stores the 'eventually consistent' model will likely cause issues.
That being said - if all you need is updated resources, you could create a simple script to run either as a cron or on startup that downloads the files from s3.
Finally in the case of static resources like css/images don't store them on your webserver in the first place - there are plenty of articles covering the benefit of storing and accessing static web resources directly from s3 while keeping the dynamic stuff on your server.
From what we can tell at this point, EFS is expected to provide basic NFS file sharing on SSD-backed storage. Once available, it will be a v1.0 proprietary file system. There is no encryption and its AWS-only. The data is completely under AWS control.
SoftNAS is a mature, proven advanced ZFS-based NAS Filer that is full-featured, including encrypted EBS and S3 storage, storage snapshots for data protection, writable clones for DevOps and QA testing, RAM and SSD caching for maximum IOPS and throughput, deduplication and compression, cross-zone HA and a 100% up-time SLA. It supports NFS with LDAP and Active Directory authentication, CIFS/SMB with AD users/groups, iSCSI multi-pathing, FTP and (soon) AFP. SoftNAS instances and all storage is completely under your control and you have complete control of the EBS and S3 encryption and keys (you can use EBS encryption or any Linux compatible encryption and key management approach you prefer or require).
The ZFS filesystem is a proven filesystem that is trusted by thousands of enterprises globally. Customers are running more than 600 million files in production on SoftNAS today - ZFS is capable of scaling into the billions.
SoftNAS is cross-platform, and runs on cloud platforms other than AWS, including Azure, CenturyLink Cloud, Faction cloud, VMware vSPhere/ESXi, VMware vCloud Air and Hyper-V, so your data is not limited or locked into AWS. More platforms are planned. It provides cross-platform replication, making it easy to migrate data between any supported public cloud, private cloud, or premise-based data center.
SoftNAS is backed by industry-leading technical support from cloud storage specialists (it's all we do), something you may need or want.
Those are some of the more noteworthy differences between EFS and SoftNAS. For a more detailed comparison chart:
https://www.softnas.com/wp/nas-storage/softnas-cloud-aws-nfs-cifs/how-does-it-compare/
If you are willing to roll your own HA NFS cluster, and be responsible for its care, feeding and support, then you can use Linux and DRBD/corosync or any number of other Linux clustering approaches. You will have to support it yourself and be responsible for whatever happens.
There's also GlusterFS. It does well up to 250,000 files (in our testing) and has been observed to suffer from an IOPS brownout when approaching 1 million files, and IOPS blackouts above 1 million files (according to customers who have used it). For smaller deployments it reportedly works reasonably well.
Hope that helps.
CTO - SoftNAS
For keeping your webserver sessions in sync you can easily switch to Redis or Memcached as your session handler. This is a simple setting in the PHP.ini and they can all access the same Redis or Memcached server to do sessions. You can use Amazon's Elasticache which will manage the Redis or Memcache instance for you.
http://phpave.com/redis-as-a-php-session-handler/ <- explains how to setup Redis with PHP pretty easily
For keeping your files in sync is a little bit more complicated.
How to I push new code changes to all my webservers?
You could use Git. When you deploy you can setup multiple servers and it will push your branch (master) to the multiple servers. So every new build goes out to all webserver.
What about new machines that launch?
I would setup new machines to run a rsync script from a trusted source, your master web server. That way they sync their web folders with the master when they boot and would be identical even if the AMI had old web files in it.
What about files that change and need to be live updated?
Store any user uploaded files in S3. So if user uploads a document on Server 1 then the file is stored in s3 and location is stored in a database. Then if a different user is on server 2 he can see the same file and access it as if it was on server 2. The file would be retrieved from s3 and served to the client.
GlusterFS is also an open source distributed file system used by many to create shared storage across EC2 instances
Until Amazon EFS hits production the best approach in my opinion is to build a storage backend exporting NFS from EC2 instances, maybe using Pacemaker/Corosync to achieve HA.
You could create an EBS volume that stores the files and instruct Pacemaker to umount/dettach and then attach/mount the EBS volume to the healthy NFS cluster node.
Hi we currently use a product called SoftNAS in our AWS environment. It allows us to chooses between both EBS and S3 backed storage. It has built in replication as well as a high availability option. May be something you can check out. I believe they offer a free trial you can try out on AWS
We are using ObjectiveFS and it is working well for us. It uses S3 for storage and is straight forward to set up.
They've also written a doc on how to share files between EC2 instances.
http://objectivefs.com/howto/how-to-share-files-between-ec2-instances

Setting UP Spark on existing EC2 cluster

I have to access some big files in buckets in Amazon S3 and do processing on them. For this I was planning to use Apache Spark. I have 2 EC2 instances for this learning project. These are not used but for small crons, so could I use them to install and run Spark? If so, how to install Spark on existing EC2 boxes, so that I can make one master and one slave?
If it helps, I installed Spark in standalone mode on one branch, and the other as well, setting one as Master, and the other as slave. The detailed instructions for the same as I followed are
https://spark.apache.org/docs/1.2.0/spark-standalone.html#installing-spark-standalone-to-a-cluster
See the tutorial on Apache Spark Cluster on EC2 here http://www.supergloo.com/fieldnotes/apache-spark-cluster-amazon-ec2-tutorial/
yes you can create easily a master slave with 2 aws instances just set SPARK_MASTER_IP = instance_privateIP_1 in spark-env.sh on both instances and put instance2 private ip in slaves file in conf folder and these configurations are same on both the machine and other configurations also set like memory core etc. and then you can start it from master, and make sure the spark is install on same location in both the machines.

Shared File Systems between multiple AWS EC2 instances

I have a couple of windows server instances running on Amazon EC2 and would like to make them a bit more fault tolerant by running a duplicate instance with load balancers.
The problem is the specific data, as an example it does no good to fail over from one web server to another web server if the contents of the document root i.e. C:/htdocs/ (Apache) or C:/Repositories (VisualSvn Server) are not identical.
Is there a way to share a volume across two or more instances?
My idea is share folder between EC2 istances:
I read it's not possible to attach the same EBS volume to multiple instances. I believe also AWS is not NFS friendly either in case I want to mount them across NFS.
And finally, I've also checked S3 bucket mounted with s3fs but I found out it's not a good option too.
Can anyone help point me in the right direction?
You are right, at the moment it is not possible to add an EBS volume to multiple instances. To create a common storage for all instances, there are options like NFS, mounting S3 buckets or using a distributed cluster filesystem like GlusterFS.
However in most cases you can simplify your setup. Try to offload static assets to another (static) domain or even host it on an website-enabled S3 bucket. This way you only have to care about the dynamic application logic or scripts on your app servers.
Also try to use some automated deployment and/or configuration management tools. With these you can for example create new machines easily, or you can use them to deploy the latest code on your machines.