We have a java application running on Tomcat that was hosted on AWS EC2 with a RDS database. We migrated the application to Azure, and the performance dropped significantly. In EC2 we had a md5.large machine (2 CPUs, 8GB) and in Azure we have P2V2 (2 CPUs, 7GB).
The database stayed in RDS, so one of our hypothesis is that we are losing performance with database traffic, since they are on different hosts now. Could that be it? If so, would creating a VPN help in any way?
The short answer is yes, you will now have lag between your application server and the DB. The VPN would not make that much of a difference. What you want is to have your DB close to your application server again. One of the ways to do that would be to either also migrate your RDS to Asure or if your RDS need to stay in AWS then see if you can replicate your DB to Asure (Depends on DB type).
Related
Imagine two cases:
I have web server running in an EC2 instance and it is connected to the database in the RDS, the managed database service.
I have web server and database running in the same EC2 instance.
Is my database in RDS going to be slower to access because it's not in the same machine?
How many milliseconds, approximately, does it add to your latency between the two?
Does this become bottleneck?
What about other managed database services like Azure, GCP, Digital Ocean, etc?
Do they behave the same?
Yes, it will be slower to RDS instances from your Webserver than a database on the same host, because you need to go over the network and that adds latency.
The drawback of running the DB on the same server is that you can't use a managed service to take care of your database and you're mixing largely stateless components (webserver) with stateful components (database). This solution is typically not scalable either. If you add more webservers, things get messy.
Don't know about Azure, GCP or Digital Ocean, but I'd be very surprised if things are different there. There's good reasons to separate these components.
We're considering to implement an ELB in our production Amazon environment. It seems it will require that production server instances be synched by a nightly script. Also, there is a Solr search engine which will need to replicated and maintained for each paired server. There's also the issue of debugging - which server is it going to? If there's a crash, do you have to search both logs? If a production app isn't behaving, how do you isolate which one is is, or do you just deploy debugging code to both instances?
We aren't having issues with response time or server load. This seems like added complexity in exchange for a limited upside. It seems like it may be overkill to me. Thoughts?
You're enumerating the problems that arise when you need high availability :)
You need to consider how critical is the availability of the service and take that into account when defining what is the right solution or just over-engineering :)
Solutions to some caveats:
To avoid nightly syncs: Use an EC2 with NFS server and mount share in both EC2 instances. (Or use Amazon EFS when it's available)
Debugging problem: You can configure the EC2 instances behind the ELB to have public IPs, limited in the Security Groups just to the PCs of the developers, and when debugging point your /etc/hosts (or Windows equivalent) to one particular server.
Logs: store the logs in S3 (or in the NFS server commented above)
I am new to Amazon Web Services. I was reading about Amazon ElastiCache and wanted to clarify if it is like (may be more than that) using RAM filesystem in Linux where we use a portion of system memory as a file system. As I referred AWS documentation it says ElastiCache is a web service. Is it like an EC2 instance with few memory modules attached? I really want to understand how it exactly works.
Our company has decided to migrate our physical servers into AWS cloud. We use Apache web server and MySQL Database running in Linux. We provide a SaaS platform for e-mail marketing and event scheduling for our customers. There is usually a high web traffic to our website during 9am-5pm on weekdays. I would like to understand if we want to use ElastiCache service, how it will be configured in AWS.? We have planned two EC2 instances for our web server and an RDS instance for the database.
Thanks.
Elastic cache is simply managed Redis or Memcached. Depending which one you choose, you would use the client for the cache with your application.
How you would implement it depends on what kind of caching you are trying to accomplish.
I installed a LAMP stack in my AWS EC2 instances so that I can use the MySQL server. Somebody recommended using RDS. But RDS is not free and also a MySQL server. My question is what makes RDS so special comparing with my MySQL server in EC2 instances?
Thanks. By the way, I'm quite new to AWS.
RDS is a managed solution. Which means, AWS staff will take care of:
Patches
Backups
Maintenance
Making sure it's alive
Hosting your database in a second EC2 instance means that:
You have to manage everything of the above
Using a LAMP stack and co-hosting Apache and MySQL is the cheapest, but:
You have to manage everything of the above
You're probably hosting a database on an instance exposed to the internet
That said, if you're planning to host a production website / service that's more than a personal website / blog / experiment you'll probably need to host webserver and database in different instances. Picking RDS is less of a headache.
For anything thats not that important, a LAMP stack makes more sense. Less scalability, potentially less security but also less administrative overhead and costs.
I am planning to run a web-application and expecting a traffic of around 100 to 200 users.
Currently I have set up single Small instance on Amazon. This instance consist of everything – the Webserver(Apache) , the Database Server(MySQL) and the IMAP server( Dovcot). I am thinking of moving out my database server out of this instance and create a separate instance for it. Now my question is –
Do I get latency while communication between my webserver and Database server( Both hosted on separate instances on Amazon )
If yes, what is the standard way to overcome this ? ( or Do I need to set up a Virtual Private Cloud ?)
If you want your architecture to scale you should separate your web server from your database server.
The low latency that you will pay (~1-2ms even between multiple availability zone), will give you better performance as you can scale each tier separately.
You can add small (even micro) instances to handle more web requests behind a load balancer, without the need to duplicate an instance that has to have a database as well
You can add auto-scale group for your web server that will automatically scale your web server tier, based on usage load
You can scale up your DB instance, to have more memory, getting a better cache hit
You can add Elastic Cache between your web server and your database
You can use Amazon RDS as a managed database service, which remove the need for an instance for the database at all (you will pay only for the actual usage of the database in RDS)
Another benefit that you can have is better security on your database. If your database is on a separate instance, you can prevent access to it from the internet. You can use a security group that allows only sql connection from your web server and not http connection from the internet.
This configuration can run in a regular EC2 environment, without the usage of VPC. You can certainly add VPC for an even more control environment, without additional cost nor much increased complexity.
In short, for scalability and high availability you should separate your tiers (web and DB). You will probably also find yourself saving on cost as well.
Of course there will be latency when communicating between separate machines. If they are both in the same availability zone it will be extremely low, typically what you'd expect for two servers on the same LAN.
If they are in different availability zones in the same region, expect a latency on the order of 2-3ms (per information provided at the 2012 AWS re:Invent conference). That's still quite low.
Using a VPC will not affect latency. That does not give you different physical connections between instances, just virtual isolation.
Finally, consider using Amazon's RDB (Relational Database Service) instead of a dedicated EC2 instance for your MySql database. The cost is about the same, and Amazon takes care of the housekeeping.
Do I get latency while communication between my webserver and Database server( Both hosted on separate instances on Amazon )
Yes, but it's rather insignificant compared to the benefits gained by separating the roles.
If yes, what is the standard way to overcome this ? ( or Do I need to set up a Virtual Private Cloud ?)
VPC increases security and ease of management of the resources, it does not affect performance. A latency of a millisecond or two isn't normally problematic for a SQL database. Writes are transactional so data isn't accessible to other requests until it's 100% completed and committed. I/O throughput and availability are much more of a concern, which is why separating the database and the application is important.
I'd highly recommend that you take a look at RDS, which is AWS's version of a managed MySQL, Oracle, or MS SQL Server database server. This will allow you to easily setup and manage your database, including cross-availability zone replication and automated backups. I also wrote a blog post yesterday that's fairly relevant to your question.