Are managed databases (e.g. Amazon RDS) slower to access than databases on the same machine (EC2) as the web server - amazon-web-services

Imagine two cases:
I have web server running in an EC2 instance and it is connected to the database in the RDS, the managed database service.
I have web server and database running in the same EC2 instance.
Is my database in RDS going to be slower to access because it's not in the same machine?
How many milliseconds, approximately, does it add to your latency between the two?
Does this become bottleneck?
What about other managed database services like Azure, GCP, Digital Ocean, etc?
Do they behave the same?

Yes, it will be slower to RDS instances from your Webserver than a database on the same host, because you need to go over the network and that adds latency.
The drawback of running the DB on the same server is that you can't use a managed service to take care of your database and you're mixing largely stateless components (webserver) with stateful components (database). This solution is typically not scalable either. If you add more webservers, things get messy.
Don't know about Azure, GCP or Digital Ocean, but I'd be very surprised if things are different there. There's good reasons to separate these components.

Related

Poor performance when migrating from AWS to Azure

We have a java application running on Tomcat that was hosted on AWS EC2 with a RDS database. We migrated the application to Azure, and the performance dropped significantly. In EC2 we had a md5.large machine (2 CPUs, 8GB) and in Azure we have P2V2 (2 CPUs, 7GB).
The database stayed in RDS, so one of our hypothesis is that we are losing performance with database traffic, since they are on different hosts now. Could that be it? If so, would creating a VPN help in any way?
The short answer is yes, you will now have lag between your application server and the DB. The VPN would not make that much of a difference. What you want is to have your DB close to your application server again. One of the ways to do that would be to either also migrate your RDS to Asure or if your RDS need to stay in AWS then see if you can replicate your DB to Asure (Depends on DB type).

memorystore and instances in different regions (GCP)

I'm building a chat app in React Native, and the backend is in Node.JS I'm using GKE to deploy the server code.
I'm using a cloud sql postgresql, connecting with internal IP. This works. I also use a memorystore (redis). Here is the problem.
For autoscaling, I'm planning to get multiple GKE clusters in different regions (for now, europe-west1 and us-central1). I have configured a load balancer with one backend containing all instance groups. I don't know of this is the correct/ideal solution, but it works. The problem lies in the fact that you can only connect to a redis database from an instance within the same region. If i use use-central1 as the region for my memorystore instance, I cannot connect to it through the vm's in the eu-cluster I created.
What is the best solution to overcome this problem? I've created an extra VM in the same region as the redis instance with haproxy configured to use as a reverse proxy to the memorystore, and this way, I can connect to the redis database through all instances, no matter what region they're in. But I don't know if this is the correct solution?
EDIT:
I'm using websockets (socket.io) for chat messages. Because I'm planning to use multiple servers, I need a centralized database to store (references to) the socket ID's, so users can send messages to users that are connected to other servers.
I'm thinking redis is the correct solution for a number of reasons:
I can use socket.io-redis to store the socket ID's on redis
fast response time
I don't know about the size of the data stored, but it's definitely not Mb's
I'm using a postgresql database to store other information (like username, passwords), but it seems to me that redis is a far better solution for real time applications.

Amazon ElastiCache vs Ramfs in Linux

I am new to Amazon Web Services. I was reading about Amazon ElastiCache and wanted to clarify if it is like (may be more than that) using RAM filesystem in Linux where we use a portion of system memory as a file system. As I referred AWS documentation it says ElastiCache is a web service. Is it like an EC2 instance with few memory modules attached? I really want to understand how it exactly works.
Our company has decided to migrate our physical servers into AWS cloud. We use Apache web server and MySQL Database running in Linux. We provide a SaaS platform for e-mail marketing and event scheduling for our customers. There is usually a high web traffic to our website during 9am-5pm on weekdays. I would like to understand if we want to use ElastiCache service, how it will be configured in AWS.? We have planned two EC2 instances for our web server and an RDS instance for the database.
Thanks.
Elastic cache is simply managed Redis or Memcached. Depending which one you choose, you would use the client for the cache with your application.
How you would implement it depends on what kind of caching you are trying to accomplish.

AWS multi-zone distaster recovery and load balancing - best approach?

I’m using Amazon Web Services, and trying to set up a modest system for load balancing and disaster recovery. The application is PHP based, with Zend Framework 2 (ZF2) on the front end, a local memcached server and MySQL through RDS. All servers are running Amazon Linux.
I am trying to configure the elastic load balancer to use two servers in two different AWS “availability zones.” To seamlessly allow one server to shut down and another take over, we need shared PHP sessions. So I set up PHP database sessions with ZF2.
In general, I assume the likelihood of an outage of an AWS zone is considerably lower than chance of a fatal problem in the individual servers or the application itself. So I am considering a different approach:
All the servers in the same availability zone
Separate AWS ElastiCache server (essentially memcached, cannot be used across zones)
PHP sessions stored in the cache (built-in support for memcached)
One emergency server in a different zone – in the rare case of a zone outage, we would change the DNS record to use the different server
Is this a good standard approach to DR and load balancing? I don’t like the DR solution in the case of zone outage, but I haven’t seen a zone go down much, and we can probably handle that level of risk if it simplifies the design. If the load balancer could weight be servers, I would pull all the weight on one zone, with the backup server weighted much lower.
What would be the benefit of keeping all the PHP servers in the same AZ vs distributing them among multiple AZs? I can't think of any, except a very small (3-5ms) latency improvement. Since there's very little downside, why not spread the servers among multiple AZs?
Your Elasticache memcached is still a single point of failure. If the AZ that the Elasticache instance is running in has a problem you will lose sessions. You could switch to use Elasticache w/ Redis (which supports master/slave) to achieve multi-AZ for your cache layer as well.

WebServer and Database server hosted on seperate instances of Amazon EC2

I am planning to run a web-application and expecting a traffic of around 100 to 200 users.
Currently I have set up single Small instance on Amazon. This instance consist of everything – the Webserver(Apache) , the Database Server(MySQL) and the IMAP server( Dovcot). I am thinking of moving out my database server out of this instance and create a separate instance for it. Now my question is –
Do I get latency while communication between my webserver and Database server( Both hosted on separate instances on Amazon )
If yes, what is the standard way to overcome this ? ( or Do I need to set up a Virtual Private Cloud ?)
If you want your architecture to scale you should separate your web server from your database server.
The low latency that you will pay (~1-2ms even between multiple availability zone), will give you better performance as you can scale each tier separately.
You can add small (even micro) instances to handle more web requests behind a load balancer, without the need to duplicate an instance that has to have a database as well
You can add auto-scale group for your web server that will automatically scale your web server tier, based on usage load
You can scale up your DB instance, to have more memory, getting a better cache hit
You can add Elastic Cache between your web server and your database
You can use Amazon RDS as a managed database service, which remove the need for an instance for the database at all (you will pay only for the actual usage of the database in RDS)
Another benefit that you can have is better security on your database. If your database is on a separate instance, you can prevent access to it from the internet. You can use a security group that allows only sql connection from your web server and not http connection from the internet.
This configuration can run in a regular EC2 environment, without the usage of VPC. You can certainly add VPC for an even more control environment, without additional cost nor much increased complexity.
In short, for scalability and high availability you should separate your tiers (web and DB). You will probably also find yourself saving on cost as well.
Of course there will be latency when communicating between separate machines. If they are both in the same availability zone it will be extremely low, typically what you'd expect for two servers on the same LAN.
If they are in different availability zones in the same region, expect a latency on the order of 2-3ms (per information provided at the 2012 AWS re:Invent conference). That's still quite low.
Using a VPC will not affect latency. That does not give you different physical connections between instances, just virtual isolation.
Finally, consider using Amazon's RDB (Relational Database Service) instead of a dedicated EC2 instance for your MySql database. The cost is about the same, and Amazon takes care of the housekeeping.
Do I get latency while communication between my webserver and Database server( Both hosted on separate instances on Amazon )
Yes, but it's rather insignificant compared to the benefits gained by separating the roles.
If yes, what is the standard way to overcome this ? ( or Do I need to set up a Virtual Private Cloud ?)
VPC increases security and ease of management of the resources, it does not affect performance. A latency of a millisecond or two isn't normally problematic for a SQL database. Writes are transactional so data isn't accessible to other requests until it's 100% completed and committed. I/O throughput and availability are much more of a concern, which is why separating the database and the application is important.
I'd highly recommend that you take a look at RDS, which is AWS's version of a managed MySQL, Oracle, or MS SQL Server database server. This will allow you to easily setup and manage your database, including cross-availability zone replication and automated backups. I also wrote a blog post yesterday that's fairly relevant to your question.