Network Latency between GCP server and linux data centres - google-cloud-platform

What are the best practices to avoid network latency between GCP server and unix server. My client application, which is on linux, is accessing GCP end point, but facing network latency. How to avoid it.

Are you suspecting that part of the latency is not due to the distance between your server and GCP? If not then obviously all you can do is (1) place your server closer to your GCP node and (2) maybe cluster/parallelize your GCP requests if you have many of them.
So I suggest that you determine the distance between the two sites and compare it to the roundtrip time for your requests. If it's significantly larger then indeed you will have to analyze the structure of your requests.

Latency is not related to the OS you are using. Network latency is a measure of the time delay required for information to travel across a network. In all the factors that may affect this time delay, the one you can manage in cloud is the distance from the source to the destination. You can find other latency factors in this previous answer.
If you are looking to optimize the latency you could use a Cloud Load Balancer. Using Google Cloud Platform HTTP(S) load balancer, the requests are always routed to the instance group that is closest to the user. With load balancer you can also use Cloud CDN. Cloud CDN reduces latency by serving assets directly at Google's network edge.

Related

AWS ALB - single for all services?

We have many internet services, what are the considerations whether to use alb per service or single alb for all using listener rule pointing to target 🎯 group.
The services has its own clusters/target group with different functionality and different url.
Can one service spike impact other services?
Is it going to be a single point of failure ?
Cost perspective ?
Observability, monitoring, logs ?
Ease of management ?
Personally I would normally use a single ALB and use different listeners for different services.
For example, I have service1.domain.com and service2.domain.com. I would have two hostname listeners in the same ALB which route to the different services.
In my experience ALB is highly available and scales very nicely without any issues. I've never had a service become unreachable due to scaling issues. ALB's scale based on "Load Balancer Capacity Units" (LBCU). As your load balancer requires more capacity, AWS automatically assigns more LBCU's which allows it to handle more traffic.
Source: Own experience working on an international system consisting of monoliths and microservices which have a large degree of scaling between timezones.
You don't have impact on service B if service A has a spike, but the identification of which service is having bad times could be a little pain.
For monitoring perspective it's is a bit hard because is not that easy to have a fast identification of which service/target is suffering.
For management, as soon as different teams need to create/management its targets it can create some conflicts.
I wouldn't encourage you using that monolith architecture.
From cost perspective you can use one load balancer with multi forward rules, but using a single central load balancer for an entire application ecosystem essentially duplicates the standard monolith architecture, but increases the number of instances to be served by one load balancer enormously. In addition to being a single point of failure for the entire system should it go down, this single load balancer can very quickly become a major bottleneck, since all traffic to every microservice has to pass through it.
Using a separate load balancer per microservice type may add additional overhead but it make single point of failure per microservice in this model, incoming traffic for each type of microservice is sent to a different load balancer.

How to find AWS EC2 with lowest latency to another server

I have a client sever located in AWS and I want to reduce latency between his machine and my EC2 instance. I rented two same servers in one availability zone and started sending requests to client`s API. It turned out that these servers have different latencies: 95-th percentiles were different for about 5 milliseconds (that is about 30% from mean latency). And my aim is to reduce latency.
I think that I can rent more servers and repeat these experiment, but it will be the next step of my investigation. The first step for me is to understand the reasons why servers in the same zone have so big difference in API response latency and which metrics can be useful to explain it?
The second way to reduce latency is to rent bare metal server instead of EC2, but it seems to be too expensive. And I am afraid that renting this server make even worse if it stand further from client server.
So, tell me please:
Do you have any advice how to reduce latency?
How can I rent closest server to my client in the same AWS zone?

API Gateway/NLB/ECS Latency

I have a number of services deployed in ECS. They register with a Network Load Balancer (via a target group). The NLB is private, and is accessed via API Gateway + a VPC link.
Most of the time, requests to my services take ~4-5 seconds, but occasionally < 100ms. The latter should be the standard; the actual requests are served by my node instances in ~10ms or less. I'm starting to dig into this, but was wondering if there was a common bottleneck in setups similar to what I'm currently using.
Any insight would be greatly appreciated!
The answer to this was to enable Cross-Zone Load Balancing on my load balancers. This isn't immediately obvious and took two AWS support sessions to dig it up as the root cause.

Scalable server hosting

I have simple server now (some xeon cpu hosted somewhere), running apache/php/mysql (no docker, but its a possibility) and Im expecting some heavy traffic and I need my server to handle that.
Currently the server can handle about 100 users at once, I need it to handle couple thousands possibly.
What would be easiest and fastest solution to move my app to some scalable hosting?
I have no experience with AWS or something like that.
I was reading about AWS and similar, but Im mostly confused and not sure what should I choose.
The basic choice is:
Scale vertically by using a bigger computer. However, you will eventually hit a limit and you will have a single-point of failure (one server!), or
Scale horizontally by adding more servers and spreading the traffic across the servers. This has the added advantage of handling failure because, if one server fails, the others can continue serving traffic.
A benefit of doing horizontal scaling in the cloud is the ability to add/remove servers based on workload. When things are busy, add more servers. When things are quiet, remove servers. This also allows you to lower costs when things are quiet (which is not possible on-premises when you own your own equipment).
The architecture involves putting multiple servers behind a Load Balancer:
Traffic comes into a Load Balancer
The Load Balancer sends the request to a server (often based upon some measure of how "busy" each server is)
The server processes the request and sends a response back to the Load Balancer
The Load Balancer sends the response to the original requester
AWS has several Load Balancers available, which vary by need. If you are simply sending traffic to a single application that is installed on all servers, a Network Load Balancer should be sufficient. For situations where different parts of the application are on different servers (eg mobile interface vs web interface), you could use a Application Load Balancer.
AWS also assists with horizontal scaling by providing the Amazon EC2 Auto Scaling service. This allows you to specify details of the servers to launch (disk image, instance type, network settings) and Auto Scaling can then automatically launch new servers when required and terminate ones that aren't required. (Note that they launch and terminate, not start and stop.)
You can further define scaling policies that tell Auto Scaling when to launch/terminate instances by measuring metrics such as CPU Utilization. This way, the number of servers can approximately match the volume of traffic.
It should be mentioned that if you have a database, it should be stored separately to the application servers so that it does not get terminated. You could use the Amazon Relational Database Service (RDS) to run a database for you, or you could run one on a separate Amazon EC2 instance.
If you want to find out more about any of the above technologies, there are plenty of talks on YouTube or blog posts that can explain and demonstrate their use.

AWS multi-zone distaster recovery and load balancing - best approach?

I’m using Amazon Web Services, and trying to set up a modest system for load balancing and disaster recovery. The application is PHP based, with Zend Framework 2 (ZF2) on the front end, a local memcached server and MySQL through RDS. All servers are running Amazon Linux.
I am trying to configure the elastic load balancer to use two servers in two different AWS “availability zones.” To seamlessly allow one server to shut down and another take over, we need shared PHP sessions. So I set up PHP database sessions with ZF2.
In general, I assume the likelihood of an outage of an AWS zone is considerably lower than chance of a fatal problem in the individual servers or the application itself. So I am considering a different approach:
All the servers in the same availability zone
Separate AWS ElastiCache server (essentially memcached, cannot be used across zones)
PHP sessions stored in the cache (built-in support for memcached)
One emergency server in a different zone – in the rare case of a zone outage, we would change the DNS record to use the different server
Is this a good standard approach to DR and load balancing? I don’t like the DR solution in the case of zone outage, but I haven’t seen a zone go down much, and we can probably handle that level of risk if it simplifies the design. If the load balancer could weight be servers, I would pull all the weight on one zone, with the backup server weighted much lower.
What would be the benefit of keeping all the PHP servers in the same AZ vs distributing them among multiple AZs? I can't think of any, except a very small (3-5ms) latency improvement. Since there's very little downside, why not spread the servers among multiple AZs?
Your Elasticache memcached is still a single point of failure. If the AZ that the Elasticache instance is running in has a problem you will lose sessions. You could switch to use Elasticache w/ Redis (which supports master/slave) to achieve multi-AZ for your cache layer as well.