I have created a Site to Site VPN connection between VPC of Google cloud Platform and AWS in North Virginia region for both the VPCs. But the problem is I have been getting a very high ping and low bandwidth while communicating between the instances. Can any one tell me the reason for this?
image showing the ping data
The ping is very high considering they are in a very close region. Please help.
Multiple reason behind the cause :
1) verify gcp network performance by gcping
2) verify the tcp size and rtt for bandwidth
3) verify with iperf or tcpdump for throughput
https://cloud.google.com/community/tutorials/network-throughput
Be aware that any VPN will be traversing the internet, so even though they are relatively close to each other there will be multiple hops before the instances are connected together.
Remember that from the instance it will need to route outside of AWS networks, then to any hops on the internet to GCP and finally routed to the instance and back again to return the response
In addition there is some variation in performance as the line will not be dedicated.
If you want dedicated performance, without traversing the internet you would need to look at AWS Direct Connect. However, this might limit your project because of cost.
One of the many limits to TCP throughout is:
Throughput <= EffectiveWindowSize / RoundTripTime
If your goal is indeed higher throughput, then you can consider tweaking the TCP window size limits. The default TCP window size under Linux is ~3MB. However, there is more to EffectiveWindowSize than that. There is also the congestion window, which will depend on factors such as packet losses and congestion control heuristics being used (eg cubic vs bbr).
As far as sanity checking the ping RTTs you are seeing, you can compare with ping times you see between an instance in AWS us-east-1 and GCP us-east4 when you are not using a VPN.
Related
Need a design advice. Need to run in AWS loadbalancer per packet (not per flow).
It's for unidirectional UDP based streaming.
That means that each packet received by the loadbalancer should be send to another target - so that all targets receive the same amount of packets.
I do not see any ready solution and considering using EC2 with iptables and "-m statistic --mode random" PREROUTING chain. Any comments on the performance of that module at 1 up to 10Gbit/s scale ? (how strong EC2 instance would i need?)
Any other advices / hints how to achieve it ?
Thanks,
AWS Network Loadbalancer can be configured to send to "random" Targets in the TargetGroup, but this behaviour is not documented and just stated (to be exact, it's not defined how this distribution is done). It's the general ELB behaviour that targets are chosen by some hidden algorithm. Maybe it's worth an experiment? Make sure that Stickiness is turned off, as this is enabled the exact opposite of your use case.
I couldn't find a hard definition of how many GBit/s a NLB will support. But there is the concept of LCU (Load Balancer Capacity Units) that determines also billing and needs to be taken into account. LCUs are exposed in CloudWatch
Custom EC2 Instances will work and also cost a lot as CPU scales (roughly) with network. Here is a general list of EC2 Instances that you can filter for your network requirements and also see the pricing for it.
Maybe you should generally go for Devices with Enhanced Networking and Nitro, as later have special hardware for fast networking.
We've configured some Firewall rules to block some bad ips. This has been done in the VPC Network -> Firewall area. This is NOT done on the server via IPTables or anything.
Everything is fine until we have floods of traffic from these bad ips. I can see in the firewall log for this rule it was blocking them, but there is either a connection limit or bandwidth limit. For 40 minutes I have a solid wall hit counts of 24,000 for ever minute - no up or down just 24,000 constantly.
The server was getting no traffic, resource usage was way down. This was a problem because valid traffic was getting bottle necked somewhere.
The only thing I can see in the docs is a limit of 130,000 maximum stateful connections.
https://cloud.google.com/vpc/docs/firewalls#specifications
Machine type is n1-standard-4
During this attack when I looked at the quotas page, nothing was maxed out.
Is anyone able to shed some light on this?
The answer is to resize the instance size and add more cores.
Don't use instanced with shared cores.
I went for an n2 with 8 cores and this has now resolved it's self.
How does AWS Direct Connect work?
From AWS docs:
AWS Direct Connect establishes a dedicated network connection between your on-premises network and AWS ... bypassing your internet service provider.
Please explain what it actually means in practical sense? And why would it be any better?
NOTE: I know there are AWS docs (worst docs I've ever seen) and some
other articles and answers on SO, but none of those explain what it
actually means in practice or miss some important notes for understanding for somebody who's only used to public internet and can't imagine how it's possible to "bybass public internet". So I decided
to create this post from my collective knowledge and also provided a real example from our company case. Hope it will be
useful for somebody.
So, from AWS docs:
AWS Direct Connect establishes a dedicated network connection between your on-premises network and AWS ... bypassing your internet service provider.
But to understand what it actually means, lets refresh some knowledge.
Public Internet
Internet cable goes from your PC to your ISP (Internet Service Provider) switch located somewhere in your apartments building, usually.
And then... it gets connected to another switch, and it goes on and on like that, travelling through other cables and switches until it reaches the target PC you wanted to reach. Each switch knows where to send the signal next (how: it's a different topic).
So the signal first travels to some main switch of your ISP (where it does all the pricing/monitoring etc). ISP itself is then connected to another bigger ISP (they have some sort of partnership where your ISP pays that another ISP for using its cables, just like you do with your own ISP). In total, lots of physical cables literally lay down around the world going undersea/underground/over-the-air allowing the whole world to be connected to each other.
Problem
There are millions of internet users. You can imagine how many switches and cables is needed to support all that big network, and how much traffic (hello TikTok) is traveling. Because of that:
Your signal doesn't travel via the most optimal route through switches, when it needs to go from your PC to another target machine (some AWS machine in our case). Especially when you're on different continents.
Big traffic also makes your ping fluctuate, depending on the load on each individual switch.
There are all sorts of switches, we can't really trust them. Moreover, ISPs aren't required to be compliant with PCI security standard, or any other. If you want a secure connection, you have to use VPN (IPSec, OSI layer 3), but it costs in terms of performance and ping speed.
AWS Direct Connect
AWS own internet
AWS came here and said - "let me create my own internet around the world. I literally lay down my own physical cables and I'll connect them to only selected (partner) data centers. So you can use those instead of public internet" AWS may still probably lease someone's cables (especially underseas one's), but most of the cables/switches are their own ones. Benefits:
It's a much smaller net. Less switches and they're more reliable. Cables go almost straight to AWS. Therefore it's faster.
AWS implements MACsec OSI layer 2 security at hardware level. So no VPN required (though you could still use it). It's faster.
How to connect
Obviously, AWS can't connect to each PC in the world just like public ISPs, otherwise it would be the same public internet network, sort of speaking, and wouldn't make sense.
AWS has a partnership with selected data centers around the world to which AWS did the physical connection and put their own switch there ("AWS Direct Connect Cage"). So if you put your server in such data center (or at least your server connects to such data center from other nearest location via public internet) you can quickly enter AWS network where your signal will travel much faster. So you don't even need a public ISP in such case.
You do this to:
Reduce latency and improve stability into your AWS environment.
Even reduce latency between non-AWS endpoints. When both endpoints use public internet to only connect to the nearest AWS cage, while then cage-to-cage traffics goes through AWS internal network. And voila!
Results
In our company case, we managed to decrease the latency around 5 times (i.e. 0.5 seconds vs 2.5 seconds) for non-AWS connectivity.
I'm currently in Sydney and I do have the following scenario:
1 RDS on N. Virginia.
1 EC2 on Sydney
1 EC2 on N. Virginia
I need this to redundation, and this is the simplified scenario.
When my app on EC2 sydney connection to RDS on N. Virgnia, it takes almost 2.5 seconds to give me the result. We can think: Ok, that's the latency.
BUT, when I send the request to EC2 N. Virginia, I get the result in less then 500ms.
Why there is a slow connection when you access RDS from outside the region?
I mean: I can experience this slow connection when I'm running the application on my computer too. But when the application is in the same region that RDS, works quickier that on my own computer.
Most likely you request to RDS requires multiple roundtrips to complete. I.e. at first your EC2 instance requests something to RDS, then something else based on the first request etc. Without seeing your database code, it's hard to say exactly what might be the cause of that.
You say then when you talk to the remote EC2 instance, instead, you get the response in less than 500 ms. That suggests that setting up a TCP connection and sending a single request with reply is 500 ms. Based on that, my guess is that your database connection requires at least 5x back and forth traffic.
There is no additional penalty with RDS in terms of using it out of region, but most database protocols are not optimized for high latency conditions. You might be much better off setting up a read replica in Sydney.
If you are trying to connect the RDS using public-facing network, then it might be slow. AWS launched cross region VPC peering, please peer all the region's VPC (make sure there will not be any IP conflict) and try to connect using private connections.
I want to run a dpdk experiment using Amazon EC2 service. But there are a great number of services in AWS. I don't know which one to choose.
My experiment need two servers connected together using 10Gbps network adpater supporting dpdk. I run pktgen-dpdk on one server to send packets towards the other server. And another dpdk application will run in the other server to deal with these packets.
I think I can rent servers such c4.8xlarge c4.4xlarge. But I don't know how to set up the local network between them. The local network should have low latency.
Any suggestions will be appreciated! Thank you!
You're looking for Virtual Private Cloud (VPC). An AWS EC2 "instance" like your c4.8xlarge is just a machine. The VPC and several other components allow you to set up a broader network, routing, security groups (basically, a firewall) and other networking capabilities, including in your case a Gateway, which would allow your dpkg system to look out onto the Internet to find dependencies.
The in-network latency is extremely low, < 1ms in our experience. I think most current EC2 instances support 10Gbps networking and other speedy network capabilities.