From some reason download traffic from virtual machine on GCP (Google Cloud Platform) with Debian 9 is limited to 50K/s? Upload seems to be fine, inline with my local upload link.
It is the same with scp or https download. Any suggestions what might be wrong, where to search?
Machine type
n1-standard-1 (1 vCPU, 3.75 GB memory)
CPU platform
Intel Skylake
Zone
europe-west4-a
Network interfaces
Premium tier
Thanks,
Mihaelus
Simple test:
wget https://hrcki.primasystems.si/Nova/assets/download.test.html
Output:
--2018-10-18 15:21:00-- https://hrcki.primasystems.si/Nova/assets/download.test.html Resolving
hrcki.primasystems.si (hrcki.primasystems.si)... 35.204.252.248
Connecting to hrcki.primasystems.si
(hrcki.primasystems.si)|35.204.252.248|:443... connected. HTTP request
sent, awaiting response... 200 OK Length: 541422592 (516M) [text/html]
Saving to: `download.test.html.1' 0% [] 1,073,152 48.7K/s eta
2h 59m
Always good to minimize variables when trying to diagnose. So while it is unlikely the use of HTTP is why things are that very slow, you might consider using netperf or iperf3 to measure TCP bulk transfer performance between your VM in GCP and your local system. You can do that either "by hand" or via PerfKit Benchmarker https://cloud.google.com/blog/products/networking/perfkit-benchmarker-for-evaluating-cloud-network-performance
It can be helpful to have packet traces - from both ends when possible - to look at. You want the packet traces to be started before the test - it is important to see the packets used to establish the TCP connection(s). They do not need to be "full packet" traces, and often you don't want them to be. Capturing just the first 96 bytes of each packet would be sufficient for this sort of investigating.
You might also consider taking snapshots of the network statistics offered by the OSes running in your GCP VM and local system. For example, if running *nix taking a snapshot of "netstat -s" before and after the test. And perhaps a traceroute from each end towards the other.
Network statistics and packet traces, along with as many details about the two endpoints as possible are among the sorts of things support organizations are likely to request when looking to help resolve an issue of this sort.
Related
While load/performance testing of an API on DNS in AWS using JMeter, we observed relatively higher response times(~ 230 ms) in AWS windows machine. When this test is performed in my local machine, the response times are around 110 ms.The throughput/# of samples served does change widely due to this response time.
The tests were ran for 1 hour each with no delay for three times in both the machines. The only difference I see is my RAM size is 16 GB while AWS is 4 GB. Will this really make such a big difference? or is there something I am missing.
AWS Machine configuration:
My local machine configuration:
Can anyone share their thoughts?
I can think of 2 possible reasons:
Your AWS machine is located in the region which is geographically more far from the endpoint than your local machine
It might really be the case JMeter lacks resources on the AWS instance and hence cannot send requests fast enough so make sure to:
Monitor the resources available to JMeter using Windows PerfMon or Amazon Cloudwatch or JMeter Perfmon Plugin as JMeter might be very resources intensive and it should heave sufficient headroom to operate
Follow JMeter Best Practices
I'm trying to host a Spigot Minecraft 1.12.2 Server using Ubuntu, The server has been properly set up and is working properly, The ping however isn't really great, I am playing from India and the server VM instance region has been set to Germany-Frankfurt, I should be getting anywhere between 130-200ms latency but It's always above 300 or even 1000 at times, I did tracert using windows CMD terminal and the packets seem to go to U.S.A first and then to Germany, I asked several of my friends to ping the server and they all get the same result. How can I fix this? Is there any way to route packets straight to Germany Instead of going to the U.S first?
Made a new Instance in Mumbai Region, India, which is where I live, I'm getting 3 Ping while on the server select menu, but upon joining it jumps to 200.
I expect around 130-160 ping, which is what I get on other servers on that region, Other players who live near Germany are getting high pings, I can't make this server public with a major issue like this.
Have a look at the network map on this page: https://cloud.google.com/about/locations/#network-tab
As you can see, Google's network is not connected between Europe and India - therefore traffic has to take a detour around the other side of the world through Asia and the US.
Within a region, so from Germany to Germany and from India to India, you should however achieve low latency.
Probably you're experiencing this issue due to instance's machine type and CPU's count.
As stated in the documentation:
"Outbound or egress traffic from a virtual machine is subject to maximum network egress throughput caps. These caps are dependent on the number of vCPUs that a virtual machine instance has. Each core is subject to a 2 Gbits/second (Gbps) cap for peak performance. Each additional core increases the network cap, up to a theoretical maximum of 16 Gbps for each virtual machine".
Having so little information about your setup i cannot help you further unfortunately.
Please provide more information about your setup and your customer's needs.
For example, who will your customers be? From which country? Is this the reason why you're using an European region for your services while you live in India?
Redis can give sub millisecond response times. That's a great promise. I'm testing heroku redis and I get 1ms up to about 8ms, for a zincrby. I'm using microtime() in php to wrap the call. This heroku redis (I'm using the free plan) is a shared instance and there is resource contention so I expect response times for identical queries to vary, and they certainly do.
I'm curious as to the cause of the difference in performance vs. redis installed on my macbook pro via homebrew. There's obviously no network latency there. What I'm curious about is does this mean that any cloud redis (i.e. connecting over the network, say within aws), is always going to be quite a bit slower than if I were to have one cloud server and run a redis inside the same physical machine, thus eliminating network latency?
There is also resource contention in these cloud offerings, unless a private server is chosen which costs a lot more.
Some numbers: my local macbook pro consistently gives 0.2ms for the identical zincrby that takes between 1ms & 8ms on the heroku redis.
Is network latency the cause of this?
No, probably not.
The typical latency of a 1 Gbit/s network is about 200us. That's 0.2ms.
What's more, in aws you're probably on 10gbps at least.
As this page in the redis manual explains, the main cause of the latency variation between these two environments will almost certainly be a result of the higher intrinsic latency (there's a redis command to test this on any particular system: redis-cli --intrinsic-latency 100, see the manual page above) arising from being run in a linux container.
i.e., network latency is not the dominant cause of the variation seen here.
Here is a checklist (from redis manual page linked above).
If you can afford it, prefer a physical machine over a VM to host the server.
Do not systematically connect/disconnect to the server (especially true for web based applications). Keep your connections as long lived
as possible.
If your client is on the same host than the server, use Unix domain sockets.
Prefer to use aggregated commands (MSET/MGET), or commands with variadic parameters (if possible) over pipelining.
Prefer to use pipelining (if possible) over sequence of roundtrips.
Redis supports Lua server-side scripting to cover cases that are not suitable for raw pipelining (for instance when the result of a command
is an input for the following commands).
I have issues with websocket performance on AWS EC2.
I use websockets to listen to a server with incoming network rate 100-300 Kb/sec. Just listening, not sending. On EC2, every 10-20 minutes, I get disconnected (code 1006 - abnormal connection loss - no reason given). I have tested with t2.micro (which I believe should be more than enough for such a small task) and t2.large. I use US East, which should be close to the source.
This is to be compared with only one disconnection every few hours when I run the same app on my personal computer, in a different country. I have used two different libraries (Python aiohttp and websockets) to confirm that I have the same issues.
This points to an issue with network quality on EC2. However I'm not sure if this websockets task is demanding, so this is surprising.
Did anyone experience this before? What other diagnostics can I do to better understand the root cause?
When choosing an amazon aws instance type to launch, there is a property of each type which is "Network Performance" which is either "Low", "Moderate", or "High".
I'm wondering what this exactly means. Will my ping be lower if I choose low? Or will it be ok as long as many users aren't logged in at once?
I'm launching a real time multiplayer game and I am so I am curious as to exactly what is meant under "network performance". I actually need fairly low memory and processing power, but instances with those criteria usually have "low" network performance.
Has anyone experience with the different network performances or have more information?
It's not official, but Serhiy Topchiy did a benchmark with different instance types:
http://epamcloud.blogspot.com.br/2013/03/testing-amazon-ec2-network-speed.html
For US-EAST-1, it seems that LOW corresponds to 50Mb/s, Moderate corresponds to 300Mb/s and High corresponds to 1Gb/s.
My recent experience here: https://serverfault.com/questions/1094608/benchmarking-aws-outbound-internet-bandwidth-egress-up-to-25-gbps
We ran a live video broadcast on two AWS EC2 servers, hosting 500 viewers, that degraded catastrophically after 10 minutes.
We reproduced the outbound bandwidth throttling with iperf (see link above).
I believe it was mentioned at the reInvent 2013 conference that the different properties are related to the underlying network connection: Some servers have 10GB connections (High) some have 1GB (Moderate) and some have 100MB (Low).
I cannot find any on-line documentation to confirm this, however.
Edit: There is an interesting article on Packet per second limit available here
Since this question was first posed, AWS has released more information on the networking stack, and many of the newer instance families can support up to 25Gbps with the appropriate ENA drivers. It looks like much of the increased performance is due to the new Nitro system.