Performance test in AWS : How to guarantee Bandwidth

Performance test in AWS : How to guarantee Bandwidth - amazon-web-services

I need to run a performance test against an application based on Elastic Beanstalk located in AWS fronted by and ELB.
I expect traffic to be around 25 Gbit/s
As per AWS requirements, I am using another account (dedicated to tests) from my AWS organisation.
The application is a production application in another account of my AWS organisation.
My performance test will use the DNS entry of the production website, it will be executed by EC2 instances in subnet of a VPC that has an internet gateway.
I have a doubt regarding the bandwidth, I don't understand from AWS documentations I read if there will be a limitation of bandwidth or not ?
From this answer it seems I may face such issues:
https://stackoverflow.com/a/62344703/9565222
In this case, how can I run a performance test that reflects what happens in production, ie pass through DNS entry pointing to the ELB.
Let's say I create a Peering connection between the Test account VPC and production VPC, what is the max bandwidth ?
My test shows that with 3 c5d.9xlarge using a VPC Peering connection , I only get around 10 Gbits/s, so it would be the max whatever the number of instances.
Another test shows that with 3 c5d.9xlarge using a Internet Gateway, I get varying bandwidth capped around 12 Gbits/s, but I cannot tell what's the real limit.
So what are my option ?
- VPC Peering is not
- Internet Gateway from multiple machines may be but I would like a kind of guarantee
- Are there better options (Transit Gateway ?) ?

I need to run a performance test against an application based on Elastic Beanstalk located in AWS fronted by and ELB. I expect traffic to be around 25 Gbit/s
That sounds totally fine, ELB can easily handle 25 Gbps.
Make sure that your test reflects what your production load is going to be like. If your production load is all coming from a very small number of sources, replicate that. If it's coming from a very large number of sources (e.g., lots of users of a client app, each generating a bit of traffic, resulting in a ton of total aggregated traffic), make sure you replicate that. There are differences that may seem nuanced if you're not experienced in this kind of testing, and reproducing the real environment as closely as possible is the easiest way to avoid any of those issues.
For testing with a very large number of relatively low-bandwidth sources, take a look at projects like these:
Bees with Machine Guns
Tsung
I have a doubt regarding the bandwidth, I don't understand from AWS documentations I read if there will be a limitation of bandwidth or not ?
Some components in AWS have bandwidth limitations, some don't.
Specifically, EC2 instances each have a maximum bandwidth they support depending on the instance type. Also, you should know that even if a given EC2 Instance Type supports a certain bandwidth, you need to be sure that the OS running on that instance supports that bandwidth. This usually means that you need to ensure that the correct drivers are being used. In my experience, as long as you use the most recent version of Amazon Linux avaialable, everything should "just work".
Also, as I mention in more details later, VPC Peering Connections and Internet Gateway are do not limit bandwidth.
Let's say I create a Peering connection between the Test account VPC and production VPC, what is the max bandwidth ?
VPC Peering Connections are not a bandwidth bottleneck. That is, they don't limit the amount of bandwidth you have across the peering connection.
From the Amazon VPC FAQ:
Q. Are there any bandwidth limitations for peering connections?
Bandwidth between instances in peered VPCs is no different than bandwidth between instances in the same VPC.
[nb: there's a note about placement groups in the FAQs, but you don't mentioned that so I removed it; if you are using the feature, please clarify, as it's something that you most likely shouldn't be using anyway based on what you described originally in the question]
My test shows that with 3 c5d.9xlarge using a VPC Peering connection , I only get around 10 Gbits/s
The c5d.9xlarge instance type is limited to 10 Gbps. So if you use that for your test, you won't ever see one instance with more than 10 Gbps.
More info here: Amazon EC2 C5 Instances.
Also, make sure you check the EC2 C6g instances. I haven't personally used them, but they are supposed to be incredibly faster and lower cost: they were released just 2 days ago.
Another test shows that with 3 c5d.9xlarge using a Internet Gateway, I get varying bandwidth capped around 12 Gbits/s [...]
The Internet Gateway isn't a bandwidth bottleneck. In other words, there's no bandwidth limit imposed by the Internet Gateway.
In fact, there's no "single device" that is an Internet Gateway. Think of it more as a "flag" that tells the VPC networking system that your VPC has a path to and from the Internet.
From the Amazon VPC FAQ:
Q. Are there any bandwidth limitations for Internet gateways? Do I need to be concerned about its availability? Can it be a single point of failure?
No. An Internet gateway is horizontally-scaled, redundant, and highly available. It imposes no bandwidth constraints.
So what are my option ? - VPC Peering is not - Internet Gateway from multiple machines may be but I would like a kind of guarantee - Are there better options (Transit Gateway ?) ?
VPC Peering is probably the best choice here. As I mentioned, it is not limiting your bandwidth. Check other things like I mentioned before: the instance type, the OS, the drivers, etc.
Using an Internet Gateway for this implies that, from a routing perspective, your traffic is "leaving AWS" and going "out to the Internet" (even though, physically, it probably won't ever truly leave AWS's physical devices). This means that, from a billing perspective, you'll be charged "Data Transfer Out to the Internet" rates. They are significantly higher than what you'd pay for VPC Peering.
I see no need for a Transit Gateway here, as the scenario you describe is really simple and can be solved with a VPC Peering Connection.

Related

How can I specify the subnet of an RDS or DynamoDB?

I would like to specify the subnet for a DynamoDB or RDS database to decrease the latency when I access the database from my EC2 server. Is this possible? Right now, the latency is about 0.1s for reads from the DynamoDB server by the EC2 server in the same region, which seems much too slow.

How do you measure the latency? I'm extremely surprised to hear that you're gettin 100ms latency with Dynamo, from an EC2 host. In my experience DynamoDB gives pretty consistent latency in the low tens of milliseconds: 10-20ms in the 99th percentile and 10ms in the 90th percentile latency is pretty typical. The median latency (p50) is even lower with the majority or requests being completed in 4-5ms.
Of course, how long requests take also depends on how much data you're shuttling around. Writing large items for instance may take longer than simply updating a smaller item.
If you do want to configure a VPC endpoint for DynamoDB though, that is totally possible and you can accomplish that using the aws ec2 create-vpc-endpoint command as shown in the following guide:
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/vpc-endpoints-dynamodb.html
Note In your question you're asking about RDS as well. While RDS does not offer VPC endpoints as a service, an RDS instance is already within your VPC so it doesn't really make sense to talk about VPC endpoint for RDS databases. You can simply create the instance and not even give it public internet access. Requests to the RDS databases will simply flow through the VPC router, straight to the RDS instance endpoint, all within the VPC.

Static IP for outbound API calls

A new api service we use requires that we give them a list of all the IP addresses our calls will be coming from; if we make an api call from any other IP address, the call will fail.
This question has been asked before here, but I'm wondering if in 2019 there is any simpler/easier/lower cost solution.
Our Setup
Elastic Beanstalk, which currently scales to anywhere from 5 - 50 ec2 instances for our web application based on traffic
An Application Load Balancer
Also have a worker tier, which would be available for use if that might be helpful
Typically these api calls would be coming from any of our web tier ec2 instances, as the calls will be based on a user interaction. We can of course set up something different, e.g. have the worker tier make the calls
Solutions I've Found
Give each ec2 instance an elastic (static) ip address. This is not a great solution for us, because as we hopefully continue to scale the number of ip addresses needed will continue to grow {ref}
Set up two NAT instances (one not being sufficient as it would be a single point of failure). I'm hoping there is something simpler and lower cost than this option. {ref} {ref}
Create new ec2 instances and put them behind a Network Load Balancer. Again, complex and costly. {ref}
Are there any new, easier, less costly solutions? I have never used AWS Lambda before; maybe it is be possible to run Lambda functions all from one IP address? I don't have many ideas beyond that at this point. Thanks for your time.

A NAT is the best solution, and shouldn't cost you much more than a web-server.
The simplest way to use a NAT is the NAT Gateway. Pricing depends on region, but it's around $0.05/hour, which is a little more than the price of a t3.medium EC2 instance. You're also charged a per-GB rate for data, which can add up quickly. On the positive side, Amazon manages the infrastructure for you, including patches and high-availability.
A NAT Instance is an EC2 instance running a specially-configured AMI. You could probably get away with running this on a t3.micro instance, at $0.01 per hour, which is probably much less than any of your webservers. You will be responsible for applying patches and waking up in the middle of the night if anything goes wrong.
You can probably get away with a single NAT, of either type. You will pay for cross-AZ traffic by doing this ($0.01/GB), so it will be false economy if you move a lot of data across the NAT. It's a tossup on whether you'll get higher availability from two NATs, because you can only reference one at a time in your routing tables. So if one goes down you'll have to update the routing tables to point at the other, which will probably take as much time as bringing up a new instance.
You can't use a Lambda, because it needs to have a permanent IP address assignment and you can't control that with Lambda. You could write your own proxy server, running on EC2, but the costs for that are the same as a NAT Instance.

Here is prescriptive guidance from AWS: https://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/generate-a-static-outbound-ip-address-using-a-lambda-function-amazon-vpc-and-a-serverless-architecture.html
"This pattern describes how to generate a static outbound IP address in the Amazon Web Services (AWS) Cloud by using a serverless architecture..."
Essentially, you have an AWS Lambda function that uses an Elastic IP address as the outbound IP address. In the guidance, you will create "a Lambda function and a virtual private cloud (VPC) that routes outbound traffic through an internet gateway with a static IP address. To use the static IP address, you attach the Lambda function to the VPC and its subnets. "

AWS alternative to DNS failover?

I recently started reading about and playing around with AWS. I have particular interest in the different high availability architectures that can be acheived using the platform. Specifically, I am looking for a reliable poor man's solution that can be implemented using the least amount of servers.
So far, I am satisfied with solutions for the main HA concerns: load balancing, redundancy, auto recovery, scalability ...
The only sticking point I have is with failover solutions.
Using an ELB might seem great, however ELB actually uses DNS balancing under the hood. See Is AWS's Elastic Load Balancer a single point of failure?. Also from a Netflix blog post: Lessons Netflix Learned from the AWS Outage
This is because the ELB is a two tier load balancing scheme. The first tier consists of basic DNS based round robin load balancing. This gets a client to an ELB endpoint in the cloud that is in one of the zones that your ELB is configured to use.
Now, I have learned DNS failover is not an ideal solution, as others have pointed out, mainly because of unpredictable DNS caching. See for example: Why is DNS failover not recommended?.
Other than ELBs, it seems to me that most AWS HA architectures rely on DNS failover using route 53.
Finally, the floating IP/Elastic IP (EIP) strategy has popped up in a very small number of articles, such as Leveraging Multiple IP Addresses for Virtual IP Address Fail-over and I'm having a hard time figuring out if this is a viable solution for production systems. Also, all examples I came across implemented this using a set of active-passive instances. It seems like a waste to have a passive for every active to achieve this.
In light of this, I would like to ask you what is a faster and more reliable way to perform failover?
More specifically, please discuss how to perform failover without using DNS for the following 2 setups:
2 active-active EC2 instances in seperate AZs. Active-active, because this is a budget setup, were we can't afford to have an instance sitting around.
1 ELB with 2 EC2 instances in region A, 1 ELB with 2 EC2 instances in region B. Again, both regions are active and serving traffic. How do you handle the failover from 1 ELB to the other?

You'll understand ELB better by playing with it, if you are the inquisitive type, as I am.
"1" ELB provisioned in 2 availability zones is billed as 1 but deployed as 2. There are 2 IP addresses assigned, one to each balancer, and 2 A records auto-created, one for each, with very short TTLs.
Each of these 2 balancers will forward traffic to the instance in its same AZ, or you can enable cross-AZ load balancing (and you should, if you only have 1 server instance in each AZ).
These IP addresses do not change often and though it stands to reason that ELBs fail like anything else, I have maybe 30 of them and have never knowingly had a dead one on my hands, presumably because the ELB infrastructure will replace a dead instance and change the DNS without your intervention.
For 2 regions, you have little choice other than using DNS at some level. Latency-based routing from Route 53 can send people to the closest site in normal operations and route all traffic to the other site in the event of an outage of an entire region (as detected by Route 53 health checks), but with this is somewhat more likely to encounter issues with DNS caching when the entire region is unavailable.
Of course, part of the active/passive dilemma in a single region using Elastic IP is easily remedied with HAProxy on both app servers. It's an http request router and load balancer like ELB, but with a broader set of features. The code is so tight that you can likely run it on your app servers with negligible CPU consumption. The instance with the EIP would then balance traffic between its local app server and the peer. Across regions, HAProxy behind ELB could forward traffic to a mate in a remote region, if the local region is up but for whatever reason the application can't serve requests from the local region. (I have used such a setup to increase availability of external services, by bouncing the request to a remote AWS region when the direct Internet path from the local region is not working.)

Is ssl termination at AWS load balancer ELB secure?

We have a web application running on ec2 instance.
We have added AWS ELB to route all request to application to load balancer.
SSL certificate has been applied to ELB.
I am worried about whether HTTP communication between ELB and ec2 instance is secure?
or
should I use HTTPS communication between ELB and ec2 instance?
Does AWS guarantees security of HTTP communication between ELB and ec2 instance?

I answered a similar question once but would like to highlight some points:
Use VPC with proper Security Groups setup (must) and network ACL (optional).
Notice your private keys distribution. AWS made it easy with storing it safely in their system and never using it again on your servers. It is probably better to use self-signed certificates on your servers (reducing the chance to leak your real private keys)
SSL is cheap these days (compute wise)
It all depends on your security requirements, regulations and how much complexity overhead are you willing to take.
AWS do provide some guaranties (see network section) against spoofing / retrieval of information by other tenants, but the safe assumption is that multi-tenant public cloud environment is not 100% hygienic and you should encrypt.
Single tenant instance (as suggested by #andreimarinescu) will not help as the attack vector discussed here is the network between the ELB (shared env) and your instance. (however, it might help against XEN zero days)
Long answer with short summary - encrypt.

Absolute control over security and cloud deployments are in my opinion two things that don't mix very well.
Regarding the security of traffic between the ELB and the EC2 instances, you should probably deploy your resources in a VPC in order to add a new layer of isolation. AWS doesn't offer any security guarantees.
If the data transferred is too sensitive, you might also want to look at deploying in a dedicated data-center where you can have greater control over the networking aspects. Also, you might want to look at single-tenant instances on EC2, since you're probably sharing your physical resources with other EC2 customers.
That being said, there's one aspect that you also should take into account: SSL termination is quite an expensive job, so terminating SSL at the ELB level will allow your backend instances to focus resources on actually fulfilling the requests, but this will also impact the ELB (it will automatically scale, but it will have to do it faster, and you might see increased latency as it does so during spikes of traffic).

EC2 instance region via IP Address

I'm trying to get my EC2 instances to communicate better with APIs of a 3rd party service. Latency is extremely important as voice communication is heavily involved & lag is intolerable.
I know a few of the providers use EC2, but the thing is Amazon's IP system makes it difficult to find which region the instance is in. With non elastic-ip services I could do a whois and find if it was in Australia or somewhere in Europe so I could put a server close by.
With these elastic IP's how can I find which zone they're in. I can use ping times but its a bit of a guess and I have to make all these instances in different regions to find the shortest ping time.

Amazon EC2 regularly publishes its Amazon EC2 Public IP Ranges, which clusters them by Region.
It does not cluster them by Availability Zone (AZ) (if you actually meant that literally), but this shouldn't matter much, insofar cross AZ latency should regularly be within single digit milliseconds range only.
Other than that you might also be interested in my answer to How could I determine which AWS location is best for serving customers from a particular region?, which outlines two other options for handling this based on external data/algorithms or via the Multi-Region Latency Based Routing now Available for AWS (which would likely only be useful when fully embracing Amazon Route 53 as well).

Put your server behind a Route 53 DNS and let Latency Based Routing do the rest for you - it can decide automatically for you the least latent server.
http://aws.typepad.com/aws/2012/03/latency-based-multi-region-routing-now-available-for-aws.html

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js