AWS EC2 instance works with thousands of http calls simultaneously

AWS EC2 instance works with thousands of http calls simultaneously - amazon-web-services

I have created a JSP page. When ever i send HTTP request to this page i create a EC2 instance on AWS and send back the Public IP in response but it takes atleast 10 sec to send response back.
I want to test 1000 HTTP calls simultaneously which means 1000 EC2 instances will initialize in AWS.
How can i test it? How can i generate 1000 Request to this page and then get the response back?

There are several tools for simulating traffic. An example of this is Apache's Jmeter. It takes some setting up but once you have it set up, you can send as many http requests as the machine sending them can handle.

For a simple test, you can use something like siege. There's also an AWS-based solution called "bees with machineguns" which can distribute the load across multiple instances.

You will have to use some kind of load testing tool to achieve this. I will suggest Jmeter. You can also read this useful blog on doing the same.
http://vdaubry.github.io/2015/02/24/Distributed-load-testing-with-jmeter-and-EC2/

Related

How to block strange HTTP requests from AWS ELB

I have a bunch of strange logs on Cloudwatch coming from ALB, looks like this.
2020-11-03T14:52:57.289+09:00 Not Found: /owa/auth/logon.aspx
2020-11-03T15:23:20.120+09:00 Not Found: /.env
2020-11-03T15:35:39.482+09:00 Not Found: /index.php
I use cloudwatch to logging server data, so this really bothers me. I would like to know how to block them.

Welcome to the Internet! There are many strange bots running on the Internet that are trying to access systems using known vulnerabilities. Any device connected to the Internet will regularly receive such requests. Take a look at the logs in your home router to see an example of what takes place.
You could add a Web Application Firewall (AWS WAF) to the Load Balancer, which can block defined patterns of requests. However, it might not be worth the effort/expense if your goal is merely to clean-up the log file.

Does anyone know if cloud run supports http/2 streaming while it does NOT support http1.1 streaming?

We have a streaming endpoint where data streams through our api.domain.com service to our backend.domain.com service and then as chunks are received in backend.domain.com, we write those chunks to the database. In this way, we can ndjson a request into our servers and IT IS FAST, VERY FAST.
We were very very disappointed to find out the cloud-run firewalls for http1.1 at least (via curl) do NOT support streaming!!!! curl is doing http2 to google cloud run firewall and google is by default hitting our servers with http1.1(for some reason though I saw an option to start in http2 mode that we have not tried).
What I mean, by they don't support streaming is that google does not send our servers a request UNTIL the whole request is received by them!!!(ie. not just headers, it needs to receive the entire body....this makes things very slow as opposed to streaming straight through firewall 1, cloud run service 1, firewall 2, cloud run service 2, database.
I am wondering if google's cloud run firewall by chance supports http/2 streaming and actually sends the request headers instead of waiting for the entire body.
I realize google has body size limits.......AND I realize we respond to clients with 200OK before the entire body is received (ie. we stream back while a request is being streamed in) sooooo, I am totally ok with google killing the connection if size limits are exceeded.
So my second question in this post is if they do support streaming, what will they do when size is exceeded since I will have already responded with 2000k at that point.
In this post, my definition of streaming is 'true streaming'. You can stream a request into a system and that system can forward it to the next system and keep reading/forwarding and reading/forwarding rather than waiting for the whole request. The google cloud run firewall is NOT MY definition of streaming since it does not pass through chunks it receives! Our servers sends data as it receives it so if there are many hops, there is no impact thanks to webpieces webserver.

Unfortunately, Cloud Run doesn't support HTTP/2 end-to-end to the serving instance.
Server-side streaming is in ALPHA. Not sure if it helps solving your problem. If it does, please fill out the following form to opt in, thanks!
https://docs.google.com/forms/d/e/1FAIpQLSfjwvwFYFFd2yqnV3m0zCe7ua_d6eWiB3WSvIVk50W0O9_mvQ/viewform

504 Gateway Time-out from google cloud platform, but only sometimes

I'm hosting a CouchBase single node cluster in GCP and a flask backend OpenShift cluster which supports angular frontend. The problem is, when a post is called in my flask by my angular, it is taking too much time to get connected the VM (couchbase) and hence flask has to return a "504 Gateway Time-out". But this happens only sometimes. Sometimes it just works very well with proper speed. Not able to troubleshoot. The total data size is less than 100M, and everything is 100% memory resident in Couchbase. So I guess this is not a problem with Couchbase. Just the connection latency to GCP.

My guess is that the first time your flask backend is trying to connect for the first time to your VM, it's taking more than usual as it needs to establish the connection, authenticate and possibly do other things depending on your use-case.
This is a common problem when hosting your app on App Engine or something similar and the solution there is to use "warm-up requests". This basically spins up the whole connection (and in app engine case the instance) and makes a test connection just so when the desired connection comes, everything is already set up.
So I suggest that you check how warm-up requests work and configure something similar between your flask and VM. So basically a route in flask with the only purpose of establishing a test connections with a test package. This way your next connection will be up to speed with no 504 errors.

try to clear to cache of load balancer in GCP console
I already faced same kind of issue and resolved it using above technique

Architecture Design for API of Cloud Service

Background:
I've a local application that process the user input for 3 second (approximately) and then return an answer (output) to the user.
(I don't want to go into details about my application in purpose of not complicate the question and keep it a pure architectural question)
My Goal:
I want to make my application a service in the cloud and expose API
(for the upcoming website and for clients that will connect the service without install the software locally)
Possible Solutions:
Deploy WCF on the cloud and use my application there, so clients can invoke the service and use my application on the cloud. (RPC style)
Use a Web-API that will insert the request into queue and then a worker role will dequeue requests and post the results to a DB, so the client will send one request for creating a request in the queue, and another request for getting the result (which the Web-API will get from the DB).
The Problems:
If I go with the WCF solution (#1) I cant handle great loads of requests, maybe 10-20 simultaneously.
If I go with the WebAPI-Queue-WorkerRole solution (#2) sometimes the client will need to request the results multiple times its can be a problem.
If I go with the WebAPI-Queue-WorkerRole solution (#2) the process isn't sync, the client will not get the result once the process of his request is done, he need to request the result.
Questions:
In the WebAPI-Queue-WorkerRole solution (#2), can I somehow alert the client once his request has processed and done ? so I can save the client multiple request (for the result).
Asking multiple times for the result isn't old stuff ? I remmemeber that 10 - 15 years ago its was accepted but now ? I know that VirusTotal API use this kind of design.
There is a better solution ? one that will handle great loads and will be sync or async (returning result to the client once it done) ?
Thank you.

If you're using Azure, why not simply fire up more servers and use load balancing to handle more load? In that way, as your load increases, you have more servers to handle the requests.
Microsoft recently made available the Azure Service Fabric, which gives you a lot of control over spinning up and shutting down these services.

How to write test cases to check high availability of web service

i have two servers server1 and server2 with a load balancer to maintain the High availability. now i have to deploy a web service on server1 and server2 and should have only one url to access the web-service from both the servers.Now i have to write few tests to check the HA of servers example:
1. if i switch off or take out the server1 it should not stop rather it should get response from server2 and in test script i have to show it is getting response from server2.
any help !!!

There are probably fancier ways to do this but I simply modified the content on each server a little. I add a hidden string on the web page to indicate which server the content is returned from.
The test script can check this value and do a count to see how many times it is served from each. If you are doing manual testing you can simply view the source of the web page (if HTML is what you are looking at) and check the value. This is helpful to see if you are getting a true balanced system as well as seeing if it works when one server drops off.
This probably won't be an issue, but there are some load balancing systems that glue a user to a particular server so after the first response all additional responses will come from the same place. To handle this you need to make sure requests appear independent to the system handling the balancing. This typically means a clean session but it could depend on other variables (possibly IP address) as well.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js