appConcurrentRequest limit exceed on IIS - amazon-web-services

I deployed .Netcore MVC on AWS windows server 2019(32gb RAM and 8 cores). 100k concurrent requests because its an online exam application. 100k concurrent request should be entertained. Which server should I use?

The concurrent connection configuration depend on the maximum concurrent connection in site's advanced setting, queue length and maximum worker process in application pool advanced setting, maximum thread in thread pool.Besides, I notice the serverruntime/httpruntime has a limit of appconcurrentRequestLimit with 5000. So if you need to achieve the high concurrent request, you could go to
IIS manager->site->configuration editor->system.webServer/serverRuntime->appConcurrentRequest.

Related

pfSense Enterprise Firewall Testing

I am working on Firewall performance testing. I need to know about pfSense Enterprise Firewall performance. How many maximum concurrent users and maximum sessions does it supports?
I have used JMeter for concurrent user but could not find any other tool to measure maximum sessions of a firewall, or both maximum concurrent users and maximum sessions of a firewall. Is there any tool to test the maximum concurrent users and maximum sessions of a firewall?
Depending on what type of traffic do you want to simulate the number of sessions may be equal or higher than the number of threads.
In the majority of cases number of users == number of sessions.
For the HTTP protocol if you enabled embedded resources downloading for the requests including the embedded resources the number of sessions will be 6x times higher. It can be visualized using i.e. Server Hits Per Second chart (can be installed using JMeter Plugins Manager)

ring jetty adapter - limit concurrent requests

Question:
How do you configure ring-jetty-adapter to limit the number of concurrent worker threads? I'm using embedded jetty here, not creating a WAR file or anything like that.
Context
I only have 20 connections in my database connection pool, and all requests need to do database queries. Currently, when the server load gets to high, say 40 concurrent requests continually, 20 of them will be blocked waiting for the DB. Then the queue will keep building up, and the wait will just spike out of control (Thread starvation).
The :max-threads parameter does not do what I want, as it only limits the size of jettys internal thread pool, which is used for accept and selector threads, not just worker threads.
After some research I think what I need to use is the jetty QoS filter but I can't figure out how to translate a web.xml configuration to my clojure ring app.
First, you cannot control or limit behavior by adjusting threading configuration.
The old school model of 1 thread per request is not valid on modern containers, especially so on Jetty, which is 100% async internally.
A single request can use 1...n threads through that request's lifetime. The behavior of threading in Jetty is influenced by your technology choices (eg: os, jvm, network protocols, etc), and API choices, and even how stressed your server is.
With that out of the way, your desired solution should instead focus on limiting the number of Requests that can be used by a specific server resource endpoint concurrently.
The way that's done is by limiting the number of active requests that can concurrently access a server resource endpoint.
This is accomplished by tracking the number of requests to that specific resource endpoint and then suspending requests that exceed a configured maximum, resuming suspended requests when the active count falls below the configured maximum, and also timing out requests that sit in the suspended state for too long.
This feature set is provided for you in the Jetty QoSFilter.
You can use the Jetty QoSFilter for anything in Jetty that is based on the Jetty ServletContextHandler (which includes the WebAppContext).
See: https://www.eclipse.org/jetty/documentation/jetty-9/index.html#qos-filter
FilterHolder qosConfig = servletContextHandler.addFilter(QoSFilter.class,
"/api/backend/*",
EnumSet.of(DispatcherType.REQUEST));
qosConfig.setInitParameter("maxRequests", "10");
qosConfig.setInitParameter("waitMs", "50");
qosConfig.setInitParameter("suspendMs", "-1");

difference between core connections and i/o threads in Datastax Cassandra C++ driver

I set number of core connections per host using cass_cluster_set_max_connections_per_host() and i/o threads using cass_cluster_set_num_threads_io().
I see that my client host is establishing,
core connections * num i/o threads, number of tcp connections with each host in my cluster using netstat command. I am wondering what is the difference between an i/o thread and a core connection? Also, if a client is communicating with Cassandra cluster of 10 hosts and number of core connections is set to 2, i/o threads is set to 4 then there are essentially 10*4*2, 80 connections established from a host to cluster - and this all in single session, how are those connections utilized? doesn't it seem extraneous?
I am trying to tune those values so if a cluster is connected by 100 hosts simultaneously then the speed wouldn't slow down. Or are those settings unrelated to speed? Any more information or links are appreciated!
This is the official documentation of the fields present here
cass_cluster_set_num_threads_io : This is the number of threads that will handle query requests. Default value: 1
cass_cluster_set_max_connections_per_host: Sets the maximum number of connections made to each server in each IO thread. Default value : 2
I am wondering what is the difference between an i/o thread and a
core connection?
I/O threads are basically responsible for doing all the network operations between the Client and the Server. So if you have 1000 message waiting for the network operation, this thread will pick the request one by one and execute them. The default value is 1.
Once a message is picked by the I/O thread, it uses the connections specified in set_max_connections to make the request. The default value is 2 for this so that the I/O thread can intelligently switch connections based on server latency and throughput.
I am trying to tune those values so if a cluster is connected by 100
hosts simultaneously then the speed wouldn't slow down.
You can either keep max connection constant and increase the number of i/o threads or the other way around for scaling. There is no clear better approach between the two. You will need to benchmark and see what approach works for your case.
I think that if you have less number of request but they are big request then increasing the number of connections makes more sense but it still requires benchmarking.
This link also provides some extra info.

Web service request times out after app pool recycles

I have a classic web service that is hosted on IIS 7.5 (Windows Server 2008 R2).
After application pool recycles (default 20 minutes idle state), the first request to the web service takes about 5 minutes. When it gets through, every other request to the service takes no time at all.
I read about turning on AlwaysRunning in the IIS 7.5 that is in applicationHost.config. However, I would appreciate if anybody can provide explanation why would it happen and where to search for the cause of the problem.
Thank you in advance.
I avoid a cold start by having a heartbeat execute prior to the app pool recycle interval. However, you still need to let the app pools recycle at some pre-determined interval. See this post on cold starts. Generally, the more dependencies your app consumes and the larger your code base is then the longer it will take to "wake up" on a cold start. The delay is not really noticeable for smaller apps.

Twisted connections timeout under heavy load

We have a Django web app that serves a moderate number of users, running on an Ubuntu machine with 8 cores and at least 32GB RAM. We have no problems with users connecting via their browser. However, on the backend (on the same server) we are also running a twisted server. The Django webapp tries to connect to our twisted server, but after about 1100-1200 such connections (including a bunch of persistent connections to other devices on the backend), all the connections start to timeout. Our twisted server worked fine under low load but now the server seems to be unable to handle any new connections from Django. All connections time out. We do not see anything obviously wrong with our code (which we have been working on for a couple of years now so it should be pretty stable). We have already set our soft and hard ulimits in /etc/security/limits.conf to 50000/65000 and we have upped somaxconn to 65536. The print of limits for our twisted process is listed below. The total number of files across the tope 25 processes is just over 5000. Unfortunately we still cannot get more than roughly 1100-1200 simultaneous connections to our twisted server. What things should we look at to make our twisted connections start connecting again? Are there other sysctl or other Ubuntu Linux parameters that we need to change? Are there twisted parameters we need to change?
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 465901 465901 processes
Max open files 50000 65000 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 465901 465901 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
Twisted is a thin shell around your application. When there is a performance problem, almost always, the problem is somewhere inside the application and not in Twisted. So there is no general answer to this question.
That said, there are a couple of investigation techniques you could use. Is your Twisted process consuming 100% CPU? If so, then you are going to need to split it up into multiple processes somehow (using spawnProcess, sendFileDescriptor and adoptStreamPort to allow I/O to be done in subprocesses). If not, then your problem is probably some inadvertent blocking I/O preventing the reactor from servicing requests: you might use something like twisted_hang to diagnose hot-spots where the reactor is getting "stuck".
There's also the possibility that the problem could be on Django's side of the connection. However, with no information about how Django is making those connections, there's little more I can even guess.