Web service request times out after app pool recycles - web-services

I have a classic web service that is hosted on IIS 7.5 (Windows Server 2008 R2).
After application pool recycles (default 20 minutes idle state), the first request to the web service takes about 5 minutes. When it gets through, every other request to the service takes no time at all.
I read about turning on AlwaysRunning in the IIS 7.5 that is in applicationHost.config. However, I would appreciate if anybody can provide explanation why would it happen and where to search for the cause of the problem.
Thank you in advance.

I avoid a cold start by having a heartbeat execute prior to the app pool recycle interval. However, you still need to let the app pools recycle at some pre-determined interval. See this post on cold starts. Generally, the more dependencies your app consumes and the larger your code base is then the longer it will take to "wake up" on a cold start. The delay is not really noticeable for smaller apps.

Related

504 gateway timeout for any requests to Nginx with lot of free resources

We have been maintaining a project internally which has both web and mobile application platform. The backend of the project is developed in Django 1.9 (Python 3.4) and deployed in AWS.
The server stack consists of Nginx, Gunicorn, Django and PostgreSQL. We use Redis based cache server to serve resource intensive heavy queries. Our AWS resources include:
t1.medium EC2 (2 core, 4 GB RAM)
PostgreSQL RDS with one additional read-replica.
Right now Gunicorn is set to create 5 workers (by following the 2*n+1 rule). Load wise, there are like 20-30 mobile users making requests in every minute and there are 5-10 users checking the web panel every hour. So I would say, not very much load.
Now this setup works alright for 80% days. But when something goes wrong (for example, we detect a bug in the live system and we had to switch off the server for maintenance for few hours. In the mean time, the mobile apps have a queue of requests ready in their app. So when we make the backend live, a lot of users hit the system at the same time.), the server stops behaving normally and started responding with 504 gateway timeout error.
Surprisingly every time this happened, we found the server resources (CPU, Memory) to be free by 70-80% and the connection pool in the databases are mostly free.
Any idea where the problem is? How to debug? If you have already faced a similar issue, please share the fix.
Thank you,

Should I use MSMQ or IIS

I have a web site that exposes a web service to all my desktop clients.
Randomly, these clients will invoke the web service which in turn will add a message jpeg in byte array format to the MSMQ.
I have a service application that reads from this queue and performs an enhancement on this jpeg and saves it to the hard drive.
The number of clients uploading at anyone time is unpredictable.
I choose this method because I do not want to put any strain on IIS. The enhancements my service application performs is not much 'erg' but it exists nevertheless.
However, after realizing that my service application had stopped for sometime and required restarting I noticed the RAM leap up to clear the backlog. Whilst I have corrected this and the service is now coded to restart automatically on fail I surmised that a backlog could exists at busy times which again give a higher RAM usage.
Now, should I accept to do the processing all within my web service and then save to the hard drive or am I correct in using a MSMQ?
I am using C# and asp.net

Service blocks windows startup

We have automatically started service which in some cases spends a lot of the time loading necessary data, let's say 10 minutes. During this time it works as expected (processing some huge data files required to start). I report the progess by C++ SetServiceStatus function, it is working fine.
This service is not dependent on anything and has only one dependency which is again our own service. It is started after those 10 minutes, it needs the first "server" service to be fully running to accept the requests.
I thought that windows would start all other automatic services (in less then 10 minutes as usually) and then start working normally but system is completely blocked during startup (i can't login to computer or ping the computer) until this one specific service is started (reports SERVICE_RUNNING by SetServiceStatus). When out service completely starts, the other missing system services (required for network, remote desktop, whatever, it's quite random) are also started. Is this normal behaviour? Why are non-depending processes (as remote desktop, network connections, etc.) waiting for this process? Am I missing something?
I tried to add some dependencies to postpone the startup of my service but I ended up with many dependencies and behaviour still somehow random (as order of services is random). Sometimes I was able to login but for example Start button started working only after those 10 minutes when my service was started. I am not sure what is "the last service" to depend on and what services to include to my depend-list and on some computers this services can be disabled and it can bring new problems... so I don't like this solution very much.
Another option was Delayed start option for our service. This should start service when all other automatic services are running. Well, this works fine, windows boots, computer running and responding, our service is started, but the performance is very bad, many times slower than usually, it seems that delayed started services have much lower priority or something like that.
My only current solution is to report to system that my service is running (by SetServiceStatus function), but to continue loading (this works, I tested it). But then we have problem with our dependent service as it needs to be started when the first one is really ready. It can be solved but I still wonder how is this possible and if there is something I could use to keep the current state of automatic started service which reports "started" when it is really fully started and prepared to work. Thanks for any ideas.
Set SERVICE_RUNNING as soon as possible, and then continue processing in background. Make your other service resilient to the first service being in a running state, but not yet ready to service.
The longer the service is in the starting state the more problems we get from different windows versions.

cpu load and django application that makes long-response-time requests to external API

I'm developing a web application in python for which each user request makes an API call to an external service and takes about 20 seconds to receive response. As a result, in the event of several concurrent requests being made, the CPU load goes crazy (>95%) with several idle processes.
The server consists of a 1.6 GHz dual core Atom 330 with 2GB RAM.
The web app is developed in python which is served through Apache with mod_wsgi
My question is the following. Will a non-blocking webserver such as Tornado improve CPU load and thus handle more concurrent users (I'm also interested why) ? Can you suggest any other scalable solution ?
This really doesn't have anything to do with blocking; it does, but it doesn't. The 20 sec request is blocking one thread, so another has be utilized for the next request. Whereas with quick requests, the threads basically round-robin.
However, this really shouldn't be spiking your CPU output. Webservers have an upward limit of "workers" that get spawned and when they're all tied up, they're all tied up. It won't extend past the limit, so unless you've set or the default setting is higher than the box you have is capable of running, it shouldn't push your CPU that high.
Regardless, all that is merely informational, and doesn't really solve your problem. With such a long running request though, you should be offloading this from your webserver as quick as possible. The webserver should merely hand off the request to another process that can asyncronously handle it and then employ polling to notify the client when the response is ready. Node.js is used a lot in similar scenarios, but I really don't have enough experience with it to give you any real guidance beyond that.
You should look into using message queues to offload tasks so that your user requests are not blocked.
You could look into python libs kombu and celery to handle messages and tasks.
You are likely using prefork MPM with Apache and mod_wsgi embedded mode. This is a bad combination by default because Apache is setup for PHP and not fat Python web applications. Read:
http://blog.dscpl.com.au/2009/03/load-spikes-and-excessive-memory-usage.html
which explains this exact sort of issue.
Use mod_wsgi daemon mode at the minimum, and preferably also change to worker MPM for Apache.

First call to web service each day is slow

While building this web service and the app that calls it, we have noticed that the first call to the web service each day is extremely slow. It even will time out on some days. However, every call after that work great. Can anybody shed light on why this might be and how we can get rid of this pain?
Thanks in advance!
If it's an ASP.NET web service, it may be the CLR initializing and loading and verifying the assemblies for the first time. You may want to consider pre-compilation
Agree with the other answers on caching, initialization, etc. As far as a workaround, one possibility may be to set up some sort of daily task (SQL Server job, Windows service, something else?) to simulate a hit to the service each day, so that your users don't experience this first slow request.
If it is an ASP.NET web service, then you might want to check the settings of the application pool the web service is running in, especially the idle timeout which defaults to 20 minutes in IIS7.
Configuring IIS7 idle-timeout
Even if it is not an ASP.NET web service, other web servers will have equivalent configuration settings you have to tweak to keep your web service alive overnight.
Can you duplicate the same behavior on your database? It could just be the db needing to optimise the query for the first run (Maybe the parameter is today's date?).
Are there a lot of static constructors or set up code in the Global.asax class? Because IIS recycles worker processes periodically, the start up code may be running again.
The rule for optimization is: don't guess. Put in profiling to find out exactly what is slow, and then work to make that faster. Everything already posted provides excellent tips on where to start looking for slowness.