My ColdFusion services are taking many minutes to start, I suspect that the number of datasources may play a part, it seems that ColdFusion attempts to establish a connection with each datasource on start, is this true, could it be the source of y slow startup times?
Related
I have a classic web service that is hosted on IIS 7.5 (Windows Server 2008 R2).
After application pool recycles (default 20 minutes idle state), the first request to the web service takes about 5 minutes. When it gets through, every other request to the service takes no time at all.
I read about turning on AlwaysRunning in the IIS 7.5 that is in applicationHost.config. However, I would appreciate if anybody can provide explanation why would it happen and where to search for the cause of the problem.
Thank you in advance.
I avoid a cold start by having a heartbeat execute prior to the app pool recycle interval. However, you still need to let the app pools recycle at some pre-determined interval. See this post on cold starts. Generally, the more dependencies your app consumes and the larger your code base is then the longer it will take to "wake up" on a cold start. The delay is not really noticeable for smaller apps.
UPDATE
It appears this issue is caused by a bug related specifically to using Axis2 with ColdFusion and we have been able
to replicate the issue in our production environment on two different servers by
switching between Axis1 and Axis2. My original tests to compare the
two were apparently thwarted by an override in an Application.cfc
which forced Axis2.
We ran into a memory leak today which forced us to speed up the resolution to this issue. It resembled the leak
discussed here though we aren't sure if it is the exact same
problem
(https://www.hass.de/content/coldfusion-10-webservice-leaking-memory-trusted-cache-leaks-memory).
Our primary webservices are in Axis1 and we only switched to Axis2 for
this new set of webservices because we needed document literal style
for SalesForce and with Axis1 an invalid wsdl was being created (did
not properly describe all object types in arrays). So now we have it as
Axis1 and using a manually manipulated wsdl. Not entirely sure if it
will work out with SalesForce but as far as a general fix this works.
I am investigating an issue with our coldfusion based soap webservices in our production environment. It appears that the time between the return statement in the webservices method code and actually receiving a response can be significant and appears to directly correspond to the size of the response and/or number of objects.
In development a particular request that returns 1000 records takes about 6 seconds to return. However in production that same hit takes 50+ seconds to return. I added some timing code and found that the actual function code takes less than 1 second to run at the start of the request, meaning that generating the response is taking coldfusion about 50 seconds in production. Hitting the webservice with simple http request does not have the same slowness so seems to be soap/axis specific. The resulting xml is about 1MB which I have compared and found no differences. I also copied out settings from cfadmin in both environments to compare and could find no performance related setting differences.
Both environments are at the same CF 10 update level. The server monitor shows no significant memory usage. I also ran the request from in the server to make sure there wasn't some slow connection issues or https slowing things down but the results are the same.
Any suggestions or solution would be appreciated.
Additional notes...
CPU sits at about 17% for most of the time of the request which is a lot of work to be doing. Something is happening very inefficiently
I tried switching instance to Axis1 and back again followed by an instance restart and additional tests with no change in results
One possibility is that you have them throttled - check the "request tuning" in your CF administrator. By default the setting for "number of simultaneous web service requests" is 10. Are you looping and hitting the server? In production is there more traffic?
In server monitor enable profiling and monitoring, then click on "statistics". On the far right there is a little chart icon. click on it and you will see a chart and a counter legend in the top right. Then run your code. Does the "web services running" reach a threshold and cross into "web services queued" - if so you need to increase that threshold.
One more clue - in the server monitor do NOT run the "memory profiling for more than a few seconds - say 30. If you don't you will have performance problems for sure.
Is it possible that a database (connected to ColdFusion 9 via a datasource connection) being unavailable could cause ColdFusion to become unresponsive? (The database is used for a singular one-off lightly-trafficked app.)
Recently, maintenance on a connected Oracle database (oracle jdbc) has caused that database to be unavailable two different times. Coincidentally, at both these times, ColdFusion pages on our site became unavailable or terribly slow to load (static HTML pages seemed to load fine, for the most part). Restarting the ColdFusion application server service would fix the problem, but only for minutes. The first time, during a time the application server was responsive, we unchecked the "Maintain connections" checkbox. I'm not sure this had any effect, then shortly after the Oracle database came back online, and we didn't seem to have the problem any more.
The second time that database was offline, we experienced a very similar issue with our website - ColdFusion pages becoming reaaaally slow or unavailable altogether. During one of the times when I could access the CF administrator, I updated the datasource and checked "Disable connections". Then I stopped and restarted both the CF ODBC agent and ODBC server services. After that, the problem seemed to stop, but I don't know enough to know if this is causation or coincidence.
Anyone have insights on this?
Server setup: Windows Server 2003 SP2, ColdFusion 9, IIS 6
There are a number of ways to slow a database to a crawl if not stop it completely. If you have hackers for example attacking your database through Port 1433 with attempted logins several times a second that can slow it down and if they get in they can of course do whatever they want. When this happened to me I found a record of attacks in the Event logs; the solution is better network security intercepting such attacks and never letting them actually talk to the database. Or say if your site is vulnerable to SQL injection attacks hackers could be messing with your database that way too but network security wouldn't necessarily work in that case. It doesn't require hackers to degrade the performance of your database however, you could be having a problem with allocated disk space for transaction logs or indexes filling up, or heaven forbid an imminent hardware failure showing early symptoms. You're backing up your database often I hope, off the server. To answer your question yes ColdFusion can and will become unresponsive when pages are called that call the database, and will usually display error messages when the database finally times out and never sends the requested data to ColdFusion. You can protect against that to some extent with CFTRY tags around your queries that display clean and polite error messages instead of ColdFusion's ugly ones if the database fails to return data, at least your site continues to look professional that way. One project I worked used a shared SQL Server database that often got overloaded and slowed down terribly and there was nothing I could do about improving that situation. What I did to keep the site functioning was to maintain a DB backup in the form of a MS Access database (yeah it was inappropriate but it worked when SQL Server wouldn't) and anytime SQL Server failed I had the application set up to automatically use code that called the Access database instead.
These are some ideas for you to think about if you are continuing to have problems, I see nobody's even tried to answer your question in the last six months and that's kinda been my experience with the quality of assistance this site has offered me too. I hope my thoughts can be of some use to you.
We have automatically started service which in some cases spends a lot of the time loading necessary data, let's say 10 minutes. During this time it works as expected (processing some huge data files required to start). I report the progess by C++ SetServiceStatus function, it is working fine.
This service is not dependent on anything and has only one dependency which is again our own service. It is started after those 10 minutes, it needs the first "server" service to be fully running to accept the requests.
I thought that windows would start all other automatic services (in less then 10 minutes as usually) and then start working normally but system is completely blocked during startup (i can't login to computer or ping the computer) until this one specific service is started (reports SERVICE_RUNNING by SetServiceStatus). When out service completely starts, the other missing system services (required for network, remote desktop, whatever, it's quite random) are also started. Is this normal behaviour? Why are non-depending processes (as remote desktop, network connections, etc.) waiting for this process? Am I missing something?
I tried to add some dependencies to postpone the startup of my service but I ended up with many dependencies and behaviour still somehow random (as order of services is random). Sometimes I was able to login but for example Start button started working only after those 10 minutes when my service was started. I am not sure what is "the last service" to depend on and what services to include to my depend-list and on some computers this services can be disabled and it can bring new problems... so I don't like this solution very much.
Another option was Delayed start option for our service. This should start service when all other automatic services are running. Well, this works fine, windows boots, computer running and responding, our service is started, but the performance is very bad, many times slower than usually, it seems that delayed started services have much lower priority or something like that.
My only current solution is to report to system that my service is running (by SetServiceStatus function), but to continue loading (this works, I tested it). But then we have problem with our dependent service as it needs to be started when the first one is really ready. It can be solved but I still wonder how is this possible and if there is something I could use to keep the current state of automatic started service which reports "started" when it is really fully started and prepared to work. Thanks for any ideas.
Set SERVICE_RUNNING as soon as possible, and then continue processing in background. Make your other service resilient to the first service being in a running state, but not yet ready to service.
The longer the service is in the starting state the more problems we get from different windows versions.
We are trying to use monit to monitor services on our Ubuntu machine. I have successfully setup a host url check to make sure that coldfusion can render web pages and it there is an error to restart coldfusion.
I was wondering if there is a way to get more stats into monit by monitoring the coldfusion process. I have been unable to find out if coldfusion creates a pid file.
Does Coldfusion 9 or Jrun create a pid file for monit to use? Is there another way to monitor coldfusion with monit?
ColdFusion can output real-time performance metrics such as:
Page hits per second
Database accesses per second
Number of queued requests
Number of running requests
Number of timed out requests
Average queue time
Average request time
Average database transaction time
Bytes incoming per second
Bytes outgoing per second
You can learn more about the output of this logging here: http://help.adobe.com/en_US/ColdFusion/9.0/Admin/WSc3ff6d0ea77859461172e0811cbf3638e6-7fe0.html#WS9F365555-357A-4a15-AC72-449EF611E342
I would be interested to learn how you set this up once complete. I'll have the same task in a few weeks.
Thanks!
You will need to create the PID file with a wrapper script around your Java application. I'm doing the same thing myself these days. To the best of my understanding monit has to have the PID file to check the life of your service.