Limit number of CFHTTP requests sent every x seconds - coldfusion

I'm making an application that will continually send CFHTTP requests to a server to search for items, as well as sending further CFHTTP requests to perform actions on any returned results.
The issue I'm having is that the server has a maximum threshold of 3 requests per second and even when I try to implement a sleep call every 4 milliseconds it doesn't work properly as, although it delays, the CFHTTP requests can queue up if it takes them a couple of seconds to return so that it then tries to send multiple in the same second triggering the threshold to be exceeded.
Is there a way I can ensure that there are never more than 3 active CFHTTP requests?

I think you are going to need to implement some sort of logging widget as part of your process. The log will keep track of request frequency. If the threshold is not met, then you would just skip over that iteration of your CFHTTP call. I don't mean a file log or a database log, but something implemented in the application or even request scope depending on your implementation. There is no way to throttle CFHTTP itself. It is basically a very simplistic wrapper around a Java HTTP library which then goes straight to the underlying operating system.

If you're limiting concurrent requests, then first part of this answer applies. If you're looking to limit the number of requests per second, then the bit at the end applies. The question kind of asks both things.
If I understand correctly, you've got a number of threads (either as requests CF is processing or threads CF has created itself) which all need to make calls to the same rate-limited domain. What you need is a central way of co-ordinating access, combined with a nice way of controlling program execution.
I don't know of any native limits that CF might support (I'd be happy to be proven wrong) so you're likely to have to implement your own. The cheap'n'nasty way to do this is to increment and decrement a allowed_conenctions variable in a long-lived scope such as appliation. The downsides are that you have to implement checking all over the place and that if there are no spare connections, you'll have to wait somehow.
Really what you have is a resource pool (of allowed HTTP connections) and I'm guessing that you want your code to wait until a connection is free. CF does this kind of thing already for database connections.
In your case, there isn't really a need to keep anything in a pool (as HTTP connections aren't long-lived), other than a permit to use the resource. Java provides a class which ought to provide what you're after, the Semaphore.
I've not tried it but in theory, something like the snippet below ought to work:
//Application.cfc:onApplicationStart()
application.http_pool = CreateObject("java","java.util.concurrent.Semaphore").init(3)
//Meanwhile, elsewhere in your code
application.http_pool.acquire()
//Make my HTTP call
application.http_pool.release()
You could even wrap the HTTP object to provide this functionality without having to use the acquire/release each time, which would make it more reliable.
EDIT
It you're looking to limit rates, look at guava's RateLimiter which has the same general interface as Semaphore above, but implements rate limiting for you. You'd need to add guava to ColdFusion's classpath, or use JavaLoader or use CF10 which has classloading facilities built-in.

Related

What is optimal value for Phusion passenger PassengerMaxRequestQueueSize

I know this depends on the box hardware, but for example if there are set 100 processes, the default queue is also 100. Does it makes sense to increase PassengerMaxRequestQueueSize to 200 or 300? Probably this depends on free memory. Thoughts?
The best answer will be explaining the setting and probably one or two examples, assuming the server process requests for 2-3 seconds.
Thanks in advance!
Why you should limit queuing
Any requests that aren't immediately handled by an application process, are queued. Queuing is usually is bad: it often means that your server cannot handle the requests quickly enough.
A larger queue means that requests are less likely to be dropped. But this comes with a drawback: during busy times, the larger the queue, the longer your visitors have to wait before they see a response. This causes them to click reload, making the queue even longer (their previous request will stay in the queue; the OS does not know that they've disconnected until it tries to send data back to the visitor), or causes them to leave in frustration.
So having a limit on the queue is a good thing. It limits the impact of the above situation.
You should ensure that requests are queued as little as possible. That could mean:
Making your app faster (if your workload is CPU bound).
Upgrading to faster hardware (if your workload is CPU bound).
Increasing your app's concurrency settings (if your workload is I/O bound), e.g. by increasing the number of processes or threads.
If you cannot prevent requests from being queued, then the next best thing to do is to keep the queue short, and to display a friendly error message upon reaching the queue limit. Something like, "We're sorry, a lot of people are visiting us right now. Please try again later." The documentation for PassengerMaxRequestQueueSize tells you how to do that.
Optimal value for the queue size
It's hard to say what the optimal queue size should be. A good rule of thumb is: set the request queue size to the maximum number of requests you can handle in one second. Depending on your situation you may have to tweak things a little bit.
This rule of thumb comes from the notion of expected burst traffic. How many simultaneous requests do you expect on your server?
Suppose that your queue size is 100, and that for whatever reason you receive 150 requests at the same time. Suppose that your server is fast enough to handle 150 requests in half a second, so you know it's not a performance problem. But if you have a request queue size of 100, then 50 of those requests will be dropped with a "Request queue full" error.
In such a situation, you should set the queue size to the maximum number of concurrent requests that you think you can safely handle without performance issues.
This SO question and the Passenger docs here talk more about working with this. If you want more information about why this is happening on your server you can try running passenger-status (usually you need to run this as root).
If you would like to set a custom error page when visitors see this issue you can use the following (in Apache) to set a custom error page:
PassengerErrorOverride on
ErrorDocument 503 /error503.html
As mentioned by Hongli you can also change the setting PassengerMaxRequestQueueSize to a higher number to queue more requests. You can also set this to 0 and disable it (for most situations this is not an optimal solution however).
For reference, the default error message a visitor to your site will see when bumping against this limit is:
This website is under heavy load
We're sorry, too many people are accessing this website at the same time. We're working on this problem. Please try again later.

How do I detect an aborted connection in Django?

I have a Django view that does some pretty heavy processing and takes around 20-30 seconds to return a result.
Sometimes the user will end up closing the browser window (terminating the connection) before the request completes -- in that case, I'd like to be able to detect this and stop working. The work I do is read-only on the database so there isn't any issue with transactions.
In PHP the connection_aborted function does exactly this. Is this functionality available in Django?
Here's example code I'd like to write:
def myview(request):
while not connection_aborted():
# do another bit of work...
if work_complete:
return HttpResponse('results go here')
Thanks.
I don't think Django provides it because it basically can't. More than Django itself, this depends on the way Django interfaces with your web server. All this depends on your software stack (which you have not specified). I don't think it's even part of the FastCGI and WSGI protocols!
Edit: I'm also pretty sure that Django does not start sending any data to the client until your view finishes execution, so it can't possibly know if the connection is dead. The underlying socket won't trigger an error unless the server tries to send some data back to the user.
That connection_aborted method in PHP doesn't do what you think it does. It will tell you if the client disconnected but only if the buffer has been flushed, i.e. some sort of response is sent from the server back to the client. The PHP versions wouldn't even work as you've written if above. You'd have to add a call to something like flush within your loop to have the server attempt to send data.
HTTP is a stateless protocol. It's designed to not have either the client or the server dependent on each other. As a result the state of either is only known when there is a connection is created, and that only occurs when there's some data to send one way or another.
Your best bet is to do as #MattH suggested and do this through a bit of AJAX, and if you'd like you can integrate something like Node.js to make client "check-ins" during processing. How to set that up properly is beyond my area of expertise, though.
So you have an AJAX view that runs a query that takes 20-30 seconds to process requested in the background of a rendered page and you're concerned about wasted resources for when someone cancels the page load.
I see that you've got options in three broad categories:
Live with it. Improve the situation by caching the results in case the user comes back.
Make it faster. Throw more space at a time/space trade-off. Maintain intermediate tables. Precalculate the entire thing, etc.
Do something clever with the browser fast-polling a "is it ready yet?" query and the server cancelling the query if it doesn't receive a nag within interval * 2 or similar. If you're really clever, you could return progress / ETA to the nags. However, this might not have particularly useful behaviour when the system is under load or your site is being accessed over limited bandwidth.
I don't think you should go for option 3 because it's increasing complexity and resource usage for not much gain.

XMLRPCPP asynchronously handling multiple calls?

I have a remote server which handles various different commands, one of which is an event fetching method.
The event fetch returns right away if there is 1 or more events listed in the queue ready for processing. If the event queue is empty, this method does not return until a timeout of a few seconds. This way I don't run into any HTTP/socket timeouts. The moment an event becomes available, the method returns right away. This way the client only ever makes connections to the server, and the server does not have to make any connections to the client.
This event mechanism works nicely. I'm using the boost library to handle queues, event notifications, etc.
Here's the problem. While the server is holding back on returning from the event fetch method, during that time, I can't issue any other commands.
In the source code, XmlRpcDispatch.cpp, I'm seeing in the "work" method, a simple loop that uses a blocking call to "select".
Seems like while the handling of a method is busy, no other requests are processed.
Question: am I not seeing something and can XmlRpcpp (xmlrpc++) handle multiple requests asynchronously? Does anyone know of a better xmlrpc library for C++? I don't suppose the Boost library has a component that lets me issue remote commands?
I actually don't care about the XML or over-HTTP feature. I simply need to issue (asynchronous) commands over TCP in any shape or form?
I look forward to any input anyone might offer.
I had some problems with XMLRPC also, and investigated many solutions like GSoap and XMLRPC++, but in the end I gave up and wrote the whole HTTP+XMLRPC from scratch using Boost.ASIO and TinyXML++ (later I swaped TinyXML to expat). It wasn't really that much work; I did it myself in about a week, starting from scratch and ending up with many RPC calls fully implemented.
Boost.ASIO gave great results. It is, as its name says, totally async, and with excellent performance with little overhead, which to me was very important because it was running in an embedded environment (MIPS).
Later, and this might be your case, I changed XML to Google's Protocol-buffers, and was even happier. Its API, as well as its message containers, are all type safe (i.e. you send an int and a float, and it never gets converted to string and back, as is the case with XML), and once you get the hang of it, which doesn't take very long, its very productive solution.
My recomendation: if you can ditch XML, go with Boost.ASIO + ProtobufIf you need XML: Boost.ASIO + Expat
Doing this stuff from scratch is really worth it.

HttpSendRequest blocking when more than two downloads are already in progress

In our program, a new thread is created each time an HTTP request needs to be made, and there can be several running simultaneously. The problem I am having is that if I've got two threads already running, where they are looping on reading from InternetReadFile() after having called HttpSendRequest(), any subsequent attempts to call HttpSendRequest() just hang on that call, so I end up with the previously mentioned two threads continuing to read from their connections just fine, but the third just blocks on HttpSendRequest() until it times out.
From what I've been able to find on my own, this seems like it could just be the way wininet works, as the HTTP spec recommends: "A single-user client SHOULD NOT maintain more than 2 connections with any server or proxy."
I've seen various programs handle multiple simultaneous downloads to the same server, but I'd imagine they need to do a lot of extra work to do that, in terms of managing the various connections, or writing their own http interface.
If it would require a lot of extra complexity to set it up to handle more than two active sessions, then I would just change things to only handle one or two files at a time, leaving the rest queued. However, if there were some low-complexity way to allow more than two at a time (off the top of my head, I'd guess using a new process per download might work, but would be messier), that would be preferable; it's not like it would be downloading more than 3-5 simultaneously anyway, and each download is at the user's request. I read some mentions of registry hacks to change the limit, but that's definitely not something I'd do. Any ideas?
The HTTP 1.1 standard mandates a maximum of 2 simultaneous connections per server. If you have IE5, IE6, or IE7 installed, the versions of WinInet they install allow you to use InternetSetOption() to increase the limit (look at INTERNET_OPTION_MAX_CONNS_PER_SERVER and INTERNET_OPTION_MAX_CONNS_PER_1_0_SERVER options). However, the version of WinInet that is installed by with IE8 apparently disables that functionality (see http://connect.microsoft.com/WNDP/feedback/ViewFeedback.aspx?FeedbackID=434396 and http://connect.microsoft.com/WNDP/feedback/ViewFeedback.aspx?FeedbackID=481485).
If you call InternetOpen() multiple times, you should be able to simultaneously download two files on each HINTERNET returned by InternetOpen().

Approach for REST request with long execution time?

We are building a REST service that will take about 5 minutes to execute. It will be only called a few times a day by an internal app. Is there an issue using a REST (ie: HTTP) request that takes 5 minutes to complete?
Do we have to worry about timeouts? Should we be starting the request in a separate thread on the server and have the client poll for the status?
This is one approach.
Create a new request to perform ProcessXYZ
POST /ProcessXYZRequests
201-Created
Location: /ProcessXYZRequest/987
If you want to see the current status of the request:
GET /ProcessXYZRequest/987
<ProcessXYZRequest Id="987">
<Status>In progress</Status>
<Cancel method="DELETE" href="/ProcessXYZRequest/987"/>
</ProcessXYZRequest>
when the request is finished you would see something like
GET /ProcessXYZRequest/987
<ProcessXYZRequest>
<Status>Completed</Status>
<Results href="/ProcessXYZRequest/Results"/>
</ProcessXYZRequest>
Using this approach you can easily imagine what the following requests would give
GET /ProcessXYZRequests/Pending
GET /ProcessXYZRequests/Completed
GET /ProcessXYZRequests/Failed
GET /ProcessXYZRequests/Today
Assuming that you can configure HTTP timeouts using whatever framework you choose, then you could request via a GET and just hang for 5 mins.
However it may be more flexible to initiate an execution via a POST, get a receipt (a number/id whatever), and then perform a GET using that 5 mins later (and perhaps retry given that your procedure won't take exactly 5 mins every time). If the request is still ongoing then return an appropriate HTTP error code (404 perhaps, but what would you return for a GET with a non-existant receipt?), or return the results if available.
As Brian Agnew points out, 5 minutes is entirely manageable, if somewhat wasteful of resources, if one can control timeout settings. Otherwise, at least two requests must be made: The first to get the result-producing process rolling, and the second (and third, fourth, etc., if the result takes longer than expected to compile) to poll for the result.
Brian Agnew and Darrel Miller both suggest similar approaches for the two(+)-step approach: POST a request to a factory endpoint, starting a job on the server, and later GET the result from the returned result endpoint.
While the above is a very common solution, and indeed adheres to the letter of the REST constraints, it smells very much of RPC. That is, rather than saying, "provide me a representation of this resource", it says "run this job" (RPC) and then "provide me a representation of the resource that is the result of running the job" (REST). EDIT: I'm speaking very loosely here. To be clear, none of this explicitly defies the REST constraints, but it does very much resemble dressing up a non-RESTful approach in REST's clothing, losing out on its benefits (e.g. caching, idempotency) in the process.
As such, I would rather suggest that when the client first attempts to GET the resource, the server should respond with 202 "Accepted" (http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.3), perhaps with "try back in 5 minutes" somewhere in the response entity. Thereafter, the client can poll the same endpoint to GET the result, if available (otherwise return another 202, and try again later).
Some additional benefits of this approach are that single-use resources (such as jobs) are not unnecessarily created, two separate endpoints need not be queried (factory and result), and likewise the second endpoint need not be determined from parsing the response from the first, thus simpler. Moreover, results can be cached, "for free" (code-wise). Set the cache expiration time in the result header according to how long the results are "valid", in some sense, for your problem domain.
I wish I could call this a textbook example of a "resource-oriented" approach, but, perhaps ironically, Chapter 8 of "RESTful Web Services" suggests the two-endpoint, factory approach. Go figure.
If you control both ends, then you can do whatever you want. E.g. browsers tend to launch HTTP requests with "connection close" headers so you are left with fewer options ;-)
Bear in mind that if you've got some NAT/Firewalls in between you might have some drop connections if they are inactive for some time.
Could I suggest registering a "callback" procedure? The client issues the request with a "callback end-point" to the server, gets a "ticket". Once the server finishes, it "callbacks" the client... or the client can check the request's status through the ticket identifier.