HttpSendRequest blocking when more than two downloads are already in progress

HttpSendRequest blocking when more than two downloads are already in progress - c++

In our program, a new thread is created each time an HTTP request needs to be made, and there can be several running simultaneously. The problem I am having is that if I've got two threads already running, where they are looping on reading from InternetReadFile() after having called HttpSendRequest(), any subsequent attempts to call HttpSendRequest() just hang on that call, so I end up with the previously mentioned two threads continuing to read from their connections just fine, but the third just blocks on HttpSendRequest() until it times out.
From what I've been able to find on my own, this seems like it could just be the way wininet works, as the HTTP spec recommends: "A single-user client SHOULD NOT maintain more than 2 connections with any server or proxy."
I've seen various programs handle multiple simultaneous downloads to the same server, but I'd imagine they need to do a lot of extra work to do that, in terms of managing the various connections, or writing their own http interface.
If it would require a lot of extra complexity to set it up to handle more than two active sessions, then I would just change things to only handle one or two files at a time, leaving the rest queued. However, if there were some low-complexity way to allow more than two at a time (off the top of my head, I'd guess using a new process per download might work, but would be messier), that would be preferable; it's not like it would be downloading more than 3-5 simultaneously anyway, and each download is at the user's request. I read some mentions of registry hacks to change the limit, but that's definitely not something I'd do. Any ideas?

The HTTP 1.1 standard mandates a maximum of 2 simultaneous connections per server. If you have IE5, IE6, or IE7 installed, the versions of WinInet they install allow you to use InternetSetOption() to increase the limit (look at INTERNET_OPTION_MAX_CONNS_PER_SERVER and INTERNET_OPTION_MAX_CONNS_PER_1_0_SERVER options). However, the version of WinInet that is installed by with IE8 apparently disables that functionality (see http://connect.microsoft.com/WNDP/feedback/ViewFeedback.aspx?FeedbackID=434396 and http://connect.microsoft.com/WNDP/feedback/ViewFeedback.aspx?FeedbackID=481485).

If you call InternetOpen() multiple times, you should be able to simultaneously download two files on each HINTERNET returned by InternetOpen().

Related

Limit number of CFHTTP requests sent every x seconds

I'm making an application that will continually send CFHTTP requests to a server to search for items, as well as sending further CFHTTP requests to perform actions on any returned results.
The issue I'm having is that the server has a maximum threshold of 3 requests per second and even when I try to implement a sleep call every 4 milliseconds it doesn't work properly as, although it delays, the CFHTTP requests can queue up if it takes them a couple of seconds to return so that it then tries to send multiple in the same second triggering the threshold to be exceeded.
Is there a way I can ensure that there are never more than 3 active CFHTTP requests?

I think you are going to need to implement some sort of logging widget as part of your process. The log will keep track of request frequency. If the threshold is not met, then you would just skip over that iteration of your CFHTTP call. I don't mean a file log or a database log, but something implemented in the application or even request scope depending on your implementation. There is no way to throttle CFHTTP itself. It is basically a very simplistic wrapper around a Java HTTP library which then goes straight to the underlying operating system.

If you're limiting concurrent requests, then first part of this answer applies. If you're looking to limit the number of requests per second, then the bit at the end applies. The question kind of asks both things.
If I understand correctly, you've got a number of threads (either as requests CF is processing or threads CF has created itself) which all need to make calls to the same rate-limited domain. What you need is a central way of co-ordinating access, combined with a nice way of controlling program execution.
I don't know of any native limits that CF might support (I'd be happy to be proven wrong) so you're likely to have to implement your own. The cheap'n'nasty way to do this is to increment and decrement a allowed_conenctions variable in a long-lived scope such as appliation. The downsides are that you have to implement checking all over the place and that if there are no spare connections, you'll have to wait somehow.
Really what you have is a resource pool (of allowed HTTP connections) and I'm guessing that you want your code to wait until a connection is free. CF does this kind of thing already for database connections.
In your case, there isn't really a need to keep anything in a pool (as HTTP connections aren't long-lived), other than a permit to use the resource. Java provides a class which ought to provide what you're after, the Semaphore.
I've not tried it but in theory, something like the snippet below ought to work:
//Application.cfc:onApplicationStart()
application.http_pool = CreateObject("java","java.util.concurrent.Semaphore").init(3)
//Meanwhile, elsewhere in your code
application.http_pool.acquire()
//Make my HTTP call
application.http_pool.release()
You could even wrap the HTTP object to provide this functionality without having to use the acquire/release each time, which would make it more reliable.
EDIT
It you're looking to limit rates, look at guava's RateLimiter which has the same general interface as Semaphore above, but implements rate limiting for you. You'd need to add guava to ColdFusion's classpath, or use JavaLoader or use CF10 which has classloading facilities built-in.

How to smooth restart a c++ program without shut down the running program?

I have a server program which should run full time a day. If I want to change some parameters of it, Is there any way rather than shut down then restart way?

There are quite a few ways of doing this, including, but almost certainly not limited to:
You can maintain the parameters in a separate file so that the program will periodically check that file and update its internal information.
Similar to (1) but you can send some sort of signal to the application to get it to immediately re-read the file.
You can do either (1) or (2) but using shared memory rather than a configuration file.
You can have your program sit at the server end of an IPC conversation, so that a client can open up a connection to it to provide new parameters. Anything from a simple message queue to a full-blown HTTP server and associated pages.
Of course, all of these tend to need a fair amount of work in your program to get it to look for the new information.
You should take that into account when making your decision. By far the quickest solution to implement is to just (cleanly) kill off the process at something like 11:55pm then immediately restart it. It's simpler because your code probably already has the ability to load the information on startup, so this could be a simple cron one-liner.
Some people speak of laziness as a bad thing but that's not always the case :-)

If the Server maintains many alive connections from clients, restarting the server process is the last way you should consider. Except reloading configuration files, inserting a proxy process between clients and server can be another way.
The proxy process is Responsible for 2 things.
a. Maintaining the connection from clients and forwarding packets to Server for handling.
b. Judging weather the current server process(Server A) is alive and if it not, switching to another server(Server B) automatically.
Then you can change parameters by restart server without worrying about interrupting clients since there is always two(or more) servers running.

Best approach for writing a Linux Server in C (phtreads, select or fork ? )

i got a very specific question about server programming in UNIX (Debian, kernel 2.6.32). My goal is to learn how to write a server which can handle a huge amount of clients. My target is more than 30 000 concurrent clients (even when my college mentions that 500 000 are possible, which seems QUIIITEEE a huge amount :-)), but i really don't know (even whats possible) and that is why I ask here. So my first question. How many simultaneous clients are possible? Clients can connect whenever they want and get in contact with other clients and form a group (1 group contains a maximum of 12 clients). They can chat with each other, so the TCP/IP package size varies depending on the message sent.
Clients can also send mathematical formulas to the server. The server will solve them and broadcast the answer back to the group. This is a quite heavy operation.
My current approach is to start up the server. Than using fork to create a daemon process. The daemon process binds the socket fd_listen and starts listening. It is a while (1) loop. I use accept() to get incoming calls.
Once a client connects I create a pthread for that client which will run the communication. Clients get added to a group and share some memory together (needed to keep the group running) but still every client is running on a different thread. Getting the access to the memory right was quite a hazzle but works fine now.
In the beginning of the programm i read out the /proc/sys/kernel/threads-max file and according to that i create my threads. The amount of possible threads according to that file is around 5000. Far away from the amount of clients i want to be able to serve.
Another approach i consider is to use select () and create sets. But the access time to find a socket within a set is O(N). This can be quite long if i have more than a couple of thousands clients connected. Please correct me if i am wrong.
Well, i guess i need some ideas :-)
Groetjes
Markus
P.S. i tag it for C++ and C because it applies to both languages.

The best approach as of today is an event loop like libev or libevent.
In most cases you will find that one thread is more than enough, but even if it isn't, you can always have multiple threads with separate loops (at least with libev).
Libev[ent] uses the most efficient polling solution for each OS (and anything is more efficient than select or a thread per socket).

You'll run into a couple of limits:
fd_set size: This is changable at compile time, but has quite a low limit by default, this affects select solutions.
Thread-per-socket will run out of steam far earlier - I suggest putting the longs calculations in separate threads (with pooling if required), but otherwise a single thread approach will probably scale.
To reach 500,000 you'll need a set of machines, and round-robin DNS I suspect.
TCP ports shouldn't be a problem, as long as the server doesn't connection back to the clients. I always seem to forget this, and have to be reminded.
File descriptors themselves shouldn't be too much of a problem, I think, but getting them into your polling solution may be more difficult - certainly you don't want to be passing them in each time.

I think you can use the event model(epoll + worker threads pool) to solve this problem.
first listen and accept in main thread, if the client connects to the server, the main thread distribute the client_fd to one worker thread, and add epoll list, then this worker thread will handle the reqeust from the client.
the number of worker thread can be configured by the problem, and it must be no more the the 5000.

How do I detect an aborted connection in Django?

I have a Django view that does some pretty heavy processing and takes around 20-30 seconds to return a result.
Sometimes the user will end up closing the browser window (terminating the connection) before the request completes -- in that case, I'd like to be able to detect this and stop working. The work I do is read-only on the database so there isn't any issue with transactions.
In PHP the connection_aborted function does exactly this. Is this functionality available in Django?
Here's example code I'd like to write:
def myview(request):
while not connection_aborted():
# do another bit of work...
if work_complete:
return HttpResponse('results go here')
Thanks.

I don't think Django provides it because it basically can't. More than Django itself, this depends on the way Django interfaces with your web server. All this depends on your software stack (which you have not specified). I don't think it's even part of the FastCGI and WSGI protocols!
Edit: I'm also pretty sure that Django does not start sending any data to the client until your view finishes execution, so it can't possibly know if the connection is dead. The underlying socket won't trigger an error unless the server tries to send some data back to the user.

That connection_aborted method in PHP doesn't do what you think it does. It will tell you if the client disconnected but only if the buffer has been flushed, i.e. some sort of response is sent from the server back to the client. The PHP versions wouldn't even work as you've written if above. You'd have to add a call to something like flush within your loop to have the server attempt to send data.
HTTP is a stateless protocol. It's designed to not have either the client or the server dependent on each other. As a result the state of either is only known when there is a connection is created, and that only occurs when there's some data to send one way or another.
Your best bet is to do as #MattH suggested and do this through a bit of AJAX, and if you'd like you can integrate something like Node.js to make client "check-ins" during processing. How to set that up properly is beyond my area of expertise, though.

So you have an AJAX view that runs a query that takes 20-30 seconds to process requested in the background of a rendered page and you're concerned about wasted resources for when someone cancels the page load.
I see that you've got options in three broad categories:
Live with it. Improve the situation by caching the results in case the user comes back.
Make it faster. Throw more space at a time/space trade-off. Maintain intermediate tables. Precalculate the entire thing, etc.
Do something clever with the browser fast-polling a "is it ready yet?" query and the server cancelling the query if it doesn't receive a nag within interval * 2 or similar. If you're really clever, you could return progress / ETA to the nags. However, this might not have particularly useful behaviour when the system is under load or your site is being accessed over limited bandwidth.
I don't think you should go for option 3 because it's increasing complexity and resource usage for not much gain.

XMLRPCPP asynchronously handling multiple calls?

I have a remote server which handles various different commands, one of which is an event fetching method.
The event fetch returns right away if there is 1 or more events listed in the queue ready for processing. If the event queue is empty, this method does not return until a timeout of a few seconds. This way I don't run into any HTTP/socket timeouts. The moment an event becomes available, the method returns right away. This way the client only ever makes connections to the server, and the server does not have to make any connections to the client.
This event mechanism works nicely. I'm using the boost library to handle queues, event notifications, etc.
Here's the problem. While the server is holding back on returning from the event fetch method, during that time, I can't issue any other commands.
In the source code, XmlRpcDispatch.cpp, I'm seeing in the "work" method, a simple loop that uses a blocking call to "select".
Seems like while the handling of a method is busy, no other requests are processed.
Question: am I not seeing something and can XmlRpcpp (xmlrpc++) handle multiple requests asynchronously? Does anyone know of a better xmlrpc library for C++? I don't suppose the Boost library has a component that lets me issue remote commands?
I actually don't care about the XML or over-HTTP feature. I simply need to issue (asynchronous) commands over TCP in any shape or form?
I look forward to any input anyone might offer.

I had some problems with XMLRPC also, and investigated many solutions like GSoap and XMLRPC++, but in the end I gave up and wrote the whole HTTP+XMLRPC from scratch using Boost.ASIO and TinyXML++ (later I swaped TinyXML to expat). It wasn't really that much work; I did it myself in about a week, starting from scratch and ending up with many RPC calls fully implemented.
Boost.ASIO gave great results. It is, as its name says, totally async, and with excellent performance with little overhead, which to me was very important because it was running in an embedded environment (MIPS).
Later, and this might be your case, I changed XML to Google's Protocol-buffers, and was even happier. Its API, as well as its message containers, are all type safe (i.e. you send an int and a float, and it never gets converted to string and back, as is the case with XML), and once you get the hang of it, which doesn't take very long, its very productive solution.
My recomendation: if you can ditch XML, go with Boost.ASIO + ProtobufIf you need XML: Boost.ASIO + Expat
Doing this stuff from scratch is really worth it.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js