TCP/IP and designing networking application - c++

i'm reading about way to implemnt client-server in the most efficient manner, and i bumped into that link :
http://msdn.microsoft.com/en-us/library/ms740550(VS.85).aspx
saying :
"Concurrent connections should not exceed two, except in special purpose applications. Exceeding two concurrent connections results in wasted resources. A good rule is to have up to four short lived connections, or two persistent connections per destination "
i can't quite get what they mean by 2... and what do they mean by persistent?
let's say i have a server who listens to many clients , whom suppose to do some work with the server, how can i keep just 2 connections open ?
what's the best way to implement it anyway ? i read a little about completion port , but couldn't find a good examples of code, or at least a decent explanation.
thanks

Did you read the last sentence:
A good rule is to have up to four
short lived connections, or two
persistent connections per
destination.
Hard to say from the article, but by destination I think they mean client. This isn't a very good article.

A persistent connection is where a client connects to the server and then performs all its actions without ever dropping the connection. Even if the client has periods of time when it does not need the server, it maintains its connection to the server ready for when it might need it again.
A short lived connection would be one where the client connects, performs its action and then disconnects. If it needs more help from the server it would re-connect to the server and perform another single action.
As the server implementing the listening end of the connection, you can set options in the listening TCP/IP socket to limit the number of connections that will be held at the socket level and decide how many of those connections you wish to accept - this would allow you to accept 2 persistent connections or 4 short lived connections as required.

What they mean by, "persistent," is a connection that is opened, and then held open. It's pretty common problem to determine whether it's more expensive to tie up resources with an "always on" connection, or suffer the overhead of opening and closing a connection every time you need it.
It may be worth taking a step back, though.
If you have a server that has to listen for requests from a bunch of clients, you may have a perfect use case for a message-based architecture. If you use tightly-coupled connections like those made with TCP/IP, your clients and servers are going to have to know a lot about each other, and you're going to have to write a lot of low-level connection code.
Under a message-based architecture, your clients could place messages on a queue. The server could then monitor that queue. It could take messages off the queue, perform work, and place the responses back on the queue, where the clients could pick them up.
With such a design, the clients and servers wouldn't have to know anything about each other. As long as they could place properly-formed messages on the queue, and connect to the queue, they could be implemented in totally different languages, and run on different OS's.
Messaging-oriented-middleware like Apache ActiveMQ and Weblogic offer API's you could use from C++ to manage and use queues, and other messaging objects. ActiveMQ is open source, and Weblogic is sold by Oracle (who bought BEA). There are many other great messaging servers out there, so use these as examples, to get you started, if messaging sounds like it's worth exploring.

I think key words are "per destination". Single tcp connection tries to accelerate up to available bandwidth. So if you allow more connections to same destination, they have to share same bandwidth.
This means that each transfer will be slower than it could be and server has to allocate more resources for longer time - data structures for each connection.
Because establishing tcp connection is "time consuming", it makes sense to allow establish second connection in time when you are serving first one, so they are overlapping each other. for short connections setup time could be same as for serving the connection itself (see poor performance example), so more connections are needed for filling all bandwidth effectively.
(sorry I cannot post hyperlinks yet)
here msdn.microsoft.com/en-us/library/ms738559%28VS.85%29.aspx you can see, what is poor performance.
here msdn.microsoft.com/en-us/magazine/cc300760.aspx is some example of threaded server what performs reasonably well.
you can limit number of open connections by limiting number of accept() calls. you can limit number of connections from same source just by canceling connection when you find out, that you allready have more then two connections from this location (just count them).
For example SMTP works in similar way. When there are too many connections, it returns 4xx code and closes your connection.
Also see this question:
What is the best epoll/kqueue/select equvalient on Windows?

Related

Pinging by creating new sockets per each peer

I created a small cross-platform app using Qt sockets in C++ (although this is not a C++ or Qt specific question).
The app has a small "ping" feature that tries to connect to a peer and asks for a small challenge (i.e. some custom data sent and some custom data replied) to see if it's alive.
I'm opening one socket per each peer so as soon as the ping starts we have several sockets in SYN_SENT.
Is this a proper way to implement a ping-like protocol with challenge? Am I wasting sockets? Is there a better way I should be doing this?
I'd say your options are:
An actual ping (using ICMP echo packets). This has low overhead, but only tells you whether the host is up. And it requires you to handle lost packets, timeouts, and retransmits.
A UDP-based protocol. This also has lower kernel overhead, but again you'll be responsible for setting up timeouts, handling lost packets, and retransmits. It has the advantage of allowing you to positively affirm that your program is running on the peer. It can be implemented with only a single socket endpoint no matter how many peers you add. (It is also possible that you could send to multiple peers at once with a broadcast if all are on a local network, or a multicast [complicated set-up required for that].)
TCP socket as you're doing now. This is much easier to code, extremely reliable and will automatically provide a timeout (i.e. your connect will eventually fail if the peer doesn't respond). It lets you know positively that your peer is there and running your program. Although there is more kernel overhead to this, and you will use one socket endpoint on your host per peer system, I wouldn't call it a significant issue unless you think you'll be having thousands of peers.
So, in the end, you have to judge: If thousands of hosts will be participating and this pinging is going to happen frequently, you may be better off coding up a UDP solution. If the pinging is rare or you don't expect so many peers, I would go the TCP route. (And I wouldn't consider that a "waste of sockets" -- those advantages are why TCP is so commonly used.)
The technique described in the question doesn't really implement ping for the connection and doesn't test if the connection itself is alive. The technique only checks that the peer is listening for (and is responsive to) new connections...
What you're describing is more of an "is the server up?" test than a "keep-alive" ping.
If we're discussing "keep-alive" pings, than this technique will fail.
For example, if just the read or the write aspect of the connection is closed, you wouldn't know. Also, if the connection was closed improperly (i.e., due to an intermediary dropping the connection), this ping will not expose the issue.
Most importantly, for some network connections and protocols, you wouldn't be resetting the connection's timeout... so if your peer is checking for connection timeouts, this ping won't help.
For a "keep-alive" ping, I would recommend that you implement a protocol specific ping.
Make sure that the ping is performed within the existing (same) connection and never requires you to open a new connection.

Design a multi client - server application, where client send messages infrequent

I have to design a server which can able to send a same objects to many clients. clients may send some request to the server if it wants to update something in the database.
Things which are confusing:
My server should start the program (where I perform some operation and produce 'results' , this will be send to the client).
My server should listen to the incoming connection from the client, if any it should accept and start sending the ‘results’.
Server should accept as many clients as possible (Not more than 100).
My ‘result' should be secured. I don’t want some one take my ‘result' and see what my program logics look like.
I thought point 1. is one thread. And point 2. is another thread and it going to create multiple threads within its scope to serve point 3. Point 4 should be taken by my application logic while serialising the 'result' rather the server.
Is it a bad idea? If so where can i improve?
Thanks
Putting every connection on a thread is very bad, and is apparently a common mistake that beginners do. Every thread costs about 1 MB of memory, and this will overkill your program for no good reason. I did ask the very same question before, and I got a very good answer. I used boost ASIO, and the server/client project is finished since months, and it's a running project now beautifully.
If you use C++ and SSL (to secure your connection), no one will see your logic, since your programs are compiled. But you have to write your own communication protocol/serialization in that case.

Connection per session or multiplexing multiple sessions through one connection

When designing a client/server architecture, is there any advantage to multiplexing multiple connections from the same process to the remote server (i.e. sharing one connection) vs opening one connection per thread/session in the client (as is typically done when connecting to memcached or database servers.)
I know there's a bit of overhead associated with each connection (e.g. if a server has 50,000 open connections that uses up a lot of RAM) this was one major reason why facebook made a UDP patch for memcached. But I don't expect to have anywhere near that number. Maybe 10,000 at the most. There's also savings in establishing a tcp/ip connection and doing authorization, but for now I'd rather leave authorization to firewall software as memcached does.
Are there any reasons to implement multiplexing connections in a tcp/ip client/server application with less than 10K connections?
Edit - Details:
This is for a database server/client I'm working on. I think that Informix and Oracle do actually allow for session multiplexing over one tcp/ip connection. In the Informix documentation they say you may get a performance improvement for nonthreaded clients (no mention of multi-threaded clients, perhaps it's not a thread-safe implementation.)
is there any advantage to multiplexing multiple connections vs opening one connection per thread/session
Yes, though it depends on the implementation of the simplex. You probably know about the firewall hassle with e.g. FTP, SIP et al, especially when encryption is used partway. This is what influences the decision whether to use multiple, or just one connection.

C++ TCP server Class Design

I am developing TCP server in C++(win32/linux) which cater multiple client.The server is for Video Streaming.Client request video to server and Server get it from Gateway connected with camera.
I am stuck up in the class Design.I found three classes by
Peer
Session and
ConnectionMgr.
So here ConnectionMgr is responsible for managing other Classes.
I wanted your feedback on this.
what info Peer and session need to have;
How Peer and session is related
what information needs to be modeled here.
how to do Session maintainer.
Managing multiple client will require Threads what information thoses may need.
Please give your feedback so that I can upgrade my design.
Looking at the problem space from scratch:
there's some state associated with each client that connects - you seem to split this between Peer and Session and I see no real value in that if they're 1:1 - can omit such trivia from the high-level design stage.
"what info Peer and session need to have": socket descriptor is the only crucial thing, assuming you have only one camera and stream to all clients at the same pace (losing data when socket send() blocks/can't complete due to full buffer), otherwise, a buffer too...
you have a ConnectionMgr, well - yes... it must listen and accept clients on the server socket, possibly launch a new thread per client or monitor the set of current client connections and dispatch events
you'll need to make some decisions about the I/O and concurrency model (e.g. select/poll/non-blocking, async, blocking, single threaded, thread-per-client, thread-pool etc)
this will obviously affect your design: you should decide which - or which choices - you need to support...
To get a feel for this problem space, I suggest you create a very simple client/server program - probably using threads if you're familiar and comfortable with multithreading, otherwise you can hack upon the GCC libc TCP client/server examples for a select() based solution (http://www.gnu.org/s/libc/manual/html_node/Server-Example.html#Server-Example) or try boost::asio or ACE or whatever. To start, just get it working so you can telnet to the server and whatever you type in any connection is echoed out on all the connections. That should give you enough insight to start asking more concrete questions.
As #nabulke and #Jan Hudec stated in their comments, Boost.Asio is very good solution for your problem. Have a look at pretty simple example "Async TCP Echo Server". It uses just 2 classes: server and session. No session_manager. Sessions are managed automatically by smart pointers, very convenient and simple approach.
Using Boost.Asio you can keep the network part simple (and almost optimal by efficiency using asynchronous processing). As a bonus, adding couple of code lines, you receive multithreaded server w/o headache (I would recommend this example: "An HTTP server using a single io_service and a thread pool calling io_service::run().", just ignore HTTP stuff. pay attention to boost::asio::io_service::strand used in connection class)

How to design a client server architect

I like to know the server (TCP based) architecture to support large scale of clients(at least10K) to implement Fix server. My points are
How we design it.
How to listen on the open port? Use select or poll or any other function.
How to process the response of the client? On large scale we cannot create the one thread for each client.
Should the processing of response is in the different executable and share the request and response to the server executable through IPC.
There is much more on it. I would appreciate if anyone explains it or provide any link.
Thanks
An excellent resource for information on this topic is The C10K problem. Although the dimensions there seem a little old, the techniques are still applicable today.
The architecture depends on what you want to do with the clients incoming data. My guess is that for every incoming message you would perform some computations and probably also return a response.
In that case I would create 1 main listener thread that receives all the incoming messages (Actually, if your hardware has more than 1 physical network device, I would use a listener thread per device and make sure each one is listening to a specific device).
Get the number of CPUs that you have on your machine and create worker threads for each CPU and bind them each thread to one cpu (Maybe number of working thread should be num_of_cpu-1, to leave an availalbe cpu for the listener and dispatcher).
Each thread has a queue and semaphore, the main listener thread just push the incoming data into those queues. There are many way to perform load balancing (Will talk about it later).
Each working thread just works on the requests given to it, and put the response on another queue that is read by the dispatcher.
The dispatcher - there are 2 options here, use a thread for dispatcher (or thread per network device as for listeners), or have the dispatcher actually be the same thread as the listener.
There is some advantage to put them both on the same thread, since it makes it easier to detect lost socket connection and use the same fds for both reading and writing without thread synchronization. However, it could be that using 2 different threads would give better performance, it need to be tested.
Note about load balancing:
This is a topic of its own.
The simplest thing is to use 1 queue for all working threads, but the problem is that they have to lock in order to pop items and the locking can damage performance. (But you get the most balanced load).
Another quite simple approach would be to have a private queue for every worker and perform round-robin when inserting. After every X cycles check the size of all the queues. If some queues are much larger than others then leave them out for the next X cycles and then recheck them again. This is not the best approach, but a simple one to implement and gives some load balancing while no locking is needed.
By the way - There is a way to implement queue between 2 threads without blocking - but this is also another topic.
I hope it helps,
Guy
If the client and server are on a secure network then the security aspect is to be minimal - to the extent that the transfers are encrypted. If the clients and the server are not on a secure network - you first want the server and client to authenticate each other and then initiate encrypted data transfer. For data transfer, server-side authentication should suffice. At the end of this authentication use the session key to generate encrypted data stream (symmetric). consider using TFTP it is simple to implement and scales reasonably well.