I have a winsock-server, accepting packets from a local IP, which currently works without using IOCP. I want it to be non-blocking though, working through IOCP. Yes I know about the alternatives (select, WSAAsync etc.), but this won't do it for developing an MMO server.
So here's the question - how do I do this using std::thread and IOCP?
I already know that GetQueuedCompletionStatus() dequeues packets, while PostQueuedCompletionStatus() queues those to the IOCP.
Is this the proper way to do it async though?
How can I threat all clients equally on about 10 threads? I thought about receiving UDP packets and processing those while IOCP has something in queue, but packets will be processed by max 10 at a time and I also have an infinite loop in each thread.
The target is creating a game server, capable of holding thousands of clients at the same time.
About the code: netListener() is a class, holding packets received from the listening network interface in a vector. All it does in Receive() is
WSARecvFrom(sockfd, &buffer, 1, &bytesRecv, &flags, (SOCKADDR*)&senderAddr, &size, &overl, 0);
std::cout << "\n\nReceived " << bytesRecv << " bytes.\n" << "Packet [" << std::string(buffer.buf, bytesRecv)<< "]\n";*
The code works, buffer shows what I've sent to myself, but I'm not sure whether having only ONE receive() will suffice.
About blocking - yes, I realized that putting listener.Receive() into a separate thread doesn't block the main thread. But imagine this - lots of clients try to send packets, can one receive process them all? Not to mention I was planning to queue an IOCP packet on each receive, but still not sure how to do this properly.
And another question - is it possible to establish a direct connection between a client and another client? If you host a server on a local machine behind NAT and you want it to be accessible from the internet, for example.
Threads:
void Host::threadFunc(int i) {
threadMutex.lock();
for (;;) {
if (m_Init) {
if (GetQueuedCompletionStatus(iocp, &bytesReceived, &completionKey, (LPOVERLAPPED*)&overl, WSA_INFINITE)) {
std::cout << "1 completion packet dequeued, bytes: " << bytesReceived << std::endl;
}
}
}
threadMutex.unlock(); }
void Host::createThreads() {
//Create threads
for (unsigned int i = 0; i < SystemInfo.dwNumberOfProcessors; ++i) {
threads.push_back(std::thread(&Host::threadFunc, this, i));
if (threads[i].joinable()) threads[i].detach();
}
std::cout << "Threads created: " << threads.size() << std::endl; }
Host
Host::Host() {
using namespace std;
InitWSA();
createThreads();
m_Init = true;
SecureZeroMemory((PVOID)&overl, sizeof(WSAOVERLAPPED));
overl.hEvent = WSACreateEvent();
iocp = CreateIoCompletionPort((HANDLE)sockfd, iocp, 0, threads.size());
listener = netListener(sockfd, overl, 12); //12 bytes buffer size
for (int i = 0; i < 4; ++i) { //IOCP queue test
if (PostQueuedCompletionStatus(iocp, 150, completionKey, &overl)) {
std::cout << "1 completion packet queued\n";
}
}
std::cin.get();
listener.Receive(); //Packet receive test - adds a completion packet n bytes long if client sent one
std::cin.get();}
Related
I was digging through the Asio documention for sockets but I couldn't find anything useful on how I can handle the following situation:
I assume to have a lot of servers in a peer to peer network (up to 1000).
Servers will have to communicate regularly with each other so I do not want to open a new client connection to send a message to another server every time this is needed (huge overhead).
At the same time, creating n threads that each correspond to a client -> server connection is also not really viable.
I'll implement different communication schemes (all-to-all, star and tree) so 1, log(n) and n of the servers will have to instantiate those n socket clients to create a connection to the other servers.
Is there a good way I can simply do (pseudocode).
pool = ConnectionPool.create(vector<IP>);
pool.sendMessage(ip, message);
I know on the server side I can use an async connection. However, I don't really know how to handle it from the "client" (sender) perspective in C++/Asio.
Tl:DR;
Which APIs and classes am I supposed to use when I want to "send" messages to N servers without having to open N connections every time I do that and neither using N threads".
Yes, each process will need a server side (to receive messages from any of the n participants) and one client side (to send messages to any of the n participants). However, as far as I could find in Asio, the only way to send messages to k of the n participants is by creating k threads with k connections
Then you must not have looked in the right place, or not very far at all.
A core tenet async IO is multiplexing IO on a single thread (all of the kqueue/epoll/select/IO completion ports etc abstractions are geared towards that goal).
Here's an absolutely lazy-coded demonstration that shows:
single threaded everything
a listener that accepts unbounded clients (we could easily add additional listeners)
we connect to a collection of "peers"
on a heartbeat interval we send all the peers a heartbeat message
for (auto& peer : peers)
async_write(peer, buffer(message), [ep=peer.remote_endpoint(ec)](error_code ec, size_t xfr) {
std::cout << "(sent " << xfr << " bytes to " << ep << "(" << ec.message() << ")" << std::endl;
});
additionally it handles asynchronous process signals (INT, TERM) to shutdown all the async operations
"Live¹" On Coliru
#include <boost/asio.hpp>
#include <list>
#include <iostream>
using std::tuple;
using namespace std::literals;
template <typename T>
static auto reference_eq(T const& obj) {
return [p=&obj](auto& ref) { return &ref == p; };
}
int main() {
using namespace boost::asio; // don't be this lazy please
using boost::system::error_code;
using ip::tcp;
io_context ioc;
tcp::acceptor listener(ioc, {{}, 6868});
listener.set_option(tcp::acceptor::reuse_address(true));
listener.listen();
using Loop = std::function<void()>;
std::list<tcp::socket> clients, peers;
// accept unbounded clients
Loop accept_loop = [&] {
listener.async_accept([&](error_code const& ec, tcp::socket s) {
if (!ec) {
std::cout << "New session " << s.remote_endpoint() << std::endl;
clients.push_back(std::move(s));
accept_loop();
}
});
};
tcp::resolver resoler(ioc);
for (auto [host,service] : {
tuple{"www.example.com", "http"},
{"localhost", "6868"},
{"::1", "6868"},
// ...
})
{
auto& p = peers.emplace_back(ioc);
async_connect(p, resoler.resolve(host,service), [&,spec=(host+":"s+service)](error_code ec, auto...) {
std::cout << "For " << spec << " (" << ec.message() << ")";
if (!ec)
std::cout << " " << p.remote_endpoint();
else
peers.remove_if(reference_eq(p));
std::cout << std::endl;
});
}
std::string const& message = "heartbeat\n";
high_resolution_timer timer(ioc);
Loop heartbeat = [&]() mutable {
timer.expires_from_now(2s);
timer.async_wait([&](error_code ec) {
std::cout << "heartbeat " << ec.message() << std::endl;
if (ec)
return;
for (auto& peer : peers)
async_write(peer, buffer(message), [ep=peer.remote_endpoint(ec)](error_code ec, size_t xfr) {
std::cout << "(sent " << xfr << " bytes to " << ep << "(" << ec.message() << ")" << std::endl;
});
heartbeat();
});
};
signal_set sigs(ioc, SIGINT, SIGTERM);
sigs.async_wait([&](error_code ec, int sig) {
if (!ec) {
std::cout << "signal: " << strsignal(sig) << std::endl;
listener.cancel();
timer.cancel();
} });
accept_loop();
heartbeat();
ioc.run_for(10s); // max time for Coliru, or just `run()`
}
Prints (on my system):
New session 127.0.0.1:46730
For localhost:6868 (Success) 127.0.0.1:6868
For ::1:6868 (Connection refused)
For www.example.com:http (Success) 93.184.216.34:80
heartbeat Success
(sent 10 bytes to 93.184.216.34:80(Success)
(sent 10 bytes to 127.0.0.1:6868(Success)
heartbeat Success
(sent 10 bytes to 93.184.216.34:80(Success)
(sent 10 bytes to 127.0.0.1:6868(Success)
heartbeat Success
(sent 10 bytes to 93.184.216.34:80(Success)
(sent 10 bytes to 127.0.0.1:6868(Success)
^Csignal: Interrupt
heartbeat Operation canceled
Note how the one client ("New session") is our own peer connection on localhost:6868 :)
Of course, in real life you would have a class to represent a client session, perhaps have queues for messages pending sending, and optionally run on multiple threads (using strands to synchronize access to shared objects).
OTHER SAMPLES
If you really wish to avoid an explicit collection of clients, see this very similar demo: How to pass a boost asio tcp socket to a thread for sending heartbeat to client or server which
also starts from single-threaded, but adds a thread pool for strand demonstration purposes)
It has a heartbeat timer per session meaning that each session can have their own frequency
¹ it's not working on coliru because of limited access to network. A loop-back only version without resolver use works: Live On Coliru
Since you stated you want to use a TCP i.e. connection based protocol, you can use the async ASIO API and could rely on 1 thread, because async i.e. reactor pattern call do not block.
Your server would use boost::asio::async_write to a boost::asio::ip::tcp::socket, which is equal to one TCP connection happening. The callback you give async_write as a parameter will be called when you are done sending, but async_write would return immediatly. Receiving would be similar to a client. In order to get a TCP connection to a incoming client you would have to use a boost::asio::ip::tcp::resolver which opens new TCP connections/sockets for you by listening via boost::asio::ip::tcp::resolver::async_resolve in the client and boost::asio::ip::tcp::acceptor initialized with a boost::asio::ip::tcp::endpoint and boost::asio::ip::tcp::acceptor::async_accept on server side. Actually you would need 2, one for IPv4 and for IPv6 each.
Since you would have some state with a TCP connection on server side, you would ordinary have to track in a central place, but to avoid this contention and ease the pattern, its common to use a class which inherits std::enable_shared_from_this, which will give a std::shared_pointer of itself into the callback to std::async_write so that, between sending and receiving, where the thread is not blocked in the usual sense, it would not be forgotten i.e. deleted.
For reading I recommend boost::asio::async_read_until and in general a boost::asio::streambuf.
By this 1 thread that runs boost::asio::io_context::run in a loop would suffice, it would unblock every-time one of the many connections need processing of the received stuff or something new to be sent has to be generated.
The general project is a bit out of scope, it would help if you could narrow your question a bit, or better read the talks and examples. I have written something similiar as you indent, a resilient overlay network: https://github.com/Superlokkus/code
I have setup two Raspberry Pis to use UDP sockets, one as the client and one as the server. The kernel has been patched with RT-PREEMPT (4.9.43-rt30+). The client acts as an echo to the server to allow for the calculation of Round-Trip Latency (RTL). At the moment a send frequency of 10Hz is being used on the server side with 2 threads: one for sending the messages to the client and one for receiving the messages from the client. The threads are setup to have a schedule priority of 95 using Round-Robin scheduling.
The server constructs a message containing the time the message was sent and the time past since messages started being sent. This message is sent from the server to the client then immediately returned to the server. Upon receiving the message back from the client the server calculates the Round-Trip Latency and then stores it in a .txt file, to be used for plotting using Python.
The problem is that when analysing the graphs I noticed there is a periodic spike in the RTL. The top graph of the image:RTL latency and sendto() + recvfrom() times. In the legend I have used RTT instead of RTL. These spikes are directly related to the spikes shown in the server side sendto() and recvfrom() calls. Any suggestion on how to remove these spikes as my application is very reliant on consistency?
Things I have tried and noticed:
The size of the message being sent has no effect. I have tried larger messages (1024 bytes) and smaller messages (0 bytes) and the periodic delay does not change. This suggests to me that it is not a buffer issue as there is nothing filling up?
The frequency at which the messages are sent does play a big role, if the frequency is doubled then the latency spikes occur twice as often. This then suggests that something is filling up and while it empties the sendto()/recvfrom() functions experience a delay?
Changes to the buffer size with setsockop() has no effect.
I have tried quite a few other settings (MSG_DONTWAIT, etc) to no avail.
I am by no means an expert in sockets/C++ programming/Linux so any suggestions given will be greatly appreciated as I am out of ideas. Below is the code used to create the socket and start the server threads for sending and receiving the messages. Below that is the code for sending the messages from the server, if you need the rest please let me know but for now my concern is centred around the delay caused by the sendto() function. If you need anything else please let me know. Thanks.
thread_priority = priority;
recv_buff = recv_buff_len;
std::cout << del << " Second start-up delay..." << std::endl;
sleep(del);
std::cout << "Delay complete..." << std::endl;
master = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP);
// master socket creation
if(master == 0){// Try to create the UDP socket
perror("Could not create the socket: ");
exit(EXIT_FAILURE);
}
std::cout << "Master Socket Created..." << std::endl;
std::cout << "Adjusting send and receive buffers..." << std::endl;
setBuff();
// Server address and port creation
serv.sin_family = AF_INET;// Address family
serv.sin_addr.s_addr = INADDR_ANY;// Server IP address, INADDR_ANY will
work on the server side only
serv.sin_port = htons(portNum);
server_len = sizeof(serv);
// Binding of master socket to specified address and port
if (bind(master, (struct sockaddr *) &serv, sizeof (serv)) < 0) {
//Attempt to bind master socket to address
perror("Could not bind socket...");
exit(EXIT_FAILURE);
}
// Show what address and port is being used
char IP[INET_ADDRSTRLEN];
inet_ntop(AF_INET, &(serv.sin_addr), IP, INET_ADDRSTRLEN);// INADDR_ANY
allows all network interfaces so it will always show 0.0.0.0
std::cout << "Listening on port: " << htons(serv.sin_port) << ", and
address: " << IP << "..." << std::endl;
// Options specific to the server RPi
if(server){
std::cout << "Run Time: " << duration << " seconds." << std::endl;
client.sin_family = AF_INET;// Address family
inet_pton(AF_INET, clientIP.c_str(), &(client.sin_addr));
client.sin_port = htons(portNum);
client_len = sizeof(client);
serv_send = std::thread(&SocketServer::serverSend, this);
serv_send.detach();// The server send thread just runs continuously
serv_receive = std::thread(&SocketServer::serverReceive, this);
serv_receive.join();
}else{// Specific to client RPi
SocketServer::clientReceiveSend();
}
And the code for sending the messages:
// Setup the priority of this thread
param.sched_priority = thread_priority;
int result = sched_setscheduler(getpid(), SCHED_RR, ¶m);
if(result){
perror ("The following error occurred while setting serverSend() priority");
}
int ched = sched_getscheduler(getpid());
printf("serverSend() priority result %i : Scheduler priority id %i \n", result, ched);
std::ofstream Out;
std::ofstream Out1;
Out.open(file_name);
Out << duration << std::endl;
Out << frequency << std::endl;
Out << thread_priority << std::endl;
Out.close();
Out1.open("Server Side Send.txt");
packets_sent = 0;
Tbegin = std::chrono::high_resolution_clock::now();
// Send messages for a specified time period at a specified frequency
while(!stop){
// Setup the message to be sent
Tstart = std::chrono::high_resolution_clock::now();
TDEL = std::chrono::duration_cast< std::chrono::duration<double>>(Tstart - Tbegin); // Total time passed before sending message
memcpy(&message[0], &Tstart, sizeof(Tstart));// Send the time the message was sent with the message
memcpy(&message[8], &TDEL, sizeof(TDEL));// Send the time that had passed since Tstart
// Send the message to the client
T1 = std::chrono::high_resolution_clock::now();
sendto(master, &message, 16, MSG_DONTWAIT, (struct sockaddr *)&client, client_len);
T2 = std::chrono::high_resolution_clock::now();
T3 = std::chrono::duration_cast< std::chrono::duration<double>>(T2-T1);
Out1 << T3.count() << std::endl;
packets_sent++;
// Pause so that the required message send frequency is met
while(true){
Tend = std::chrono::high_resolution_clock::now();
Tdel = std::chrono::duration_cast< std::chrono::duration<double>>(Tend - Tstart);
if(Tdel.count() > 1/frequency){
break;
}
}
TDEL = std::chrono::duration_cast< std::chrono::duration<double>>(Tend - Tbegin);
// Check to see if the program has run as long as required
if(TDEL.count() > duration){
stop = true;
break;
}
}
std::cout << "Exiting serverSend() thread..." << std::endl;
// Save extra results to the end of the last file
Out.open(file_name, std::ios_base::app);
Out << packets_sent << "\t\t " << packets_returned << std::endl;
Out.close();
Out1.close();
std::cout << "^C to exit..." << std::endl;
I have sorted out the problem. It was not the ARP tables as even with the ARP functionality disabled there was a periodic spike. With the ARP functionality disabled there would only be a single spike in latency as opposed to a series of latency spikes.
It turned out to be a problem with the threads I was using as there were two threads on a CPU only capable of handling one thread at a time. The one thread that was sending the information was being affected by the second thread that was receiving information. I changed the thread priorities around a lot (send priority higher than receive, receive higher than send and send equal to receive) to no avail. I have now bought a Raspberry Pi that has 4 cores and I have set the send thread to run on core 2 while the receive thread runs on core 3, preventing the threads from interfering with each other. This has not only removed the latency spikes but also reduced the mean latency of my setup.
I'm using a MacBook Pro to send a series of 1024-byte UDP datagrams in my main thread over a socket using a multicast address and port every 12 mS (ugly but illustrative):
for(;;) {
//-------------- Send ----------------
try {
sock.sendTo(filebufPos, readsize, mcAddr, mcPort);
sendCount++;
if(sendCount < sends_needed) {
filebufPos += readsize;
} else {
sendCount = 0; //Reset send counter
filebufPos = filebuf; //Reset pointer to start of file buffer
}
} catch (SocketException &e) {
cerr << e.what() << endl;
}
usleep(12000); //------------ Pause between sends-----------
}
On my iPhone 5, I try to receive the datagrams using a non-blocking 'recvFrom' call on the same multicast address and port within a callback routine that gets called every 1.5 mS, ala:
try {
nBytesReceived = sock->recvFrom((void *)buf, nBytesCount, mcAddr, mcPort);
} catch (SocketException &e) {
cerr << e.what() << endl;
}
I measure the time between successful UDP socket recvs on the iPhone client side. Ideally, I should receive the UDP datagrams every 8 callbacks (12 ms), and for the most part this is the case. However, sometimes the time between recvs is very short, while at other times it can be as long as 100-150 mS between recvs.
Any ideas why this might be happening?
Thanks!
I'm working on a vision-application, which have two modes:
1) parameter setting
2) automatic
The problem is in 2), when my app waits for a signal via TCP/IP. The program is freezing while accept()-methode is called. I want to provide the possibility on a GUI to change the mode. So if the mode is changing, it's provided by another signal (message_queue). So I want to interrupt the accept state.
Is there a simple possibility to interrupt the accept?
std::cout << "TCPIP " << std::endl;
client = accept(slisten, (struct sockaddr*)&clientinfo, &clientinfolen);
if (client != SOCKET_ERROR)
cout << "client accepted: " << inet_ntoa(clientinfo.sin_addr) << ":"
<< ntohs(clientinfo.sin_port) << endl;
//receive the message from client
//recv returns the number of bytes received!!
//buf contains the data received
int rec = recv(client, buf, sizeof(buf), 0);
cout << "Message: " << rec << " bytes and the message " << buf << endl;
I read about select() but I have no clue how to use it. Could anybody give me a hint how to implement for example select() in my code?
Thanks.
Best regards,
T
The solution is to call accept() only when there is an incoming connection request. You do that by polling on the listen socket, where you can also add other file descriptors, use a timeout etc.
You did not mention your platform. On Linux, see epoll(), UNIX see poll()/select(), Windows I don't know.
A general way would be to use a local TCP connection by which the UI thread could interrupt the select call. The general architecture would use:
a dedicated thread waiting with select on both slisten and the local TCP connection
a TCP connection (Unix domain socket on a Unix or Unix-like system, or 127.0.0.1 on Windows) between the UI thread and the waiting one
various synchronizations/messages between both threads as required
Just declare that select should read slisten and the local socket. It will return as soon as one is ready, and you will be able to know which one is ready.
As you haven't specified your platform, and networking, especially async, is platform-specific, I suppose you need a cross-platform solution. Boost.Asio fits perfectly here: http://www.boost.org/doc/libs/1_39_0/doc/html/boost_asio/reference/basic_socket_acceptor/async_accept/overload1.html
Example from the link:
void accept_handler(const boost::system::error_code& error)
{
if (!error)
{
// Accept succeeded.
}
}
...
boost::asio::ip::tcp::acceptor acceptor(io_service);
...
boost::asio::ip::tcp::socket socket(io_service);
acceptor.async_accept(socket, accept_handler);
If Boost is a problem, Asio can be a header-only lib and used w/o Boost: http://think-async.com/Asio/AsioAndBoostAsio.
One way would be to run select in a loop with a timeout.
Put slisten into nonblocking mode (this isn't strictly necessary but sometimes accept blocks even when select says otherwise) and then:
fd_set read_fds;
FD_ZERO(&read_fds);
FD_SET(slisten, &read_fds);
struct timeval timeout;
timeout.tv_sec = 1; // 1s timeout
timeout.tv_usec = 0;
int select_status;
while (true) {
select_status = select(slisten+1, &read_fds, NULL, NULL, &timeout);
if (select_status == -1) {
// ERROR: do something
} else if (select_status > 0) {
break; // we have data, we can accept now
}
// otherwise (i.e. select_status==0) timeout, continue
}
client = accept(slisten, ...);
This will allow you to catch signals once per second. More info here:
http://man7.org/linux/man-pages/man2/select.2.html
and Windows version (pretty much the same):
https://msdn.microsoft.com/pl-pl/library/windows/desktop/ms740141(v=vs.85).aspx
I have used C++ & Winsock2 to create both server and client applications. It currently handles multiple client connections by creating separate threads.
Two clients connect to the server. After both have connected, I need to send a message ONLY to the first client which connected, then wait until a response has been received, send a separate message to the second client.
The trouble is, I don't know how I can target the first client which connected.
The code I have at the moment accepts two connections but the message is sent to client 2.
Can someone please give me so ideas on how I can use Send() to a specific client? Thanks
Code which accepts the connections and starts the new threads
SOCKET TempSock = SOCKET_ERROR; // create a socket called Tempsock and assign it the value of SOCKET_ERROR
while (TempSock == SOCKET_ERROR && numCC !=2) // Until a client has connected, wait for client connections
{
cout << "Waiting for clients to connect...\n\n";
while ((ClientSocket = accept(Socket, NULL, NULL)))
{
// Create a new thread for the accepted client (also pass the accepted client socket).
unsigned threadID;
HANDLE hThread = (HANDLE)_beginthreadex(NULL, 0, &ClientSession, (void*)ClientSocket, 0, &threadID);
}
}
ClientSession()
unsigned __stdcall ClientSession(void *data)
{
SOCKET ClientSocket = (SOCKET)data;
numCC ++; // increment the number of connected clients
cout << "Clients Connected: " << numCC << endl << endl; // output number of clients currently connected to the server
if (numCC <2)
{
cout << "Waiting for additional clients to connect...\n\n";
}
if (numCC ==2)
{
SendRender(); // ONLY TO CLIENT 1???????????
// wait for client render to complete and receive Done message back
memset(bufferReply, 0, 999); // set the memory of the buffer
int inDataLength = recv(ClientSocket,bufferReply,1000,0); // receive data from the server and store in the buffer
response = bufferReply; // assign contents of buffer to string var 'message'
cout << response << ". " << "Client 1 Render Cycle complete.\n\n";
SendRender(); // ONLY TO CLIENT 2????????????
}
return 0;
}
Sendrender() function (sends render command to the client)
int SendRender()
{
// Create message to send to client which will initialise rendering
char *szMessage = "Render";
// Send the Render message to the first client
iSendResult = send(ClientSocket, szMessage, strlen(szMessage), 0); // HOW TO SEND ONLY TO CLIENT 1???
if (iSendResult == SOCKET_ERROR)
{
// Display error if unable to send message
cout << "Failed to send message to Client " << numCC << ": ", WSAGetLastError();
closesocket(Socket);
WSACleanup();
return 1;
}
// notify user that Render command has been sent
cout << "Render command sent to Client " << numCC << endl << endl;
return 0;
}
You can provide both a wait function and a control function to the thread by adding a WaitForSingleObject (or WaitForMultipleObjects) call. Those API calls suspend the thread until some other thread sets an event handle. The API return value tells you which event handle was set, which you can use to determine which action to take.
Use a different event handle for each thread. To pass it in to a thread you will need a struct that contains both the event handle and the socket handle you are passing now. Passing a pointer to this struct into the thread is a way to, in effect, pass two parameters.
Your main thread will need to use CreateEvent to initialize the thread handles. Then after both sockets are connected it would set one event (SetEvent), triggering the first thread.