I am trying to implement two way communication using sockets and not quite sure where I'm going wrong. I have an application that launches a child application, the child application then tries to communicate with the application that launched it, but I am not getting anything.
In the application that launches the child:
int clsSocketThread::initialiseSocket(bool blnIsModule, QString strPurpose) {
const char* cpszLocalHost = "localhost";
//Get the socket
int intSocket = socket(AF_INET, SOCK_STREAM, 0);
if ( intSocket == 0 ) {
clsDebugService::exitWhenDebugQueueEmpty("Failed to create socket!");
}
struct hostent* pHostEntry = gethostbyname(cpszLocalHost);
if ( pHostEntry == nullptr ) {
clsDebugService::exitWhenDebugQueueEmpty("Unable to resolve ip address!");
}
//Initliase and get address of localhost
struct sockaddr_in srvAddr;
bzero((char*)&srvAddr, sizeof(srvAddr));
//Set-up server address
memcpy(&srvAddr.sin_addr, pHostEntry->h_addr_list[0], pHostEntry->h_length);
srvAddr.sin_family = AF_INET;
srvAddr.sin_port = htons(clsSocketThread::mscuint16Port);
char* pszIP = inet_ntoa(srvAddr.sin_addr);
if ( pszIP != nullptr ) {
qdbg() << "Setting up socket on ip: " << pszIP
<< ", port: " << clsSocketThread::mscuint16Port
<< ((strPurpose.isEmpty() == true) ? "" : strPurpose);
}
socklen_t tSvrAddr = sizeof(srvAddr);
int intRC;
#if !defined(STANDALONE)
if ( blnIsModule == true ) {
intRC = inet_pton(srvAddr.sin_family, pszIP, &srvAddr.sin_addr);
if ( intRC <= 0 ) {
clsDebugService::exitWhenDebugQueueEmpty("Invalid address not supported!");
}
intRC = ::connect(intSocket, (const struct sockaddr*)&srvAddr, tSvrAddr);
} else
#endif
{
intRC = bind(intSocket, (const struct sockaddr*)&srvAddr, tSvrAddr);
}
if ( intRC < 0 ) {
clsDebugService::exitWhenDebugQueueEmpty("Socket operation failed!");
}
if ( blnIsModule != true && listen(intSocket, 5) < 0 ) {
clsDebugService::exitWhenDebugQueueEmpty("Cannot listen to socket!");
}
return intSocket;
}
This function is used by both the launcher and the child, when the child calls it the first parameter is true. I've run both in debuggers and all the function calls are successful and there are no errors.
In the launching application I have a thread:
void clsSocketThread::serverSocketBody() {
if ( mintSocket == 0 ) {
mintSocket = clsSocketThread::initialiseSocket();
}
QByteArray qarybytBuffer;
char arycBuffer[2048];
int intNewSocket = 0;
size_t tBufferSize = sizeof(arycBuffer);
QJsonObject objJSON;
while( mpThread != nullptr ) {
if ( intNewSocket <= 0 ) {
struct sockaddr_in cliAddr;
socklen_t tCliLen = sizeof(cliAddr);
intNewSocket = accept(mintSocket, (struct sockaddr*)&cliAddr, &tCliLen);
if ( intNewSocket < 0 ) {
continue;
}
}
//Read from the other socket!
bzero(arycBuffer, tBufferSize);
ssize_t tRead = read(intNewSocket, arycBuffer, tBufferSize);
if ( tRead <= 0 ) {
continue;
}
qarybytBuffer = QByteArray(arycBuffer, (int)tRead);
int intIdx = qarybytBuffer.indexOf(clsJSON::msccOpenCurlyBracket)
,intIdx2 = qarybytBuffer.lastIndexOf(clsJSON::msccCloseCurlyBracket);
if ( intIdx >= 0 && intIdx2 > intIdx ) {
qarybytBuffer = qarybytBuffer.mid(intIdx, intIdx2 - intIdx + 1).trimmed();
QJsonObject objJSON(clsJSON(&qarybytBuffer).toQJsonObject());
if ( objJSON.contains(clsJSON::mscszMsgType) == true ) {
qdbg() << "[RX]Data: " << arycBuffer;//HACK
clsJSON::blnDecodeAccordingToType(&objJSON);
}
}
}
}
I'm not receiving any messages from the child. Both applications are set-up to communicate on localhost:8123
This is operating system specific.
For Linux, read Advanced Linux Programming then syscalls(2), socket(7), unix(7), fifo(7), pipe(7)
With Qt, consider using QSocketNotifier in your main thread.
You might also want to use POCO, ONCRPC, JSONRPC, Wt (perhaps with libcurl) or libonion.
You could get some inspiration by studying the C++ source code of Qt, of POCO, of VMIME.
Be aware that in many (but not all) cases, a single send(2) -or write(2)- on emitter side may correspond to several recv(2) -or read(2)- on the receiving side (and vice versa), at least with TCP on different machines. So you need some event loop (often around poll(2)...) and documented conventions on application-level message formats. Then SMTP or HTTP could be inspirational (and in some cases, useful).
Related
I am using C++ code snippet for port forwarding. The requirement is to do the hand shake between two ports. It should be two way communication. That is to forward what ever iscoming on the source port to destination port. And then to forward the response of the destination port to the source port.
This piece of code is working as expected on my mac system. But when I am running this code on Linux system I am facing one issue.
Issue:
The C++ code that I am using is having 3 parts:
establish_connection_to_source();
open_connection_to_destination();
processconnetion();
On Linux: establish_connection_to_source(); and open_connection_to_destination(); is working perfectly fine. But processconnetion(); is havng one issue.
Following is the process connection method:
void processconnetion()
{
buffer *todest = new buffer(socket_list[e_source].fd,socket_list[e_dest].fd);
buffer *tosrc = new buffer(socket_list[e_dest].fd,socket_list[e_source].fd);
if (todest == NULL || tosrc == NULL){
fprintf(stderr,"out of mememory\n");
exit(-1);
}
unsigned int loopcnt;
profilecommuncation srcprofile(COMM_BUFSIZE);
profilecommuncation destprofile(COMM_BUFSIZE);
while (true) {
int withevent = poll(socket_list, 2, -1);
loopcnt++;
fprintf(stderr,"loopcnt %d socketswith events = %d source:0x%x dest:0x%x\n", loopcnt, withevent, socket_list[e_source].revents, socket_list[e_dest].revents);
if ((socket_list[e_source].revents | socket_list[e_dest].revents) & (POLLHUP | POLLERR)) {
// one of the connections has a problem or has Hungup
fprintf(stderr,"socket_list[e_source].revents= 0x%X\n", socket_list[e_source].revents);
fprintf(stderr,"socket_list[e_dest].revents= 0x%X\n", socket_list[e_dest].revents);
fprintf(stderr,"POLLHUP= 0x%X\n", POLLHUP);
fprintf(stderr,"POLLERR= 0x%X\n", POLLERR);
int result;
socklen_t result_len = sizeof(result);
getsockopt(socket_list[e_dest].fd, SOL_SOCKET, SO_ERROR, &result, &result_len);
fprintf(stderr, "result = %d\n", result);
fprintf(stderr,"exiting as one connection had an issue\n");
break;
}
if (socket_list[e_source].revents & POLLIN) {
srcprofile.increment_size(todest->copydata());
}
if (socket_list[e_dest].revents & POLLIN) {
destprofile.increment_size(tosrc->copydata());
}
}
delete todest;
delete tosrc;
close(socket_list[e_source].fd);
close(socket_list[e_dest].fd);
srcprofile.dumpseensizes("source");
destprofile.dumpseensizes("destination");
}
Here it is giving error - exiting as one connection had an issue that means that if ((socket_list[e_source].revents | socket_list[e_dest].revents) & (POLLHUP | POLLERR)) is returning true. The issue is with the destination port and not in case of source.
Note:
Variales used in the processconnetion(); method:
socket_list is a structure of type pollfd. Following is the description:
struct pollfd {
int fd;
short events;
short revents;
};
pollfd socket_list[3];
#define e_source 0
#define e_dest 1
#define e_listen 2
Following is the output at the time for exit:
connecting to destination: destination IP / 32001.
connected...
loopcnt 1 socketswith events = 1 source:0x0 dest:0x10
socket_list[e_source].revents= 0x0
socket_list[e_dest].revents= 0x10
POLLHUP= 0x10
POLLERR= 0x8
result = 0
exiting as one connection had an issue
int withevent = poll(socket_list, 2, -1); here the withevent value returned is 1
Socket List Initialisation:
guard( (socket_list[e_listen].fd = socket( PF_INET, SOCK_STREAM, IPPROTO_TCP )), "Failed to create socket listen, error: %s\n", "created listen socket");
void guard(int n, char *msg, char *success)
{
if (n < 0) {
fprintf(stderr, msg, strerror(errno) );
exit(-1);
}
fprintf(stderr,"n = %d %s\n",n, success);
}
I am not able to figure out the issue as it is working fine in mac. Any leads why this behaviour in Linux is highly appreciated. Thanks in advance.
I want to try and correlate an IP packet (using libpcap) to a process. I have had some limited success using the relevant /proc/net/ files but found that on some of the machines i'm using, this file can be many thousands of lines and parsing it is not efficient (caching has alleviated some performance problems).
I read that using sock_diag netlink subsystem could help by directly querying the kernel about the socket I am interested in. I've had limited success with my attempts but have hit a mental block on what is wrong.
For the initial query I have:
if (query_fd_) {
struct {
nlmsghdr nlh;
inet_diag_req_v2 id_req;
} req = {
.nlh = {
.nlmsg_len = sizeof(req),
.nlmsg_type = SOCK_DIAG_BY_FAMILY,
.nlmsg_flags = NLM_F_REQUEST | NLM_F_DUMP
},
.id_req = {
.sdiag_family = packet.l3_protocol,
.sdiag_protocol = packet.l4_protocol,
.idiag_ext = 0,
.pad = 0,
.idiag_states = -1,
.id = {
.idiag_sport = packet.src_port,
.idiag_dport = packet.dst_port
}
}
};
//packet ips are just binary data stored as strings!
memcpy(req.id_req.id.idiag_src, packet.src_ip.c_str(), 4);
memcpy(req.id_req.id.idiag_dst, packet.dst_ip.c_str(), 4);
struct sockaddr_nl nladdr = {
.nl_family = AF_NETLINK
};
struct iovec iov = {
.iov_base = &req,
.iov_len = sizeof(req)
};
struct msghdr msg = {
.msg_name = (void *) &nladdr,
.msg_namelen = sizeof(nladdr),
.msg_iov = &iov,
.msg_iovlen = 1
};
// Send message to kernel
for (;;) {
if (sendmsg(query_fd_, &msg, 0) < 0) {
if (errno == EINTR)
continue;
perror("sendmsg");
return false;
}
return true;
}
}
return false;
For the receive code I have:
long buffer[8192];
struct sockaddr_nl nladdr = {
.nl_family = AF_NETLINK
};
struct iovec iov = {
.iov_base = buffer,
.iov_len = sizeof(buffer)
};
struct msghdr msg = {
.msg_name = (void *) &nladdr,
.msg_namelen = sizeof(nladdr),
.msg_iov = &iov,
.msg_iovlen = 1
};
int flags = 0;
for (;;) {
ssize_t rv = recvmsg(query_fd_, &msg, flags);
// error handling
if (rv < 0) {
if (errno == EINTR)
continue;
if ((errno == EAGAIN) || (errno == EWOULDBLOCK))
break;
perror("Failed to recv from netlink socket");
return 0;
}
if (rv == 0) {
printf("Unexpected shutdown of NETLINK socket");
return 0;
}
for (const struct nlmsghdr* header = reinterpret_cast<const struct nlmsghdr*>(buffer);
rv >= 0 && NLMSG_OK(header, static_cast<uint32_t>(rv));
header = NLMSG_NEXT(header, rv)) {
// The end of multipart message
if (header->nlmsg_type == NLMSG_DONE)
return 0;
if (header->nlmsg_type == NLMSG_ERROR) {
const struct nlmsgerr *err = reinterpret_cast<nlmsgerr*>(NLMSG_DATA(header));
if (err == NULL)
return 100;
errno = -err->error;
perror("NLMSG_ERROR");
return 0;
}
if (header->nlmsg_type != SOCK_DIAG_BY_FAMILY) {
printf("unexpected nlmsg_type %u\n", (unsigned)header->nlmsg_type);
continue;
}
// Get the details....
const struct inet_diag_msg* diag = reinterpret_cast<inet_diag_msg*>(NLMSG_DATA(header));
if (header->nlmsg_len < NLMSG_LENGTH(sizeof(*diag))) {
printf("Message too short %d vs %d\n", header->nlmsg_len, NLMSG_LENGTH(sizeof(*diag)));
return 0;
}
if (diag->idiag_family != PF_INET) {
printf("unexpected family %u\n", diag->idiag_family);
return 1;
}
return diag->idiag_inode;
The Problem:
The diag->udiag_inode value doesn't match the one I see in netstat output or in the /proc/net/ files. Is it supposed too? If not, is it possible to use this approach to retrieve the inode number for the process so that I can then query /proc for the corresponding PID?
Another thing I didn't quite understand is the NLMSG_DONE when checking the nlmsg_type in the header. What I am seeing:
1 - TCP 10.0.9.15:51002 -> 192.168.64.11:3128 [15047]
2 - TCP 192.168.64.11:3128 -> 10.0.9.15:51002 [0]
3 - TCP 10.0.9.15:51002 -> 192.168.64.11:3128 [0]
4 - TCP 192.168.64.11:3128 -> 10.0.9.15:51002 [15047]
5 - TCP 10.0.9.15:51002 -> 192.168.64.11:3128 [0]
6 - TCP 192.168.64.11:3128 -> 10.0.9.15:51002 [0]
7 - TCP 10.0.9.15:51002 -> 192.168.64.11:3128 [15047]
So I get an inode number on first query, then some NLMSG_DONE returns (stepping through code confirmed this was the path). Why don't I get the same result for say lines 1 and 3?
Appreciate any help or advice.
Found the answer and posting in case anyone stumbles across it:
I had a uint16_t as the return type from the recv code when in fact it should have been ino_t or uint32_t. I discovered this when I noticed that a few of the inodes matched correctly after a fresh reboot and then after a while stopped matching with no code changed (inode count obviously incrementing). Using the correct type in the function return sorted the problem (so the code I posted is actually correct!)
I was getting multi part messages. I should have looped whilst NLM_F_MULTI was set in the flags and then left the loop when receiving NLMSG_DONE.
I modified thegeekinthecorner examples to be able to continuously send data.
I am using g++4.9.2.
I tried uninstalling the oficial latest OFED from here http://downloads.openfabrics.org/OFED/
OFED Distribution Software Installation Menu
1) View OFED Installation Guide
2) Install OFED Software
3) Show Installed Software
4) Configure IPoIB
5) Uninstall OFED Software
Q) Exit
Select Option [1-5]:5
Uninstalling the previous version of OFED
Running rpm -e --allmatches libibverbs libibverbs-devel libibverbs-utils libmthca libmlx4 libcxgb3 libnes libipathverbs libibcm libibumad libibumad-devel libibmad ibacm librdmacm librdmacm-utils librdmacm-devel opensm opensm-libs dapl perftest mstflint ibutils infiniband-diags qperf infinipath-psm opensm opensm-libs libipathverbs dapl libibcm libibmad libibumad libibumad-devel libibverbs libibverbs-devel libibverbs-utils libipathverbs libmthca libmlx4 librdmacm librdmacm-devel librdmacm-utils ibacm ibutils ibutils-libs libnes infinipath-psm
Failed to uninstall the previous installation
See /tmp/OFED.22320.logs/ofed_uninstall.log
[idf#node1 OFED-1.5.4-20110726-0732]$
[idf#node1 OFED-1.5.4-20110726-0732]$
If instead I just try to install it, I get this:
OFED Distribution Software Installation Menu
1) Basic (OFED modules and basic user level libraries)
2) HPC (OFED modules and libraries, MPI and diagnostic tools)
3) All packages (all of Basic, HPC)
4) Customize
Q) Exit
Select Option [1-4]:3
Please choose an implementation of MVAPICH2:
1) OFA (IB and iWARP)
2) uDAPL
Implementation [1]: 1
Enable ROMIO support [Y/n]:
Enable shared library support [Y/n]:
Enable Checkpoint-Restart support [y/N]:
Kernel 3.10.0-229.7.2.el7.x86_64 is not supported.
For the list of Supported Platforms and Operating Systems see
/mnt/gluster/Downloads/OFED-1.5.4-20110726-0732/docs/OFED_release_notes.txt
[idf#node1 OFED-1.5.4-20110726-0732]$
[idf#node2 Release]$ lspci | grep -i mel
02:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0)
[idf#node2 Release]$
[idf#node1 Release]$ ibv_devinfo
hca_id: mlx4_0
transport: InfiniBand (0)
fw_ver: 2.7.200
node_guid: 0025:90ff:ff1a:081c
sys_image_guid: 0025:90ff:ff1a:081f
vendor_id: 0x02c9
vendor_part_id: 26428
hw_ver: 0xB0
board_id: SM_2092000001000
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 4096 (5)
sm_lid: 1
port_lid: 2
port_lmc: 0x00
link_layer: InfiniBand
[idf#node1 Release]$ ifconfig -a
ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 2044
inet 192.168.0.1 netmask 255.255.255.0 broadcast 192.168.0.255
inet6 fe80::225:90ff:ff1a:71 prefixlen 64 scopeid 0x20<link>
Infiniband hardware address can be incorrect! Please read BUGS section in ifconfig(8).
infiniband 80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00 txqueuelen 256 (InfiniBand)
RX packets 5 bytes 280 (280.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 27 overruns 0 carrier 0 collisions 0
Below is the client and server. When I run this programs, the clients will send messages, but the number of messages it sends is erratic, error messages are often
Client:
#include <iostream>
#include <thread>
#include <netdb.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <rdma/rdma_cma.h>
#define TEST_NZ(x) do { if ( (x)) die("error: " #x " failed (returned non-zero)." ); } while (0)
#define TEST_Z(x) do { if (!(x)) die("error: " #x " failed (returned zero/null)."); } while (0)
const int BUFFER_SIZE = 2048;
const int TIMEOUT_IN_MS = 500; /* ms */
struct context
{
struct ibv_context *ctx;
struct ibv_pd *pd;
struct ibv_cq *cq;
struct ibv_comp_channel *comp_channel;
pthread_t cq_poller_thread;
};
struct connection
{
struct rdma_cm_id *id;
struct ibv_qp *qp;
struct ibv_mr *recv_mr;
struct ibv_mr *send_mr;
char *recv_region;
char *send_region;
int num_completions;
};
static pthread_t msgThread;
static void die(const char *reason);
static void build_context(struct ibv_context *verbs);
static void build_qp_attr(struct ibv_qp_init_attr *qp_attr);
static void * poll_cq(void *);
static void post_receives(struct connection *conn);
static void register_memory(struct connection *conn);
static int on_addr_resolved(struct rdma_cm_id *id);
static void on_completion(struct ibv_wc *wc);
static int on_connection(void *context);
static int on_disconnect(struct rdma_cm_id *id);
static int on_event(struct rdma_cm_event *event);
static int on_route_resolved(struct rdma_cm_id *id);
static struct context *s_ctx = NULL;
#include <mutex> // std::mutex, std::unique_lock
#include <condition_variable> // std::condition_variable
std::mutex mtx;
std::condition_variable cv;
bool ok_to_send_next_message = 1;
bool message_available()
{
return 0 != ok_to_send_next_message;
}
int main(int argc, char **argv)
{
struct addrinfo *addr;
struct rdma_cm_event *event = NULL;
struct rdma_cm_id *conn= NULL;
struct rdma_event_channel *ec = NULL;
if (argc != 3)
die("usage: client <server-address> <server-port>");
TEST_NZ(getaddrinfo(argv[1], argv[2], NULL, &addr));
TEST_Z(ec = rdma_create_event_channel());
TEST_NZ(rdma_create_id(ec, &conn, NULL, RDMA_PS_TCP));
TEST_NZ(rdma_resolve_addr(conn, NULL, addr->ai_addr, TIMEOUT_IN_MS));
freeaddrinfo(addr);
while (0 == rdma_get_cm_event(ec, &event))
//while (rdma_get_cm_event(ec, &event))
{
std::cout << "rdma_get_cm_event\n";
struct rdma_cm_event event_copy;
memcpy(&event_copy, event, sizeof(*event));
rdma_ack_cm_event(event);
if (on_event(&event_copy))
break;
}
rdma_destroy_event_channel(ec);
return 0;
}
void die(const char *reason)
{
fprintf(stderr, "%s\n", reason);
exit(EXIT_FAILURE);
}
void build_context(struct ibv_context *verbs)
{
if (s_ctx)
{
if (s_ctx->ctx != verbs)
die("cannot handle events in more than one context.");
return;
}
s_ctx = (struct context *)malloc(sizeof(struct context));
s_ctx->ctx = verbs;
TEST_Z(s_ctx->pd = ibv_alloc_pd(s_ctx->ctx));
TEST_Z(s_ctx->comp_channel = ibv_create_comp_channel(s_ctx->ctx));
TEST_Z(s_ctx->cq = ibv_create_cq(s_ctx->ctx, 100, NULL, s_ctx->comp_channel, 0)); /* cqe=10 is arbitrary */
TEST_NZ(ibv_req_notify_cq(s_ctx->cq, 0));
TEST_NZ(pthread_create(&s_ctx->cq_poller_thread, NULL, poll_cq, NULL));
}
void *SendMessages(void *context)
{
static int loopcount = 0;
while(1)
{
std::unique_lock<std::mutex> lck(mtx);
cv.wait(lck, message_available);
//std::this_thread::sleep_for(std::chrono::microseconds(50));
ok_to_send_next_message = 0;
struct connection *conn = (struct connection *)context;
struct ibv_send_wr wr, *bad_wr = NULL;
struct ibv_sge sge;
std::cout << "looping send..." << loopcount << '\n' << std::flush;
memset(&wr, 0, sizeof(wr));
wr.wr_id = (uintptr_t)conn;
wr.opcode = IBV_WR_SEND;
wr.sg_list = &sge;
wr.num_sge = 1;
wr.send_flags = IBV_SEND_SIGNALED;
sge.addr = (uintptr_t)conn->send_region;
sge.length = BUFFER_SIZE;
sge.lkey = conn->send_mr->lkey;
snprintf(conn->send_region, BUFFER_SIZE, "message from active/client side with count %d", loopcount++);
TEST_NZ(ibv_post_send(conn->qp, &wr, &bad_wr));
}
}
void build_qp_attr(struct ibv_qp_init_attr *qp_attr)
{
std::cout << "build_qp_attr\n";
memset(qp_attr, 0, sizeof(*qp_attr));
qp_attr->send_cq = s_ctx->cq;
qp_attr->recv_cq = s_ctx->cq;
qp_attr->qp_type = IBV_QPT_RC;
qp_attr->cap.max_send_wr = 100;
qp_attr->cap.max_recv_wr = 100;
qp_attr->cap.max_send_sge = 1;
qp_attr->cap.max_recv_sge = 1;
}
void * poll_cq(void *ctx)
{
struct ibv_cq *cq;
struct ibv_wc wc;
while (1)
{
TEST_NZ(ibv_get_cq_event(s_ctx->comp_channel, &cq, &ctx));
ibv_ack_cq_events(cq, 1);
TEST_NZ(ibv_req_notify_cq(cq, 0));
int ne;
struct ibv_wc wc;
do
{
std::cout << "polling\n";
ne = ibv_poll_cq(cq, 1, &wc);
}
while(ne == 0);
on_completion(&wc);
//if (wc.opcode == IBV_WC_SEND)
if (wc.status == IBV_WC_SUCCESS)
{
{
ok_to_send_next_message = 1;
//while (message_available()) std::this_thread::yield();
//std::cout << "past yield\n";
std::unique_lock<std::mutex> lck(mtx);
cv.notify_one();
}
}
}
return NULL;
}
void post_receives(struct connection *conn)
{
std::cout << "post_receives\n";
struct ibv_recv_wr wr, *bad_wr = NULL;
struct ibv_sge sge;
wr.wr_id = (uintptr_t)conn;
wr.next = NULL;
wr.sg_list = &sge;
wr.num_sge = 1;
sge.addr = (uintptr_t)conn->recv_region;
sge.length = BUFFER_SIZE;
sge.lkey = conn->recv_mr->lkey;
TEST_NZ(ibv_post_recv(conn->qp, &wr, &bad_wr));
}
void register_memory(struct connection *conn)
{
std::cout << "register_memory\n";
conn->send_region = (char *)malloc(BUFFER_SIZE);
conn->recv_region = (char *)malloc(BUFFER_SIZE);
TEST_Z(conn->send_mr = ibv_reg_mr(
s_ctx->pd,
conn->send_region,
BUFFER_SIZE,
IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE));
TEST_Z(conn->recv_mr = ibv_reg_mr(
s_ctx->pd,
conn->recv_region,
BUFFER_SIZE,
IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE));
}
int on_addr_resolved(struct rdma_cm_id *id)
{
std::cout << "on_addr_resolved\n";
struct ibv_qp_init_attr qp_attr;
struct connection *conn;
build_context(id->verbs);
build_qp_attr(&qp_attr);
TEST_NZ(rdma_create_qp(id, s_ctx->pd, &qp_attr));
id->context = conn = (struct connection *)malloc(sizeof(struct connection));
conn->id = id;
conn->qp = id->qp;
conn->num_completions = 0;
register_memory(conn);
post_receives(conn);
TEST_NZ(rdma_resolve_route(id, TIMEOUT_IN_MS));
return 0;
}
void on_completion(struct ibv_wc *wc)
{
std::cout << "on_completion\n";
struct connection *conn = (struct connection *)(uintptr_t)wc->wr_id;
if (wc->status != IBV_WC_SUCCESS)
{
//die("\ton_completion: status is not IBV_WC_SUCCESS.");
printf("\ton_completion: status is not IBV_WC_SUCCESS.");
printf("\t it is %d ", wc->status);
}
printf("\n");
if (wc->opcode & IBV_WC_RECV)
printf("\treceived message: %s\n", conn->recv_region);
else if (wc->opcode == IBV_WC_SEND)
printf("\tsend completed successfully.\n");
else
die("\ton_completion: completion isn't a send or a receive.");
if (5 == ++conn->num_completions)
rdma_disconnect(conn->id);
}
int on_connection(void *context)
{
std::cout << "on_connection\n";
TEST_NZ(pthread_create(&msgThread, NULL, SendMessages, context));
return 0;
}
int on_disconnect(struct rdma_cm_id *id)
{
struct connection *conn = (struct connection *)id->context;
printf("disconnected.\n");
rdma_destroy_qp(id);
ibv_dereg_mr(conn->send_mr);
ibv_dereg_mr(conn->recv_mr);
free(conn->send_region);
free(conn->recv_region);
free(conn);
rdma_destroy_id(id);
return 1; /* exit event loop */
}
int on_route_resolved(struct rdma_cm_id *id)
{
struct rdma_conn_param cm_params;
printf("route resolved.\n");
memset(&cm_params, 0, sizeof(cm_params));
TEST_NZ(rdma_connect(id, &cm_params));
return 0;
}
int on_event(struct rdma_cm_event *event)
{
int r = 0;
if (event->event == RDMA_CM_EVENT_ADDR_RESOLVED)
r = on_addr_resolved(event->id);
else if (event->event == RDMA_CM_EVENT_ROUTE_RESOLVED)
r = on_route_resolved(event->id);
else if (event->event == RDMA_CM_EVENT_ESTABLISHED)
r = on_connection(event->id->context);
else if (event->event == RDMA_CM_EVENT_DISCONNECTED)
r = on_disconnect(event->id);
else
die("on_event: unknown event.");
return r;
}
Server:
#include <iostream>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <inttypes.h>
#include <rdma/rdma_cma.h>
#define TEST_NZ(x) do { if ( (x)) die("error: " #x " failed (returned non-zero)." ); } while (0)
#define TEST_Z(x) do { if (!(x)) die("error: " #x " failed (returned zero/null)."); } while (0)
const int BUFFER_SIZE = 2048;
struct context
{
struct ibv_context *ctx;
struct ibv_pd *pd;
struct ibv_cq *cq;
struct ibv_comp_channel *comp_channel;
pthread_t cq_poller_thread;
};
struct connection
{
struct ibv_qp *qp;
struct ibv_mr *recv_mr;
struct ibv_mr *send_mr;
char *recv_region;
char *send_region;
};
static void die(const char *reason);
static void build_context(struct ibv_context *verbs);
static void build_qp_attr(struct ibv_qp_init_attr *qp_attr);
static void * poll_cq(void *);
static void post_receives(struct connection *conn);
static void register_memory(struct connection *conn);
static void on_completion(struct ibv_wc *wc);
static int on_connect_request(struct rdma_cm_id *id);
static int on_connection(void *context);
static int on_disconnect(struct rdma_cm_id *id);
static int on_event(struct rdma_cm_event *event);
static struct context *s_ctx = NULL;
int main(int argc, char **argv)
{
struct sockaddr_in6 addr;
struct rdma_cm_event *event = NULL;
struct rdma_cm_id *listener = NULL;
struct rdma_event_channel *ec = NULL;
uint16_t port = 0;
memset(&addr, 0, sizeof(addr));
addr.sin6_family = AF_INET6;
TEST_Z(ec = rdma_create_event_channel());
TEST_NZ(rdma_create_id(ec, &listener, NULL, RDMA_PS_TCP));
TEST_NZ(rdma_bind_addr(listener, (struct sockaddr *)&addr));
TEST_NZ(rdma_listen(listener, 100)); /* backlog=10 is arbitrary */
//printf("[ %"PRIu32" ]\n", *addr.sin6_addr.s6_addr32);
port = ntohs(rdma_get_src_port(listener));
printf("listening on port %d.\n", port);
while (rdma_get_cm_event(ec, &event) == 0)
{
struct rdma_cm_event event_copy;
memcpy(&event_copy, event, sizeof(*event));
rdma_ack_cm_event(event);
if (on_event(&event_copy))
break;
}
rdma_destroy_id(listener);
rdma_destroy_event_channel(ec);
return 0;
}
void die(const char *reason)
{
fprintf(stderr, "%s\n", reason);
exit(EXIT_FAILURE);
}
void build_context(struct ibv_context *verbs)
{
if (s_ctx)
{
if (s_ctx->ctx != verbs)
die("cannot handle events in more than one context.");
return;
}
s_ctx = (struct context *)malloc(sizeof(struct context));
s_ctx->ctx = verbs;
TEST_Z(s_ctx->pd = ibv_alloc_pd(s_ctx->ctx));
TEST_Z(s_ctx->comp_channel = ibv_create_comp_channel(s_ctx->ctx));
TEST_Z(s_ctx->cq = ibv_create_cq(s_ctx->ctx, 100, NULL, s_ctx->comp_channel, 0)); /* cqe=10 is arbitrary */
TEST_NZ(ibv_req_notify_cq(s_ctx->cq, 0));
TEST_NZ(pthread_create(&s_ctx->cq_poller_thread, NULL, poll_cq, NULL));
}
void build_qp_attr(struct ibv_qp_init_attr *qp_attr)
{
memset(qp_attr, 0, sizeof(*qp_attr));
qp_attr->send_cq = s_ctx->cq;
qp_attr->recv_cq = s_ctx->cq;
qp_attr->qp_type = IBV_QPT_RC;
qp_attr->cap.max_send_wr = 100;
qp_attr->cap.max_recv_wr = 100;
qp_attr->cap.max_send_sge = 1;
qp_attr->cap.max_recv_sge = 1;
}
void * poll_cq(void *ctx)
{
struct ibv_cq *cq;
struct ibv_wc wc;
while (1)
{
TEST_NZ(ibv_get_cq_event(s_ctx->comp_channel, &cq, &ctx));
ibv_ack_cq_events(cq, 1);
TEST_NZ(ibv_req_notify_cq(cq, 0));
while (ibv_poll_cq(cq, 1, &wc))
{
std::cout << "polling\n";
on_completion(&wc);
}
}
return NULL;
}
void post_receives(struct connection *conn)
{
std::cout << "post_receives\n";
struct ibv_recv_wr wr, *bad_wr = NULL;
struct ibv_sge sge;
wr.wr_id = (uintptr_t)conn;
wr.next = NULL;
wr.sg_list = &sge;
wr.num_sge = 1;
sge.addr = (uintptr_t)conn->recv_region;
sge.length = BUFFER_SIZE;
sge.lkey = conn->recv_mr->lkey;
TEST_NZ(ibv_post_recv(conn->qp, &wr, &bad_wr));
}
void register_memory(struct connection *conn)
{
conn->send_region = (char *)malloc(BUFFER_SIZE);
conn->recv_region = (char *)malloc(BUFFER_SIZE);
TEST_Z(conn->send_mr = ibv_reg_mr(
s_ctx->pd,
conn->send_region,
BUFFER_SIZE,
IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE));
TEST_Z(conn->recv_mr = ibv_reg_mr(
s_ctx->pd,
conn->recv_region,
BUFFER_SIZE,
IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE));
}
void on_completion(struct ibv_wc *wc)
{
if (wc->status != IBV_WC_SUCCESS)
die("on_completion: status is not IBV_WC_SUCCESS.");
if (wc->opcode & IBV_WC_RECV)
{
struct connection *conn = (struct connection *)(uintptr_t)wc->wr_id;
post_receives(conn);
printf("received message: %s\n", conn->recv_region);
}
else if (wc->opcode == IBV_WC_SEND)
{
printf("send completed successfully.\n");
}
}
int on_connect_request(struct rdma_cm_id *id)
{
struct ibv_qp_init_attr qp_attr;
struct rdma_conn_param cm_params;
struct connection *conn;
printf("received connection request.\n");
build_context(id->verbs);
build_qp_attr(&qp_attr);
TEST_NZ(rdma_create_qp(id, s_ctx->pd, &qp_attr));
id->context = conn = (struct connection *)malloc(sizeof(struct connection));
conn->qp = id->qp;
register_memory(conn);
post_receives(conn);
memset(&cm_params, 0, sizeof(cm_params));
TEST_NZ(rdma_accept(id, &cm_params));
return 0;
}
int on_connection(void *context)
{
struct connection *conn = (struct connection *)context;
struct ibv_send_wr wr, *bad_wr = NULL;
struct ibv_sge sge;
snprintf(conn->send_region, BUFFER_SIZE, "message from passive/server side with pid %d", getpid());
printf("connected. posting send...\n");
memset(&wr, 0, sizeof(wr));
wr.opcode = IBV_WR_SEND;
wr.sg_list = &sge;
wr.num_sge = 1;
wr.send_flags = IBV_SEND_SIGNALED;
sge.addr = (uintptr_t)conn->send_region;
sge.length = BUFFER_SIZE;
sge.lkey = conn->send_mr->lkey;
TEST_NZ(ibv_post_send(conn->qp, &wr, &bad_wr));
return 0;
}
int on_disconnect(struct rdma_cm_id *id)
{
struct connection *conn = (struct connection *)id->context;
printf("peer disconnected.\n");
rdma_destroy_qp(id);
ibv_dereg_mr(conn->send_mr);
ibv_dereg_mr(conn->recv_mr);
free(conn->send_region);
free(conn->recv_region);
free(conn);
rdma_destroy_id(id);
return 0;
}
int on_event(struct rdma_cm_event *event)
{
std::cout << "on_event\n";
int r = 0;
if (event->event == RDMA_CM_EVENT_CONNECT_REQUEST)
r = on_connect_request(event->id);
else if (event->event == RDMA_CM_EVENT_ESTABLISHED)
r = on_connection(event->id->context);
else if (event->event == RDMA_CM_EVENT_DISCONNECTED)
r = on_disconnect(event->id);
else
die("on_event: unknown event.");
return r;
}
Here are a couple of runs. Totally random the number of message sent:
[idf#node1 Release]$ ./TGKITCClient 192.168.0.1 47819
rdma_get_cm_event
on_addr_resolved
build_qp_attr
register_memory
post_receives
rdma_get_cm_event
route resolved.
rdma_get_cm_event
on_connection
looping send...0
polling
on_completion
received message: message from passive/server side with pid 4188
polling
on_completion
send completed successfully.
looping send...1
polling
on_completion
send completed successfully.
^C
[idf#node1 Release]$
And then
[idf#node1 Release]$ ./TGKITCClient 192.168.0.1 55148
rdma_get_cm_event
on_addr_resolved
build_qp_attr
register_memory
post_receives
rdma_get_cm_event
route resolved.
rdma_get_cm_event
on_connection
looping send...0
polling
on_completion
received message: message from passive/server side with pid 4279
polling
on_completion
send completed successfully.
looping send...1
polling
on_completion
send completed successfully.
looping send...2
polling
on_completion
send completed successfully.
looping send...3
polling
on_completion
send completed successfully.
looping send...4
polling
on_completion
send completed successfully.
looping send...5
polling
on_completion
send completed successfully.
looping send...6
polling
on_completion
send completed successfully.
looping send...7
polling
on_completion
send completed successfully.
looping send...8
rdma_get_cm_event
disconnected.
polling
on_completion
send completed successfully.
on_completion: status is not IBV_WC_SUCCESS. it is 5 [idf#node1 Release]$
Here is the server side:
on_event
peer disconnected.
on_event
received connection request.
post_receives
on_event
connected. posting send...
polling
send completed successfully.
polling
post_receives
received message: message from active/client side with count 0
polling
post_receives
received message: message from active/client side with count 1
polling
post_receives
received message: message from active/client side with count 2
polling
post_receives
received message: message from active/client side with count 3
polling
post_receives
received message: message from active/client side with count 4
polling
post_receives
received message: message from active/client side with count 5
polling
post_receives
received message: message from active/client side with count 6
polling
post_receives
received message: message from active/client side with count 7
on_event
peer disconnected.
Make sure that the most recent drivers and firmware on the cards are installed. Beyond that, using the RDMA packages included with most OS distributions when trying to run IB is a dangerous game to play.
It is strongly recommended that for applications like these the Open Fabrics Enterprise Distribution should be used to provide openib, opensm and a variety of other useful infiniband related packages for analysis diagnostics and tuning of a network. The official OFED packages can be found on the OpenFabrics website.
Based on the question it looks like IPoIB is being used but the specific configuration is not mentioned. IPoIB is not necessarily the best way to take advantages of the hardware resources available in the IB cards.
In addition to those considerations making sure that the subnetmanager is setup and configured correctly. Some switches have built-in subnet managers that can be access and configured through a management interface, in other cases it might make more sense to run and configure the subnet manager on one of the nodes that you are using. OpenSM is a common subnet manager that is included with OFED distributions and there are many online guides available for setting up and configuring a subnet manager based on the type of network being setup.
OFED also includes a variety of IB testing and profiling tools. ibdiagnet is a useful tool for debugging IB network issues. And there are many guides available online that show different ways to make use of the tool as well as the other tools included in OFED.
Depending on the type of IB switch used there may additionally be some network management and diagnostic tools that would allow for further analysis of the network. The configuration of IB hardware and the low-level software that manages it is sometimes more critical to overall performance than the actual code being run. But with that being said recompiling and linking to relevant libraries from the correct version of OFED might be advisable if significant software of hardware configuration changes are made.
I'm trying to make a select-server in order to receive connection from several clients (all clients will connect to the same port).
The server accepts the first 2 clients, but unless one of them disconnects, it will not accept a new one.
I'm starting to listen the the server port like this:
listen(m_socketId, SOMAXCONN);
and using the select command like this:
int selected = select(m_maxSocketId + 1, &m_socketReadSet, NULL, NULL, 0);
I've added some code.
bool TcpServer::Start(char* ipAddress, int port)
{
m_active = true;
FD_ZERO(&m_socketMasterSet);
bool listening = m_socket->Listen(ipAddress, port);
// Start listening.
m_maxSocketId = m_socket->GetId();
FD_SET(m_maxSocketId, &m_socketMasterSet);
if (listening == true)
{
StartThread(&InvokeListening);
StartReceiving();
return true;
}
else
{
return false;
}
}
void TcpServer::Listen()
{
while (m_active == true)
{
m_socketReadSet = m_socketMasterSet;
int selected = select(m_maxSocketId + 1, &m_socketReadSet, NULL, NULL, 0);
if (selected <= 0)
continue;
bool accepted = Accept();
if (accepted == false)
{
ReceiveFromSockets();
}
}
}
bool TcpServer::Accept()
{
int listenerId = m_socket->GetId();
if (FD_ISSET(listenerId, &m_socketReadSet) == true)
{
struct sockaddr_in remoteAddr;
int addrSize = sizeof(remoteAddr);
unsigned int newSockId = accept(listenerId, (struct sockaddr *)&remoteAddr, &addrSize);
if (newSockId == -1) // Invalid socket...
{
return false;
}
if (newSockId > m_maxSocketId)
{
m_maxSocketId = newSockId;
}
m_clientUniqueId++;
// Remembering the new socket, so we'll be able to check its state
// the next time.
FD_SET(newSockId, &m_socketMasterSet);
CommEndPoint remote(remoteAddr);
CommEndPoint local = m_socket->GetLocalPoint();
ClientId* client = new ClientId(m_clientUniqueId, newSockId, local, remote);
m_clients.Add(client);
StoreNewlyAcceptedClient(client);
char acceptedMsg = CommInternalServerMsg::ConnectionAccepted;
Server::Send(CommMessageType::Internal, client, &acceptedMsg, sizeof(acceptedMsg));
return true;
}
return false;
}
I hope it's enough :)
what's wrong with it?
The by far most common error with select() is not re-initializing the fd sets on every iteration. The second, third, and forth arguments are updated by the call, so you have to populate them again every time.
Post more code, so people can actually help you.
Edit 0:
fd_set on Windows is a mess :)
It's not allowed to copy construct fd_set objects:
m_socketReadSet = m_socketMasterSet;
This combined with Nikolai's correct statement that select changes the set passed in probably accounts for your error.
poll (On Windows, WSAPoll) is a much friendlier API.
Windows also provides WSAEventSelect and (Msg)WaitForMultipleObjects(Ex), which doesn't have a direct equivalent on Unix, but allows you to wait on sockets, files, thread synchronization events, timers, and UI messages at the same time.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Want to know the ESSID of wireless network via C++ in UBUNTU
Hello I have written the following code which is a part of a project. It is used to find the ESSID of the current associated network. But it has a flaw that it also the displays the ESSID of the network with which I am not associated i.e. if I try to associate myself with a wireless n/w and if it is unsuccessfull i.e. NO DHCP OFFERS ARE RECEIVED, then also it will display the that ESSID with which I have made my attempt.
Is it possible to find the BSSID of current associated wireless network as it is the only way with which I can mark b/w associated and non associated, e.g. with an ioctl call?
int main (void)
{
int errno;
struct iwreq wreq;
CStdString result = "None";
int sockfd;
char * id;
char ESSID[20];
memset(&wreq, 0, sizeof(struct iwreq));
if((sockfd = socket(AF_INET, SOCK_DGRAM, 0)) == -1) {
fprintf(stderr, "Cannot open socket \n");
fprintf(stderr, "errno = %d \n", errno);
fprintf(stderr, "Error description is : %s\n",strerror(errno));
return result ;
}
CLog::Log(LOGINFO,"Socket opened successfully");
FILE* fp = fopen("/proc/net/dev", "r");
if (!fp)
{
// TBD: Error
return result;
}
char* line = NULL;
size_t linel = 0;
int n;
char* p;
int linenum = 0;
while (getdelim(&line, &linel, '\n', fp) > 0)
{
// skip first two lines
if (linenum++ < 2)
continue;
p = line;
while (isspace(*p))
++p;
n = strcspn(p, ": \t");
p[n] = 0;
strcpy(wreq.ifr_name, p);
id = new char[IW_ESSID_MAX_SIZE+100];
wreq.u.essid.pointer = id;
wreq.u.essid.length = 100;
if ( ioctl(sockfd,SIOCGIWESSID, &wreq) == -1 ) {
continue;
}
else
{
strcpy(ESSID,id);
return ESSID;
}
free(id);
}
free(line);
fclose(fp);
return result;
}
Note: Since this question seems to be duplicated in two places, I'm repeating my answer here as well.
You didn't mention whether you were using an independent basic service set or not (i.e., an ad-hoc network with no controlling access point), so if you're not trying to create an ad-hoc network, then the BSSID should be the MAC address of the local access point. The ioctl() constant you can use to access that information is SIOCGIWAP. The ioctl payload information will be stored inside of your iwreq structure at u.ap_addr.sa_data.