get process inode using netlink - c++

I want to try and correlate an IP packet (using libpcap) to a process. I have had some limited success using the relevant /proc/net/ files but found that on some of the machines i'm using, this file can be many thousands of lines and parsing it is not efficient (caching has alleviated some performance problems).
I read that using sock_diag netlink subsystem could help by directly querying the kernel about the socket I am interested in. I've had limited success with my attempts but have hit a mental block on what is wrong.
For the initial query I have:
if (query_fd_) {
struct {
nlmsghdr nlh;
inet_diag_req_v2 id_req;
} req = {
.nlh = {
.nlmsg_len = sizeof(req),
.nlmsg_type = SOCK_DIAG_BY_FAMILY,
.nlmsg_flags = NLM_F_REQUEST | NLM_F_DUMP
},
.id_req = {
.sdiag_family = packet.l3_protocol,
.sdiag_protocol = packet.l4_protocol,
.idiag_ext = 0,
.pad = 0,
.idiag_states = -1,
.id = {
.idiag_sport = packet.src_port,
.idiag_dport = packet.dst_port
}
}
};
//packet ips are just binary data stored as strings!
memcpy(req.id_req.id.idiag_src, packet.src_ip.c_str(), 4);
memcpy(req.id_req.id.idiag_dst, packet.dst_ip.c_str(), 4);
struct sockaddr_nl nladdr = {
.nl_family = AF_NETLINK
};
struct iovec iov = {
.iov_base = &req,
.iov_len = sizeof(req)
};
struct msghdr msg = {
.msg_name = (void *) &nladdr,
.msg_namelen = sizeof(nladdr),
.msg_iov = &iov,
.msg_iovlen = 1
};
// Send message to kernel
for (;;) {
if (sendmsg(query_fd_, &msg, 0) < 0) {
if (errno == EINTR)
continue;
perror("sendmsg");
return false;
}
return true;
}
}
return false;
For the receive code I have:
long buffer[8192];
struct sockaddr_nl nladdr = {
.nl_family = AF_NETLINK
};
struct iovec iov = {
.iov_base = buffer,
.iov_len = sizeof(buffer)
};
struct msghdr msg = {
.msg_name = (void *) &nladdr,
.msg_namelen = sizeof(nladdr),
.msg_iov = &iov,
.msg_iovlen = 1
};
int flags = 0;
for (;;) {
ssize_t rv = recvmsg(query_fd_, &msg, flags);
// error handling
if (rv < 0) {
if (errno == EINTR)
continue;
if ((errno == EAGAIN) || (errno == EWOULDBLOCK))
break;
perror("Failed to recv from netlink socket");
return 0;
}
if (rv == 0) {
printf("Unexpected shutdown of NETLINK socket");
return 0;
}
for (const struct nlmsghdr* header = reinterpret_cast<const struct nlmsghdr*>(buffer);
rv >= 0 && NLMSG_OK(header, static_cast<uint32_t>(rv));
header = NLMSG_NEXT(header, rv)) {
// The end of multipart message
if (header->nlmsg_type == NLMSG_DONE)
return 0;
if (header->nlmsg_type == NLMSG_ERROR) {
const struct nlmsgerr *err = reinterpret_cast<nlmsgerr*>(NLMSG_DATA(header));
if (err == NULL)
return 100;
errno = -err->error;
perror("NLMSG_ERROR");
return 0;
}
if (header->nlmsg_type != SOCK_DIAG_BY_FAMILY) {
printf("unexpected nlmsg_type %u\n", (unsigned)header->nlmsg_type);
continue;
}
// Get the details....
const struct inet_diag_msg* diag = reinterpret_cast<inet_diag_msg*>(NLMSG_DATA(header));
if (header->nlmsg_len < NLMSG_LENGTH(sizeof(*diag))) {
printf("Message too short %d vs %d\n", header->nlmsg_len, NLMSG_LENGTH(sizeof(*diag)));
return 0;
}
if (diag->idiag_family != PF_INET) {
printf("unexpected family %u\n", diag->idiag_family);
return 1;
}
return diag->idiag_inode;
The Problem:
The diag->udiag_inode value doesn't match the one I see in netstat output or in the /proc/net/ files. Is it supposed too? If not, is it possible to use this approach to retrieve the inode number for the process so that I can then query /proc for the corresponding PID?
Another thing I didn't quite understand is the NLMSG_DONE when checking the nlmsg_type in the header. What I am seeing:
1 - TCP 10.0.9.15:51002 -> 192.168.64.11:3128 [15047]
2 - TCP 192.168.64.11:3128 -> 10.0.9.15:51002 [0]
3 - TCP 10.0.9.15:51002 -> 192.168.64.11:3128 [0]
4 - TCP 192.168.64.11:3128 -> 10.0.9.15:51002 [15047]
5 - TCP 10.0.9.15:51002 -> 192.168.64.11:3128 [0]
6 - TCP 192.168.64.11:3128 -> 10.0.9.15:51002 [0]
7 - TCP 10.0.9.15:51002 -> 192.168.64.11:3128 [15047]
So I get an inode number on first query, then some NLMSG_DONE returns (stepping through code confirmed this was the path). Why don't I get the same result for say lines 1 and 3?
Appreciate any help or advice.

Found the answer and posting in case anyone stumbles across it:
I had a uint16_t as the return type from the recv code when in fact it should have been ino_t or uint32_t. I discovered this when I noticed that a few of the inodes matched correctly after a fresh reboot and then after a while stopped matching with no code changed (inode count obviously incrementing). Using the correct type in the function return sorted the problem (so the code I posted is actually correct!)
I was getting multi part messages. I should have looped whilst NLM_F_MULTI was set in the flags and then left the loop when receiving NLMSG_DONE.

Related

Two way communication using sockets

I am trying to implement two way communication using sockets and not quite sure where I'm going wrong. I have an application that launches a child application, the child application then tries to communicate with the application that launched it, but I am not getting anything.
In the application that launches the child:
int clsSocketThread::initialiseSocket(bool blnIsModule, QString strPurpose) {
const char* cpszLocalHost = "localhost";
//Get the socket
int intSocket = socket(AF_INET, SOCK_STREAM, 0);
if ( intSocket == 0 ) {
clsDebugService::exitWhenDebugQueueEmpty("Failed to create socket!");
}
struct hostent* pHostEntry = gethostbyname(cpszLocalHost);
if ( pHostEntry == nullptr ) {
clsDebugService::exitWhenDebugQueueEmpty("Unable to resolve ip address!");
}
//Initliase and get address of localhost
struct sockaddr_in srvAddr;
bzero((char*)&srvAddr, sizeof(srvAddr));
//Set-up server address
memcpy(&srvAddr.sin_addr, pHostEntry->h_addr_list[0], pHostEntry->h_length);
srvAddr.sin_family = AF_INET;
srvAddr.sin_port = htons(clsSocketThread::mscuint16Port);
char* pszIP = inet_ntoa(srvAddr.sin_addr);
if ( pszIP != nullptr ) {
qdbg() << "Setting up socket on ip: " << pszIP
<< ", port: " << clsSocketThread::mscuint16Port
<< ((strPurpose.isEmpty() == true) ? "" : strPurpose);
}
socklen_t tSvrAddr = sizeof(srvAddr);
int intRC;
#if !defined(STANDALONE)
if ( blnIsModule == true ) {
intRC = inet_pton(srvAddr.sin_family, pszIP, &srvAddr.sin_addr);
if ( intRC <= 0 ) {
clsDebugService::exitWhenDebugQueueEmpty("Invalid address not supported!");
}
intRC = ::connect(intSocket, (const struct sockaddr*)&srvAddr, tSvrAddr);
} else
#endif
{
intRC = bind(intSocket, (const struct sockaddr*)&srvAddr, tSvrAddr);
}
if ( intRC < 0 ) {
clsDebugService::exitWhenDebugQueueEmpty("Socket operation failed!");
}
if ( blnIsModule != true && listen(intSocket, 5) < 0 ) {
clsDebugService::exitWhenDebugQueueEmpty("Cannot listen to socket!");
}
return intSocket;
}
This function is used by both the launcher and the child, when the child calls it the first parameter is true. I've run both in debuggers and all the function calls are successful and there are no errors.
In the launching application I have a thread:
void clsSocketThread::serverSocketBody() {
if ( mintSocket == 0 ) {
mintSocket = clsSocketThread::initialiseSocket();
}
QByteArray qarybytBuffer;
char arycBuffer[2048];
int intNewSocket = 0;
size_t tBufferSize = sizeof(arycBuffer);
QJsonObject objJSON;
while( mpThread != nullptr ) {
if ( intNewSocket <= 0 ) {
struct sockaddr_in cliAddr;
socklen_t tCliLen = sizeof(cliAddr);
intNewSocket = accept(mintSocket, (struct sockaddr*)&cliAddr, &tCliLen);
if ( intNewSocket < 0 ) {
continue;
}
}
//Read from the other socket!
bzero(arycBuffer, tBufferSize);
ssize_t tRead = read(intNewSocket, arycBuffer, tBufferSize);
if ( tRead <= 0 ) {
continue;
}
qarybytBuffer = QByteArray(arycBuffer, (int)tRead);
int intIdx = qarybytBuffer.indexOf(clsJSON::msccOpenCurlyBracket)
,intIdx2 = qarybytBuffer.lastIndexOf(clsJSON::msccCloseCurlyBracket);
if ( intIdx >= 0 && intIdx2 > intIdx ) {
qarybytBuffer = qarybytBuffer.mid(intIdx, intIdx2 - intIdx + 1).trimmed();
QJsonObject objJSON(clsJSON(&qarybytBuffer).toQJsonObject());
if ( objJSON.contains(clsJSON::mscszMsgType) == true ) {
qdbg() << "[RX]Data: " << arycBuffer;//HACK
clsJSON::blnDecodeAccordingToType(&objJSON);
}
}
}
}
I'm not receiving any messages from the child. Both applications are set-up to communicate on localhost:8123
This is operating system specific.
For Linux, read Advanced Linux Programming then syscalls(2), socket(7), unix(7), fifo(7), pipe(7)
With Qt, consider using QSocketNotifier in your main thread.
You might also want to use POCO, ONCRPC, JSONRPC, Wt (perhaps with libcurl) or libonion.
You could get some inspiration by studying the C++ source code of Qt, of POCO, of VMIME.
Be aware that in many (but not all) cases, a single send(2) -or write(2)- on emitter side may correspond to several recv(2) -or read(2)- on the receiving side (and vice versa), at least with TCP on different machines. So you need some event loop (often around poll(2)...) and documented conventions on application-level message formats. Then SMTP or HTTP could be inspirational (and in some cases, useful).

Xcode app for macOS. This is how I setup to get audio from usb mic input. Worked a year ago, now doesn't. Why

Here is my audio init code. My app responds when queue buffers are ready, but all data in buffer is zero. Checking sound in system preferences shows that USB Audio CODEC in sound input dialog is active. AudioInit() is called right after app launches.
{
#pragma mark user data struct
typedef struct MyRecorder
{
AudioFileID recordFile;
SInt64 recordPacket;
Float32 *pSampledData;
MorseDecode *pMorseDecoder;
} MyRecorder;
#pragma mark utility functions
void CheckError(OSStatus error, const char *operation)
{
if(error == noErr) return;
char errorString[20];
// see if it appears to be a 4 char code
*(UInt32*)(errorString + 1) = CFSwapInt32HostToBig(error);
if (isprint(errorString[1]) && isprint(errorString[2]) &&
isprint(errorString[3]) && isprint(errorString[4]))
{
errorString[0] = errorString[5] = '\'';
errorString[6] = '\0';
}
else
{
sprintf(errorString, "%d", (int)error);
}
fprintf(stderr, "Error: %s (%s)\n", operation, errorString);
}
OSStatus MyGetDefaultInputDeviceSampleRate(Float64 *outSampleRate)
{
OSStatus error;
AudioDeviceID deviceID = 0;
AudioObjectPropertyAddress propertyAddress;
UInt32 propertySize;
propertyAddress.mSelector = kAudioHardwarePropertyDefaultInputDevice;
propertyAddress.mScope = kAudioObjectPropertyScopeGlobal;
propertyAddress.mElement = 0;
propertySize = sizeof(AudioDeviceID);
error = AudioObjectGetPropertyData(kAudioObjectSystemObject,
&propertyAddress,
0,
NULL,
&propertySize,
&deviceID);
if(error)
return error;
propertyAddress.mSelector = kAudioDevicePropertyNominalSampleRate;
propertyAddress.mScope = kAudioObjectPropertyScopeGlobal;
propertyAddress.mElement = 0;
propertySize = sizeof(Float64);
error = AudioObjectGetPropertyData(deviceID,
&propertyAddress,
0,
NULL,
&propertySize,
outSampleRate);
return error;
}
static int MyComputeRecordBufferSize(const AudioStreamBasicDescription *format,
AudioQueueRef queue,
float seconds)
{
int packets, frames, bytes;
frames = (int)ceil(seconds * format->mSampleRate);
if(format->mBytesPerFrame > 0)
{
bytes = frames * format->mBytesPerFrame;
}
else
{
UInt32 maxPacketSize;
if(format->mBytesPerPacket > 0)
{
// constant packet size
maxPacketSize = format->mBytesPerPacket;
}
else
{
// get the largest single packet size possible
UInt32 propertySize = sizeof(maxPacketSize);
CheckError(AudioQueueGetProperty(queue,
kAudioConverterPropertyMaximumOutputPacketSize,
&maxPacketSize,
&propertySize),
"Couldn't get queues max output packet size");
}
if(format->mFramesPerPacket > 0)
packets = frames / format->mFramesPerPacket;
else
// worst case scenario: 1 frame in a packet
packets = frames;
// sanity check
if(packets == 0)
packets = 1;
bytes = packets * maxPacketSize;
}
return bytes;
}
extern void bridgeToMainThread(MorseDecode *pDecode);
static int callBacks = 0;
// ---------------------------------------------
static void MyAQInputCallback(void *inUserData,
AudioQueueRef inQueue,
AudioQueueBufferRef inBuffer,
const AudioTimeStamp *inStartTime,
UInt32 inNumPackets,
const AudioStreamPacketDescription *inPacketDesc)
{
MyRecorder *recorder = (MyRecorder*)inUserData;
Float32 *pAudioData = (Float32*)(inBuffer->mAudioData);
recorder->pMorseDecoder->pBuffer = pAudioData;
recorder->pMorseDecoder->bufferSize = inNumPackets;
bridgeToMainThread(recorder->pMorseDecoder);
CheckError(AudioQueueEnqueueBuffer(inQueue,
inBuffer,
0,
NULL),
"AudioQueueEnqueueBuffer failed");
printf("packets = %ld, bytes = %ld\n",(long)inNumPackets,(long)inBuffer->mAudioDataByteSize);
callBacks++;
//printf("\ncallBacks = %d\n",callBacks);
//if(callBacks == 0)
//audioStop();
}
static AudioQueueRef queue = {0};
static MyRecorder recorder = {0};
static AudioStreamBasicDescription recordFormat;
void audioInit()
{
// set up format
memset(&recordFormat,0,sizeof(recordFormat));
recordFormat.mFormatID = kAudioFormatLinearPCM;
recordFormat.mChannelsPerFrame = 2;
recordFormat.mBitsPerChannel = 32;
recordFormat.mBytesPerPacket = recordFormat.mBytesPerFrame = recordFormat.mChannelsPerFrame * sizeof(Float32);
recordFormat.mFramesPerPacket = 1;
//recordFormat.mFormatFlags = kAudioFormatFlagsCanonical;
recordFormat.mFormatFlags = kAudioFormatFlagsNativeFloatPacked;
MyGetDefaultInputDeviceSampleRate(&recordFormat.mSampleRate);
UInt32 propSize = sizeof(recordFormat);
CheckError(AudioFormatGetProperty(kAudioFormatProperty_FormatInfo,
0,
NULL,
&propSize,
&recordFormat),
"AudioFormatProperty failed");
recorder.pMorseDecoder = MorseDecode::pInstance();
recorder.pMorseDecoder->m_sampleRate = recordFormat.mSampleRate;
// recorder.pMorseDecoder->setCircularBuffer();
//set up queue
CheckError(AudioQueueNewInput(&recordFormat,
MyAQInputCallback,
&recorder,
NULL,
kCFRunLoopCommonModes,
0,
&queue),
"AudioQueueNewInput failed");
UInt32 size = sizeof(recordFormat);
CheckError(AudioQueueGetProperty(queue,
kAudioConverterCurrentOutputStreamDescription,
&recordFormat,
&size), "Couldn't get queue's format");
// set up buffers and enqueue
const int kNumberRecordBuffers = 3;
int bufferByteSize = MyComputeRecordBufferSize(&recordFormat, queue, AUDIO_BUFFER_DURATION);
for(int bufferIndex = 0; bufferIndex < kNumberRecordBuffers; bufferIndex++)
{
AudioQueueBufferRef buffer;
CheckError(AudioQueueAllocateBuffer(queue,
bufferByteSize,
&buffer),
"AudioQueueAllocateBuffer failed");
CheckError(AudioQueueEnqueueBuffer(queue,
buffer,
0,
NULL),
"AudioQueueEnqueueBuffer failed");
}
}
void audioRun()
{
CheckError(AudioQueueStart(queue, NULL), "AudioQueueStart failed");
}
void audioStop()
{
CheckError(AudioQueuePause(queue), "AudioQueuePause failed");
}
}
This sounds like the new macOS 'microphone privacy' setting, which, if set to 'no access' for your app, will cause precisely this behaviour. So:
Open the System Preferences pane.
Click on 'Security and Privacy'.
Select the Privacy tab.
Click on 'Microphone' in the left-hand pane.
Locate your app in the right-hand pane and tick the checkbox next to it.
Then restart your app and test it.
Tedious, no?
Edit: As stated in the comments, you can't directly request microphone access, but you can detect whether it has been granted to your app or not by calling [AVCaptureDevice authorizationStatusForMediaType: AVMediaTypeAudio].

Recv function for TCP Socket programming

I am new in Socket Programming. I am trying to create a client application. The server is a camera which communicates using TCP. The camera is sending continuous data. Using Wireshark, I can see that the camera is sending continuous packets of different sizes, but not more than 1514 bytes. But my recv function is always returning 2000 which is the size of my buffer.
unsigned char buf[2000];
int bytesIn = recv(sock, (char*)buf, sizeof(buf) , 0);
if (bytesIn > 0)
{
std::cout << bytesIn << std::endl;
}
The first packet I receive is of size 9 bytes, which recv returns correct, but after that it always returns 2000.
Can anyone please tell me the solution so that I can get the correct size of the actual data payload?
EDIT
int bytesIn = recv(sock, (char*)buf, sizeof(buf) , 0);
if (bytesIn > 0)
{
while (bytes != 1514)
{
if (count == 221184)
{
break;
}
buffer[count++] = buf[bytes++];
}
std::cout << count;
}
EDIT:
Here is my Wireshark capture:
My Code to handle packets
int bytesIn = recv(sock, (char*)&buf, sizeof(buf) , 0);
if (bytesIn > 0)
{
if (flag1 == true)
{
while ((bytes != 1460 && (buf[bytes] != 0)) && _fillFlag)
{
buffer[fill++] = buf[bytes++];
if (fill == 221184)
{
flag1 = false;
_fillFlag = false;
fill = 0;
queue.Enqueue(buffer, sizeof(buffer));
break;
}
}
}
if ((strncmp(buf, _string2, 10) == 0))
{
flag1 = true;
}
}
For each frame camera is sending 221184 bytes and after each frame it sends a packet of data 9 bytes which I used to compare this 9 bytes are constant.
This 221184 bytes send by camera doesn't have 0 so I use this condition in while loop. This code is working and showing the frame but after few frame it shows fully black frame. I think the mistake is in receiving the packet.
Size of per frame is : 221184 (fixed)
Size of per recv is : 0 ~ 1514
My implementation here :
DWORD MakeFrame(int socket)
{
INT nFrameSize = 221184;
INT nSizeToRecv = 221184;
INT nRecvSize = 2000;
INT nReceived = 0;
INT nTotalReceived = 0;
BYTE byCamera[2000] = { 0 }; // byCamera size = nRecvSize
BYTE byFrame[221184] = { 0 }; // byFrame size = nFrameSize
while(0 != nSizeToRecv)
{
nRecvSize = min(2000, nSizeToRecv);
nReceived = recv(socket, (char*)byCamera, nRecvSize, 0);
memcpy_s(byFrame + nTotalReceived, nFrameSize, byCamera, nReceived);
nSizeToRecv -= nReceived;
nTotalReceived += nReceived;
}
// byFrame is ready to use.
// ...
// ...
return WSAGetLastError();
}
The first packet I receive is of size 9 bytes which it print correct after that it always print 2000. So can anyone please tell me the solution that I only get the size of actual data payload.
TCP is no packet-oriented, but a stream-oriented transport protocol. There is no notion of packets in TCP (apart maybe from a MTU). If you want to work in packets, you have to either use UDP (which is in fact packet-oriented, but by default not reliable concerning order, discarding and alike) or you have to implement your packet logic in TCP, i.e. reading from a stream and partition the data into logical packets once received.

Detect USB hardware keylogger

I need to determine is there hardware keylogger that was plugged to PC with USB keyboard. It needs to be done via software method, from user-land. However wiki says that it is impossible to detect HKL using soft, there are several methods exists. The best and I think only one overiew that present in net relating that theme is "Detecting Hardware Keyloggers, by Fabian Mihailowitsch - youtube".
Using this overview I am developing a tool to detect USB hardware keyloggers. The sources for detecting PS/2 keyloggers was already shared by author and available here. So my task is to make it worked for USB only.
As suggested I am using libusb library to interfere with USB devices in system.
So, there are methods I had choosen in order to detect HKL:
Find USB keyboard that bugged by HKL. Note that HKL is usually
invisible from device list in system or returned by libusb.
Detect Keyghost HKL by: Interrupt read from USB HID device, send usb reset (libusb_reset_device), read interrupt again. If data returned on last read is not nulls then keylogger detected. It is described on page 45 of Mihailowitsch's presentation
Time measurement. The idea is measure time of send/receive packets using control transfer for original keyboard for thousands times. In case HKL has been plugged, program will measure time again and then compare the time with the original value. For HKL it have to be much(or not so much) greater.
Algorithm is:
Send an output report to Keyboard(as Control transfer) (HID_REPORT_TYPE_OUTPUT 0x02 )
Wait for ACKed packet
Repeat Loop (10.000 times)
Measure time
Below is my code according to steps of detection.
1. Find USB keyboard
libusb_device * UsbKeyboard::GetSpecifiedDevice(PredicateType pred)
{
if (_usbDevices == nullptr) return nullptr;
int i = 0;
libusb_device *dev = nullptr;
while ((dev = _usbDevices[i++]) != NULL)
{
struct libusb_device_descriptor desc;
int r = libusb_get_device_descriptor(dev, &desc);
if (r >= 0)
{
if (pred(desc))
return dev;
}
}
return nullptr;
}
libusb_device * UsbKeyboard::FindKeyboard()
{
return GetSpecifiedDevice([&](libusb_device_descriptor &desc) {
bool isKeyboard = false;
auto dev_handle = libusb_open_device_with_vid_pid(_context, desc.idVendor, desc.idProduct);
if (dev_handle != nullptr)
{
unsigned char buf[255] = "";
// product description contains 'Keyboard', usually string is 'USB Keyboard'
if (libusb_get_string_descriptor_ascii(dev_handle, desc.iProduct, buf, sizeof(buf)) >= 0)
isKeyboard = strstr((char*)buf, "Keyboard") != nullptr;
libusb_close(dev_handle);
}
return isKeyboard;
});
}
Here we're iterating through all USB devices in system and checks their Product string. In my system this string for keyboard is 'USB keyboard' (obviously).
Is it stable way to detect keyboard through Product string? Is there other ways?
2. Detect Keyghost HKL using Interrupt read
int UsbKeyboard::DetectKeyghost(libusb_device *kbdev)
{
int r, i;
int transferred;
unsigned char answer[PACKET_INT_LEN];
unsigned char question[PACKET_INT_LEN];
for (i = 0; i < PACKET_INT_LEN; i++) question[i] = 0x40 + i;
libusb_device_handle *devh = nullptr;
if ((r = libusb_open(kbdev, &devh)) < 0)
{
ShowError("Error open device", r);
return r;
}
r = libusb_set_configuration(devh, 1);
if (r < 0)
{
ShowError("libusb_set_configuration error ", r);
goto out;
}
printf("Successfully set usb configuration 1\n");
r = libusb_claim_interface(devh, 0);
if (r < 0)
{
ShowError("libusb_claim_interface error ", r);
goto out;
}
r = libusb_interrupt_transfer(devh, 0x81 , answer, PACKET_INT_LEN,
&transferred, TIMEOUT);
if (r < 0)
{
ShowError("Interrupt read error ", r);
goto out;
}
if (transferred < PACKET_INT_LEN)
{
ShowError("Interrupt transfer short read %", r);
goto out;
}
for (i = 0; i < PACKET_INT_LEN; i++) {
if (i % 8 == 0)
printf("\n");
printf("%02x, %02x; ", question[i], answer[i]);
}
printf("\n");
out:
libusb_close(devh);
return 0;
}
I've got such error on libusb_interrupt_transfer:
libusb: error [hid_submit_bulk_transfer] HID transfer failed: [5] Access denied
Interrupt read error - Input/Output Error (LIBUSB_ERROR_IO) (GetLastError() - 1168)
No clue why 'access denied', then IO error, and GetLastError() returns 1168, which means - Element not found (What element?). Looking for help here.
Time measurement. Send output report and wait for ACK packet.
int UsbKeyboard::SendOutputReport(libusb_device *kbdev)
{
const int PACKET_INT_LEN = 1;
int r, i;
unsigned char answer[PACKET_INT_LEN];
unsigned char question[PACKET_INT_LEN];
for (i = 0; i < PACKET_INT_LEN; i++) question[i] = 0x30 + i;
for (i = 1; i < PACKET_INT_LEN; i++) answer[i] = 0;
libusb_device_handle *devh = nullptr;
if ((r = libusb_open(kbdev, &devh)) < 0)
{
ShowError("Error open device", r);
return r;
}
r = libusb_set_configuration(devh, 1);
if (r < 0)
{
ShowError("libusb_set_configuration error ", r);
goto out;
}
printf("Successfully set usb configuration 1\n");
r = libusb_claim_interface(devh, 0);
if (r < 0)
{
ShowError("libusb_claim_interface error ", r);
goto out;
}
printf("Successfully claim interface\n");
r = libusb_control_transfer(devh, CTRL_OUT, HID_SET_REPORT, (HID_REPORT_TYPE_OUTPUT << 8) | 0x00, 0, question, PACKET_INT_LEN, TIMEOUT);
if (r < 0) {
ShowError("Control Out error ", r);
goto out;
}
r = libusb_control_transfer(devh, CTRL_IN, HID_GET_REPORT, (HID_REPORT_TYPE_INPUT << 8) | 0x00, 0, answer, PACKET_INT_LEN, TIMEOUT);
if (r < 0) {
ShowError("Control In error ", r);
goto out;
}
out:
libusb_close(devh);
return 0;
}
Error the same as for read interrupt:
Control Out error - Input/Output Error (LIBUSB_ERROR_IO) (GetLastError() - 1168
)
How to fix please? Also how to wait for ACK packet?
Thank you.
UPDATE:
I've spent a day on searching and debbuging. So currently my problem is only to
send Output report via libusb_control_transfer. The 2nd method with interrupt read is unnecessary to implement because of Windows denies access to read from USB device using ReadFile.
It is only libusb stuff left, here is the code I wanted to make work (from 3rd example):
// sending Output report (LED)
// ...
unsigned char buf[65];
buf[0] = 1; // First byte is report number
buf[1] = 0x80;
r = libusb_control_transfer(devh, CTRL_OUT,
HID_SET_REPORT/*0x9*/, (HID_REPORT_TYPE_OUTPUT/*0x2*/ << 8) | 0x00,
0, buf, (uint16_t)2, 1000);
...
The error I've got:
[ 0.309018] [00001c0c] libusb: debug [_hid_set_report] Failed to Write HID Output Report: [1] Incorrect function
Control Out error - Input/Output Error (LIBUSB_ERROR_IO) (GetLastError() - 1168)
This error occures right after DeviceIoControl call in libusb internals.
What means "Incorrect function" there?

Communication between client and server is erratic

I modified thegeekinthecorner examples to be able to continuously send data.
I am using g++4.9.2.
I tried uninstalling the oficial latest OFED from here http://downloads.openfabrics.org/OFED/
OFED Distribution Software Installation Menu
1) View OFED Installation Guide
2) Install OFED Software
3) Show Installed Software
4) Configure IPoIB
5) Uninstall OFED Software
Q) Exit
Select Option [1-5]:5
Uninstalling the previous version of OFED
Running rpm -e --allmatches libibverbs libibverbs-devel libibverbs-utils libmthca libmlx4 libcxgb3 libnes libipathverbs libibcm libibumad libibumad-devel libibmad ibacm librdmacm librdmacm-utils librdmacm-devel opensm opensm-libs dapl perftest mstflint ibutils infiniband-diags qperf infinipath-psm opensm opensm-libs libipathverbs dapl libibcm libibmad libibumad libibumad-devel libibverbs libibverbs-devel libibverbs-utils libipathverbs libmthca libmlx4 librdmacm librdmacm-devel librdmacm-utils ibacm ibutils ibutils-libs libnes infinipath-psm
Failed to uninstall the previous installation
See /tmp/OFED.22320.logs/ofed_uninstall.log
[idf#node1 OFED-1.5.4-20110726-0732]$
[idf#node1 OFED-1.5.4-20110726-0732]$
If instead I just try to install it, I get this:
OFED Distribution Software Installation Menu
1) Basic (OFED modules and basic user level libraries)
2) HPC (OFED modules and libraries, MPI and diagnostic tools)
3) All packages (all of Basic, HPC)
4) Customize
Q) Exit
Select Option [1-4]:3
Please choose an implementation of MVAPICH2:
1) OFA (IB and iWARP)
2) uDAPL
Implementation [1]: 1
Enable ROMIO support [Y/n]:
Enable shared library support [Y/n]:
Enable Checkpoint-Restart support [y/N]:
Kernel 3.10.0-229.7.2.el7.x86_64 is not supported.
For the list of Supported Platforms and Operating Systems see
/mnt/gluster/Downloads/OFED-1.5.4-20110726-0732/docs/OFED_release_notes.txt
[idf#node1 OFED-1.5.4-20110726-0732]$
[idf#node2 Release]$ lspci | grep -i mel
02:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0)
[idf#node2 Release]$
[idf#node1 Release]$ ibv_devinfo
hca_id: mlx4_0
transport: InfiniBand (0)
fw_ver: 2.7.200
node_guid: 0025:90ff:ff1a:081c
sys_image_guid: 0025:90ff:ff1a:081f
vendor_id: 0x02c9
vendor_part_id: 26428
hw_ver: 0xB0
board_id: SM_2092000001000
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 4096 (5)
sm_lid: 1
port_lid: 2
port_lmc: 0x00
link_layer: InfiniBand
[idf#node1 Release]$ ifconfig -a
ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 2044
inet 192.168.0.1 netmask 255.255.255.0 broadcast 192.168.0.255
inet6 fe80::225:90ff:ff1a:71 prefixlen 64 scopeid 0x20<link>
Infiniband hardware address can be incorrect! Please read BUGS section in ifconfig(8).
infiniband 80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00 txqueuelen 256 (InfiniBand)
RX packets 5 bytes 280 (280.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 27 overruns 0 carrier 0 collisions 0
Below is the client and server. When I run this programs, the clients will send messages, but the number of messages it sends is erratic, error messages are often
Client:
#include <iostream>
#include <thread>
#include <netdb.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <rdma/rdma_cma.h>
#define TEST_NZ(x) do { if ( (x)) die("error: " #x " failed (returned non-zero)." ); } while (0)
#define TEST_Z(x) do { if (!(x)) die("error: " #x " failed (returned zero/null)."); } while (0)
const int BUFFER_SIZE = 2048;
const int TIMEOUT_IN_MS = 500; /* ms */
struct context
{
struct ibv_context *ctx;
struct ibv_pd *pd;
struct ibv_cq *cq;
struct ibv_comp_channel *comp_channel;
pthread_t cq_poller_thread;
};
struct connection
{
struct rdma_cm_id *id;
struct ibv_qp *qp;
struct ibv_mr *recv_mr;
struct ibv_mr *send_mr;
char *recv_region;
char *send_region;
int num_completions;
};
static pthread_t msgThread;
static void die(const char *reason);
static void build_context(struct ibv_context *verbs);
static void build_qp_attr(struct ibv_qp_init_attr *qp_attr);
static void * poll_cq(void *);
static void post_receives(struct connection *conn);
static void register_memory(struct connection *conn);
static int on_addr_resolved(struct rdma_cm_id *id);
static void on_completion(struct ibv_wc *wc);
static int on_connection(void *context);
static int on_disconnect(struct rdma_cm_id *id);
static int on_event(struct rdma_cm_event *event);
static int on_route_resolved(struct rdma_cm_id *id);
static struct context *s_ctx = NULL;
#include <mutex> // std::mutex, std::unique_lock
#include <condition_variable> // std::condition_variable
std::mutex mtx;
std::condition_variable cv;
bool ok_to_send_next_message = 1;
bool message_available()
{
return 0 != ok_to_send_next_message;
}
int main(int argc, char **argv)
{
struct addrinfo *addr;
struct rdma_cm_event *event = NULL;
struct rdma_cm_id *conn= NULL;
struct rdma_event_channel *ec = NULL;
if (argc != 3)
die("usage: client <server-address> <server-port>");
TEST_NZ(getaddrinfo(argv[1], argv[2], NULL, &addr));
TEST_Z(ec = rdma_create_event_channel());
TEST_NZ(rdma_create_id(ec, &conn, NULL, RDMA_PS_TCP));
TEST_NZ(rdma_resolve_addr(conn, NULL, addr->ai_addr, TIMEOUT_IN_MS));
freeaddrinfo(addr);
while (0 == rdma_get_cm_event(ec, &event))
//while (rdma_get_cm_event(ec, &event))
{
std::cout << "rdma_get_cm_event\n";
struct rdma_cm_event event_copy;
memcpy(&event_copy, event, sizeof(*event));
rdma_ack_cm_event(event);
if (on_event(&event_copy))
break;
}
rdma_destroy_event_channel(ec);
return 0;
}
void die(const char *reason)
{
fprintf(stderr, "%s\n", reason);
exit(EXIT_FAILURE);
}
void build_context(struct ibv_context *verbs)
{
if (s_ctx)
{
if (s_ctx->ctx != verbs)
die("cannot handle events in more than one context.");
return;
}
s_ctx = (struct context *)malloc(sizeof(struct context));
s_ctx->ctx = verbs;
TEST_Z(s_ctx->pd = ibv_alloc_pd(s_ctx->ctx));
TEST_Z(s_ctx->comp_channel = ibv_create_comp_channel(s_ctx->ctx));
TEST_Z(s_ctx->cq = ibv_create_cq(s_ctx->ctx, 100, NULL, s_ctx->comp_channel, 0)); /* cqe=10 is arbitrary */
TEST_NZ(ibv_req_notify_cq(s_ctx->cq, 0));
TEST_NZ(pthread_create(&s_ctx->cq_poller_thread, NULL, poll_cq, NULL));
}
void *SendMessages(void *context)
{
static int loopcount = 0;
while(1)
{
std::unique_lock<std::mutex> lck(mtx);
cv.wait(lck, message_available);
//std::this_thread::sleep_for(std::chrono::microseconds(50));
ok_to_send_next_message = 0;
struct connection *conn = (struct connection *)context;
struct ibv_send_wr wr, *bad_wr = NULL;
struct ibv_sge sge;
std::cout << "looping send..." << loopcount << '\n' << std::flush;
memset(&wr, 0, sizeof(wr));
wr.wr_id = (uintptr_t)conn;
wr.opcode = IBV_WR_SEND;
wr.sg_list = &sge;
wr.num_sge = 1;
wr.send_flags = IBV_SEND_SIGNALED;
sge.addr = (uintptr_t)conn->send_region;
sge.length = BUFFER_SIZE;
sge.lkey = conn->send_mr->lkey;
snprintf(conn->send_region, BUFFER_SIZE, "message from active/client side with count %d", loopcount++);
TEST_NZ(ibv_post_send(conn->qp, &wr, &bad_wr));
}
}
void build_qp_attr(struct ibv_qp_init_attr *qp_attr)
{
std::cout << "build_qp_attr\n";
memset(qp_attr, 0, sizeof(*qp_attr));
qp_attr->send_cq = s_ctx->cq;
qp_attr->recv_cq = s_ctx->cq;
qp_attr->qp_type = IBV_QPT_RC;
qp_attr->cap.max_send_wr = 100;
qp_attr->cap.max_recv_wr = 100;
qp_attr->cap.max_send_sge = 1;
qp_attr->cap.max_recv_sge = 1;
}
void * poll_cq(void *ctx)
{
struct ibv_cq *cq;
struct ibv_wc wc;
while (1)
{
TEST_NZ(ibv_get_cq_event(s_ctx->comp_channel, &cq, &ctx));
ibv_ack_cq_events(cq, 1);
TEST_NZ(ibv_req_notify_cq(cq, 0));
int ne;
struct ibv_wc wc;
do
{
std::cout << "polling\n";
ne = ibv_poll_cq(cq, 1, &wc);
}
while(ne == 0);
on_completion(&wc);
//if (wc.opcode == IBV_WC_SEND)
if (wc.status == IBV_WC_SUCCESS)
{
{
ok_to_send_next_message = 1;
//while (message_available()) std::this_thread::yield();
//std::cout << "past yield\n";
std::unique_lock<std::mutex> lck(mtx);
cv.notify_one();
}
}
}
return NULL;
}
void post_receives(struct connection *conn)
{
std::cout << "post_receives\n";
struct ibv_recv_wr wr, *bad_wr = NULL;
struct ibv_sge sge;
wr.wr_id = (uintptr_t)conn;
wr.next = NULL;
wr.sg_list = &sge;
wr.num_sge = 1;
sge.addr = (uintptr_t)conn->recv_region;
sge.length = BUFFER_SIZE;
sge.lkey = conn->recv_mr->lkey;
TEST_NZ(ibv_post_recv(conn->qp, &wr, &bad_wr));
}
void register_memory(struct connection *conn)
{
std::cout << "register_memory\n";
conn->send_region = (char *)malloc(BUFFER_SIZE);
conn->recv_region = (char *)malloc(BUFFER_SIZE);
TEST_Z(conn->send_mr = ibv_reg_mr(
s_ctx->pd,
conn->send_region,
BUFFER_SIZE,
IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE));
TEST_Z(conn->recv_mr = ibv_reg_mr(
s_ctx->pd,
conn->recv_region,
BUFFER_SIZE,
IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE));
}
int on_addr_resolved(struct rdma_cm_id *id)
{
std::cout << "on_addr_resolved\n";
struct ibv_qp_init_attr qp_attr;
struct connection *conn;
build_context(id->verbs);
build_qp_attr(&qp_attr);
TEST_NZ(rdma_create_qp(id, s_ctx->pd, &qp_attr));
id->context = conn = (struct connection *)malloc(sizeof(struct connection));
conn->id = id;
conn->qp = id->qp;
conn->num_completions = 0;
register_memory(conn);
post_receives(conn);
TEST_NZ(rdma_resolve_route(id, TIMEOUT_IN_MS));
return 0;
}
void on_completion(struct ibv_wc *wc)
{
std::cout << "on_completion\n";
struct connection *conn = (struct connection *)(uintptr_t)wc->wr_id;
if (wc->status != IBV_WC_SUCCESS)
{
//die("\ton_completion: status is not IBV_WC_SUCCESS.");
printf("\ton_completion: status is not IBV_WC_SUCCESS.");
printf("\t it is %d ", wc->status);
}
printf("\n");
if (wc->opcode & IBV_WC_RECV)
printf("\treceived message: %s\n", conn->recv_region);
else if (wc->opcode == IBV_WC_SEND)
printf("\tsend completed successfully.\n");
else
die("\ton_completion: completion isn't a send or a receive.");
if (5 == ++conn->num_completions)
rdma_disconnect(conn->id);
}
int on_connection(void *context)
{
std::cout << "on_connection\n";
TEST_NZ(pthread_create(&msgThread, NULL, SendMessages, context));
return 0;
}
int on_disconnect(struct rdma_cm_id *id)
{
struct connection *conn = (struct connection *)id->context;
printf("disconnected.\n");
rdma_destroy_qp(id);
ibv_dereg_mr(conn->send_mr);
ibv_dereg_mr(conn->recv_mr);
free(conn->send_region);
free(conn->recv_region);
free(conn);
rdma_destroy_id(id);
return 1; /* exit event loop */
}
int on_route_resolved(struct rdma_cm_id *id)
{
struct rdma_conn_param cm_params;
printf("route resolved.\n");
memset(&cm_params, 0, sizeof(cm_params));
TEST_NZ(rdma_connect(id, &cm_params));
return 0;
}
int on_event(struct rdma_cm_event *event)
{
int r = 0;
if (event->event == RDMA_CM_EVENT_ADDR_RESOLVED)
r = on_addr_resolved(event->id);
else if (event->event == RDMA_CM_EVENT_ROUTE_RESOLVED)
r = on_route_resolved(event->id);
else if (event->event == RDMA_CM_EVENT_ESTABLISHED)
r = on_connection(event->id->context);
else if (event->event == RDMA_CM_EVENT_DISCONNECTED)
r = on_disconnect(event->id);
else
die("on_event: unknown event.");
return r;
}
Server:
#include <iostream>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <inttypes.h>
#include <rdma/rdma_cma.h>
#define TEST_NZ(x) do { if ( (x)) die("error: " #x " failed (returned non-zero)." ); } while (0)
#define TEST_Z(x) do { if (!(x)) die("error: " #x " failed (returned zero/null)."); } while (0)
const int BUFFER_SIZE = 2048;
struct context
{
struct ibv_context *ctx;
struct ibv_pd *pd;
struct ibv_cq *cq;
struct ibv_comp_channel *comp_channel;
pthread_t cq_poller_thread;
};
struct connection
{
struct ibv_qp *qp;
struct ibv_mr *recv_mr;
struct ibv_mr *send_mr;
char *recv_region;
char *send_region;
};
static void die(const char *reason);
static void build_context(struct ibv_context *verbs);
static void build_qp_attr(struct ibv_qp_init_attr *qp_attr);
static void * poll_cq(void *);
static void post_receives(struct connection *conn);
static void register_memory(struct connection *conn);
static void on_completion(struct ibv_wc *wc);
static int on_connect_request(struct rdma_cm_id *id);
static int on_connection(void *context);
static int on_disconnect(struct rdma_cm_id *id);
static int on_event(struct rdma_cm_event *event);
static struct context *s_ctx = NULL;
int main(int argc, char **argv)
{
struct sockaddr_in6 addr;
struct rdma_cm_event *event = NULL;
struct rdma_cm_id *listener = NULL;
struct rdma_event_channel *ec = NULL;
uint16_t port = 0;
memset(&addr, 0, sizeof(addr));
addr.sin6_family = AF_INET6;
TEST_Z(ec = rdma_create_event_channel());
TEST_NZ(rdma_create_id(ec, &listener, NULL, RDMA_PS_TCP));
TEST_NZ(rdma_bind_addr(listener, (struct sockaddr *)&addr));
TEST_NZ(rdma_listen(listener, 100)); /* backlog=10 is arbitrary */
//printf("[ %"PRIu32" ]\n", *addr.sin6_addr.s6_addr32);
port = ntohs(rdma_get_src_port(listener));
printf("listening on port %d.\n", port);
while (rdma_get_cm_event(ec, &event) == 0)
{
struct rdma_cm_event event_copy;
memcpy(&event_copy, event, sizeof(*event));
rdma_ack_cm_event(event);
if (on_event(&event_copy))
break;
}
rdma_destroy_id(listener);
rdma_destroy_event_channel(ec);
return 0;
}
void die(const char *reason)
{
fprintf(stderr, "%s\n", reason);
exit(EXIT_FAILURE);
}
void build_context(struct ibv_context *verbs)
{
if (s_ctx)
{
if (s_ctx->ctx != verbs)
die("cannot handle events in more than one context.");
return;
}
s_ctx = (struct context *)malloc(sizeof(struct context));
s_ctx->ctx = verbs;
TEST_Z(s_ctx->pd = ibv_alloc_pd(s_ctx->ctx));
TEST_Z(s_ctx->comp_channel = ibv_create_comp_channel(s_ctx->ctx));
TEST_Z(s_ctx->cq = ibv_create_cq(s_ctx->ctx, 100, NULL, s_ctx->comp_channel, 0)); /* cqe=10 is arbitrary */
TEST_NZ(ibv_req_notify_cq(s_ctx->cq, 0));
TEST_NZ(pthread_create(&s_ctx->cq_poller_thread, NULL, poll_cq, NULL));
}
void build_qp_attr(struct ibv_qp_init_attr *qp_attr)
{
memset(qp_attr, 0, sizeof(*qp_attr));
qp_attr->send_cq = s_ctx->cq;
qp_attr->recv_cq = s_ctx->cq;
qp_attr->qp_type = IBV_QPT_RC;
qp_attr->cap.max_send_wr = 100;
qp_attr->cap.max_recv_wr = 100;
qp_attr->cap.max_send_sge = 1;
qp_attr->cap.max_recv_sge = 1;
}
void * poll_cq(void *ctx)
{
struct ibv_cq *cq;
struct ibv_wc wc;
while (1)
{
TEST_NZ(ibv_get_cq_event(s_ctx->comp_channel, &cq, &ctx));
ibv_ack_cq_events(cq, 1);
TEST_NZ(ibv_req_notify_cq(cq, 0));
while (ibv_poll_cq(cq, 1, &wc))
{
std::cout << "polling\n";
on_completion(&wc);
}
}
return NULL;
}
void post_receives(struct connection *conn)
{
std::cout << "post_receives\n";
struct ibv_recv_wr wr, *bad_wr = NULL;
struct ibv_sge sge;
wr.wr_id = (uintptr_t)conn;
wr.next = NULL;
wr.sg_list = &sge;
wr.num_sge = 1;
sge.addr = (uintptr_t)conn->recv_region;
sge.length = BUFFER_SIZE;
sge.lkey = conn->recv_mr->lkey;
TEST_NZ(ibv_post_recv(conn->qp, &wr, &bad_wr));
}
void register_memory(struct connection *conn)
{
conn->send_region = (char *)malloc(BUFFER_SIZE);
conn->recv_region = (char *)malloc(BUFFER_SIZE);
TEST_Z(conn->send_mr = ibv_reg_mr(
s_ctx->pd,
conn->send_region,
BUFFER_SIZE,
IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE));
TEST_Z(conn->recv_mr = ibv_reg_mr(
s_ctx->pd,
conn->recv_region,
BUFFER_SIZE,
IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE));
}
void on_completion(struct ibv_wc *wc)
{
if (wc->status != IBV_WC_SUCCESS)
die("on_completion: status is not IBV_WC_SUCCESS.");
if (wc->opcode & IBV_WC_RECV)
{
struct connection *conn = (struct connection *)(uintptr_t)wc->wr_id;
post_receives(conn);
printf("received message: %s\n", conn->recv_region);
}
else if (wc->opcode == IBV_WC_SEND)
{
printf("send completed successfully.\n");
}
}
int on_connect_request(struct rdma_cm_id *id)
{
struct ibv_qp_init_attr qp_attr;
struct rdma_conn_param cm_params;
struct connection *conn;
printf("received connection request.\n");
build_context(id->verbs);
build_qp_attr(&qp_attr);
TEST_NZ(rdma_create_qp(id, s_ctx->pd, &qp_attr));
id->context = conn = (struct connection *)malloc(sizeof(struct connection));
conn->qp = id->qp;
register_memory(conn);
post_receives(conn);
memset(&cm_params, 0, sizeof(cm_params));
TEST_NZ(rdma_accept(id, &cm_params));
return 0;
}
int on_connection(void *context)
{
struct connection *conn = (struct connection *)context;
struct ibv_send_wr wr, *bad_wr = NULL;
struct ibv_sge sge;
snprintf(conn->send_region, BUFFER_SIZE, "message from passive/server side with pid %d", getpid());
printf("connected. posting send...\n");
memset(&wr, 0, sizeof(wr));
wr.opcode = IBV_WR_SEND;
wr.sg_list = &sge;
wr.num_sge = 1;
wr.send_flags = IBV_SEND_SIGNALED;
sge.addr = (uintptr_t)conn->send_region;
sge.length = BUFFER_SIZE;
sge.lkey = conn->send_mr->lkey;
TEST_NZ(ibv_post_send(conn->qp, &wr, &bad_wr));
return 0;
}
int on_disconnect(struct rdma_cm_id *id)
{
struct connection *conn = (struct connection *)id->context;
printf("peer disconnected.\n");
rdma_destroy_qp(id);
ibv_dereg_mr(conn->send_mr);
ibv_dereg_mr(conn->recv_mr);
free(conn->send_region);
free(conn->recv_region);
free(conn);
rdma_destroy_id(id);
return 0;
}
int on_event(struct rdma_cm_event *event)
{
std::cout << "on_event\n";
int r = 0;
if (event->event == RDMA_CM_EVENT_CONNECT_REQUEST)
r = on_connect_request(event->id);
else if (event->event == RDMA_CM_EVENT_ESTABLISHED)
r = on_connection(event->id->context);
else if (event->event == RDMA_CM_EVENT_DISCONNECTED)
r = on_disconnect(event->id);
else
die("on_event: unknown event.");
return r;
}
Here are a couple of runs. Totally random the number of message sent:
[idf#node1 Release]$ ./TGKITCClient 192.168.0.1 47819
rdma_get_cm_event
on_addr_resolved
build_qp_attr
register_memory
post_receives
rdma_get_cm_event
route resolved.
rdma_get_cm_event
on_connection
looping send...0
polling
on_completion
received message: message from passive/server side with pid 4188
polling
on_completion
send completed successfully.
looping send...1
polling
on_completion
send completed successfully.
^C
[idf#node1 Release]$
And then
[idf#node1 Release]$ ./TGKITCClient 192.168.0.1 55148
rdma_get_cm_event
on_addr_resolved
build_qp_attr
register_memory
post_receives
rdma_get_cm_event
route resolved.
rdma_get_cm_event
on_connection
looping send...0
polling
on_completion
received message: message from passive/server side with pid 4279
polling
on_completion
send completed successfully.
looping send...1
polling
on_completion
send completed successfully.
looping send...2
polling
on_completion
send completed successfully.
looping send...3
polling
on_completion
send completed successfully.
looping send...4
polling
on_completion
send completed successfully.
looping send...5
polling
on_completion
send completed successfully.
looping send...6
polling
on_completion
send completed successfully.
looping send...7
polling
on_completion
send completed successfully.
looping send...8
rdma_get_cm_event
disconnected.
polling
on_completion
send completed successfully.
on_completion: status is not IBV_WC_SUCCESS. it is 5 [idf#node1 Release]$
Here is the server side:
on_event
peer disconnected.
on_event
received connection request.
post_receives
on_event
connected. posting send...
polling
send completed successfully.
polling
post_receives
received message: message from active/client side with count 0
polling
post_receives
received message: message from active/client side with count 1
polling
post_receives
received message: message from active/client side with count 2
polling
post_receives
received message: message from active/client side with count 3
polling
post_receives
received message: message from active/client side with count 4
polling
post_receives
received message: message from active/client side with count 5
polling
post_receives
received message: message from active/client side with count 6
polling
post_receives
received message: message from active/client side with count 7
on_event
peer disconnected.
Make sure that the most recent drivers and firmware on the cards are installed. Beyond that, using the RDMA packages included with most OS distributions when trying to run IB is a dangerous game to play.
It is strongly recommended that for applications like these the Open Fabrics Enterprise Distribution should be used to provide openib, opensm and a variety of other useful infiniband related packages for analysis diagnostics and tuning of a network. The official OFED packages can be found on the OpenFabrics website.
Based on the question it looks like IPoIB is being used but the specific configuration is not mentioned. IPoIB is not necessarily the best way to take advantages of the hardware resources available in the IB cards.
In addition to those considerations making sure that the subnetmanager is setup and configured correctly. Some switches have built-in subnet managers that can be access and configured through a management interface, in other cases it might make more sense to run and configure the subnet manager on one of the nodes that you are using. OpenSM is a common subnet manager that is included with OFED distributions and there are many online guides available for setting up and configuring a subnet manager based on the type of network being setup.
OFED also includes a variety of IB testing and profiling tools. ibdiagnet is a useful tool for debugging IB network issues. And there are many guides available online that show different ways to make use of the tool as well as the other tools included in OFED.
Depending on the type of IB switch used there may additionally be some network management and diagnostic tools that would allow for further analysis of the network. The configuration of IB hardware and the low-level software that manages it is sometimes more critical to overall performance than the actual code being run. But with that being said recompiling and linking to relevant libraries from the correct version of OFED might be advisable if significant software of hardware configuration changes are made.