I have a socket where I set a timeout for recv().
I have two steps for recv(), first I check content of received data if complete using MSG_PEEK | MSG_DONTWAIT.
recvTimeout.tv_sec = mRecvTimeoutSecs;
recvTimeout.tv_usec = mRecvTimeoutUSecs;
sendTimeout.tv_sec = mSendTimeoutSecs;
sendTimeout.tv_usec = mSendTimeoutUSecs;
result = enableSocketOption(SOL_SOCKET, SO_RCVTIMEO, &recvTimeout, sizeof(recvTimeout));
peekdLen = ::recv(mSocket, peekDataBuffer, MAX_RECV_LENGTH, MSG_PEEK | MSG_DONTWAIT);
I'm just thinking if recv() will timeout if I used MSG_PEEK | MSG_DONTWAIT.
No, the socket will not timeout, as MSG_DONTWAIT will cause recv() to return immediately. Note that if you set like 1 msec timeout, then it might timeout - that would depend on the implementation (on which OS your code runs on).
Related
Context:
We are working on migration of the driver, which is currently represented as a kernel extension, to the DriverKit framework.
The driver works with Thunderbolt RAID storage devices.
When connected through the Thunderbolt interface to the host, the device itself presents in the OS as a PCI device. The main function of our driver (.kext) is to create a "virtual" SCSI device in the OS for each virtual RAID array. So that the OS can work with these SCSI drives as usual disk storage.
We use https://developer.apple.com/documentation/scsicontrollerdriverkit to migrate this functionality in the dext version of the driver.
Current issue:
When a device is connected - the dext driver cannot create a SCSI drive in the OS.
Technically our dext tries to create a SCSI drive using the UserCreateTargetForID() method.
On this step the OS sends the first SCSI command "Test Unit Ready" to the device to check if it is a SCSI device.
We process this command in an additional thread separated from the main process of the dext (as it is recommended in the DriverKit documentation).
We can see in the logs that the device receives this command and responses but when our dext sends this response to the OS the process is stuck in the waiting mode. How can we understand why it happens and fix it?
More details:
We are migrating functionality of an already existing “.kext” driver. We checked logs of the kext driver of this step:
15:06:17.902539+0700 Target device try to create for idx:0
15:06:17.902704+0700 Send command 0 for target 0 len 0
15:06:18.161777+0700 Complete command: 0 for target: 0 Len: 0 status: 0 flags: 0
15:06:18.161884+0700 Send command 18 for target 0 len 6
15:06:18.161956+0700 Complete command: 18 for target: 0 Len: 6 status: 0 flags: 0
15:06:18.162010+0700 Send command 18 for target 0 len 44
15:06:18.172972+0700 Complete command: 18 for target: 0 Len: 44 status: 0 flags: 0
15:06:18.275501+0700 Send command 18 for target 0 len 36
15:06:18.275584+0700 Complete command: 18 for target: 0 Len: 36 status: 0 flags: 0
15:06:18.276257+0700 Target device created for idx:0
We can see a successful message “Target device created for idx:0”
In the the dext logs of the same step:
We do not see the “Send command 18 for target 0 len 6” as we have in the kext logs
no log of the successful result “Target device created for idx:0”
I'll add a thread name to each line of the dext log (CustomThread,DefaultQueue,SendCommandCustomThread,InterruptQueue):
15:54:10.903466+0700 Try to create target for 0 UUID 432421434863538456 - CustomThread
15:54:10.903633+0700 UserDoesHBAPerformAutoSense - DefaultQueue
15:54:10.903763+0700 UserInitializeTargetForID - DefaultQueue
15:54:10.903876+0700 UserDoesHBASupportMultiPathing DefaultQueue
15:54:10.904200+0700 UserProcessParallelTask start - DefaultQueue
15:54:10.904298+0700 Sent command : 0 len 0 for target 0 - SendCommandCustomThread
15:54:11.163003+0700 Disable interrupts - InterruptQueue
15:54:11.163077+0700 Complete cmd : 0 for target: 0 len: 0 status: 0 flags: 0 - InterruptQueue
15:54:11.163085+0700 Enable interrupts - InterruptQueue
Code for complete task
SCSIUserParallelResponse osRsp = {0};
osRsp.fControllerTaskIdentifier = osTask->taskId;
osRsp.fTargetID = osTask->targetId;
osRsp.fServiceResponse = kSCSIServiceResponse_TASK_COMPLETE;
osRsp.fCompletionStatus = (SCSITaskStatus) response->status;
// Transfer length computation.
osRsp.fBytesTransferred = transferLength; // === 0 for this case.
ParallelTaskCompletion(osTask->action, osRsp);
osTask->action->release();
Will appreciate any help
This is effectively a deadlock, which you seem to have already worked out. It's not 100% clear from your your question, but as I initially had the same problem, I assume you're calling UserCreateTargetForID from the driver's default queue. This won't work, you must call it from a non-default queue because SCSIControllerDriverKit assumes that your default queue is idle and ready to handle requests from the kernel while you are calling this function. The header docs are very ambiguous on this, though they do mention it:
The dext class should call this method to create a new target for the
targetID. The framework ensures that the new target is created before the call returns.
Note that this call to the framework runs on the Auxiliary queue.
SCSIControllerDriverKit expects your driver to use 3 different dispatch queues (default, auxiliary, and interrupt), although I think it can be done with 2 as well. I recommend you (re-)watch the relevant part of the WWDC2020 session video about how Apple wants you to use the 3 dispatch queues, exactly. The framework does not seem to be very flexible on this point.
Good luck with the rest of the driver port, I found this DriverKit framework even more fussy than the other ones.
Thanks to pmdj for direction of think. For my case answer is just add initialization for version field for response.
osRsp.version = kScsiUserParallelTaskResponseCurrentVersion1;
It looks obvious. But there are no any information in docs or WWDC2020 video about initialization version field.
My project is hardware raid 'user space driver' . My driver has now completed the io stress test. Your problem should be in the SCSI command with data transfer. And you want to send data to the system by your software driver to complete the SCSI ' inquiry ' command. I think you also used 'UserGetDataBuffer'. It seems to be some distance from iokit's function.
kern_return_t IMPL ( XXXXUserSpaceDriver, UserProcessParallelTask )
{
/*
**********************************************************************
** UserGetDataBuffer
**********************************************************************
*/
if(parallelTask.fCommandDescriptorBlock[0] == SCSI_CMD_INQUIRY)
{
IOBufferMemoryDescriptor *data_buffer_memory_descriptor = nullptr;
/*
******************************************************************************************************************************************
** virtual kern_return_t UserGetDataBuffer(SCSIDeviceIdentifier fTargetID, uint64_t fControllerTaskIdentifier, IOBufferMemoryDescriptor **buffer);
******************************************************************************************************************************************
*/
if((UserGetDataBuffer(parallelTask.fTargetID, parallelTask.fControllerTaskIdentifier, &data_buffer_memory_descriptor) == kIOReturnSuccess) && (data_buffer_memory_descriptor != NULL))
{
IOAddressSegment data_buffer_virtual_address_segment = {0};
if(data_buffer_memory_descriptor->GetAddressRange(&data_buffer_virtual_address_segment) == kIOReturnSuccess)
{
IOAddressSegment data_buffer_physical_address_segment = {0};
IODMACommandSpecification dmaSpecification;
IODMACommand *data_buffer_iodmacommand = {0};
bzero(&dmaSpecification, sizeof(dmaSpecification));
dmaSpecification.options = kIODMACommandSpecificationNoOptions;
dmaSpecification.maxAddressBits = 64;
if(IODMACommand::Create(ivars->pciDevice, kIODMACommandCreateNoOptions, &dmaSpecification, &data_buffer_iodmacommand) == kIOReturnSuccess)
{
uint64_t dmaFlags = kIOMemoryDirectionInOut;
uint32_t dmaSegmentCount = 1;
pCCB->data_buffer_iodmacommand = data_buffer_iodmacommand;
if(data_buffer_iodmacommand->PrepareForDMA(kIODMACommandPrepareForDMANoOptions, data_buffer_memory_descriptor, 0/*offset*/, parallelTask.fRequestedTransferCount/*length*/, &dmaFlags, &dmaSegmentCount, &data_buffer_physical_address_segment) == kIOReturnSuccess)
{
parallelTask.fBufferIOVMAddr = (uint64_t)data_buffer_physical_address_segment.address; /* data_buffer_physical_address: overwrite original fBufferIOVMAddr */
pCCB->OSDataBuffer = reinterpret_cast <uint8_t *> (data_buffer_virtual_address_segment.address);/* data_buffer_virtual_address */
}
}
}
}
}
}
response.fBytesTransferred = dataxferlen;
response.version = kScsiUserParallelTaskResponseCurrentVersion1;
response.fTargetID = TARGETLUN2SCSITARGET(TargetID, 0);
response.fControllerTaskIdentifier = pCCB->fControllerTaskIdentifier;
response.fCompletionStatus = taskStatus;
response.fServiceResponse = serviceResponse;
response.fSenseLength = taskStatus;
IOUserSCSIParallelInterfaceController::ParallelTaskCompletion(pCCB->completion, response);
pCCB->completion->release();
pCCB->completion = NULL;
pCCB->ccb_flags.start = 0;/*reset startdone for outstanding ccb check*/
if(pCCB->data_buffer_iodmacommand != NULL)
{
pCCB->data_buffer_iodmacommand->CompleteDMA(kIODMACommandCompleteDMANoOptions);
OSSafeReleaseNULL(pCCB->data_buffer_iodmacommand); // pCCB->data_buffer_iodmacommand->free(); pCCB->data_buffer_iodmacommand = NULL;
pCCB->OSDataBuffer = NULL;
}
I am using the EIPScanner library to talk to 2 TURCK I/O modules via EthernetIP. The library works fine with one module, but the moment I instantiate the driver for the second one, the first one times out.
I tried to modify the library by passing a port number to listen on, but it didn't seem to make a difference:
In ConnectionManager.cpp
IOConnection::WPtr
ConnectionManager::forwardOpen(const SessionInfoIf::SPtr& si, ConnectionParameters connectionParameters, int port, bool isLarge) {
....
....
....
if (o2tSockAddrInfo != additionalItems.end())
{
Buffer sockAddrBuffer(o2tSockAddrInfo->getData());
sockets::EndPoint endPoint("",0);
sockAddrBuffer >> endPoint;
endPoint.setPort(port)
if (endPoint.getHost() == "0.0.0.0") {
ioConnection->_socket = std::make_unique<UDPSocket>(
si->getRemoteEndPoint().getHost(), endPoint.getPort());
} else {
ioConnection->_socket = std::make_unique<UDPSocket>(endPoint);
}
}
I tried adding a timeout to the recvfrom() function, but that also didn't make a difference.
In UDPSocket.cpp
UDPSocket::UDPSocket(EndPoint endPoint)
: BaseSocket(EndPoint(std::move(endPoint))) {
_sockedFd = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP);
if (_sockedFd < 0) {
throw std::system_error(BaseSocket::getLastError(), BaseSocket::getErrorCategory());
}
/*struct timeval tv;
tv.tv_sec = 0;
tv.tv_usec = 1;
setsockopt(_sockedFd, SOL_SOCKET, SO_RCVTIMEO, &tv, sizeof(tv));*/
Logger(LogLevel::DEBUG) << "Opened UDP socket fd=" << _sockedFd;
}
Output:
[DEBUG] Opened TCP socket fd=7
[DEBUG] Connecting to 192.168.1.130:44818
[INFO] Registered session 103
[DEBUG] Opened TCP socket fd=8
[DEBUG] Connecting to 192.168.1.131:44818
[INFO] Registered session 103
2021-06-18 13:03:02,996 Process-1
2021-06-18 13:03:02,997 Process-1 [INFO] Send request: service=0x54 epath=[classId=6 objectId=1]
[INFO] Open IO connection O2T_ID=1781321728 T2O_ID=915406849 SerialNumber 1
[DEBUG] Opened UDP socket fd=15
[INFO] Open UDP socket to send data to 192.168.1.130:2223
[DEBUG] Opened UDP socket fd=16
[INFO] Will look for id: 915406849 from endPoint: 192.168.1.130:2222
[DEBUG] Received data from connection T2O_ID=915406849(host: 192.168.1.130:2223)
[DEBUG] Received: seq=2 data=[0][0][0][0]
[DEBUG] Received data from connection T2O_ID=915406849(host: 192.168.1.130:2223)
[DEBUG] Received: seq=3 data=[0][0][0][0]
2021-06-18 13:03:03,112 Process-2
2021-06-18 13:03:03,113 Process-2
[INFO] Send request: service=0x54 epath=[classId=6 objectId=1]
[INFO] Open IO connection O2T_ID=1782571776 T2O_ID=1911881729 SerialNumber 1
[DEBUG] Received data from connection T2O_ID=1911881729(host: 192.168.1.130:2223)
[DEBUG] Received: seq=1 data=[0][0][4][0]
[DEBUG] Opened UDP socket fd=20
[INFO] Open UDP socket to send data to 192.168.1.131:2224
[DEBUG] Opened UDP socket fd=21
[INFO] Will look for id: 1911881729 from endPoint: 192.168.1.131:2222
[DEBUG] Received data from connection T2O_ID=915406849(host: 192.168.1.131:2224)
[DEBUG] Received: seq=4 data=[0][0][0][0]
[DEBUG] Received data from connection T2O_ID=1911881729(host: 192.168.1.131:2224)
[DEBUG] Received: seq=2 data=[0][0][4][0]
As you can see once the device on 192.168.1.131 starts streaming, the 1.130 device stops. And eventually it times out and shuts down.
I have a test which constructs another event sourcing actor from inside message handler and this construction is taking more than 3 seconds. Below is current configuration, how can I increase default timeout?
extends ScalaTestWithActorTestKit(
ConfigFactory.parseString("""
akka.persistence.testkit.events.serialize = off
akka.actor.allow-java-serialization = on
akka.test.single-expect-default = 999s
""").withFallback(PersistenceTestKitPlugin.config).withFallback(ManualTime.config)
Here is the error message:
Timeout (3 seconds) during receiveMessage while waiting for message.
java.lang.AssertionError: Timeout (3 seconds) during receiveMessage while waiting for message.
Try with the constructor: public ScalaTestWithActorTestKit(com.typesafe.config.Config config)
similar constructor in (it works for me):
extends TestKit(ActorSystem("AliveActorSpec", ConfigFactory.load(ConfigFactory.parseString("""
| akka {
| test {
| single-expect-default = 6s
| }
| }
""".stripMargin))))
I use tornado 5.1
There is a socket iostream:
self.socket = socket.socket()
self.stream = tornado.iostream.IOStream(self.socket)
self.stream.connect((self.ip, self.port), self.connect_callback)
I need to reconnect it if it got WSAECONNREFUSED or any other error resulting it to be closed.
And ofc. if the connection is alive it does not to be reconnected. This procedure on attempting to reconnect must be executed only if the stream is closed.
Yes, the only is required is to make PeriodicCallback:
server = MainServer() # an instance of tornado.web.Application
server.listen(8888)
reconnect_periodic_callback = tornado.ioloop.PeriodicCallback(
server.reconnect_closed_clients, 5000
)
reconnect_periodic_callback.start()
tornado.ioloop.IOLoop.instance().start()
What means a result of 0x000000be in send command:
iResult = send( ConnectSocket, dataToSend, (int) strlen(dataToSend), 0 );
I didn't found this return code here: http://msdn.microsoft.com/en-us/library/windows/desktop/ms740668(v=vs.85).aspx
Any ideas?
Thx
The value returned by send is the number of bytes it sent. If it fails it returns SOCKET_ERROR and you use WSAGetLastError to get the error code, which is the codes listed in your link. Read the manual page for send instead.