IcmpSendEcho fails but "ping" succeeds - c++

I have been looking at using IcmpSendEcho, and found that it will fail to ping certain devices (e.g. my BT Home Hub 4) with GetLastError reporting 11010. While for other devices it works fine (when executing on the same system). In comparison, ping.exe succeeds on all these devices, but I have no idea how the implementation of Ping differs. All cases I have tried so far have been IPv4, which I provided directly (so no DNS etc.).
hIcmpFile = IcmpCreateFile();
ipAddress = inet_addr(ipAddressStr);
...hIcmpFile is reused
static const WORD sendSize = 32;
static const DWORD replySize = sizeof(ICMP_ECHO_REPLY) + sendSize;
char sendData[sendSize] = { 0 };
char replyBuffer[replySize];
auto ret = IcmpSendEcho(hIcmpFile, ipAddress, sendData, sendSize, NULL, replyBuffer, replySize, 1000);
if (ret == 0)
{
auto error = GetLastError();
The only other report I have found is what would cause ICMPsendEcho to fail when ping.exe succeeds. However those answers appear to differ from my problem. I have tried using different payload sizes, and I have tried IcmpSendEcho2, that also failed for the same devices.

Try running with Administrator rights.

I've been having a similar problem, but I think the issue is due to the icmp request timing out before you get a reply.
My code is based heavily on the example code from the MSDN page for IcmpSendEcho, only I added a number of retries on failure. My code runs in the evenings, when machines are likely to have gone to sleep or some other low power state which means they take a few seconds to wake up and reply.
Usually my output logs state that the first ping attempt fails with error 11010. The second attempt always succeeds. So I'm guessing the first ping gives the machine a poke and wakes it up, but I miss the delayed reply. The second ping succeeds.
So try either adding a longer timeout or just add a few retries.

Related

Winsock sendto returns error 10049 (WSAEADDRNOTAVAIL) for broadcast address after network adapter is disabled or physically disconnected

I am working on a p2p application and to make testing simple, I am currently using udp broadcast for the peer discovery in my local network. Each peer binds one udp socket to port 29292 of the ip address of each local network interface (discovered via GetAdaptersInfo) and each socket periodically sends a packet to the broadcast address of its network interface/local address. The sockets are set to allow port reuse (via setsockopt SO_REUSEADDR), which enables me to run multiple peers on the same local machine without any conflicts. In this case there is only a single peer on the entire network though.
This all works perfectly fine (tested with 2 peers on 1 machine and 2 peers on 2 machines) UNTIL a network interface is disconnected. When deactivacting the network adapter of either my wifi or an USB-to-LAN adapter in the windows dialog, or just plugging the usb cable of the adapter, the next call to sendto will fail with return code 10049. It doesn't matter if the other adapter is still connected, or was at the beginning, it will fail. The only thing that doesn't make it fail is deactivating wifi through the fancy win10 dialog through the taskbar, but that isn't really a surprise because that doesn't deactivate or remove the adapter itself.
I initially thought that this makes sense because when the nic is gone, how should the system route the packet. But: The fact that the packet can't reach its target has absolutely nothing to do with the address itsself being invalid (which is what the error means), so I suspect I am missing something here. I was looking for any information I could use to detect this case and distinguish it from simply trying to sendto INADDR_ANY, but I couldn't find anything. I started to log every bit of information which I suspected could have changed, but its all the same on a successfull sendto and the one that crashes (retrieved via getsockopt):
250 16.24746[886] [debug|debug] local address: 192.168.178.35
251 16.24812[886] [debug|debug] no remote address
252 16.25333[886] [debug|debug] type: SOCK_DGRAM
253 16.25457[886] [debug|debug] protocol: IPPROTO_UDP
254 16.25673[886] [debug|debug] broadcast: 1, dontroute: 0, max_msg_size: 65507, rcv_buffer: 65536, rcv_timeout: 0, reuse_addr: 1, snd_buffer: 65536, sdn_timeout: 0
255 16.25806[886] [debug|debug] Last WSA error on socket was WSA Error Code 0: The operation completed successfully.
256 16.25916[886] [debug|debug] target address windows formatted: 192.168.178.255
257 16.25976[886] [debug|debug] target address 192.168.178.255:29292
258 16.26138[886] [debug|assert] ASSERT FAILED at D:\Workspaces\spaced\source\platform\win32_platform.cpp:4141: sendto failed with (unhandled) WSA Error Code 10049: The requested address is not valid in its context.
The nic that got removed is this one:
1.07254[0] [platform|info] Discovered Network Interface "Realtek USB GbE Family Controller" with IP 192.168.178.35 and Subnet 255.255.255.0
And this is the code that does the sending (dlog_socket_information_and_last_wsaerror generates all the output that is gathered using getsockopt):
void send_slice_over_udp_socket(Socket_Handle handle, Slice<d_byte> buffer, u32 remote_ip, u16 remote_port){
PROFILE_FUNCTION();
auto socket = (UDP_Socket*) sockets[handle.handle];
ASSERT_VALID_UDP_SOCKET(socket);
dlog_socket_information_and_last_wsaerror(socket);
if(socket->is_dummy)
return;
if(buffer.size == 0)
return;
DASSERT(socket->state == Socket_State::created);
u64 bytes_left = buffer.size;
sockaddr_in target_socket_address = create_socket_address(remote_ip, remote_port);
#pragma warning(push)
#pragma warning(disable: 4996)
dlog("target address windows formatted: %s", inet_ntoa(target_socket_address.sin_addr));
#pragma warning(pop)
unsigned char* parts = (unsigned char*)&remote_ip;
dlog("target address %hhu.%hhu.%hhu.%hhu:%hu", parts[3], parts[2], parts[1], parts[0], remote_port);
int sent_bytes = sendto(socket->handle, (char*) buffer.data, bytes_left > (u64) INT32_MAX ? INT32_MAX : (int) bytes_left, 0, (sockaddr*)&target_socket_address, sizeof(target_socket_address));
if(sent_bytes == SOCKET_ERROR){
#define LOG_WARNING(message) log_nonreproducible(message, Category::platform_network, Severity::warning, socket->handle); return;
switch(WSAGetLastError()){
//#TODO handle all (more? I guess many should just be asserted since they should never happen) cases
case WSAEHOSTUNREACH: LOG_WARNING("socket %lld, send failed: The remote host can't be reached at this time.");
case WSAECONNRESET: LOG_WARNING("socket %lld, send failed: Multiple UDP packet deliveries failed. According to documentation we should close the socket. Not sure if this makes sense, this is a UDP port after all. Closing the socket wont change anything, right?");
case WSAENETUNREACH: LOG_WARNING("socket %lld, send failed: the network cannot be reached from this host at this time.");
case WSAETIMEDOUT: LOG_WARNING("socket %lld, send failed: The connection has been dropped, because of a network failure or because the system on the other end went down without notice.");
case WSAEADDRNOTAVAIL:
case WSAENETRESET:
case WSAEACCES:
case WSAEWOULDBLOCK: //can this even happen on a udp port? I expect this to be fire-and-forget-style.
case WSAEMSGSIZE:
case WSANOTINITIALISED:
case WSAENETDOWN:
case WSAEINVAL:
case WSAEINTR:
case WSAEINPROGRESS:
case WSAEFAULT:
case WSAENOBUFS:
case WSAENOTCONN:
case WSAENOTSOCK:
case WSAEOPNOTSUPP:
case WSAESHUTDOWN:
case WSAECONNABORTED:
case WSAEAFNOSUPPORT:
case WSAEDESTADDRREQ:
ASSERT(false, tprint_last_wsa_error_as_formatted_message("sendto failed with (unhandled) ")); break;
default: ASSERT(false, tprint_last_wsa_error_as_formatted_message("sendto failed with (undocumented) ")); //The switch case above should have been exhaustive. This is a bug. We either forgot a case, or maybe the docs were lying? (That happened to me on android. Fun times. Well. Not really.)
}
#undef LOG_WARNING
}
DASSERT(sent_bytes >= 0);
total_bytes_sent += (u64) sent_bytes;
bytes_left -= (u64) sent_bytes;
DASSERT(bytes_left == 0);
}
The code that generates the address from ip and port looks like this:
sockaddr_in create_socket_address(u32 ip, u16 port){
sockaddr_in address_info;
address_info.sin_family = AF_INET;
address_info.sin_port = htons(port);
address_info.sin_addr.s_addr = htonl(ip);
memset(address_info.sin_zero, 0, 8);
return address_info;
}
The error seems to be a little flaky. It reproduces 100% of the time until it decides not to anymore. After a restart its usually back.
I am looking for a solution to handle this case correctly. I could of course just re-do the network interface discovery when the error occurs, because I "know" that I don't give any broken IPs to sendto, but that would just be a heuristic. I want to solve the actual problem.
I also don't quite understand when error 10049 is supposed to fire exactly anyway. Is it just if I pass an ipv6 address to a ipv4 socket, or send to 0.0.0.0? There is no flat out "illegal" ipv4 address after all, just ones that don't make sense from context.
If you know what I am missing here, please let me know!
This is a issue people have been facing up for a while , and people suggested to read the documentation provided by Microsoft on the following issue .
"Btw , I don't know whether they are the same issues or not but the error thrown back the code are same, that's why I have attached a link for the same!!"
https://learn.microsoft.com/en-us/answers/questions/537493/binding-winsock-shortly-after-boot-results-in-erro.html
I found a solution (workaround?)
I used NotifyAddrChange to receive changes to the NICs and thought it for some reason didn't trigger when I disabled the NIC. Turns out it does, I'm just stupid and stopped debugging too early: There was a bug in the code that diffs the results from GetAdaptersInfo to the last known state to figure out the differences, so the application missed the NIC disconnecting. Now that it observes the disconnect, it can kill the sockets before they try to send on the disabled NIC, thus preventing the error from happening. This is not really a solution though, since there is a race condition here (NIC gets disabled before send and after check for changes), so I'll still have to handle error 10049.
The bug was this:
My expectation was that, when I disable a NIC, iterating over all existing NICs would show the disabled NIC as disabled. That is not what happens. What happens is that the NIC is just not in the list of existing NICs anymore, even though the windows dialog will still show it (as disabled). That is somewhat suprising to me but not all that unreasonable I guess.
Before I had these checks to detect changes in the NICs:
Did the NIC exist before, was enabled and is now disabled -> disable notification
Did the NIC exist before, was disabled and is now enabled -> enable notification
Did the NIC not exist before, is not enabled -> enable notification
And the fix was adding a fourth one:
Is there an existing NIC that was not in the list of NICs anymore -> disable notification
I'm still not 100% happy that there is the possibility of getting a somewhat ambiguous error on a race condition, but I might call it a day here.

BlueZ over DBus stops responding within varying intervals

I am currently using the BlueZ DBus API to scan for BLE devices but it stops responding completely after varying intervals. Sometimes it's minutes, other times it's one or two hours.
My assumption is that I am forgetting to do something. The strange thing is, that after I exit the application, tools like bluetoothctl and hciconfig also no longer respond. Sometimes a reboot is not enough to get it to work again and I need to power-cycle the machines. It happens on several different machines as well.
I am acquiring the bus using:
GError* error = nullptr;
mConnection = g_bus_get_sync(G_BUS_TYPE_SYSTEM, nullptr, &error);
Starting the loop:
mLoop = g_main_loop_new(nullptr, false);
g_main_loop_run(mLoop);
Then I power on the adapter by setting the Powered property to true and calling StartDiscovery. Devices are then reported through:
guint iface_add_sub = g_dbus_connection_signal_subscribe(mConnection, "org.bluez", "org.freedesktop.DBus.ObjectManager", "InterfacesAdded", nullptr, nullptr, G_DBUS_SIGNAL_FLAGS_NONE, device_appeared, this, nullptr);
guint iface_remove_sub
= g_dbus_connection_signal_subscribe(mConnection, "org.bluez", "org.freedesktop.DBus.ObjectManager", "InterfacesRemoved", nullptr, nullptr, G_DBUS_SIGNAL_FLAGS_NONE, device_disappeared, this, nullptr);
Is there anything I am missing to prevent BlueZ from stopping to respond?
bluetoothd -v = 5.53
ldd --version = GLIBC 2.31-0ubuntu9.2
Bluez cannot scan indefinitely because it will crash, as you are observing. In my opinion it is a bug…
What I do is to scan for 8 seconds and then stop the scan. Bluez will then clean up internal cache. After allowing 1 second for cleanup, I restart the scan.
This way it will hold up much longer. Nonetheless after a week or so it might still crash…

ASIOCallbacks::bufferSwitchTimeInfo comes very slowly in 2.8MHz Samplerate with DSD format on Sony PHA-3

I bought a Sony PHA-3 and try to write an app to play DSD in native mode. (I've succeeded in DoP mode.)
However, When I set the samplerate to 2.8MHz, I found the ASIOCallbacks::bufferSwitchTimeInfo come not so fast as it should be.
It'll take nearly 8 seconds to request for 2.8MHz samples which should be completed in 1 second.
The code is merely modified from the host sample of asiosdk 2.3, thus I'll post a part of the key codes to help complete my question.
After ASIO Start, the host sample will keep printing the progress to indicating the time info like this:
fprintf (stdout, "%d ms / %d ms / %d samples **%ds**", asioDriverInfo.sysRefTime,
(long)(asioDriverInfo.nanoSeconds / 1000000.0),
(long)asioDriverInfo.samples,
(long)(**asioDriverInfo.samples / asioDriverInfo.sampleRate**));
The final expression will tell me how many seconds has elapsed. (asioDriverInfo.samples/asioDriverInfo.sampleRate).
Where asioDriverInfo.sampleRate is 2822400 Hz.
And asioDriverInfo.samples is assigned in the ASIOCallbacks::bufferSwitchTimeInfo like below:
if (timeInfo->timeInfo.flags & kSamplePositionValid)
asioDriverInfo.samples = ASIO64toDouble(timeInfo->timeInfo.samplePosition);
else
asioDriverInfo.samples = 0;
It's the original code of the sample.
So I can easily find out the time elapsed very slowly.
I've tried to raise the samplerate to even higher, say 2.8MHz * 4, it's even longer to see the time to advance 1 second.
I tried to lower the samplerate to below 2.8MHz, the API failed.
I surely have set the SampleFormat according to the guide of the sdk.
ASIOIoFormat aif;
memset(&aif, 0, sizeof(aif));
aif.FormatType = kASIODSDFormat;
ASIOSampleRate finalSampleRate = 176400;
if(ASE_SUCCESS == ASIOFuture(kAsioSetIoFormat,&aif) ){
finalSampleRate = 2822400;
}
In fact, without setting the SampleFormat to DSD, setting samplerate to 2.8MHz will lead to an API failure.
Finally, I remembered all the DAW (Cubase / Reaper, ...) have an option to set the thread priority, so I doubted the thread of the callback is not high enough and also try to raise its thread priority to see if this could help. However, when I check the thread priority, it returns THREAD_PRIORITY_TIME_CRITICAL.
static double processedSamples = 0;
if (processedSamples == 0)
{
HANDLE t = GetCurrentThread();
int p = GetThreadPriority(t); // I get THREAD_PRIORITY_TIME_CRITICAL here
SetThreadPriority(t, THREAD_PRIORITY_HIGHEST); // So the priority is no need to raise anymore....(SAD)
}
It's same for the ThreadPriorityBoost property. It's not disabled (already boosted).
Anybody has tried to write a host asio demo and help me resolve this issue?
Thanks very much in advance.
Issue cleared.
I should getBufferSize and createBuffers after kAsioSetIoFormat.

libircclient : Selective connection absolutely impossible to debug

I'm not usually the type to post a question, and more to search why something doesn't work first, but this time I did everything I could, and I just can't figure out what is wrong.
So here's the thing:
I'm currently programming an IRC Bot, and I'm using libircclient, a small C library to handle IRC connections. It's working pretty great, it does the job and is kinda easy to use, but ...
I'm connecting to two different servers, and so I'm using the custom networking loop, which uses the select function. On my personal computer, there's no problem with this loop, and everything works great.
But (Here's the problem), on my remote server, where the bot will be hosted, I can connect to one server but not the other.
I tried to debug everything I could. I even went to examine the sources of libircclient, to see how it worked, and put some printfs where I could, and I could see where does it comes from, but I don't understand why it does this.
So here's the code for the server (The irc_session_t objects are encapsulated, but it's normally kinda easy to understand. Feel free to ask for more informations if you want to):
// Connect the first session
first.connect();
// Connect the osu! session
second.connect();
// Initialize sockets sets
fd_set sockets, out_sockets;
// Initialize sockets count
int sockets_count;
// Initialize timeout struct
struct timeval timeout;
// Set running as true
running = true;
// While the server is running (Which means always)
while (running)
{
// First session has disconnected
if (!first.connected())
// Reconnect it
first.connect();
// Second session has disconnected
if (!second.connected())
// Reconnect it
second.connect();
// Reset timeout values
timeout.tv_sec = 1;
timeout.tv_usec = 0;
// Reset sockets count
sockets_count = 0;
// Reset sockets and out sockets
FD_ZERO(&sockets);
FD_ZERO(&out_sockets);
// Add sessions descriptors
irc_add_select_descriptors(first.session(), &sockets, &out_sockets, &sockets_count);
irc_add_select_descriptors(second.session(), &sockets, &out_sockets, &sockets_count);
// Select something. If it went wrong
int available = select(sockets_count + 1, &sockets, &out_sockets, NULL, &timeout);
// Error
if (available < 0)
// Error
Utils::throw_error("Server", "run", "Something went wrong when selecting a socket");
// We have a socket
if (available > 0)
{
// If there was something wrong when processing the first session
if (irc_process_select_descriptors(first.session(), &sockets, &out_sockets))
// Error
Utils::throw_error("Server", "run", Utils::string_format("Error with the first session: %s", first.get_error()));
// If there was something wrong when processing the second session
if (irc_process_select_descriptors(second.session(), &sockets, &out_sockets))
// Error
Utils::throw_error("Server", "run", Utils::string_format("Error with the second session: %s", second.get_error()));
}
The problem in this code is that this line:
irc_process_select_descriptors(second.session(), &sockets, &out_sockets)
Always return an error the first time it's called, and only for one server. The weird thing is that on my Windows computer, it works perfectly, while on the Ubuntu server, it just doesn't want to, and I just can't understand why.
I did some in-depth debug, and I saw that libircclient does this:
if (session->state == LIBIRC_STATE_CONNECTING && FD_ISSET(session->sock, out_set))
And this is where everything goes wrong. The session state is correctly set to LIBIRC_STATE_CONNECTING, but the second thing, FD_ISSET(session->sock, out_set) always return false. It returns true for the first session, but for the second session, never.
The two servers are irc.twitch.tv:6667 and irc.ppy.sh:6667. The servers are correctly set, and the server passwords are correct too, since everything works fine on my personal computer.
Sorry for the very long post.
Thanks in advance !
Alright, after some hours of debug, I finally got the problem.
So when a session is connected, it will enter in the LIBIRC_STATE_CONNECTING state, and then when calling irc_process_select_descriptors, it will check this:
if (session->state == LIBIRC_STATE_CONNECTING && FD_ISSET(session->sock, out_set))
The problem is that select() will alter the sockets sets, and will remove all the sets that are not relevant.
So if the server didn't send any messages before calling the irc_process_select_descriptors, FD_ISSET will return 0, because select() thought that this socket is not relevant.
I fixed it by just writing
if (session->state == LIBIRC_STATE_CONNECTING)
{
if(!FD_ISSET(session->sock, out_set))
return 0;
...
}
So it will make the program wait until the server has sent us anything.
Sorry for not having checked everything !

c++ networking issue- my OS does not send ack

I wrote an application in c++ which send data over tcp connection to several machines.as part of the protocol I use in my application, the other side sends hearbeat messages from time to time, then I know that the connection is still alive.
now, I want this application to work on 100 machines or more at the same time. but, I see that sometimes I don't get these heartbeat messages although they sent: I see in wireshark that the packet arrived, then my OS doesn't ack on this message, so there are some retransmit without any ack from my OS. if I look in the window size property - I saw that there is no issue in this part. what can be the root cause for this behavior? it is something in my code that I should change?
this is my select code:
while(it != sockets.end() ){
FD_SET((*it), &readFds);
if( (int)(*it) > fdCount ){
fdCount = *it;
}
it++;
}
int res = select(fdCount, &readFds, NULL, NULL, NULL );
I'm using server 2008 r2.
I don't see any stress on the network card or on the switch.
Please help me!
thanks