I am having trouble figuring out sockets i am just asking the server for data at a position (glm::i64vec4) and expecting a response but the position gets way off when i get the response and the data for that position reflects that (aka my voxel game make a kinda cool looking but useless mess)
It's probably just me not understanding sockets whatsoever or maybe something weird with this library
one thought i had is it was maybe something to do with mismatching blocking and non blocking on the server and client
but when i switched the server to blocking (and put each client in a seperate thread from each other and the accepting process) it did nothing
if i'm doing something really stupid please tell me i know next to nothing about sockets
here is some code that probably looks horrible
Server Code
std::deque <CActiveSocket*> clients;
CPassiveSocket socket;
socket.Initialize();
socket.SetNonblocking();//I'm doing this so i don't need multiple threads for clients
socket.Listen("0.0.0.0",port);
while (1){
{
CActiveSocket* c;
if ((c = socket.Accept()) != NULL){
clients.emplace_back(c);
}
}
for (CActiveSocket*& c : clients){
c->Receive(sizeof(glm::i64vec4));
if (c->GetBytesReceived() == sizeof(glm::i64vec4)){
chkpkt chk;
chk.pos = *(glm::i64vec4*)c->GetData();
LOOP3D(chksize+2){
chk.data(i,j,k).val = chk.pos.y*chksize+j;
chk.data(i,j,k).id=0;
}
while (c->Send((uint8*)&chk,sizeof(chkpkt)) != sizeof(chkpkt)){}
}
}
}
Client Code
//v is a glm::i64vec4
//fsock is set to Blocking
if(fsock.Send((uint8*)&v,sizeof(glm::i64vec4)))
if (fsock.Receive(sizeof(chkpkt))){
tthread::lock_guard<tthread::fast_mutex> lock(wld->filemut);
wld->ichks[v]=(*(chkpkt*)fsock.GetData()).data;//i tried using the position i get back from the server to set this (instead of v) but that made it to where nothing loaded
//i checked it and the chunks position never lines up with what i sent
}
Without your complete application codes it's unrealistic to offer any suggestions of any particular lines of code correction.
But it seems like you are using this library. It doesn't matter if not, because most of time when doing network programming, socket's weird behavior make some problems somewhat universal. Thus there are a few suggestions for the portion of socket application in your project:
It suffices to have BLOCKING sockets.
Most of time socket's read have somewhat weird behavior, that is, it might not receive the requested size of bytes at a time. Due to this, you need to repeatedly call read until the receiving buffer is read thoroughly. For a complete and robust solution you can refer to Stevens's readn routine ([Ref.1], page122).
If you are using exactly the library mentioned above, you can see that your fsock.Receive eventually calls recv. And recv is just an variant of read[Ref.2], thus the solutions for both of them are just identical. And this pattern might help:
while(fsock.Receive(sizeof(chkpkt))>0)
{
// ...
}
Ref.1: https://mathcs.clarku.edu/~jbreecher/cs280/UNIX%20Network%20Programming(Volume1,3rd).pdf
Ref.2: https://man7.org/linux/man-pages/man2/recv.2.html#DESCRIPTION
Related
Is there simple example code that shows how to create a non-blocking network-based bio from scratch? I do not need to verify right now or renegotiate, just get data flowing back and forth first.
I'm trying to use openssl on top of an already existing abstraction of epoll, sockets, and buffers. I already have existing machinery for all of that and am trying to create a BIO over it, but I cannot for the life of me get it to work. I already have an established TCP connection, so I need to insert that into a source/sink bio and then do the handshake.
The current state is the dreaded scenario where SSL_connect returns -1, SSL_get_error returns 5 (syscall error), and errno reads SUCCESS. I have seen others have the same problem, but not a single answer.
The reason for doing this instead of just using a mem bio to shuttle bytes to back forth is because the rest of the stack is fairly well optimized, and I don't want to do the extra copying.
My first idea what just to implement a bio over the in and out buffers I already maintain, but I cannot get that work either. There are a lot questions floating around this site and others. Some have outdated answers, but most just don't have answers that work when they do have answers at all.
Well, interesting. I would have made this a comment but it's too long.
I think this might boil down to the version of OpenSSL you are using. I took my existing blocking socket implementation and did this (sorry, it's a bit quick and dirty and doesn't bother to clear the error stack as it should, but that doesn't seem to be causing a problem. The busy loop is intentional, to hammer OpenSSL as hard as possible, but I also tried it with a short sleep and the result was the same):
u_long non_blocking = 1;
int ioctl_err = ioctlsocket (skt, FIONBIO, &non_blocking);
assert (ioctl_err == 0);
int connect_result = 0;
int connect_err = 0;
for ( ; ; )
{
connect_result = SSL_connect (ssl);
if (connect_result >= 0)
break;
connect_err = SSL_get_error (ssl, connect_result);
if (connect_err != SSL_ERROR_WANT_READ && connect_err != SSL_ERROR_WANT_WRITE)
break; // I put a breakpoint here; it was never hit
}
u_long non_blocking = 0; // so that the rest of my code still works
ioctl_err = ioctlsocket (skt, FIONBIO, &non_blocking);
if (connect_result <= 0) // this never happened either
do_something_appropriate ();
// Various (synchronous) calls to `SSL_read` and `SSL_write`, these all worked fine
Now I know this isn't exactly what you are doing, but from OpenSSL's point of view that shouldn't matter. So, for me, I can get it to work.
Testing environment:
Windows (sorry!)
OpenSSL 3.0.1 (which, if not the latest, is not far off)
tl;dr If you're using the OpenSSL libraries that came with your Linux installation, it might be time to move on. Plus, if you build it from source you can build it with debug info, which might come in handy some time (I've actually built two versions for exactly this reason - one optimised and one not).
So that's it. HTH. Looks like you might be OK after all.
PS: I'm obviously not doing anything with BIOs here, and maybe your problem lies there instead. If so, we need some self-contained sample code using those which exhibits the problems you are having. Then, perhaps, someone can suggest a solution.
Define BIO method:
BIO_meth_new
BIO_meth_set_write_ex
BIO_meth_set_read_ex
BIO_meth_set_ctrl
BIO_meth_free
Create BIO:
BIO_new
BIO_set_data
BIO_free
Associate BIO with SSL:
SSL_set_bio
I am using linux epoll in edge trigger mode.
Each time a new connection is incoming, I add the file descriptor to epoll with EPOLLIN|EPOLLOUT|EPOLLET flag. My first question is: What's the right way to check which kind of event(s) occur for each ready file descriptor after the epoll_wait returns? I mean, I see some example code e.g from https://github.com/yedf/handy/blob/master/raw-examples/epoll-et.cc line 124 do it like this:
for (int i = 0; i < n; i++) {
//...
if (events & (EPOLLIN | EPOLLERR)) {
if (fd == lfd) {
handleAccept(efd, fd);
} else {
handleRead(efd, fd);
}
} else if (events & EPOLLOUT) {
if (output_log)
printf("handling epollout\n");
handleWrite(efd, fd);
} else {
exit_if(1, "unknown event");
}
}
What caught my attention is: it uses "if and else if and else" to check which event occurs, which means if it handleRead, then it can't handleWrite at the same time. And I think this may cause loss of event in the following condition: Both socket read and write operation have meet EAGAIN and then the remote end both read and send some data, thus the epoll wait may set both EPOLLIN and EPOLLOUT, but it can only handleRead, and the data remaining in output buffer can't be sent since handleWrite is not being called.
So is the above usage wrong?
According man 7 epoll QA:
If more than one event occurs between epoll_wait(2) calls, are
they combined or reported separately?
They will be combined.
If i got it right, several events can occur on a single file descriptor between epoll_wait calls. So I think I should use multiple "if if and if" to check on by one whether readable/writable/error events occur instead of using "if and else if". I went to see how nginx epoll module do, from https://github.com/nginx/nginx/blob/953f53921505a884f3912f2d8db5217a71c0479a/src/event/modules/ngx_epoll_module.c#L867 I see the following code:
if (revents & (EPOLLERR|EPOLLHUP)) {
//...
}
if ((revents & EPOLLIN) && rev->active) {
//....
rev->handler(rev);
}
if ((revents & EPOLLOUT) && wev->active) {
//....
wev->handler(wev);
}
It seems to adhere to my thoughts of checking all EPOLLERR..,EPOLLIN,EPOLLOUT events one after another.
Then I do the same kind of thing as nginx do in my application. But What I realized after experiment is: if I add the file descriptor to epoll with EPOLLIN|EPOLLOUT|EPOLLET flag, and I didn't fill up the output buffer, I will always get EPOLLOUT flag set after epoll_wait returns due to some data arrives and this fd becomes readable, therefore redundant write_handler would be called, which is not what I expect.
I did some search and found that this situation indeed exists and not caused by any bug in my application. According to the top voted answer at epoll with edge triggered event says:
On a somewhat related note: if you register for EPOLLIN and EPOLLOUT events and assuming you never fill up the send buffer, you still get the EPOLLOUT flag set in the event returned by epoll_wait each time EPOLLIN is triggered - see https://lkml.org/lkml/2011/11/17/234 for a more detailed explanation.
And the link in this answer says:
It's doesn't mean there's an EPOLLOUT "event", it just means a message
is triggered (by the socket becoming readable) so you get a status
update. In theory the program doesn't need to be told about EPOLLOUT
here (it should be assuming the socket is writable already), but it
doesn't do any harm.
So far What I understand about epoll edge trigger mode is:
the epoll_wait return when the state of any fd being monitored has changed, e.g from nothing to read -> readable or buffer is full-> buffer can write
the epoll_wait may return one or several event(flags) for each fd in the ready list.
the flags in sturct epoll_event.events field indicate the current state of this fd. Even if we don't fill out the output buffer, the EPOLLOUT flag would be set when epoll_wait return due to readable, because the current state of the fd is just writable.
Please correct me if I am wrong.
Then my question would be: Should I maintain a flag in each connection to indicate whether EAGAIN occurs when write to output buffer, if it is not set, don't call write_handler/handleWrite in "if (events & EPOLLOUT)" branch, so that my upper layer program would not be told about EPOLLOUT here?
What a great question (since I had pretty much the same question)! I'll just summarize what I think I know now wrt to your informative question/description and your helpful links and hopefully smarter folk will correct any mistakes.
Yes, the if/else handling of event flags is definitely bogus. For sure at least two can events can arrive at effectively the same time. E.g., both the read and write sides might have become unblocked since last you called epoll_wait(). And, of course, as soon as you accept() the connection, both reading and writing suddenly become possible, so you get an "event" of EPOLLIN|EPOLLOUT.
I really didn't grok that epoll_wait() is always delivering the entire current state, rather than only the parts of the state that changed -- thanks for clearing that up. To be perhaps clearer, epoll_wait() won't return an fd unless something changed on that socket, but if something did change, it returns all the flags representing the current state. So, I found myself staring at a stream of EPOLLIN|EPOLLOUT events wondering why it was claiming there was an "output" event, even though I hadn't written anything yet. Your answer being correct: it's just telling me the output side is still writeable.
"Should I maintain a flag..." Yes, but I would imagine that in all but the most trivial situations you were probably going to end up maintaining at least one bit of "am I currently blocked" state for your readers/writers anyway. For example, if you ever want to process data in an order different than how it arrives (e.g., prioritize responses over requests to make your server more resistant to overload) you instantly have to give up the simplicity of just having the arrival of I/O drive everything. In the particular case of writing, epoll simply doesn't have enough information to notify you at the "right" time. As soon as you accept a connection, there's an event that says "you can write now"--but you probably have nothing to write if you're a server who couldn't possibly have already gotten a request from the client. epoll just can't know whether you have something to write or not, so you were always going to have to either suffer essentially "extraneous" events, or maintain your own state.
In all but the simplest cases, the socket file descriptor ends up being insufficient information for handling I/O events, so you invariably have to associate some data structure with it, or object if you prefer. So, my C++ looks something like:
nAwake = epoll_wait(epollFd, events, 100, milliseconds);
if(nAwake < 0)
{
perror("epoll_wait failed");
assert(false);
}
for(int iSocket=0; iSocket < nAwake; ++iSocket)
{
auto This = static_cast<Eventable*>(events[iSocket].data.ptr);
auto eventFlags = events[iSocket].events;
fprintf(stderr, "%s event on socket [%d] -> %s\n",
This->ClassName(), This->fd, DumpEvent(eventFlags));
This->Event(eventFlags);
}
Where Eventable is a C++ class (or derivative thereof) that has all the state needed to decide how to handle the flags epoll delivers. (Of course, this is letting the kernel store a pointer to a C++ object, requiring a design that is very clear about pointer ownership/lifetimes.)
And since you're writing low-level code on Linux, you may also care about EPOLLRDHUP. This not-highly-portable flag lets you save one call to read(). If the client (curl seems pretty good at evoking this behavior) closes its write side of the connection (sends a FIN), you normally discover that when epoll tells you EPOLLIN, but read() returns zero bytes. However, Linux maintains an extra bit to indicate your client's write side (your read side) has been closed. So, if you tell epoll you want the EPOLLRDHUP event you can use it to avoid doing a read() whose sole purpose will turn out to be telling you the writer closed their side.
Note that EPOLLIN will still be turned on whenever EPOLLRDHUP is, AFAIK. Even after you do a shutdown(fd, SHUT_RD). Another example of how you will usually be driven to maintain your own idea of the state of the connection. You care more about clients who are kind enough to do half-shutdowns if you are implementing HTTP.
When used as an edge-triggered interface, for performance reasons,
it
is possible to add the file descriptor inside the epoll interface
(EPOLL_CTL_ADD) once by specifying (EPOLLIN|EPOLLOUT).
This allows you
to avoid continuously switching between EPOLLIN and EPOLLOUT calling
epoll_ctl(2) with EPOLL_CTL_MOD.
first of all a little background on my situation:
- Qt/C++ UI desktop application
- embedded device (Stm32l4xx family) +ATWINC1500 wifi module
I'm developing the gui application in order to send commands and files to the emdedded device via sockets.
For simple commands I've done all successfully, but for sending files (text files in GCODE format) I am stuck with some issues.
The embedded device has already a socket management(not written by me, so I have not the possibility to modify the way sockets are managed, coming from third party company), and the reception of that type of files is managed in a way that the API waits for every single line of the file being sent, and then wrotes it into a reserved portion of the flash.
My problem is that when I send file from qt Application(by reading each line and and calling write() on the line, in reality my socket sends an entire chunk of the file, like 50 lines, resulting in my device not managing the file reception.
My sending code is this:
void sendGCODE(const QString fileName)
{
QFile *file = new QFile(fileName,this);
bool result = true;
if (file->open(QIODevice::ReadOnly))
{
while (!file->atEnd())
{
QByteArray bytes(file->readLine());
result = communicationSocket->write(bytes);
communicationSocket->flush();
if(result)
{
console->append("-> GCODE line sent:"+ QString(bytes));
}
else
{
console->append("-> Error sending GCODE line!");
}
}
file->close();
}
}
Have anyone of you guys any hints on what I am doing wrong?
I've already searched and someone suggests on other topic that for this purpose it should be better to use UDP instead of TCP sockets, but unfortunately I cannot touch the embedded-device-side code.
thank you all!
EDIT
After suggestions from comments, I've sniffed tcp packets and the packets are sent correctly(i.e. each packet contains a single line). BUT... at the receiver(device), I understood that there is something regarding memory which is not well managed. an example:
sender sends the line "G1 X470.492 Y599.623 F1000" ; receiver receives correctly the string "G1 X470.492 Y599.623 F1000"
next, if the line length is less than the previous sent, i.e. sending "G1 Z5", the receiver receives: "G1 Z5\n\n.492 Y599.623 F1000", so it is clear that the buffer used to store the data packet is not re-initialized from previous packet content, and the new part overwrites the previous values where the remaining part is from the previous packet
I'm trying to figure out how I could reset that part of memory.
This is all wrong. TCP is not a message-oriented protocol. There is no way to ensure that the TCP packets contain any particular amount of data. The receiver code on the device mustn't expect that either - you perhaps misunderstood the receiver's code, or are otherwise doing something wrong (or the vendor is). What the receiver must do is wait for a packet, add the packet's data to a buffer, then extract and process as many complete lines as it can, then move the remaining data to the beginning of the buffer. And repeat that on every packet.
Thus you're looking for the wrong problem at the wrong place, unless your device never ever had a chance of working. If that device works OK with other software, then your "packetized" TCP assumption doesn't hold any water.
Here's how to proceed:
If the device is commercially available and has been tested to work, then you're looking in the wrong place.
If the device is a new product and still in development, then someone somewhere did something particularly stupid and you either need to fix that stupidity, or have the vendor fix it, or hire a consultant to fix it. But just to be completely clear: that's not how TCP works, and you cannot just accept that "it's how it is".
I am streaming data as a string over UDP, into a Socket class inside Unreal engine. This is threaded, and runs in the background.
My read function is:
float translate;
void FdataThread::ReceiveUDP()
{
uint32 Size;
TArray<uint8> ReceivedData;
if (ReceiverSocket->HasPendingData(Size))
{
int32 Read = 0;
ReceivedData.SetNumUninitialized(FMath::Min(Size, 65507u));
ReceiverSocket->RecvFrom(ReceivedData.GetData(), ReceivedData.Num(), Read, *targetAddr);
}
FString str = FString(bytesRead, UTF8_TO_TCHAR((const UTF8CHAR *)ReceivedData));
translate = FCString::Atof(*str);
}
I then call the translate variable from another class, on a Tick, or timer.
My test case sends an incrementing number from another application.
If I print this number from inside the above Read function, it looks as expected, counting up incrementally.
When i print it from the other thread, it is missing some of the numbers.
I believe this is because I call it on the Tick, so it misses out some data due to processing time.
My question is:
Is there a way to queue the incoming data, so that when i pull the value, it is the next incremental value and not the current one? What is the best way to go about this?
Thank you, please let me know if I have not been clear.
Is this the complete code? ReceivedData isn't used after it's filled with data from the socket. Instead, an (in this code) undefined variable 'buffer' is being used.
Also, it seems that the while loop could run multiple times, overwriting old data in the ReceivedData buffer. Add some debugging messages to see whether RecvFrom actually reads all bytes from the socket. I believe it reads only one 'packet'.
Finally, especially when you're using UDP sockets over the network, note that the UDP protocol isn't guaranteed to actually deliver its packets. However, I doubt this is causing your problems if you're using it on a single computer or a local network.
Your read loop doesn't make sense. You are reading and throwing away all datagrams but the last in any given sequence that happen to be in the socket receive buffer at the same time. The translate call should be inside the loop, and the loop should be while(true), or while (running), or similar.
Ok, one for the SO hive mind...
I have code which has - until today - run just fine on many systems and is deployed at many sites. It involves threads reading and writing data from a serial port.
Trying to check out a new device, my code was swamped with 995 ERROR_OPERATION_ABORTED errors calling GetOverlappedResult after the ReadFile. Sometimes the read would work, othertimes I'd get this error. Just ignoring the error and retrying would - amazingly - work without dropping any data. No ClearCommError required.
Here's the snippet.
if (!ReadFile(handle,&c,1,&read, &olap))
{
if (GetLastError() != ERROR_IO_PENDING)
{
logger().log_api(LOG_ERROR,"ser_rx_char:ReadFile");
throw Exception("ser_rx_char:ReadFile");
}
}
WaitForSingleObjectEx(r_event, INFINITE, true); // alertable, so, thread can be closed correctly.
if (GetOverlappedResult(handle,&olap,&read, TRUE) != 0)
{
if (read != 1)
throw Exception("ser_rx_char: no data");
logger().log(LOG_VERBOSE,"read char %d ( read = %d) ",c, read);
}
else
{
DWORD err = GetLastError();
if (err != 995) //Filters our ERROR_OPERATION_ABORTED
{
logger().log_api(LOG_ERROR,"ser_rx_char: GetOverlappedResult");
throw Exception("ser_rx_char:GetOverlappedResult");
}
}
My first guess is to blame the COM port driver, which I havent' used before (it's a RS422 port on a Blackmagic Decklink, FYI), but that feels like a cop-out.
Oh, and Vista SP1 Business 32-bit, for my sins.
Before I just put this down to "Someone else's problem", does anyone have any ideas of what might cause this?
How are you setting over the OVERLAPPED structure before the ReadFile? - I always zero them (other than the hEvent, obviously), which is perhaps part superstition, but I have a feeling that it's caused me a problem in the past.
I'm afraid blaming the driver (if it's non-MS and not just a tiny tweak from the reference) is not completely unrealistic. To write a COM driver is an incredibly complex thing, and the difficulty with testing it is that every application ever written uses the serial ports and their IOCTLs slightly differently.
Another common problem is not to set the whole port up - for example not calling SetCommTimeouts or SetupComm. I've no idea if you're making this sort of mistake, but I have met people who say they're not using timeouts when they actually mean that they didn't call SetCommTimeouts so they're using them but don't have a notion what they're set to...
This kind of stuff can be murder for 3rd-party COM drivers, because people have often got away with any old crap with the MS driver, and it doesn't always work the same with another device.
in addition to zeroing the OVERLAPPED, you might also check how you're setting olap.hEvent, that is, what are your arguments to CreateEvent? If you're creating an event that's pre-signalled (i.e. the third argument to CreateEvent is TRUE) I would expect an immediate return. Also, don't forget that if you specify manualReset (the second argument to CreateEvent) as FALSE, GetOverlappedResult() will helpfully clear the event for you - which might explain why it works the second time around.
Can't really tell from your snippet whether either of these affect you - hope this helps.