libwebsocket: unable to write frame bigger than 7160 bytes - c++

I'm addressing an issue with WebSocket that I'm not able to understand.
Please, use the code below as reference:
int write_buffer_size = 8000 +
LWS_SEND_BUFFER_PRE_PADDING +
LWS_SEND_BUFFER_POST_PADDING;
char *write_buffer = new unsigned char[write_buffer_size];
/* ... other code
write_buffer is filled in some way that is not important for the question
*/
n = libwebsocket_write(wsi, &write_buffer[LWS_SEND_BUFFER_PRE_PADDING], write_len,
(libwebsocket_write_protocol)write_mode);
if (n < 0) {
cerr << "ERROR " << n << " writing to socket, hanging up" << endl;
if (utils) {
log = "wsmanager::error: hanging up writing to websocket";
utils->writeLog(log);
}
return -1;
}
if (n < write_len) {
cerr << "Partial write: " << n << " < " << write_len << endl;
if (utils) {
log = "wsmanager-error: websocket partial write";
utils->writeLog(log);
}
return -1;
}
When I try to send data bigger than 7160 bytes I receive always the same error, e.g. Partial write: 7160 < 8000.
Do you have any kind of explanation for that behavior?
I have allocated a buffer with 8000 bytes reserved for the payload so I was expecting to be able to send a maximum amount of data of 8K, but 7160 (bytes) seems to be the maximum amount of data I can send.
Any help is appreciated, thanks!

I have encountered similar problem with an older version of libwebsockets. Although I didn't monitor the limit, it was pretty much the same thing: n < write_len. I think my limit was way lower, below 2048B, and I knew that the same code worked fine with newer version of libwebsockets (on different machine).
Since Debian Jessie doesn't have lws v1.6 in repositories, I've built it from github sources. Consider upgrading, it may help solve your problem. Beware, they have changed api. It was mostly renaming of methods' names from libwebsocket_* to lws_*, but also some arguments changed. Check this pull request which migrates boilerplate libwebsockets server to version 1.6. Most of these changes will affect your code.

We solved the issue updating libwebsockets to 1.7.3 version.
We also optimized the code using a custom callback called when the channel is writable
void
WSManager::onWritable() {
int ret, n;
struct fragment *frg;
pthread_mutex_lock(&send_queue_mutex);
if (!send_queue.empty() && !lws_partial_buffered(wsi)) {
frg = send_queue.front();
n = lws_write(wsi, frg->content + LWS_PRE, frg->len, (lws_write_protocol)frg->mode);
ret = checkWsWrite(n, frg->len);
if (ret >= 0 && !lws_partial_buffered(wsi)) {
if (frg->mode == WS_SINGLE_FRAGMENT || frg->mode == WS_LAST_FRAGMENT)
signalResponseSent();
// pop fragment and free memory only if lws_write was successful
send_queue.pop();
delete(frg);
}
}
pthread_mutex_unlock(&send_queue_mutex);
}

Related

Using move_pages() to move hugepages?

This question is for:
kernel 3.10.0-1062.4.3.el7.x86_64
non transparent hugepages allocated via boot parameters and might or might not be mapped to a file (e.g. mounted hugepages)
x86_64
According to this kernel source, move_pages() will call do_pages_move() to move a page, but I don't see how it indirectly calls migrate_huge_page().
So my questions are:
can move_pages() move hugepages? if yes, should the page boundary be 4KB or 2MB when passing an array of addresses of pages? It seems like there was a patch for supporting moving hugepages 5 years ago.
if move_pages() cannot move hugepages, how can I move hugepages?
after moving hugepages, can I query the NUMA IDs of hugepages the same way I query regular pages like this answer?
According to the code below, it seems like I move hugepages via move_pages() with page size = 2MB but is it the correct way?:
#include <cstdint>
#include <iostream>
#include <numaif.h>
#include <sys/mman.h>
#include <fcntl.h>
#include <errno.h>
#include <unistd.h>
#include <string.h>
#include <limits>
int main(int argc, char** argv) {
const int32_t dst_node = strtoul(argv[1], nullptr, 10);
const constexpr uint64_t size = 4lu * 1024 * 1024;
const constexpr uint64_t pageSize = 2lu * 1024 * 1024;
const constexpr uint32_t nPages = size / pageSize;
int32_t status[nPages];
std::fill_n(status, nPages, std::numeric_limits<int32_t>::min());;
void* pages[nPages];
int32_t dst_nodes[nPages];
void* ptr = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE | MAP_HUGETLB, -1, 0);
if (ptr == MAP_FAILED) {
throw "failed to map hugepages";
}
memset(ptr, 0x41, nPages*pageSize);
for (uint32_t i = 0; i < nPages; i++) {
pages[i] = &((char*)ptr)[i*pageSize];
dst_nodes[i] = dst_node;
}
std::cout << "Before moving" << std::endl;
if (0 != move_pages(0, nPages, pages, nullptr, status, 0)) {
std::cout << "failed to inquiry pages because " << strerror(errno) << std::endl;
}
else {
for (uint32_t i = 0; i < nPages; i++) {
std::cout << "page # " << i << " locates at numa node " << status[i] << std::endl;
}
}
// real move
if (0 != move_pages(0, nPages, pages, dst_nodes, status, MPOL_MF_MOVE_ALL)) {
std::cout << "failed to move pages because " << strerror(errno) << std::endl;
exit(-1);
}
const constexpr uint64_t smallPageSize = 4lu * 1024;
const constexpr uint32_t nSmallPages = size / smallPageSize;
void* smallPages[nSmallPages];
int32_t smallStatus[nSmallPages] = {std::numeric_limits<int32_t>::min()};
for (uint32_t i = 0; i < nSmallPages; i++) {
smallPages[i] = &((char*)ptr)[i*smallPageSize];
}
std::cout << "after moving" << std::endl;
if (0 != move_pages(0, nSmallPages, smallPages, nullptr, smallStatus, 0)) {
std::cout << "failed to inquiry pages because " << strerror(errno) << std::endl;
}
else {
for (uint32_t i = 0; i < nSmallPages; i++) {
std::cout << "page # " << i << " locates at numa node " << smallStatus[i] << std::endl;
}
}
}
And should I query the NUMA IDs based on 4KB page size (like the code above)? Or 2MB?
For original version of 3.10 linux kernel (not redhat patched, as I have no LXR for rhel kernels) syscall move_pages will force splitting huge page (2MB; both THP and hugetlbfs styles) into small pages (4KB). move_pages uses too short chunks (around 0.5MB if I calculated correctly) and the function graph is like:
move_pages .. -> migrate_pages -> unmap_and_move ->
static int unmap_and_move(new_page_t get_new_page, unsigned long private,
struct page *page, int force, enum migrate_mode mode)
{
struct page *newpage = get_new_page(page, private, &result);
....
if (unlikely(PageTransHuge(page)))
if (unlikely(split_huge_page(page)))
goto out;
PageTransHuge returns true for both kinds of hugepages (thp and libhugetlbs):
https://elixir.bootlin.com/linux/v3.10/source/include/linux/page-flags.h#L411
PageTransHuge() returns true for both transparent huge and hugetlbfs pages, but not normal pages.
And split_huge_page will call split_huge_page_to_list which:
Split a hugepage into normal pages. This doesn't change the position of head page.
Split will also emit vm_event counter increment of kind THP_SPLIT. The counters are exported in /proc/vmstat ("file displays various virtual memory statistics"). You can check this counter with this UUOC command cat /proc/vmstat |grep thp_split before and after your test.
There were some code for hugepage migration in 3.10 version as unmap_and_move_huge_page function which is not called from move_pages. The only usage of it in 3.10 was in migrate_huge_page which is called only from memory failure handler soft_offline_huge_page (__soft_offline_page) (added 2010):
Soft offline a page, by migration or invalidation,
without killing anything. This is for the case when
a page is not corrupted yet (so it's still valid to access),
but has had a number of corrected errors and is better taken
out.
Answers:
can move_pages() move hugepages? if yes, should the page boundary be 4KB or 2MB when passing an array of addresses of pages? It seems like there was a patch for supporting moving hugepages 5 years ago.
Standard 3.10 kernel have move_pages which will accept array "pages" of 4KB page pointers and it will break (split) huge page into 512 small pages and then it will migrate small pages. There are very low chances for them to be merged back by thp as move_pages does separate requests for physical memory pages and they almost always will be non-continuous.
Don't give pointers to "2MB", it will still split every huge page mentioned and migrate only first 4KB small page of this memory.
2013 patch was not added into original 3.10 kernel.
v2 https://lwn.net/Articles/544044/ "extend hugepage migration" (3.9);
v3 https://lwn.net/Articles/559575/ (3.11)
v4 https://lore.kernel.org/patchwork/cover/395020/ (click on Related to get access to individual patches, for example move_pages patch)
The patch seems to be accepted in September 2013: https://github.com/torvalds/linux/search?q=+extend+hugepage+migration&type=Commits
if move_pages() cannot move hugepages, how can I move hugepages?
move_pages will move data from hugepages as small pages. You can: allocate huge page in manual mode at correct numa node and copy your data (copy twice if you want to keep virtual address); or update kernel to some version with the patch and use methods and tests of patch author, Naoya Horiguchi (JP). There is copy of his tests: https://github.com/srikanth007m/test_hugepage_migration_extension
(https://github.com/Naoya-Horiguchi/test_core is required)
https://github.com/srikanth007m/test_hugepage_migration_extension/blob/master/test_move_pages.c
Now I'm not sure how to start the test and how to check that it works correctly. For ./test_move_pages -v -m private -h 2048 runs with recent kernel it does not increment THP_SPLIT counter.
His test looks very similar to our tests: mmap, memset to fault pages, filling pages array with pointers to small pages, numa_move_pages
after moving hugepages, can I query the NUMA IDs of hugepages the same way I query regular pages like this answer?
You can query status of any memory by providing correct array "pages" to move_pages syscall in query mode (with null nodes). Array should list every small page of the memory region you want to check.
If you know any reliable method to check if the memory mapped to huge page or not, you can query any small page of huge page. I think that there can be probabilistic method if you can export physical address out of kernel to the user-space (using some LKM module for example): for huge page virtual and physical addresses will always have 21 common LSB bits, and for small pages bits will coincide only for 1 test in million. Or just write LKM to export PMD Directory.

C++ TCP Socket communication - Connection is working as expected, fails after a couple of seconds, no new data is received and read() and recv() block

I am using 64-bit Ubuntu 16.04 LTS. Like I said, I am attempting to make a TCP socket connection to another device. The program starts by reading data from the socket to initialize the last_recorded_data variable (as seen below, towards the bottom of myStartProcedure()), and I know that this is working exactly as expected. Then, the rest of the program starts which is driven by callbacks. When I make UPDATE_BUFFER_MS something smaller like 8, it fails after a couple of seconds. A frequency of this value is the desired value, but if I make it larger for testing purposes (something like 500), then it works for a little bit longer, but also eventually fails the same way.
The failure is as follows: The device I'm attempting to read from consistently sends data every 8 milliseconds, and within this packet of data, the first few bytes are reserved for telling the client how large the packet is, in bytes. During normal operation, the received number of bytes and the size as described by these first few bytes are equal. However, the packet received directly before the read() call starts to block is always 24 bytes less than the expected size, but the packet says the data packet sent should still be the expected size. When the next attempt to get the data is made, the read() call blocks and upon timeout sets errno to be EAGAIN (Resource temporarily unavailable).
I tried communicating with this same device with a Python application and it is not experiencing the same issue. Furthermore, I tried this C++ application on another one of these devices and I'm seeing the same behavior, so I think it's a problem on my end. My code (simplified) is below. Please let me know if you see any obvious errors, thank you!!
#include <string>
#include <unistd.h>
#include <iostream>
#include <stdio.h>
#include <errno.h>
#include <sys/socket.h>
#include <stdlib.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#define COMM_DOMAIN AF_INET
#define PORT 8008
#define TIMEOUT_SECS 3
#define TIMEOUT_USECS 0
#define UPDATE_BUFFER_MS 8
#define PACKET_SIZE_BYTES_MAX 1200
//
// Global variables
//
// Socket file descriptor
int socket_conn;
// Tracks the timestamp of the last time data was recorded
// The data packet from the TCP connection is sent every UPDATE_BUFFER_MS milliseconds
unsigned long last_process_cycle_timestamp;
// The most recently heard data, cast to a double
double last_recorded_data;
// The number of bytes expected from a full packet
int full_packet_size;
// The minimum number of bytes needed from the packet, as I don't need all of the data
int min_required_packet_size;
// Helper to cast the packet data to a double
union PacketAsFloat
{
unsigned char byte_values[8];
double decimal_value;
};
// Simple struct to package the data read from the socket
struct SimpleDataStruct
{
// Whether or not the struct was properly populated
bool valid;
// Some data that we're interested in right now
double important_data;
//
// Other, irrelevant members removed for simplicity
//
};
// Procedure to read the next data packet
SimpleDataStruct readCurrentData()
{
SimpleDataStruct data;
data.valid = false;
unsigned char socket_data_buffer[PACKET_SIZE_BYTES_MAX] = {0};
int read_status = read(socket_conn, socket_data_buffer, PACKET_SIZE_BYTES_MAX);
if (read_status < min_required_packet_size)
{
return data;
}
//for (int i = 0; i < read_status - 1; i++)
//{
// std::cout << static_cast<int>(socket_data_buffer[i]) << ", ";
//}
//std::cout << static_cast<int>(socket_data_buffer[read_status - 1]) << std::endl;
PacketAsFloat packet_union;
for (int j = 0; j < 8; j++)
{
packet_union.byte_values[7 - j] = socket_data_buffer[j + 252];
}
data.important_data = packet_union.decimal_value;
data.valid = true;
return data;
}
// This acts as the main entry point
void myStartProcedure(std::string host)
{
//
// Code to determine the value for full_packet_size and min_required_packet_size (because it can vary) was removed
// Simplified version is below
//
full_packet_size = some_known_value;
min_required_packet_size = some_other_known_value;
//
// Create socket connection
//
if ((socket_conn = socket(COMM_DOMAIN, SOCK_STREAM, 0)) < 0)
{
std::cout << "socket_conn heard a bad value..." << std::endl;
return;
}
struct sockaddr_in socket_server_address;
memset(&socket_server_address, '0', sizeof(socket_server_address));
socket_server_address.sin_family = COMM_DOMAIN;
socket_server_address.sin_port = htons(PORT);
// Create and set timeout
struct timeval timeout_chars;
timeout_chars.tv_sec = TIMEOUT_SECS;
timeout_chars.tv_usec = TIMEOUT_USECS;
setsockopt(socket_conn, SOL_SOCKET, SO_RCVTIMEO, (const char*)&timeout_chars, sizeof(timeout_chars));
if (inet_pton(COMM_DOMAIN, host.c_str(), &socket_server_address.sin_addr) <= 0)
{
std::cout << "Invalid address heard..." << std::endl;
return;
}
if (connect(socket_conn, (struct sockaddr *)&socket_server_address, sizeof(socket_server_address)) < 0)
{
std::cout << "Failed to make connection to " << host << ":" << PORT << std::endl;
return;
}
else
{
std::cout << "Successfully brought up socket connection..." << std::endl;
}
// Sleep for half a second to let the networking setup properly
sleepMilli(500); // A sleep function I defined elsewhere
SimpleDataStruct initial = readCurrentData();
if (initial.valid)
{
last_recorded_data = initial.important_data;
}
else
{
// Error handling
return -1;
}
//
// Start the rest of the program, which is driven by callbacks
//
}
void updateRequestCallback()
{
unsigned long now_ns = currentTime(); // A function I defined elsewhere that gets the current system time in nanoseconds
if (now_ns - last_process_cycle_timestamp >= 1000000 * UPDATE_BUFFER_MS)
{
SimpleDataStruct current_data = readCurrentData();
if (current_data.valid)
{
last_recorded_data = current_data.important_data;
last_process_cycle_timestamp = now_ns;
}
else
{
// Error handling
std::cout << "ERROR setting updated data, SimpleDataStruct was invalid." << std:endl;
return;
}
}
}
EDIT #1
I should be receiving a certain number of bytes every time, and I would expect the return value of read() to be returning that value as well. However, I just tried changing the value of PACKET_SIZE_BYTES_MAX to be 2048, and the return value of read() is now 2048, when it should be the size of the packet that the device is sending back (NOT 2048). The Python application is also setting the max to be 2048 and its returning packet size is the correct/expected size...
Try commenting out the timeout setup. I never use that on my end and I don't experience the problem you're talking about.
// Create and set timeout
struct timeval timeout_chars;
timeout_chars.tv_sec = TIMEOUT_SECS;
timeout_chars.tv_usec = TIMEOUT_USECS;
setsockopt(socket_conn, SOL_SOCKET, SO_RCVTIMEO, (const char*)&timeout_chars, sizeof(timeout_chars));
To avoid blocking, you can setup the socket as a non-block socket and then use a select() or poll() to get more data. Both of these functions can use the timeout as presented above. However, with a non-blocking socket you must make sure that the read works as expected. In many cases you will get a partial read and have to wait (select() or poll()) again for more data. So the code would be a bit more complicated.
socket_conn = socket(COMM_DOMAIN, SOCK_STREAM | SOCK_NONBLOCK, 0);
If security is a potential issue, I would also set SOCK_CLOEXEC to prevent a child process from accessing the same socket.
std::vector<struct pollfd> fds;
struct pollfd fd;
fd.fd = socket_conn;
fd.events = POLLIN | POLLPRI | POLLRDHUP; // also POLLOUT for writing
fd.revents = 0; // probably useless... (kernel should clear those)
fds.push_back(fd);
int64_t timeout_chars = TIMEOUT_SECS * 1000 + TIMEOUT_USECS / 1000;
int const r = poll(&fds[0], fds.size(), timeout_chars);
if(r < 0) { ...handle error(s)... }
Another method, assuming the header size is well defined and never changes, is to read the header, then using the header information to read the rest of the data. In that case you can keep the blocking socket without any timeout. From your structures I have no idea what that could be. So... let's first define such a structure:
struct header
{
char sync[4]; // four bytes indicated a synchronization point
uint32_t size; // size of packet
... // some other info
};
I put a "sync" field. In TCP it is often that people will add such a field so if you lose track of your position you can seek to the next sync by reading one byte at a time. Frankly, with TCP, you should never get a transmission error like that. You may lose the connection, but never lose data from the stream (i.e. TCP is like a perfect FIFO over your network.) That being said, if you are working on a mission critical software, a sync and also a checksum would be very welcome.
Next we read() just the header. Now we know of the exact size of this packet, so we can use that specific size and read exactly that many bytes in our packet buffer:
struct header hdr;
read(socket_conn, &hdr, sizeof(hdr));
read(socket_conn, packet, hdr.size /* - sizeof(hdr) */);
Obviously, read() may return an error and the size in the header may be defined in big endian (so you need to swap the bytes on x86 processors). But that should get you going.
Also, if the size found in the header includes the number of bytes in the header, make sure to subtract that amount when reading the rest of the packet.
Also, the following is wrong:
memset(&socket_server_address, '0', sizeof(socket_server_address));
You meant to clear the structure with zeroes, not character zero. Although if it connects that means it probably doesn't matter much. Just use 0 instead of '0'.

Brain Computer Interface P300 Machine Learning

I am currently working a P300 (basically there is detectable increase in a brain wave when a user sees something they are interested) detection system in C++ using the Emotiv EPOC. The system works but to improve accuracy I'm attempting to use Wekinator for machine learning, using an support vector machine (SVM).
So for my P300 system I have three stimuli (left, right and forward arrows). My program keeps track of the stimulus index and performs some filtering on the incoming "brain wave" and then calculates which index has the highest average area under the curve to determine which stimuli the user is looking at.
For my integration with Wekinator: I have setup Wekinator to receive a custom OSC message with 64 features (the length of the brain wave related to the P300) and set up three parameters with discrete values of 1 or 0. For training I have I have been sending the "brain wave" for each stimulus index in a trial and setting the relevant parameters to 0 or 1, then training it and running it. The issue is that when the OSC message is received by the the program from Wekinator it is returning 4 messages, rather than just the one most likely.
Here is the code for the training (and input to Wekinator during run time):
for(int s=0; s < stimCount; s++){
for(int i=0; i < stimIndexes[s].size(); i++) {
int eegIdx = stimIndexes[s][i];
ofxOscMessage wek;
wek.setAddress("/oscCustomFeatures");
if (eegIdx + winStart + winLen < sig.size()) {
int winIdx = 0;
for(int e=eegIdx + winStart; e < eegIdx + winStart + winLen; e++) {
wek.addFloatArg(sig[e]);
//stimAvgWins[s][winIdx++] += sig[e];
}
validWindowCount[s]++;
}
std::cout << "Num args: " << wek.getNumArgs() << std::endl;
wekinator.sendMessage(wek);
}
}
Here is the receipt of messages from Wekinator:
if(receiver.hasWaitingMessages()){
ofxOscMessage msg;
while(receiver.getNextMessage(&msg)) {
std::cout << "Wek Args: " << msg.getNumArgs() << std::endl;
if (msg.getAddress() == "/OSCSynth/params"){
resultReceived = true;
if(msg.getArgAsFloat(0) == 1){
result = 0;
} else if(msg.getArgAsFloat(1) == 1){
result = 1;
} else if(msg.getArgAsFloat(2) == 1){
result = 2;
}
std::cout << "Wek Result: " << result << std::endl;
}
}
}
Full code for both is at the following Gist:
https://gist.github.com/cilliand/f716c92933a28b0bcfa4
Main query is basically whether something is wrong with the code: Should I send the full "brain wave" for a trial to Wekinator? Or should I train Wekinator on different features? Does the code look right or should it be amended? Is there a way to only receive one OSC message back from Wekinator based on smaller feature sizes i.e. 64 rather than 4 x 64 per stimulus or 9 x 64 per stimulus index.

UDP datagram counting

Hi I have a server communicating with clients over UDP. Basically clients streams to the server UDP packets. Each packet consist from header and payload. in header, there is only one short int - i call it seqnum, running from 0 to SHORT_MAX. When client reaches SHORT_MAX on sending, it starts again from 0.
On the server I need to reconstruct stream in this way:
a) If packet arrives with expected seqnum append to the stream.
b) If packet arrives with lower than expected seqnum - drop it - it is packet which arrived to late
c) If packet arrives with higher seqnum than expected - consider packets between expected and actual seqnum as lost and try to reconstruct them and after that append actual packet
I am dealing now with two problems connected to overflow of counter:
1) how to detect situation c) on SHORT_MAX boundary (e.g. expected is SHORT_MAX-2, actual seqnum in packet is 2) - in my scenario it would be wrongly detected as situation b)
2) same problem with situation b) wrongly detected as c)
Thx a lot
Assuming that SHORT_MAX actually means SHRT_MAX, then you have some 30000 or so packets that are missing if you find scenario 2. Which probably means you can't reconstruct anyway, and the link has been going wrong for quite some time. You may solve that by having a "timeout" (e.g. if no correct packet has been received in X seconds, give up and start over at some suitable point - or whatever you can do if LOTS of packets have gone missing - you can of course test this by unplugging the cable or something like that).
And you can detect "wraparound" by doing some modulo math.
#include <iostream>
#include <algorithm>
using namespace std;
#define MAX_SEQ_NO 16
#define THRESHOLD 6 // Max number of "missing packets" that is acceptable
void check_seq_no(int seq_no)
{
static int expected = 0;
cout << "Got seq_no=" << seq_no << " expected=" << expected << endl;
if (seq_no == expected)
{
expected = (expected + 1) % MAX_SEQ_NO;
}
else
{
if ((seq_no + THRESHOLD) % MAX_SEQ_NO > (expected + THRESHOLD) % MAX_SEQ_NO)
{
int missing;
if (seq_no > expected)
{
missing = seq_no - expected;
}
else
{
missing = MAX_SEQ_NO + seq_no - expected;
}
cout << "Packets missing ..." << missing << endl;
expected = (seq_no+1) % MAX_SEQ_NO;
}
else
{
cout << "Old packet received ... " << endl;
}
}
}
int main()
{
int seq_no = 0;
bool in_sim = false;
int old_seq_no = 0;
for(;;)
{
int r = rand() % 50;
if (!in_sim)
{
old_seq_no = seq_no;
switch(r)
{
// Low number: Resend an older packet
case 4:
seq_no --;
case 3:
seq_no --;
case 2:
seq_no --;
case 1:
seq_no --;
in_sim = true;
break;
// High number: "lose" a packet or four.
case 46:
seq_no++;
case 47:
seq_no++;
case 48:
seq_no++;
case 49:
seq_no++;
in_sim = true;
break;
default:
break;
}
if (old_seq_no > seq_no)
{
cout << "Simulating resend of old packets: " << old_seq_no - seq_no << endl;
}
else if (old_seq_no < seq_no)
{
cout << "Simulating missing packets: " << seq_no - old_seq_no << endl;
old_seq_no = seq_no;
}
}
if (old_seq_no == seq_no)
{
in_sim = false;
}
check_seq_no(seq_no % MAX_SEQ_NO);
seq_no++;
}
}
I would suggest that anytime you get a packet within some near threshold of SHORT_MAX, and it is within a similar threshold of 0, you consider it as a candidate for reconstruction. You can also compensate somewhat for the loss of packets by building a sane acknowledgment system between the client and server. Think of this less as a "reconstruction" problem as it is a prioritization problem wherein you must discard "old" or possibly re-transmitted data.
Another strategy I've used in the past is to define channels with (potentially) different configurations in terms of thresholds, ACKing, and overall reliability. In the ideal world you'll have packets that are 100% reliable (TCP-style, guaranteed in-order delivery) and packets that are 100% unreliable, and then possibly streams that are somewhere in between -- a good UDP-based protocol will support these. This gives your application code more direct control over the protocol's algorithms, which is the real point of UDP and where it shines for applications like games and video.
You will likely find that often you are re-implementing bits of TCP, and you may even consider using TCP for your 100% reliable-channel -- it's worth noting that TCP is often given preference by backbones, because they know they will eventually have to re-transmit those packets, if they don't make it through on this trip.
Why don't use an unsigned short, you get twice as much. For those situations that you comment, you need to specify a threshold. It's a dilemma you need to face, I mean, if you're expecting packet 30 and receive packet 60, it's a correct packet or an old lost one. That's why you need to put a threshold
For example;
If (NumberReceived < NumerberExpected)
{
threshold = (USHRT_MAX - NumerberExpected) + NumberReceived ;
// in here you have to decided how many is the threshold 10, 20, 50 …
if (threshold < 10) It is a correct packet and has started over
else It is a lost packet
}
else
If (NumberReceived > NumerberExpected)
{
threshold = NumerberReceived - NumberExpected ;
// in here you have to decided how many is the threshold 10, 20, 50 …
if (threshold < 10) It is a correct packet and I've lost some packets
else It is a lost packet;
}
else It is a correct packet

Logging Guard to limit semi-constant log messages

I'm using boost log in my application for logging.
However, in some sections of my code I have some log statements that could occur very often if something goes wrong. I'd want some kind of guard that can limit log messages when it detects that the same log message appears constantly.
e.g. (This is a simplified example, not actual implementation)
while(!framebuffer.try_pop(frame))
{
BOOST_LOG(trace) << "Buffer underrun.";
}
If for some reason "framebuffer" doesn't receive any frames for a long time the logging will send way to much log messages.
However I'm unsure what strategy to use for limiting log messages, without loosing any important messages, and how to implement it.
How about something simple, you could encapsulate it if you wanted to:
int tooMany = 10;
int count = 0;
while(!framebuffer.try_pop(frame))
{
if(count < tooMany) {
BOOST_LOG(trace) << "Buffer underrun.";
}
count++;
}
if(count >= tooMany) {
BOOST_LOG(trace) << "Message repeated: " << count << " times.";
}
Just be careful of integer overflows on the 'count' variable if you get a absolute bucketload of increments.