String is not null terminated error - c++

I'm having a string is not null terminated error, though I'm not entirely sure why. The usage of std::string in the second part of the code is one of my attempt to fix this problem, although it still doesn't work.
My initial codes was just using the buffer and copy everything into client_id[]. The error than occurred. If the error is correct, that means I've got either client_ id OR theBuffer does not have a null terminator. I'm pretty sure client_id is fine, since I can see it in debug mode. Strange thing is buffer also has a null terminator. No idea what is wrong.
char * next_token1 = NULL;
char * theWholeMessage = &(inStream[3]);
theTarget = strtok_s(theWholeMessage, " ",&next_token1);
sendTalkPackets(next_token1, sizeof(next_token1) + 1, id_clientUse, (unsigned int)std::stoi(theTarget));
Inside sendTalkPackets is. I'm getting a string is not null terminated at the last line.
void ServerGame::sendTalkPackets(char * buffer, unsigned int buffersize, unsigned int theSender, unsigned int theReceiver)
{
std::string theMessage(buffer);
theMessage += "0";
const unsigned int packet_size = sizeof(Packet);
char packet_data[packet_size];
Packet packet;
packet.packet_type = TALK;
char client_id[MAX_MESSAGE_SIZE];
char theBuffer[MAX_MESSAGE_SIZE];
strcpy_s(theBuffer, theMessage.c_str());
//Quick hot fix for error "string not null terminated"
const char * test = theMessage.c_str();
sprintf_s(client_id, "User %s whispered: ", Usernames.find(theSender)->second.c_str());
printf("This is it %s ", buffer);
strcat_s(client_id, buffersize , theBuffer);

Methinks that problem lies in this line:
sendTalkPackets(next_token1, sizeof(next_token1) + 1, id_clientUse, (unsigned int)std::stoi(theTarget));
sizeof(next_token1)+1 will always gives 5 (on 32 bit platform) because it return size of pointer not size of char array.

One thing which could be causing this (or other problems): As
buffersize, you pass sizeof(next_token1) + 1. next_token1 is
a pointer, which will have a constant size of (typically) 4 or 8. You
almost certainly want strlen(next_token1) + 1. (Or maybe without the
+ 1; conventions for passing sizes like this generally only include
the '\0' if it is an output buffer. There are a couple of other
places where you're using sizeof, which may have similar problems.
But it would probably be better to redo the whole logic to use
std::string everywhere, rather than all of these C routines. No
worries about buffer sizes and '\0' terminators. (For protocol
buffers, I've also found std::vector<char> or std::vector<unsigned char>
quite useful. This was before the memory in std::string was
guaranteed to be contiguous, but even today, it seems to correspond more
closely to the abstraction I'm dealing with.)

You can't just do
std::string theMessage(buffer);
theMessage += "0";
This fails on two fronts:
The std::string constructor doesn't know where buffer ends, if buffer is not 0-terminated. So theMessage will potentially be garbage and include random stuff until some zero byte was found in the memory beyond the buffer.
Appending string "0" to theMessage doesn't help. What you want is to put a zero byte somewhere, not value 0x30 (which is the ascii code for displaying a zero).
The right way to approach this, is to poke a literal zero byte buffersize slots beyond the start of the buffer. You can't do that in buffer itself, because buffer may not be large enough to accomodate that extra zero byte. A possibility is:
char *newbuffer = malloc(buffersize + 1);
strncpy(newbuffer, buffer, buffersize);
newbuffer[buffersize] = 0; // literal zero value
Or you can construct a std::string, whichever you prefer.

Related

How to parse captured packet from socket in cpp?

I'm using RAW socket to capture udp packets. After capturing I want to parse the packet and see what's inside.
The input I get from the socket is an unsigned char* buffer and it's length. I tried to put the buffer into a string but I guess I did it wrong because when I checked the string it was empty.
Any advice?
I don't know what you want to parse, but your have the buffer and it's length. So you can do everything you want with this memory. Look for pointer arithmetic. If you want to make an C-String out of the content, simply add an '\0' to the end of the memory block. But this assumes, that no other 0x00 are inside the buffer. So maybe you have to check that. Like πάντα ῥεῖ said.
Steps:
1: receive UDP package
2: cast like:
unsigned char* buffer;
char* cString = (char*) buffer;
3: check casted cString if an '\0' occurred before buffer size was reached. If it does, then create a new char* pointer to the byte after the '\0', but be aware of the buffer size. Save the pointer in an vector.
I made an code example, but haven't checked if it is runnable!
char* firstPtr = (char*) buffer;
size_t indexer = 0;
std::vector<char*> pointerVec;
pointerVec.push_back(firstPtr);
while(indexer < bufferSize) {
if(*(buffer + indexer) == '\0') {
if(indexer + 1 < bufferSize) {
char* cString = (char*) (buffer + indexer);
pointerVec.push_back(cString);
}
}
} // end while
After that you should have the positions of the different strings saved with the pointers inside of the vector. Now you can handle them to an copy mechanism which takes every C-String pointer and saves it's content to one C-String or String.
Hope you searched for something like that, because you question was unclear.

How is memset working in this snippet of code?

I think this snippet of code is enough to get the idea of what I'm doing.
I'm using getline to read input data from a text file that has lines that might look something like this: The cat is fat/And likes to sing
From searching around the internet I was able to get it working, but I'd like to better understand WHY it is working. My primary question is how the
memcpy(id, buffer, temp - buffer);
line is working. I read what memcpy() does but do not understand how the temp - buffer part is working.
So from my understanding I'm setting *temp to the '/' in that line. Then I'm copying the line up until the '/' into it. But how does the temp, which is at '/' minus the buffer (which is the whole line from getline) work out to just be The cat is fat?
Hopefully that made some sense.
#define MAX_SIZE 255
char buffer[MAX_SIZE + 1] = { 0 };
cin.getline(buffer, MAX_SIZE);
memset(id, 0, 256);
memset(title, 0, 256);
char* temp = strchr(buffer, '/');
memcpy(id, buffer, temp - buffer);
temp++;
strcpy(title, temp);
Also, if I can double dip, why would MAX_SIZE be defined at 255 but MAX_SIZE+1 is often used. Does this have to do with a delimiter or white space at the end of a line?
Thanks for the help.
In my opinion it is simply a bad code.:)
I would write it like
const size_t MAX_SIZE = 256
char buffer[MAX_SIZE] = {};
std::cin.getline( buffer, MAX_SIZE );
id[0] = '\0';
title[0] = '\0';
if ( char* temp = strchr( buffer, '/' ) )
{
std::memcpy( id, buffer, temp - buffer );
id[temp - buffer] = '\0';
std::strcpy( title, temp + 1 );
}
else
{
std::strcpy( id, buffer );
}
As for memcpy in this statement
memcpy(id, buffer, temp - buffer);
then it copies temp - buffer bytes from buffer to id. As id was previously set to zeroes then after memcpy it will contain a string with terminating zero.
You're question concerns pointer-difference calculation, part of the family of arithmetic operations that are done in pointer-arithmetic.
Most beginners don't have too much trouble grasping how pointer-addition works. Given this:
char buffer[256];
char *p = buffer + 10;
it is usually clear that p points to the 10th slot in the buffer char array. But you need to remember that the pointer type is important. The same construct you see above also works for more complicated data types:
struct Something
{
char name[128];
int ident;
int supervisor;
} people[64];
struct Something *p = people+10; // NOTE: same line, different types
Just as before, p points to the tenth element in the array, but note the arithmetic; the size of the underlying type is used to calculate the relevant memory offset. You don't need to do it yourself. No sizeof required here.
So why do you care? Because just like regular math, pointer math has certain properties, one of them being the following:
char buffer[256];
char *p = buffer+10; // p addresses the 10th slot in the array
size_t len = p-buffer // len is the typed-difference between p and buffer.
In this case, len will be 10, the same as the offset of p. So how does this relate to your question? Well...
char* temp = strchr(buffer, '/');
memcpy(id, buffer, temp - buffer);
The horrid nature of this code aside (if there is no '/' in the buffer array the result is temp being NULL, and the ensuing memcpy will all-but-guarantee a massive segfault). This code finds the location in the string where '/' resides. Once it has that, the calculation temp - buffer uses pointer arithmetic (specifically pointer differencing) to calculate the distance between the address in temp and the address as the base of the array. The result is the element count not including the slash itself. Therefore this code copies up-to, but not including, the discovered slash, into the id buffer. The rest of the id buffer retains all the 0 values populated with the memset and therefore the string is terminated (which is way more work than you need to do, btw).
After that line, the remainder:
temp++;
strcpy(title, temp);
post-increments the temp pointer, which says "move to the next element in the array". Then the strcpy copies the remaining chars of the null-terminated buffer string into title. Worth noting this could have simply been:
strcpy(title, ++temp);
And likewise:
strcpy(title, temp+1);
which retains temp at the '/' position. In all of the above, the result in title will be the same: all chars after the slash, but not including it.
I hope that explains what is going on. Best of luck.
MAX_SIZE+1 is reserving space for the null terminator at the end of the string ('\0')
memcpy(id, buffer, temp - buffer)
This is copying (temp-buffer) bytes from buffer to id. Since strchr finds the '/' character in the input, temp is pointing inside buffer (assumiing it's found). So for example assume buffer points to a location in memory:
buffer = 0x781230001
and the third byte is the '/', after strchr, you have
temp = 0x781230003
temp - buffer therefore is 2.
HOWEVER: If the '/' is not found, then temp will not work and the code will crash. You should check the result of strchr before doing the pointer arithmetic.
There you calculate position of first / in buffer.
char* temp = strchr(buffer, '/');
Now temp points to / in buffer. If you want to copy this part of buffer, its enough to get pointer to start and length of string. So temp - buffer evaluates to length.
=================================
The cat is fat/And likes to sing
=================================
^ ^
buffer temp
| length | = temp - buffer
End of null terminated string determinated by \0 (or simply 0). So if you need to store N chars you need to allocate N+1 buffer size.

C++ TCP socket garbage value

I have my socket comms pretty much working. The only thing that I'm not sure about is why I'm getting some garbage values at the end of my message. The first message I send contains some extra characters at the end, and every message after that is as expected...does anyone have any insight as to why this is happening?
Send:
CString string = "TEST STRING TO SEND";
char* szDest;
szDest = new char[string.GetLength()];
strcpy(szDest,string);
m_pClientSocket->Send(szDest,strlen(pMsg));
Receive: (this is using Qt)
char* temp;
int size = tcpSocket->bytesAvailable();
temp = new char[size];
tcpSocket->read(temp,size);
You will be missing the \0 in your temp after read, since it's not really transmitted (and probably shouldn't be)
You likely need to change the receive a little bit:
temp = new char[size + 1];
int realSize = tcpSocket->read(temp, size);
temp[realSize] = 0;
Btw, you would be better off with QTcpSocket::readAll() in this little snipped.
I don't know this CString class, but I see two bugs here:
Does GetLength() include the terminating NUL? If not, your char buffer is one byte smaller than it needs to be, and the strcpy is clobbering memory after the end of the buffer.
strlen(pMsg) is the length of something other than szDest. This is probably the immediate cause of your problem.
The char buffer appears to be unnecessary: why don't you just do
CString string = "TEST STRING TO SEND";
m_pClientSocket->Send(string, string.GetLength());
?

Using istringstream to process a memory block of variable length

I'm trying to use istringstream to recreate an encoded wstring from some memory. The memory is laid out as follows:
1 byte to indicate the start of the wstring encoding. Arbitrarily this is '!'.
n bytes to store the character length of the string in text format, e.g. 0x31, 0x32, 0x33 would be "123", i.e. a 123-character string
1 byte separator (the space character)
n bytes which are the wchars which make up the string, where wchar_t's are 2-bytes each.
For example, the byte sequence:
21 36 20 66 00 6f 00 6f 00
is "!6 f.o.o." (using dots to represent char 0)
All I've got is a char* pointer (let's call it pData) to the start of the memory block with this encoded data in it. What's the 'best' way to consume the data to reconstruct the wstring ("foo"), and also move the pointer to the next byte past the end of the encoded data?
I was toying with using an istringstream to allow me to consume the prefix byte, the length of the string, and the separator. After that I can calculate how many bytes to read and use the stream's read() function to insert into a suitably-resized wstring. The problem is, how do I get this memory into the istringstream in the first place? I could try constructing a string first and then pass that into the istringstream, e.g.
std::string s((const char*)pData);
but that doesn't work because the string is truncated at the first null byte. Or, I could use the string's other constructor to explicitly state how many bytes to use:
std::string s((const char*)pData, len);
which works, but only if I know what len is beforehand. That's tricky given that the data is variable length.
This seems like a really solvable problem. Does my rookie status with strings and streams mean I'm overlooking an easy solution? Or am I barking up the wrong tree with the whole string approach?
Try setting your stringstream's rdbuf:
char* buffer = something;
std::stringbuf *pbuf;
std::stringstream ss;
std::pbuf=ss.rdbuf();
std::pbuf->sputn(buffer, bufferlength);
// use your ss
Edit: I see that this solution will have a similar problem to your string(char*, len) situation. Can you tell us more about your buffer object? If you don't know the length, and it isn't null terminated, it's going to be very hard to deal with.
Is it possible to modify how you encode the length, and make that a fixed size?
unsigned long size = 6; // known string length
char* buffer = new char[1 + sizeof(unsigned long) + 1 + size];
buffer[0] = '!';
memcpy(buffer+1, &size, sizeof(unsigned long));
buffer should hold the start indicator (1 byte), the actual size (size of unsigned long), the delimiter (1 byte) and the text itself (size).
This way, you could get the size "pretty" easy, then set the pointer to point beyond the overhead, and then use the len variable in the string constructor.
unsigned long len;
memcpy(&len, pData+1, sizeof(unsigned long)); // +1 to avoid the start indicator
// len now contains 6
char* actualData = pData + 1 + sizeof(unsigned long) + 1;
std::string s(actualData, len);
It's low level and error prone :) (for instance if you read anything that isn't encoded the way that you expect it to be, the len can get pretty big), but you avoid dynamically reading the length of the string.
It seems like something on this order should work:
std::wstring make_string(char const *input) {
if (*input != '!')
return "";
char length = *++input;
return std::wstring(++input, length);
}
The difficult part is dealing with the variable length of the size. Without something to specify the length it's hard to guess when to stop treating the data as specifying the length of the string.
As for moving the pointer, if you're going to do it inside a function, you'll need to pass a reference to the pointer, but otherwise it's a simple matter of adding the size you found to the pointer you received.
It's tempting to (ab)use the (deprecated but nevertheless standard) std::istrstream here:
// Maximum size to read is
// 1 for the exclamation mark
// Digits for the character count (digits10() + 1)
// 1 for the space
const std::streamsize max_size = 3 + std::numeric_limits<std::size_t>::digits10;
std::istrstream s(buf, max_size);
if (std::istream::traits_type::to_char_type(s.get()) != '!'){
throw "missing exclamation";
}
std::size_t size;
s >> size;
if (std::istream::traits_type::to_char_type(s.get()) != ' '){
throw "missing space";
}
std::wstring(reinterpret_cast<wchar_t*>(s.rdbuf()->str()), size/sizeof(wchar_t));

Very strange char array behaviour

.
unsigned int fname_length = 0;
//fname length equals 30
file.read((char*)&fname_length,sizeof(unsigned int));
//fname contains random data as you would expect
char *fname = new char[fname_length];
//fname contains all the data 30 bytes long as you would expect, plus 18 bytes of random data on the end (intellisense display)
file.read((char*)fname,fname_length);
//m_material_file (std:string) contains all 48 characters
m_material_file = fname;
// count = 48
int count = m_material_file.length();
now when trying this way, intellisense still shows the 18 bytes of data after setting the char array to all ' ' and I get exactly the same results. even without the file read
char name[30];
for(int i = 0; i < 30; ++i)
{
name[i] = ' ';
}
file.read((char*)fname,30);
m_material_file = name;
int count = m_material_file.length();
any idea whats going wrong here, its probably something completely obvious but im stumped!
thanks
Sounds like the string in the file isn't null-terminated, and intellisense is assuming that it is. Or perhaps when you wrote the length of the string (30) into the file, you didn't include the null character in that count. Try adding:
fname[fname_length] = '\0';
after the file.read(). Oh yeah, you'll need to allocate an extra character too:
char * fname = new char[fname_length + 1];
I guess that intellisense is trying to interpret char* as C string and is looking for a '\0' byte.
fname is a char* so both the debugger display and m_material_file = fname will be expecting it to be terminated with a '\0'. You're never explicitly doing that, but it just happens that whatever data follows that memory buffer has a zero byte at some point, so instead of crashing (which is a likely scenario at some point), you get a string that's longer than you expect.
Use
m_material_file.assign(fname, fname + fname_length);
which removes the need for the zero terminator. Also, prefer std::vector to raw arrays.
std::string::operator=(char const*) is expecting a sequence of bytes terminated by a '\0'. You can solve this with any of the following:
extend fname by a character and add the '\0' explicitly as others have suggested or
use m_material_file.assign(&fname[0], &fname[fname_length]); instead or
use repeated calls to file.get(ch) and m_material_file.push_back(ch)
Personally, I would use the last option since it eliminates the explicitly allocated buffer altogether. One fewer explicit new is one fewer chance of leaking memory. The following snippet should do the job:
std::string read_name(std::istream& is) {
unsigned int name_length;
std::string file_name;
if (is.read((char*)&name_length, sizeof(name_length))) {
for (unsigned int i=0; i<name_length; ++i) {
char ch;
if (is.get(ch)) {
file_name.push_back(ch);
} else {
break;
}
}
}
return file_name;
}
Note:
You probably don't want to use sizeof(unsigned int) to determine how many bytes to write to a binary file. The number of bytes read/written is dependent on the compiler and platform. If you have a maximum length, then use it to determine the specific byte size to write out. If the length is guaranteed to fewer than 255 bytes, then only write a single byte for the length. Then your code will not depend on the byte size of intrinsic types.