char Array to string conversion results in strange characters - sockets - c++

I currently have the following code
char my_stream[800];
std::string my_string;
iResult = recv(clntSocket,my_stream,sizeof(my_stream),0);
my_string = std::string(my_stream);
Now when I attempt to convert the char array to string I get the present of weird characters in the string any suggestions on what I might be doing wrong

You're getting weird characters because your strings length is not equal to the number of bytes received.
You should initialize the string like so:
char* buffer = new char[512];
ssize_t bytesRead = recv(clntSocket,buffer,512,0);
std::string msgStr = std::string(buffer,bytesRead);
delete buffer;
The most common solution is to zero every byte of the buffer before reading anything.
char buffer[512];
buffer = { 0 };

If you're reading in a zero-terminated string from your socket, there's no need for a conversion, it's already a char string. If it's not zero-terminated already, you'll need some other kind of terminator because sockets are streams (assuming this is TCP). In other words, you don't need my_string = std::string(my_stream);

have you tried to print my_stream directly without converting to string.
According to me it may be the case of mismatch in format of data sent and received.
data on other side may be in other format like Unicode and you may be trying to print it as single byte array
if only part of string is in weird characters than it is definitely error related to null terminator at the end of my_stream missing tehn increase the size of array of my_stream.

Related

What's the need of NULL terminator at end of decrypted string?

I am working on a google chrome NaCl extension that involves encryption and decryption of data using openssl library functions. The encryption works perfectly and the decryption also works fine as of now but for that I had to put in a kind of hack but I am not sure if that's the correct way to handle it.
else if(action == "decryption")
{
pp::Var content = dict_message.Get("content");
//aes decryption starts here
pp::VarArrayBuffer buffer(content);
const char *password = "password";
unsigned char key[EVP_MAX_KEY_LENGTH], iv[EVP_MAX_IV_LENGTH];
int cipherlen = buffer.ByteLength();
int len = cipherlen + EVP_MAX_BLOCK_LENGTH;
unsigned char *plaintext = (unsigned char*)malloc(len*sizeof(unsigned char));
unsigned char* ciphertext = static_cast<unsigned char*>(buffer.Map());
aes_init(password, (int)strlen(password), key, iv);
len = decrypt(ciphertext, cipherlen, key, iv, plaintext);
buffer.Unmap();
plaintext[len]='\0'; //fix a bug of overshooting plaintext sent
//aes decryption ends here
pp::VarDictionary reply;
reply.Set("action","decryption");
reply.Set("plaintext", (char*)plaintext);
reply.Set("fileSize", len);
PostMessage(reply);
free(plaintext);
}
Now this code decrypts the data and sends back to the javascript on extension page. Notice the line plaintext[len]='\0';, if I dont put it then sometimes I get a garbage after the correctly decrypted text in plaintext and that reflects as a null in my javascript. So is the correct way to handle the bug ?
As others have mentioned, you're using a C-string, which must be NULL-terminated. When you call pp::VarDictionary::Set, the parameters are pp::Vars, and you're taking advantage of an implicit conversion constructor from C-string to pp::Var.
If instead you make plaintext a std::string or use pp::VarArrayBuffer, this won't be necessary. PostMessage() is optimized to deal with large VarArrayBuffers, so you should probably prefer that anyway. So I'd suggest replacing this line:
unsigned char *plaintext = (unsigned char*)malloc(len*sizeof(unsigned char));
with something like:
pp::VarArrayBuffer plaintext(len);
...and change your decrypt line to something like:
len = decrypt(ciphertext, cipherlen, key, iv,\b
static_cast<unsigned char*>(plaintext.Map()));
Note, this will change the type JavaScript receives from a string to an ArrayBuffer. If you want it to remain a string, you can use a std::string instead:
std::string plaintext(len, '\0');
and access the string's buffer using operator[]. So the call to decrypt looks like this (I'm assuming len is >0):
len = decrypt(ciphertext, cipherlen, key, iv, &plaintext[0]);
The '\0' is the terminator for all strings in C. Since your output is in a string, if the '\0' character is missing, your program won't know where the string ends, and so may move on to areas beyond the string when using it, which corresponds to the garbage value near the end.
Whenever strings are declared, '\0' is put at the end. However, you are first allocating the memory for the decrypted text, and then writing into it. In this case you have to take care to at the string terminating character at the end.
The reason for the existence of the string terminating character is that strings are stored in the form of a character array in C, and the size of the array is not known by just the pointer to the beginning of the string.
If the \0 is not there at the end of a c-string, the code will not know when the string ends and will walk off into unknown parts of memory. The null terminator lets the program know "this is the end of my string". Without this, you are inviting undefined behavior into your program.

String is not null terminated error

I'm having a string is not null terminated error, though I'm not entirely sure why. The usage of std::string in the second part of the code is one of my attempt to fix this problem, although it still doesn't work.
My initial codes was just using the buffer and copy everything into client_id[]. The error than occurred. If the error is correct, that means I've got either client_ id OR theBuffer does not have a null terminator. I'm pretty sure client_id is fine, since I can see it in debug mode. Strange thing is buffer also has a null terminator. No idea what is wrong.
char * next_token1 = NULL;
char * theWholeMessage = &(inStream[3]);
theTarget = strtok_s(theWholeMessage, " ",&next_token1);
sendTalkPackets(next_token1, sizeof(next_token1) + 1, id_clientUse, (unsigned int)std::stoi(theTarget));
Inside sendTalkPackets is. I'm getting a string is not null terminated at the last line.
void ServerGame::sendTalkPackets(char * buffer, unsigned int buffersize, unsigned int theSender, unsigned int theReceiver)
{
std::string theMessage(buffer);
theMessage += "0";
const unsigned int packet_size = sizeof(Packet);
char packet_data[packet_size];
Packet packet;
packet.packet_type = TALK;
char client_id[MAX_MESSAGE_SIZE];
char theBuffer[MAX_MESSAGE_SIZE];
strcpy_s(theBuffer, theMessage.c_str());
//Quick hot fix for error "string not null terminated"
const char * test = theMessage.c_str();
sprintf_s(client_id, "User %s whispered: ", Usernames.find(theSender)->second.c_str());
printf("This is it %s ", buffer);
strcat_s(client_id, buffersize , theBuffer);
Methinks that problem lies in this line:
sendTalkPackets(next_token1, sizeof(next_token1) + 1, id_clientUse, (unsigned int)std::stoi(theTarget));
sizeof(next_token1)+1 will always gives 5 (on 32 bit platform) because it return size of pointer not size of char array.
One thing which could be causing this (or other problems): As
buffersize, you pass sizeof(next_token1) + 1. next_token1 is
a pointer, which will have a constant size of (typically) 4 or 8. You
almost certainly want strlen(next_token1) + 1. (Or maybe without the
+ 1; conventions for passing sizes like this generally only include
the '\0' if it is an output buffer. There are a couple of other
places where you're using sizeof, which may have similar problems.
But it would probably be better to redo the whole logic to use
std::string everywhere, rather than all of these C routines. No
worries about buffer sizes and '\0' terminators. (For protocol
buffers, I've also found std::vector<char> or std::vector<unsigned char>
quite useful. This was before the memory in std::string was
guaranteed to be contiguous, but even today, it seems to correspond more
closely to the abstraction I'm dealing with.)
You can't just do
std::string theMessage(buffer);
theMessage += "0";
This fails on two fronts:
The std::string constructor doesn't know where buffer ends, if buffer is not 0-terminated. So theMessage will potentially be garbage and include random stuff until some zero byte was found in the memory beyond the buffer.
Appending string "0" to theMessage doesn't help. What you want is to put a zero byte somewhere, not value 0x30 (which is the ascii code for displaying a zero).
The right way to approach this, is to poke a literal zero byte buffersize slots beyond the start of the buffer. You can't do that in buffer itself, because buffer may not be large enough to accomodate that extra zero byte. A possibility is:
char *newbuffer = malloc(buffersize + 1);
strncpy(newbuffer, buffer, buffersize);
newbuffer[buffersize] = 0; // literal zero value
Or you can construct a std::string, whichever you prefer.

Weird characters when trying to grab char * from fstream

I am trying to read 4 characters at a specific position from a file. The code is simple but the result is really confusing:
fstream dicomFile;
dicomFile.open(argv[1]);
dicomFile.seekg(128,ios::beg);
char * memblock = new char [4];
dicomFile.read(memblock,4);
cout<<"header is "<<memblock<<endl;
Ideally the result should be "DICM" but the actual result from the console was "DICM" plus weird characters, as shown in the picture. What's more, every time I run it, the characters are different. I suppose this may be something about ASCII and Unicode, I tried to change project property from Unicode to multibytes and then change back, no difference.
Does anyone know what's happening here and how do I solve it please? Thanks very much!
C style (char *) strings use the concept of null-terminators. This means strings are ended with a '\0' character in their last element. You are reading in exactly 4 characters into a 4 character buffer, which does not include a null character to end the string. C and C++ will happily run right off the end of your buffer in search for the null terminator that signifies the end of the string.
Quick fix is to create a block of length + 1, read in length data, then set str[length] = '\0'. In your case it would be as below.
char * memBlock = new char [5];
// populate memBlock with 4 characters
memBlock[ 4 ] = '\0';
A better solution is to use std::string instead of char * when working with strings in C++.
You could also initialize the buffer with zeros, putting null-terminators at every location.
char * memblock = new char [5](); // zeros, and one element longer
Fairly inefficient though.

How to read in only a particular number of characters

I have a small query regarding reading a set of characters from a structure. For example: A particular variable contains a value "3242C976*32" (char - type). How can I get only the first 8 bits of this variable. Kindly help.
Thanks.
Edit:
I'm trying to read in a signal:
For Ex: $ASWEER,2,X:3242C976*32
into this structure:
struct pg
{
char command[7]; // saves as $ASWEER,2,X:3242C976*32
char comma1[1]; // saves as ,2,X:3242C976*32
char groupID[1]; // saves as 2,X:3242C976*32
char comma2[1]; // etc
char handle[2]; // this is the problem, need it to save specifically each part, buts its not
char canID[8];
char checksum[3];
}m_pg;
...
When memcopying buffer into a structure, it works but because there is no carriage returns it saves the rest of the signal in each char variable. So, there is always garbage at the end.
you could..
convert your hex value in canID to float(depending on how you want to display it), e.g.
float value1 = HexToFloat(m_pg.canID); // find a conversion script for HexToFloat
CString val;
val.Format("0.3f",value1);
the garbage values aren't actually being stored in the structure, it only displays it as so, as there is no carriage return, so format the message however you want to and display it using the CString val;
If "3242C976*3F" is a c-string or std::string, you can just do:
char* str = "3242C976*3F";
char first_byte = str[0];
Or with an arbitrary memory block you can do:
SomeStruct memoryBlock;
char firstByte;
memcpy(&firstByte, &memoryBlock, 1);
Both copy the first 8bits or 1 byte from the string or arbitrary memory block just as well.
After the edit (original answer below)
Just copy by parts. In C, something like this should work (could also work in C++ but may not be idiomatic)
strncpy(m_pg.command, value, 7); // m.pg_command[7] = 0; // oops
strncpy(m_pg.comma, value+7, 1); // m.pg_comma[1] = 0; // oops
strncpy(m_pg.groupID, value+8, 1); // m.pg_groupID[1] = 0; // oops
strncpy(m_pg.comma2, value+9, 1); // m.pg_comma2[1] = 0; // oops
// etc
Also, you don't have space for the string terminator in the members of the structure (therefore the oopses above). They are NOT strings. Do not printf them!
Don't read more than 8 characters. In C, something like
char value[9]; /* 8 characters and a 0 terminator */
int ch;
scanf("%8s", value);
/* optionally ignore further input */
while (((ch = getchar()) != '\n') && (ch != EOF)) /* void */;
/* input terminated with ch (either '\n' or EOF) */
I believe the above code also "works" in C++, but it may not be idiomatic in that language
If you have a char pointer, you can just set str[8] = '\0'; Be careful though, because if the buffer is less than 8 (EDIT: 9) bytes, this could cause problems.
(I'm just assuming that the name of the variable that already is holding the string is called str. Substitute the name of your variable.)
It looks to me like you want to split at the comma, and save up to there. This can be done with strtok(), to split the string into tokens based on the comma, or strchr() to find the comma, and strcpy() to copy the string up to the comma.

C++ partially filling array using null

NoobQuestion:
I heard that filling a char array can be terminated early with the null char. How is this done?
I've searched every single google result out there but still have come up empty handed.
Do you mean something like this:
char test[11] = "helloworld";
std::cout << test << std::endl;
test[2] = 0;
std::cout << test;
This outputs
helloworld
he
?
That's a convention called "null-terminated string". If you have a block of memory which you treat as a char buffer and there's a null character within that buffer then the null-terminated string is whatever is contained starting with the beginning of the buffer and up to and including the null character.
const int bufferLength = 256;
char buffer[bufferLength] = "somestring"; //10 character plus a null character put by the compiler - total 11 characters
here the compiler will place a null character after the "somestring" (it does so even if you don't ask to). So even though the buffer is of length 256 all the functions that work with null-terminated strings (like strlen()) will not read beyond the null character at position 10.
That is the "early termination" - whatever data is in the buffer beyond the null character it is ignored by any code designed to work with null-terminated strings. The last part is important - code could easily ignore the null character and then no "termination" would happen on null character.