Can anyone explain to me why the following code causing segmentation fault? buff should be long enough to hold 128 characters.
int main () {
char buff[16384];
char buff2[128];
sprintf(buff2, "MinPer(PatternName_Equal27_xxxxxxx_MasterPatSetup.PatternName_Equal27_xxxxxxx__default_WFT___WFTRef.ActualT0Period.UserPeriod_2_1)" );
strcat(buff, buff2);
std::cout << buff2 << endl;
std::cout << buff << endl;
return 0;
}
You have two major problems:
Your sprintf is shoving 131 bytes (130 characters plus a NUL) into a 128 byte buffer, meaning three unrelated stack bytes are getting overwritten with garbage. You need a larger buffer, or a smaller initialization string.
You call strcat to append said 131 characters to a buffer with undefined contents (no NUL to indicate where the string being concatenated to ends). This is trivially fixable, by either zero-initializing all of buff (char buff[16384] = {0};) or by inserting the NUL in the first byte (which is all you really need) adding buff[0] = '\0'; just before you strcat to it. Equivalently, you could replace strcat (which assumes a string to concatenate new data to exists in the destination) with strcpy (which ignores the existing contents of the destination) to avoid the problem.
Basically, your code is full of undefined behavior and buffer overruns. Given you're using C++, can I recommend just using std::string to avoid the hassle of C strings?
buff is uninitialized. It needs to contain a null terminated string so that strcat knows where to begin the concatenation. One way to do this is with strcpy:
strcpy(buff, ""); // initialize with empty null terminated string
strcat(buff, buff2); // add to it
strcat needs 'dest' to be a string ending with '\0'. So buff should be initialized manually.
Related
I have a string like,
string str="aaa\0bbb";
and I want to copy the value of this string to a char* variable. I tried the following methods but none of them worked.
char *c=new char[7];
memcpy(c,&str[0],7); // c="aaa"
memcpy(c,str.data(),7); // c="aaa"
strcpy(c,str.data()); // c="aaa"
str.copy(c,7); // c="aaa"
How can I copy that string to a char* variable without loosing any data?.
You can do it the following way
#include <iostream>
#include <string>
#include <cstring>
int main()
{
std::string s( "aaa\0bbb", 7 );
char *p = new char[s.size() + 1];
std::memcpy( p, s.c_str(), s.size() );
p[s.size()] = '\0';
size_t n = std::strlen( p );
std::cout << p << std::endl;
std::cout << p + n + 1 << std::endl;
}
The program output is
aaa
bbb
You need to keep somewhere in the program the allocated memory size for the character array equal to s.size() + 1.
If there is no need to keep the "second part" of the object as a string then you may allocate memory of the size s.size() and not append it with the terminating zero.
In fact these methods used by you
memcpy(c,&str[0],7); // c="aaa"
memcpy(c,str.data(),7); // c="aaa"
str.copy(c,7); // c="aaa"
are correct. They copy exactly 7 characters provided that you are not going to append the resulted array with the terminating zero. The problem is that you are trying to output the resulted character array as a string and the used operators output only the characters before the embedded zero character.
Your string consists of 3 characters. You may try to use
using namespace std::literals;
string str="aaa\0bbb"s;
to create string with \0 inside, it will consist of 7 characters
It's still won't help if you will use it as c-string ((const) char*). c-strings can't contain zero character.
There are two things to consider: (1) make sure that str already contains the complete literal (the constructor taking only a char* parameter might truncate at the string terminator char). (2) Provided that str actually contains the complete literal, statement memcpy(c,str.data(),7) should work. The only thing then is how you "view" the result, because if you pass c to printf or cout, then they will stop printing once the first string terminating character is reached.
So: To make sure that your string literal "aaa\0bbb" gets completely copied into str, use std::string str("aaa\0bbb",7); Then, try to print the contents of c in a loop, for example:
std::string str("aaa\0bbb",7);
const char *c = str.data();
for (int i=0; i<7; i++) {
printf("%c", c[i] ? c[i] : '0');
}
You already did (not really, see edit below). The problem however, is that whatever you are using to print the string (printf?), is using the c string convention of ending strings with a '\0'. So it starts reading your data, but when it gets to the 0 it will assume it is done (because it has no other way).
If you want to simply write the buffer to the output, you will have to do this with something like
write(stdout, c, 7);
Now write has information about where the data ends, so it can write all of it.
Note however that your terminal cannot really show a \0 character, so it might show some weird symbol or nothing at all. If you are on linux you can pipe into hexdump to see what the binary output is.
EDIT:
Just realized, that your string also initalizes from const char* by reading until the zero. So you will also have to use a constructor to tell it to read past the zero:
std::string("data\0afterzero", 14);
(there are prettier solutions probably)
Why does this code produce runtime issues:
char stuff[100];
strcat(stuff,"hi ");
strcat(stuff,"there");
but this doesn't?
char stuff[100];
strcpy(stuff,"hi ");
strcat(stuff,"there");
strcat will look for the null-terminator, interpret that as the end of the string, and append the new text there, overwriting the null-terminator in the process, and writing a new null-terminator at the end of the concatenation.
char stuff[100]; // 'stuff' is uninitialized
Where is the null terminator? stuff is uninitialized, so it might start with NUL, or it might not have NUL anywhere within it.
In C++, you can do this:
char stuff[100] = {}; // 'stuff' is initialized to all zeroes
Now you can do strcat, because the first character of 'stuff' is the null-terminator, so it will append to the right place.
In C, you still need to initialize 'stuff', which can be done a couple of ways:
char stuff[100]; // not initialized
stuff[0] = '\0'; // first character is now the null terminator,
// so 'stuff' is effectively ""
strcpy(stuff, "hi "); // this initializes 'stuff' if it's not already.
In the first case, stuff contains garbage. strcat requires both the destination and the source to contain proper null-terminated strings.
strcat(stuff, "hi ");
will scan stuff for a terminating '\0' character, where it will start copying "hi ". If it doesn't find it, it will run off the end of the array, and arbitrarily bad things can happen (i.e., the behavior is undefined).
One way to avoid the problem is like this:
char stuff[100];
stuff[0] = '\0'; /* ensures stuff contains a valid string */
strcat(stuff, "hi ");
strcat(stuff, "there");
Or you can initialize stuff to an empty string:
char stuff[100] = "";
which will fill all 100 bytes of stuff with zeros (the increased clarity is probably worth any minor performance issue).
Because stuff is uninitialized before the call to strcpy. After the declaration stuff isn't an empty string, it is uninitialized data.
strcat appends data to the end of a string - that is it finds the null terminator in the string and adds characters after that. An uninitialized string isn't gauranteed to have a null terminator so strcat is likely to crash.
If there were to intialize stuff as below you could perform the strcat's:
char stuff[100] = "";
strcat(stuff,"hi ");
strcat(stuff,"there");
Strcat append a string to existing string. If the string array is empty, it is not going go find end of string ('\0') and it will cause run time error.
According to Linux man page, simple strcat is implemented this way:
char*
strncat(char *dest, const char *src, size_t n)
{
size_t dest_len = strlen(dest);
size_t i;
for (i = 0 ; i < n && src[i] != '\0' ; i++)
dest[dest_len + i] = src[i];
dest[dest_len + i] = '\0';
return dest;
}
As you can see in this implementation, strlen(dest) will not return correct string length unless dest is initialized to correct c string values. You may get lucky to have an array with the first value of zero at char stuff[100]; , but you should not rely on it.
Also, I would advise against using strcpy or strcat as they can lead to some unintended problems.
Use strncpy and strncat, as they help prevent buffer overflows.
Here is the code:
int main()
{
char str[] = {'a','b','c',' ','d','e',' ',' ','f',' ',' ',' ','g','h','i',' ',' ',' ',' ','j','k'};
cout << "Len = " << strlen(str) << endl;
char* cstr = new char[strlen(str)];
strcpy(cstr, str);
cstr[5] = '\0';
cout << "Len= " << strlen(cstr) << endl;
return 0;
}
//---------------
Result console:
Len = 21
Len= 5
As you see the Len of cstr changed. It mean that remain memory area of cstr is free. Is it right?
No. All strlen() does is look for the first null character ('\0') in the string. It doesn't free memory. It doesn't even care if the memory it examines is properly allocated. It will happily walk past the end of allocated memory in search of a null character if none is found starting from the pointer you give it.
The code is broken from the starts. str is not a nul-terminated string, and as such can't be used with functions expecting those strings, such as strlen or strcpy.
As you see the Len of cstr changed. It mean that remain memory area of
cstr is free. Is it right?
No. It's not. You allocated memory for array on heap and then inserted \0 at place between array. Because of this, strlen is reporting length of array equals to 5 (because it computes length of char array by looking \0 character) but memory past that index still exists on heap. To free memory, You need to call delete [] cstr.
No. new just allocates a chunk of memory of the size you specified. The only way to release it is to call delete on it.
strlen is a function that parses memory from a starting address and counts the number of non NUL bytes, such a thing is called a C-string.
Putting a NUL byte somewhere in memory is no different from putting any other value for the memory management.
As you see the Len of cstr changed. It mean that remain memory area of cstr is free. Is it right?
No. strlen only returns the length of the string stored within the array, not the size of the array itself. The length of the string may be anywhere from 0 to strlen(str) - 1, but the size of cstr is always going to be strlen(str).
The size of the array does not change just because you stored a smaller string to it, any more than a glass gets smaller if you only fill it half way. The only way to release the memory pointed to by cstr is to use the delete operator.
No, it does not mean that remain memory area of str is free.
strlen(cstr) calculates the length of the string upto a point when a NUL \0 character is encountered.
In the beginning you allocated a char array of length 22 char. Replacing a char in between with a NUL \0 is only going to make strlen believe that the string is upto 5 char long. It will not free the other 17 char that were allocated for the local char array after that replaced char.
The memory for char array str will get get unallocated once the function main() exits (since it is a local array).
NoobQuestion:
I heard that filling a char array can be terminated early with the null char. How is this done?
I've searched every single google result out there but still have come up empty handed.
Do you mean something like this:
char test[11] = "helloworld";
std::cout << test << std::endl;
test[2] = 0;
std::cout << test;
This outputs
helloworld
he
?
That's a convention called "null-terminated string". If you have a block of memory which you treat as a char buffer and there's a null character within that buffer then the null-terminated string is whatever is contained starting with the beginning of the buffer and up to and including the null character.
const int bufferLength = 256;
char buffer[bufferLength] = "somestring"; //10 character plus a null character put by the compiler - total 11 characters
here the compiler will place a null character after the "somestring" (it does so even if you don't ask to). So even though the buffer is of length 256 all the functions that work with null-terminated strings (like strlen()) will not read beyond the null character at position 10.
That is the "early termination" - whatever data is in the buffer beyond the null character it is ignored by any code designed to work with null-terminated strings. The last part is important - code could easily ignore the null character and then no "termination" would happen on null character.
I am passing an empty char array that I need to recursively fill using strcat(). However, in the VS debugger, the array is not empty, it's full of some weird junk characters that I don't recognise. strcat() then appends to the end of these junk characters rather than at the front of the array.
I have also tried encoded[0] = '\0' to clear the junk before passing the array, but then strcat() doesn't append anything on the recursive call.
This is the code that supplies the array and calls the recursive function:
char encoded[512];
text_to_binary("Some text", encoded);
This is the recursive function:
void text_to_binary(const char* str, char* encoded)
{
char bintemp[9];
bintemp[0] = '\0';
while(*str != '\0')
{
ascii_to_binary(*str, bintemp);
strcat(encoded, bintemp);
str++;
text_to_binary(str, encoded);
}
}
What is going on?
ps. I can't use std::string - I am stuck with the char*.
Edit: This is the junk character in the array:
ÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌ...
You are not initialising the array. Change:
char encoded[512];
to
char encoded[512] = "";
strcat appends to the end of the string, the end is marked by a \0, it then appends a \0 to the new end position.
You should clear the destination encoded with either encoded[0]=0; or memset first.
char encoded[512];.. encoded is not initialized and will contain junk (or 0xCCCCCCCC in debug builds).
Your problem was due to encode initialization I think. A few comment on your program:
it's better to avoid recursive
function when you can do it with a
loop.
Second you should add the size of
encoded to avoid possible overflow
error (in the case the size of string
is bigger than encoded).
void text_to_binary(const char* str, char* encoded)
{
char bintemp[9];
bintemp[0] = '\0';
encode[0] = '\0';
for(const char *i = str; i!='\0'; i++)
{
ascii_to_binary(*i, bintemp);
strcat(encoded, bintemp);
}
}
PS: i didn't tried the source code, so if there is an error add a comment and I will correct it.
Good contination on your project.
The solution to your immediate problem has been posted already, but your text_to_binary is still inefficient. You are essentially calling strcat in a loop with always the same string to concatenate to, and strcat needs to iterate through the string to find its end. This makes your algorithm quadratic. What you should do is to keep track of the end of encoded on your own and put the content of bintemp directly there. A better way to write the loop would be
while(*str != '\0')
{
ascii_to_binary(*str, bintemp);
strcpy(encoded, bintemp);
encoded += strlen(bintemp);
str++;
}
You don't need the recursion because you are already looping over str (I believe this to be correct, as your original code will fill encoded pretty weirdly). Also, in the modified version, encoded is always pointing to the end of the original encoded string, so you can just use strcpy instead of strcat.
You didn't attached source of ascii_to_binary, let's assume that it will fill buffer with hex dump of the char (if this is the case it's easier to use sprintf(encoded+(i2),"%2x",*(str+i));
What's the point of recursively calling text_to_binary? I think this might be a problem.