StringCchCat does not append source string to destination string [duplicate] - c++

Why does this code produce runtime issues:
char stuff[100];
strcat(stuff,"hi ");
strcat(stuff,"there");
but this doesn't?
char stuff[100];
strcpy(stuff,"hi ");
strcat(stuff,"there");

strcat will look for the null-terminator, interpret that as the end of the string, and append the new text there, overwriting the null-terminator in the process, and writing a new null-terminator at the end of the concatenation.
char stuff[100]; // 'stuff' is uninitialized
Where is the null terminator? stuff is uninitialized, so it might start with NUL, or it might not have NUL anywhere within it.
In C++, you can do this:
char stuff[100] = {}; // 'stuff' is initialized to all zeroes
Now you can do strcat, because the first character of 'stuff' is the null-terminator, so it will append to the right place.
In C, you still need to initialize 'stuff', which can be done a couple of ways:
char stuff[100]; // not initialized
stuff[0] = '\0'; // first character is now the null terminator,
// so 'stuff' is effectively ""
strcpy(stuff, "hi "); // this initializes 'stuff' if it's not already.

In the first case, stuff contains garbage. strcat requires both the destination and the source to contain proper null-terminated strings.
strcat(stuff, "hi ");
will scan stuff for a terminating '\0' character, where it will start copying "hi ". If it doesn't find it, it will run off the end of the array, and arbitrarily bad things can happen (i.e., the behavior is undefined).
One way to avoid the problem is like this:
char stuff[100];
stuff[0] = '\0'; /* ensures stuff contains a valid string */
strcat(stuff, "hi ");
strcat(stuff, "there");
Or you can initialize stuff to an empty string:
char stuff[100] = "";
which will fill all 100 bytes of stuff with zeros (the increased clarity is probably worth any minor performance issue).

Because stuff is uninitialized before the call to strcpy. After the declaration stuff isn't an empty string, it is uninitialized data.
strcat appends data to the end of a string - that is it finds the null terminator in the string and adds characters after that. An uninitialized string isn't gauranteed to have a null terminator so strcat is likely to crash.
If there were to intialize stuff as below you could perform the strcat's:
char stuff[100] = "";
strcat(stuff,"hi ");
strcat(stuff,"there");

Strcat append a string to existing string. If the string array is empty, it is not going go find end of string ('\0') and it will cause run time error.
According to Linux man page, simple strcat is implemented this way:
char*
strncat(char *dest, const char *src, size_t n)
{
size_t dest_len = strlen(dest);
size_t i;
for (i = 0 ; i < n && src[i] != '\0' ; i++)
dest[dest_len + i] = src[i];
dest[dest_len + i] = '\0';
return dest;
}
As you can see in this implementation, strlen(dest) will not return correct string length unless dest is initialized to correct c string values. You may get lucky to have an array with the first value of zero at char stuff[100]; , but you should not rely on it.

Also, I would advise against using strcpy or strcat as they can lead to some unintended problems.
Use strncpy and strncat, as they help prevent buffer overflows.

Related

Why strcat causing segmentation fault

Can anyone explain to me why the following code causing segmentation fault? buff should be long enough to hold 128 characters.
int main () {
char buff[16384];
char buff2[128];
sprintf(buff2, "MinPer(PatternName_Equal27_xxxxxxx_MasterPatSetup.PatternName_Equal27_xxxxxxx__default_WFT___WFTRef.ActualT0Period.UserPeriod_2_1)" );
strcat(buff, buff2);
std::cout << buff2 << endl;
std::cout << buff << endl;
return 0;
}
You have two major problems:
Your sprintf is shoving 131 bytes (130 characters plus a NUL) into a 128 byte buffer, meaning three unrelated stack bytes are getting overwritten with garbage. You need a larger buffer, or a smaller initialization string.
You call strcat to append said 131 characters to a buffer with undefined contents (no NUL to indicate where the string being concatenated to ends). This is trivially fixable, by either zero-initializing all of buff (char buff[16384] = {0};) or by inserting the NUL in the first byte (which is all you really need) adding buff[0] = '\0'; just before you strcat to it. Equivalently, you could replace strcat (which assumes a string to concatenate new data to exists in the destination) with strcpy (which ignores the existing contents of the destination) to avoid the problem.
Basically, your code is full of undefined behavior and buffer overruns. Given you're using C++, can I recommend just using std::string to avoid the hassle of C strings?
buff is uninitialized. It needs to contain a null terminated string so that strcat knows where to begin the concatenation. One way to do this is with strcpy:
strcpy(buff, ""); // initialize with empty null terminated string
strcat(buff, buff2); // add to it
strcat needs 'dest' to be a string ending with '\0'. So buff should be initialized manually.

How to convert a std::string which contains '\0' to a char* array?

I have a string like,
string str="aaa\0bbb";
and I want to copy the value of this string to a char* variable. I tried the following methods but none of them worked.
char *c=new char[7];
memcpy(c,&str[0],7); // c="aaa"
memcpy(c,str.data(),7); // c="aaa"
strcpy(c,str.data()); // c="aaa"
str.copy(c,7); // c="aaa"
How can I copy that string to a char* variable without loosing any data?.
You can do it the following way
#include <iostream>
#include <string>
#include <cstring>
int main()
{
std::string s( "aaa\0bbb", 7 );
char *p = new char[s.size() + 1];
std::memcpy( p, s.c_str(), s.size() );
p[s.size()] = '\0';
size_t n = std::strlen( p );
std::cout << p << std::endl;
std::cout << p + n + 1 << std::endl;
}
The program output is
aaa
bbb
You need to keep somewhere in the program the allocated memory size for the character array equal to s.size() + 1.
If there is no need to keep the "second part" of the object as a string then you may allocate memory of the size s.size() and not append it with the terminating zero.
In fact these methods used by you
memcpy(c,&str[0],7); // c="aaa"
memcpy(c,str.data(),7); // c="aaa"
str.copy(c,7); // c="aaa"
are correct. They copy exactly 7 characters provided that you are not going to append the resulted array with the terminating zero. The problem is that you are trying to output the resulted character array as a string and the used operators output only the characters before the embedded zero character.
Your string consists of 3 characters. You may try to use
using namespace std::literals;
string str="aaa\0bbb"s;
to create string with \0 inside, it will consist of 7 characters
It's still won't help if you will use it as c-string ((const) char*). c-strings can't contain zero character.
There are two things to consider: (1) make sure that str already contains the complete literal (the constructor taking only a char* parameter might truncate at the string terminator char). (2) Provided that str actually contains the complete literal, statement memcpy(c,str.data(),7) should work. The only thing then is how you "view" the result, because if you pass c to printf or cout, then they will stop printing once the first string terminating character is reached.
So: To make sure that your string literal "aaa\0bbb" gets completely copied into str, use std::string str("aaa\0bbb",7); Then, try to print the contents of c in a loop, for example:
std::string str("aaa\0bbb",7);
const char *c = str.data();
for (int i=0; i<7; i++) {
printf("%c", c[i] ? c[i] : '0');
}
You already did (not really, see edit below). The problem however, is that whatever you are using to print the string (printf?), is using the c string convention of ending strings with a '\0'. So it starts reading your data, but when it gets to the 0 it will assume it is done (because it has no other way).
If you want to simply write the buffer to the output, you will have to do this with something like
write(stdout, c, 7);
Now write has information about where the data ends, so it can write all of it.
Note however that your terminal cannot really show a \0 character, so it might show some weird symbol or nothing at all. If you are on linux you can pipe into hexdump to see what the binary output is.
EDIT:
Just realized, that your string also initalizes from const char* by reading until the zero. So you will also have to use a constructor to tell it to read past the zero:
std::string("data\0afterzero", 14);
(there are prettier solutions probably)

How to check the contents of a LPTSTR string?

I'm trying to understand why a segmentation fault (SIGSEGV) occurs during the execution of this piece of code. This error occurs when testing the condition specified in the while instruction, but it does not occur at the first iteration, but at the second iteration.
LPTSTR arrayStr[STR_COUNT];
LPTSTR inputStr;
LPTSTR str;
// calls a function from external library
// in order to set the inputStr string
set_input_str(param1, (char*)&inputStr, param3);
str = inputStr;
while( *str != '\0' )
{
if( debug )
printf("String[%d]: %s\n", i, (char*)str);
arrayStr[i] = str;
str = str + strlen((char*)str) + 1;
i++;
}
After reading this answer, I have done some research on the internet and found this article, so I tried to modify the above code, using this piece of code read in this article (see below). However, this change did not solve the problem.
for (LPTSTR pszz = pszzStart; *pszz; pszz += lstrlen(pszz) + 1) {
... do something with pszz ...
}
As assumed in this answer, it seems that the code expects double null terminated arrays of string. Therefore, I wonder how I could check the contents of the inputStr string, in order to check if it actually contains only one null terminator char.
NOTE: the number of characters in the string printed from printf instruction is twice the value returned by the lstrlen(str) function call at the first iteration.
OK, now that you've included the rest of the code it is clear that it is indeed meant to parse a set of consecutive strings. The problem is that you're mixing narrow and wide string types. All you need to do to fix it is change the variable definitions (and remove the casts):
char *arrayStr[STR_COUNT];
char *inputStr;
char *str;
// calls a function from external library
// in order to set the inputStr string
set_input_str(param1, &inputStr, param3);
str = inputStr;
while( *str != '\0' )
{
if( debug )
printf("String[%d]: %s\n", i, str);
arrayStr[i] = str;
str = str + strlen(str) + 1;
i++;
}
Specifically, the issue was occurring on this line:
while( *str != '\0' )
since you hadn't cast str to char * the comparison was looking for a wide nul rather than a narrow nul.
str = str + strlen(str) + 1;
You go out of bounds, change to
str = str + 1;
or simply:
str++;
Of course you are inconsistently using TSTR and strlen, the latter assuming TCHAR = char
In any case, strlen returns the length of the string, which is the number of characters it contains not including the nul character.
Your arithmetic is out by one but you know you have to add one to the length of the string when you allocate the buffer.
Here however you are starting at position 0 and adding the length which means you are at position len which is the length of the string. Now the string runs from offset 0 to offset len - 1 and offset len holds the null character. Offset len + 1 is out of bounds.
Sometimes you might get away with reading it, if there is extra padding, but it is undefined behaviour and here you got a segfault.
This looks to me like code that expects double null terminated arrays of strings. I suspect that you are passing a single null terminated string.
So you are using something like this:
const char* inputStr = "blah";
but the code expects two null terminators. Such as:
const char* inputStr = "blah\0";
or perhaps an input value with multiple strings:
const char* inputStr = "foo\0bar\0";
Note that these final two strings are indeed double null terminated. Although only one null terminator is written explicitly at the end of the string, the compiler adds another one implicitly.
Your question edit throws a new spanner in the works? The cast in
strlen((char*)str)
is massively dubious. If you need to cast then the cast must be wrong. One wonders what LPTSTR expands to for you. Presumably it expands to wchar_t* since you added that cast to make the code compile. And if so, then the cast does no good. You are lying to the compiler (str is not char*) and lying to the compiler never ends well.
The reason for the segmentation fault is already given by Alter's answer. However, I'd like to add that the usual style of parsing a C-style string is more elegant and less verbose
while (char ch = *str++)
{
// other instructions
// ...
}
The scope of ch is only within in the body of the loop.
Aside: Either tag the question as C or C++ but not both, they're different languages.

strncpy() to get end of string

I am using C Style strings for a project, and I am confusing myself a bit. I am checking strings to see what they are prepended with (zone_, player_, etc) then getting the rest of the string after that.
else if(strncmp(info, "zone_", 5) == 0)
{
int len = strlen(info);
char *zoneName = new char[len];
strncpy(zoneName, &info[5], len-5);
Msg("Zone Selected: %s\n", zoneName);
delete zoneName;
}
When I print out the zoneName variable though, it is correct except it is followed by a bunch of gibberish. What am I doing wrong? (I realize that the gibberish is the rest of the char array being empty, but I don't know a better way to do this)
See strncpy description :
No null-character is implicitly
appended to the end of destination, so
destination will only be
null-terminated if the length of the C
string in source is less than num.
You have to remember that C-style strings are terminated with a NUL character. You've allocated enough space in zoneName, but you only need len-5 plus one:
char *zoneName = new char[len - 5 + 1];
Then, you can actually use strcpy() to copy the tail of the string:
strcpy(zoneName, &info[5]);
You don't need to specify the length because the source string is NUL terminated.
C strings are zero terminated - so they occupy len bytes (chars to be precise) plus one more with value zero known as the 'zero terminator'. You need to allocate one more character, and either copy one more from the source (since it should be zero terminated) or just set the last char of the destination to 0.
int len = strlen(info);
char *zoneName = new char[len - 5 + 1];
strncpy(zoneName, &info[5], len - 5 + 1);
C-style strings has to be finished with a byte with zero value. You should modify your code like this:
char *zoneName = new char[len-5+1];
strncpy(zoneName, &info[5], len-5);
/* correct string ending */
zoneName[len]=0;
/* Now, it's safe to print */
Msg("Zone Selected: %s\n", zoneName);

Issue with char[] in VS2008 - why does strcat append to the end of an empty array?

I am passing an empty char array that I need to recursively fill using strcat(). However, in the VS debugger, the array is not empty, it's full of some weird junk characters that I don't recognise. strcat() then appends to the end of these junk characters rather than at the front of the array.
I have also tried encoded[0] = '\0' to clear the junk before passing the array, but then strcat() doesn't append anything on the recursive call.
This is the code that supplies the array and calls the recursive function:
char encoded[512];
text_to_binary("Some text", encoded);
This is the recursive function:
void text_to_binary(const char* str, char* encoded)
{
char bintemp[9];
bintemp[0] = '\0';
while(*str != '\0')
{
ascii_to_binary(*str, bintemp);
strcat(encoded, bintemp);
str++;
text_to_binary(str, encoded);
}
}
What is going on?
ps. I can't use std::string - I am stuck with the char*.
Edit: This is the junk character in the array:
ÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌ...
You are not initialising the array. Change:
char encoded[512];
to
char encoded[512] = "";
strcat appends to the end of the string, the end is marked by a \0, it then appends a \0 to the new end position.
You should clear the destination encoded with either encoded[0]=0; or memset first.
char encoded[512];.. encoded is not initialized and will contain junk (or 0xCCCCCCCC in debug builds).
Your problem was due to encode initialization I think. A few comment on your program:
it's better to avoid recursive
function when you can do it with a
loop.
Second you should add the size of
encoded to avoid possible overflow
error (in the case the size of string
is bigger than encoded).
void text_to_binary(const char* str, char* encoded)
{
char bintemp[9];
bintemp[0] = '\0';
encode[0] = '\0';
for(const char *i = str; i!='\0'; i++)
{
ascii_to_binary(*i, bintemp);
strcat(encoded, bintemp);
}
}
PS: i didn't tried the source code, so if there is an error add a comment and I will correct it.
Good contination on your project.
The solution to your immediate problem has been posted already, but your text_to_binary is still inefficient. You are essentially calling strcat in a loop with always the same string to concatenate to, and strcat needs to iterate through the string to find its end. This makes your algorithm quadratic. What you should do is to keep track of the end of encoded on your own and put the content of bintemp directly there. A better way to write the loop would be
while(*str != '\0')
{
ascii_to_binary(*str, bintemp);
strcpy(encoded, bintemp);
encoded += strlen(bintemp);
str++;
}
You don't need the recursion because you are already looping over str (I believe this to be correct, as your original code will fill encoded pretty weirdly). Also, in the modified version, encoded is always pointing to the end of the original encoded string, so you can just use strcpy instead of strcat.
You didn't attached source of ascii_to_binary, let's assume that it will fill buffer with hex dump of the char (if this is the case it's easier to use sprintf(encoded+(i2),"%2x",*(str+i));
What's the point of recursively calling text_to_binary? I think this might be a problem.