Appending character arrays using strcat does not work - c++

Can some one tell me what's wrong with this code???
char sms[] = "gr8";
strcat (sms, " & :)");

sms is an array of size 4 1. And you're appending more char literals, which is going outside of the array, as the array can accommodate at max 4 chars which is already occupied by g, r, 8, \0.
1. By the way, why exactly 4? Answer : Because that there is a null character at the end!
If you mention the size of array as shown below, then your code is valid and well-defined.
char sms[10] = "gr8"; //ensure that size of the array is 10
//so it can be appended few chars later.
strcat (sms, " & :)");
But then C++ provides you better solution: use std::string as:
#include <string> //must
std::string sms = "gr8";
sms += " & :)"; //string concatenation - easy and cute!

Yes, there is no room for the extra characters. sms[] only allocates enough space to store the string that it is initialized with.
Using C++, a much better solution is:
std::string sms = "gr8";
sms += " & :)";

You're copying data into unallocated memory.
When you do this: char sms[] = "gr8"; you create a char array with 4 characters, "gr8" plus the 0 character at the end of the string.
Then you try to copy extra characters to the array with the strcat call, beyond the end of the array. This leads to undefined behaviour, which means something unpredictable will happen (the program might crash, or you might see weird output).
To fix this, make sure that the array that you are copying the characters to is large enough to contain all the characters, and don't forget the 0 character at the end.

In C, arrays don't automatically grow.
sms has a specific length (4, in this case - three letters and the terminating NULL). When you call strcat, you are trying to append characters to that array past its length.
This is undefined behavior, and will break your program.
If instead you had allocated an array with a large enough size to contain both strings, you would be okay:
char sms[9] = "gr8";
strcat (sms, " & :)");
C++ has the (basically) the same restrictions on arrays that C does. However, it provides higher level facilities that make it so you don't have to deal with arrays a lot of the time, such as std::string:
#include <string>
// ...
std::string sms = "gr8";
sms += " & :)";
The reason this is nicer is that you don't have to know ahead of time exactly how long your string will be. C++ will grow the underlying storage in memory for you.

Buffer overflow for character array followed by crash somewhere!

Your sms buffer is only 4 characters long. strcat will copy 5 more characters over the end of it and corrupt the stack.

Related

Dynamically allocated C-style string has more characters than given length?

I'm using a dynamic C-style string to read in data from a file, but for some reason when I dynamically allocate the C-style string using the given length, it comes out with four extra characters that can be seen using strlen(). The junk in these empty spaces is added on to the end of the read-in string and is displayed on cout. What on earth could be causing this, and how can I fix it?
The C-style string is declared in the beginning of the code, and is used one time before this. The time it is used before this it is also too large, but in that case it does not add extra information to the end. After use, it is deleted and not used again until this point. I'm pretty confused as I have not had this happen or had a problem with it before.
// Length read as 14, which is correct
iFile.read(reinterpret_cast<char *>(&length), sizeof(int));
tempCstring = new char[length]; // Length still 14
cout << strlen(tempCstring); // Console output: 18
// In tempCstring: Powerful Blockýýýý
iFile.read(reinterpret_cast<char *>(tempCstring), length);
// Custom String class takes in value Powerful Blockýýýý and is
// initialized to that
tempString = String(tempCstring);
// Temp character value takes in messed up string
temp.setSpecial(tempString);
delete[] tempCstring; // Temp cString is deleted for next use
When written to file:
// Length set to the length of the cString, m_special
length = strlen(chars[i].getSpecial().getStr());
// Length written to file. (Should I add 1 for null terminator?)
cFile.write(reinterpret_cast<char *>(&length), sizeof(int));
// String written to file
cFile.write(reinterpret_cast<char *>(chars[i].getSpecial().getStr()), length);
Whenever you see junk at the end of a string, the problem is almost always the lack of a terminator. Every C-style string ends in a byte whose value is zero, spelled '\0'. If you did not place one yourself, the standard library keeps reading bytes in memory until it sees a random '\0' that it sees in memory. In other words, the array is read beyond its bounds.
Use memset(tempCString,0,length) in order to zero out the memory following your allocation. However, this is not the soundest solution, as it is covering the real problem under the rug. Show us the context in which this code is used. Then I will be able to say where in your algorithm you will need to insert the null terminator: tempCString[i] = 0, or something like that. Nonetheless, from what you have posted, I can tell that you need to allocate one more character to make room for the terminator.
Also, since you are using C++, why not use std::string? It avoids these kinds of problems.

C++ Garbage at the end of file

I have a problem and I dont know how to solve it.
The issue is:
char * ary = new Char[];
ifstream fle;
fle.open(1.txt, ios_base::binary);
fle.seekg(fle.end);
long count = fle.tellg();
fle.seek(fle.beg);
here is the problem:
File 1.txt contains: Hello world!.
when I execute:
ary = new char(count);
fle.read(ary, count);
the ary filled like this: Hello world! #T#^#$#FF(garbage)
The file is ookay nothing inside it only what is above.
Platform: Win 7, VS 2012
Any idea how to solve this issue. (Solved)
(Problem 2)
Now I am facing another problem, the fle.read sometimes read more than the size i gave. For Example if i passed like fle.read(buffer, 1000) it ends in some cases (strlen(buffer) = 1500. How can i solve this?
Regards,
char[]-strings in C are usually null-terminated. They are one byte longer than necessary, and the last byte is set to 0x00. That's necessary because C has no way to tell the length of an array.
When you read binary data from a file, no terminating null-character is read into the string. That means a function like printf which operates on char-arrays of unknown length will output the array and any data which happens to come after it in memory until it encounters a null-character.
Solution: allocate the char[]-buffer one byte longer than the length of the data and set the last byte to 0 manually.
Better solution: Don't use C-style char-arrays. Do it the object-oriented way and use the class std::string to represent strings.
I think your problem is not that your array contains garbage, but that you forgot to put the null-terminator character at the end and your print statement doesn't know when to stop.
Also, you wrote new char(count) instead of new char[count]. In the first case, you only instantiate one char with value count while in the second case you create a buffer of count characters.
Try this:
ary = new char[count+1];
fle.read(ary, count);
ary[count] = '\0';
Most of the other answers miss a very important point:
When you do ary = new char(count); you allocate A SINGLE CHARACTER initialized with a symbol with ASCII code count.
You should write this: ary = new char[count + 1];
Well, the most obvious problem is that you are allocating using
new char(count), which allocates a single char, initialized
with count. What you were probably trying to do would be new
char[count]. What you really need is:
std::vector<char> arr( count );
fle.read( &arr[0], count );
Or maybe count + 1 in the allocation, if you want a trailing
'\0' in the buffer.
EDIT:
Since you're still having problems: fle.read will never read
more than requested. What does fle.gcount() return after the
read?
If you do:
std::vector<char> arr( count );
fle.read( &arr[0], count );
arr.resize( fle.gcount() );
you should have a vector with exactly the number of char that
you have read. If you want them as a string, you can construct
one from arr.begin(), arr.end(), or even use std::string
instead of std::vector<char> to begin with.
If you need a '\0' terminated string (for interface with
legacy software), then just create your vector with a size of
count + 1, instead of count, and &arr[0] will be your
'\0' string.
Do not try to use new char[count] here. It's very difficult
to do so correctly. (For example, it will require a try block
and a catch.)
We have to guess a little here, but most likely this comes down to an issue with your debugging. The buffer is filled correctly, but you inspect its contents incorrectly.
Now, ary is declared as char* and I suspect that when you attempt to inspect the contents of ary you use some printing method that expects a null-terminated array. But you did not null-terminate the array. And so you have a buffer overrun.
If you had only printed count characters, then you would not have overrun. Nor would you if you had null-terminated the array, not forgetting to allocate an extra character for the null terminator.
Instead of using raw arrays and new, it would make much more sense to read the buffer into std::string. You should be trying to avoid null-terminated strings as much as possible. You use those when performing interop with non-C++ libraries.
You're reading count characters for a file, you have to allocate one extra character to provide for the string terminator (\0).
ary = new char[count + 1];
ary[count] = '\0';
Try this
ary = new char[count + 1];
fle.read(ary,count);
ary[count] = '\0';
The terminating null character was missing - its not in the file, you have to add it afterwards

Char pointer giving me some really strange characters

When I run the example code, the wordLength is 7 (hence the output 7). But my char array gets some really weird characters in the end of it.
wordLength = word.length();
cout << wordLength;
char * wordchar = new char[wordLength]; //new char[7]; ??
for (int i = 0; i < word.length(); i++) //0-6 = 7
{
wordchar[i] = 'a';
}
cout << wordchar;
The output: 7 aaaaaaa²²²²¦¦¦¦¦ÂD╩2¦♀
Desired output is: aaaaaaa... What is the garbage behind it?? And how did it end up there?
You should add \0 at the end of wordchar.
char * wordchar = new char[wordLength +1];
//add chars as you have done
wordchar[wordLength] = `\0`
The reason is that C-strings are null terminated.
C strings are terminated with a '\0' character that marks their end (in contrast, C++ std::string just stores the length separately).
In copying the characters to wordchar you didn't terminate the string, thus, when operator<< outputs wordchar, it goes on until it finds the first \0 character that happens to be after the memory location pointed to by wordchar, and in the process it prints all the garbage values that happen to be in memory in between.
To fix the problem, you should:
make the allocated string 1 char longer;
add the \0 character at the end.
Still, in C++ you'll normally just want to use std::string.
Use: -
char * wordchar = new char[wordLength+1]; // 1 extra for null character
before for loop and
wordchar[i] ='\0'
after for loop , C strings are null terminated.
Without this it keeps on printing, till it finds the first null character,printing all the garbage values.
You avoid the trailing zero, that's the cause.
In C and C++ the way the whole eco-system treats string length is that it assumes a trailing zero ('\0' or simply 0 numerically). This is different then for example pascal strings, where the memory representation starts with the number which tells how many of the next characters comprise the particular string.
So if you have a certain string content what you want to store, you have to allocate one additional byte for the trailing zero. If you manipulate memory content, you'll always have to keep in mind the trailing zero and preserve it. Otherwise strstr and other string manipulation functions can mutate memory content when running off the track and keep on working on the following memory section. Without trailing zero strlen will also give a false result, it also counts until it encounters the first zero.
You are not the only one making this mistake, it often gets important roles in security vulnerabilities and their exploits. The exploit takes advantage of the side effect that function go off trail and manipulate other things then what was originally intended. This is a very important and dangerous part of C.
In C++ (as you tagged your question) you better use STL's std::string, and STL methods instead of C style manipulations.

Confusion about zero-terminating character

I've always had a question about null-terminated strings in C++/C. For example, if you have a character array like so:
char a[10];
And then you wanted to read in characters like so:
for(int i = 0; i < 10; i++)
{
cin >> a[i];
}
And lets in input the following word: questioner
as the input.
Now my question is what happens to the '\0'? If I were to reverse the string, and make it print out
renoitseuq
Where does the null-terminating character go? I thought that good programming practice was to always leave one extra character for the zero-terminating character. But in this example, everything was printed correctly, so why care about the null-terminating character? Just curious. Thanks for your thoughts!
There are cases where you're given a null-terminator, and cases where you have to ask for one yourself.
const char* x = "bla";
is a null-terminated C-style string. It actually has 4 characters - the 3 + the null terminator.
Your string isn't null-terminated. In fact, treating it as a null-terminated string leads to undefined behavior. If you were to cout << it, you'd be attempting to read beyond the memory you're allowed to access, because the runtime will keep looking for a null-terminator and spit out characters until it reaches one. In your case, you were lucky there was one right at the end, but that's not a guarantee.
char a[10]; is just like any other array - un-initialized values, 10 characters - not 11 just because it's a char array. You wouldn't expect int b[10] to contain 10 values for you to play with and an extra 0 at the end just because, would you?
Well, reading that back, I don't see why you'd expect that from a C-string as well - it's not all intuitive.
You are reading 10 chars, not a string. I assume that you also output 10 chars in reverse, so the 0-char plays no role, coz you dont use the array as string, but as an array of single chars...
char a[10] is ten characters, any of which can be a '\0'.
If you put "questioner" in there none of them are.
To get that you'd need a[11] and fill it with "questioner" and then '\0'.
If you were reversing it, you'd get the position of the first '\0' in a[?], reverse up to that and then add a null terminator.
This is a classic banana skin in C, unfortunately it still manages to get under your foot at the most inopportune of moments, even if you are all too familiar with it.

String going crazy if I don't give it a little extra room. Can anyone explain what is happening here?

First, I'd like to say that I'm new to C / C++, I'm originally a PHP developer so I am bred to abuse variables any way I like 'em.
C is a strict country, compilers don't like me here very much, I am used to breaking the rules to get things done.
Anyway, this is my simple piece of code:
char IP[15] = "192.168.2.1";
char separator[2] = "||";
puts( separator );
Output:
||192.168.2.1
But if I change the definition of separator to:
char separator[3] = "||";
I get the desired output:
||
So why did I need to give the man extra space, so he doesn't sleep with the man before him?
That's because you get a not null-terminated string when separator length is forced to 2.
Always remember to allocate an extra character for the null terminator. For a string of length N you need N+1 characters.
Once you violate this requirement any code that expects null-terminated strings (puts() function included) will run into undefined behavior.
Your best bet is to not force any specific length:
char separator[] = "||";
will allocate an array of exactly the right size.
Strings in C are NUL-terminated. This means that a string of two characters requires three bytes (two for the characters and the third for the zero byte that denotes the end of the string).
In your example it is possible to omit the size of the array and the compiler will allocate the correct amount of storage:
char IP[] = "192.168.2.1";
char separator[] = "||";
Lastly, if you are coding in C++ rather than C, you're better off using std::string.
If you're using C++ anyway, I'd recommend using the std::string class instead of C strings - much easier and less error-prone IMHO, especially for people with a scripting language background.
There is a hidden nul character '\0' at the end of each string. You have to leave space for that.
If you do
char seperator[] = "||";
you will get a string of size 3, not size 2.
Because in C strings are nul terminated (their end is marked with a 0 byte). If you declare separator to be an array of two characters, and give them both non-zero values, then there is no terminator! Therefore when you puts the array pretty much anything could be tacked on the end (whatever happens to sit in memory past the end of the array - in this case, it appears that it's the IP array).
Edit: this following is incorrect. See comments below.
When you make the array length 3, the extra byte happens to have 0 in it, which terminates the string. However, you probably can't rely on that behavior - if the value is uninitialized it could really contain anything.
In C strings are ended with a special '\0' character, so your separator literal "||" is actually one character longer. puts function just prints every character until it encounters '\0' - in your case one after the IP string.
In C, strings include a (invisible) null byte at the end. You need to account for that null byte.
char ip[15] = "1.2.3.4";
in the code above, ip has enough space for 15 characters. 14 "regular characters" and the null byte. It's too short: should be char ip[16] = "1.2.3.4";
ip[0] == '1';
ip[1] == '.';
/* ... */
ip[6] == '4';
ip[7] == '\0';
Since no one pointed it out so far: If you declare your variable like this, the strings will be automagically null-terminated, and you don't have to mess around with the array sizes:
const char* IP = "192.168.2.1";
const char* seperator = "||";
Note however, that I assume you don't intend to change these strings.
But as already mentioned, the safe way in C++ would be using the std::string class.
A C "String" always ends in NULL, but you just do not give it to the string if you write
char separator[2] = "||". And puts expects this \0 at the ned in the first case it writes till it finds a \0 and here you can see where it is found at the end of the IP address. Interesting enoiugh you can even see how the local variables are layed out on the stack.
The line: char seperator[2] = "||"; should get you undefined behaviour since the length of that character array (which includes the null at the end) will be 3.
Also, what compiler have you compiled the above code with? I compiled with g++ and it flagged the above line as an error.
String in C\C++ are null terminated, i.e. have a hidden zero at the end.
So your separator string would be:
{'|', '|', '\0'} = "||"