C++ char array move null terminator properly? - c++

Hi my problem is kind of difficult to explain so I'll just post my code section here and explain the problem with an example.
This code here has a big and a small array where the big array gets split up in small parts, is stored in the small array and the small array is outputting its content on the screen. Afterwards I free the allocated memory of the small array and initialize it again with the next part of the big array:
//this code is in a loop that runs until all of the big array has been copied
char* splitArray = new char[50];
strncpy(splitArray, bigArray+startPoint, 50); //startPoint is calculated with every loop run, it marks the next point in the array for copying
//output of splitArray on the screen here
delete splitArray;
//repeat loop here
now my problem is that the outputted string has everytime some random symbols at the end. for example "some_characters_here...last_char_hereRANDOM_CHARS_HERE".
after looking deeper into it I found out that splitArray actually doesnt have a size of 50 but of 64 with the null terminator at 64.
so when I copy from bigArray into splitArray then there are still the 14 random characters left after the real string and of course I dont want to output them.
A simple solution would be to manually set the null terminator in the splitArray at [50] but then the program fails to delete the array again.
Can anybody help me find a solution for this? Preferably with some example code, thanks.

How does the program "fail to delete the array again" if you just set splitArray[49] = 0? Don't forget, an array of length 50 is indexed from 0 through 49. splitArray[50] = 0 is writing to memory outside that allocated for splitArray, with all the consequences that entails.

When you allocate memory for the splitArray the memory is not filled with NULL characters, you need to explictly do it. Because of this your string is not properly NULL terminated. To do this you can do char* splitArray = new char[51](); to initialize with NULL character at the time of allocation itself (note that I am allocating 51 chars to have the extra NULL character at the end). . Also note that you need to do delete[] splitArray; and not delete splitArray;.

The function strncpy has the disadvantage that it doesn't terminate the destination string, if the source string contains more than 50 chars. Seems like it does in your case!
If this really is C++, you can do it with std::string splitArray(bigArray+startPoint, 50).

I see a couple of problems with your code:
If you allocate by using new [], you need to free with delete [] (not delete)
Why are you using freestore anyway? From what I can see you might as well use local array.
If you want to store 50 characters in an array, you need 51 for the terminating null character.
You wanted some code:
while(/* condition */)
{
// your logic
char splitArray[51];
strncpy(splitArray, bigArray+startPoint, 50);
splitArray[50] = '\0';
// do stuff with splitArray
// no delete
}

Just doing this will be sufficient:
char* splitArray = new char[50 + 1];
strncpy(splitArray, bigArray+startPoint, 50);
splitArray[50] = '\0';
I'd really question why you're doing this anyway though. This is much cleaner:
std::string split(bigArray+startPoint, 50);
it still does the copy, but handles (de)allocation and termination for you. You can get the underlying character pointer like so:
char const *s = split.c_str();
it'll be correctly nul-terminated, and have the same lifetime as the string object (ie, you don't need to free or delete it).
NB. I haven't changed your original code, but losing the magic integer literals would also be a good idea.

Related

Dynamically allocated C-style string has more characters than given length?

I'm using a dynamic C-style string to read in data from a file, but for some reason when I dynamically allocate the C-style string using the given length, it comes out with four extra characters that can be seen using strlen(). The junk in these empty spaces is added on to the end of the read-in string and is displayed on cout. What on earth could be causing this, and how can I fix it?
The C-style string is declared in the beginning of the code, and is used one time before this. The time it is used before this it is also too large, but in that case it does not add extra information to the end. After use, it is deleted and not used again until this point. I'm pretty confused as I have not had this happen or had a problem with it before.
// Length read as 14, which is correct
iFile.read(reinterpret_cast<char *>(&length), sizeof(int));
tempCstring = new char[length]; // Length still 14
cout << strlen(tempCstring); // Console output: 18
// In tempCstring: Powerful Blockýýýý
iFile.read(reinterpret_cast<char *>(tempCstring), length);
// Custom String class takes in value Powerful Blockýýýý and is
// initialized to that
tempString = String(tempCstring);
// Temp character value takes in messed up string
temp.setSpecial(tempString);
delete[] tempCstring; // Temp cString is deleted for next use
When written to file:
// Length set to the length of the cString, m_special
length = strlen(chars[i].getSpecial().getStr());
// Length written to file. (Should I add 1 for null terminator?)
cFile.write(reinterpret_cast<char *>(&length), sizeof(int));
// String written to file
cFile.write(reinterpret_cast<char *>(chars[i].getSpecial().getStr()), length);
Whenever you see junk at the end of a string, the problem is almost always the lack of a terminator. Every C-style string ends in a byte whose value is zero, spelled '\0'. If you did not place one yourself, the standard library keeps reading bytes in memory until it sees a random '\0' that it sees in memory. In other words, the array is read beyond its bounds.
Use memset(tempCString,0,length) in order to zero out the memory following your allocation. However, this is not the soundest solution, as it is covering the real problem under the rug. Show us the context in which this code is used. Then I will be able to say where in your algorithm you will need to insert the null terminator: tempCString[i] = 0, or something like that. Nonetheless, from what you have posted, I can tell that you need to allocate one more character to make room for the terminator.
Also, since you are using C++, why not use std::string? It avoids these kinds of problems.

C++ Garbage at the end of file

I have a problem and I dont know how to solve it.
The issue is:
char * ary = new Char[];
ifstream fle;
fle.open(1.txt, ios_base::binary);
fle.seekg(fle.end);
long count = fle.tellg();
fle.seek(fle.beg);
here is the problem:
File 1.txt contains: Hello world!.
when I execute:
ary = new char(count);
fle.read(ary, count);
the ary filled like this: Hello world! #T#^#$#FF(garbage)
The file is ookay nothing inside it only what is above.
Platform: Win 7, VS 2012
Any idea how to solve this issue. (Solved)
(Problem 2)
Now I am facing another problem, the fle.read sometimes read more than the size i gave. For Example if i passed like fle.read(buffer, 1000) it ends in some cases (strlen(buffer) = 1500. How can i solve this?
Regards,
char[]-strings in C are usually null-terminated. They are one byte longer than necessary, and the last byte is set to 0x00. That's necessary because C has no way to tell the length of an array.
When you read binary data from a file, no terminating null-character is read into the string. That means a function like printf which operates on char-arrays of unknown length will output the array and any data which happens to come after it in memory until it encounters a null-character.
Solution: allocate the char[]-buffer one byte longer than the length of the data and set the last byte to 0 manually.
Better solution: Don't use C-style char-arrays. Do it the object-oriented way and use the class std::string to represent strings.
I think your problem is not that your array contains garbage, but that you forgot to put the null-terminator character at the end and your print statement doesn't know when to stop.
Also, you wrote new char(count) instead of new char[count]. In the first case, you only instantiate one char with value count while in the second case you create a buffer of count characters.
Try this:
ary = new char[count+1];
fle.read(ary, count);
ary[count] = '\0';
Most of the other answers miss a very important point:
When you do ary = new char(count); you allocate A SINGLE CHARACTER initialized with a symbol with ASCII code count.
You should write this: ary = new char[count + 1];
Well, the most obvious problem is that you are allocating using
new char(count), which allocates a single char, initialized
with count. What you were probably trying to do would be new
char[count]. What you really need is:
std::vector<char> arr( count );
fle.read( &arr[0], count );
Or maybe count + 1 in the allocation, if you want a trailing
'\0' in the buffer.
EDIT:
Since you're still having problems: fle.read will never read
more than requested. What does fle.gcount() return after the
read?
If you do:
std::vector<char> arr( count );
fle.read( &arr[0], count );
arr.resize( fle.gcount() );
you should have a vector with exactly the number of char that
you have read. If you want them as a string, you can construct
one from arr.begin(), arr.end(), or even use std::string
instead of std::vector<char> to begin with.
If you need a '\0' terminated string (for interface with
legacy software), then just create your vector with a size of
count + 1, instead of count, and &arr[0] will be your
'\0' string.
Do not try to use new char[count] here. It's very difficult
to do so correctly. (For example, it will require a try block
and a catch.)
We have to guess a little here, but most likely this comes down to an issue with your debugging. The buffer is filled correctly, but you inspect its contents incorrectly.
Now, ary is declared as char* and I suspect that when you attempt to inspect the contents of ary you use some printing method that expects a null-terminated array. But you did not null-terminate the array. And so you have a buffer overrun.
If you had only printed count characters, then you would not have overrun. Nor would you if you had null-terminated the array, not forgetting to allocate an extra character for the null terminator.
Instead of using raw arrays and new, it would make much more sense to read the buffer into std::string. You should be trying to avoid null-terminated strings as much as possible. You use those when performing interop with non-C++ libraries.
You're reading count characters for a file, you have to allocate one extra character to provide for the string terminator (\0).
ary = new char[count + 1];
ary[count] = '\0';
Try this
ary = new char[count + 1];
fle.read(ary,count);
ary[count] = '\0';
The terminating null character was missing - its not in the file, you have to add it afterwards

Char pointer giving me some really strange characters

When I run the example code, the wordLength is 7 (hence the output 7). But my char array gets some really weird characters in the end of it.
wordLength = word.length();
cout << wordLength;
char * wordchar = new char[wordLength]; //new char[7]; ??
for (int i = 0; i < word.length(); i++) //0-6 = 7
{
wordchar[i] = 'a';
}
cout << wordchar;
The output: 7 aaaaaaa²²²²¦¦¦¦¦ÂD╩2¦♀
Desired output is: aaaaaaa... What is the garbage behind it?? And how did it end up there?
You should add \0 at the end of wordchar.
char * wordchar = new char[wordLength +1];
//add chars as you have done
wordchar[wordLength] = `\0`
The reason is that C-strings are null terminated.
C strings are terminated with a '\0' character that marks their end (in contrast, C++ std::string just stores the length separately).
In copying the characters to wordchar you didn't terminate the string, thus, when operator<< outputs wordchar, it goes on until it finds the first \0 character that happens to be after the memory location pointed to by wordchar, and in the process it prints all the garbage values that happen to be in memory in between.
To fix the problem, you should:
make the allocated string 1 char longer;
add the \0 character at the end.
Still, in C++ you'll normally just want to use std::string.
Use: -
char * wordchar = new char[wordLength+1]; // 1 extra for null character
before for loop and
wordchar[i] ='\0'
after for loop , C strings are null terminated.
Without this it keeps on printing, till it finds the first null character,printing all the garbage values.
You avoid the trailing zero, that's the cause.
In C and C++ the way the whole eco-system treats string length is that it assumes a trailing zero ('\0' or simply 0 numerically). This is different then for example pascal strings, where the memory representation starts with the number which tells how many of the next characters comprise the particular string.
So if you have a certain string content what you want to store, you have to allocate one additional byte for the trailing zero. If you manipulate memory content, you'll always have to keep in mind the trailing zero and preserve it. Otherwise strstr and other string manipulation functions can mutate memory content when running off the track and keep on working on the following memory section. Without trailing zero strlen will also give a false result, it also counts until it encounters the first zero.
You are not the only one making this mistake, it often gets important roles in security vulnerabilities and their exploits. The exploit takes advantage of the side effect that function go off trail and manipulate other things then what was originally intended. This is a very important and dangerous part of C.
In C++ (as you tagged your question) you better use STL's std::string, and STL methods instead of C style manipulations.

C++ delete[] crashes

I'm writing a C++ program, that stores strings in a string array, when the array is full I resize the array to make space for more items using the code below. But sometimes (not always) it crashes at the "delete[] temp;" line and I don't know why and how to fix it. Please, help.
I have searched a lot but could not find an answer anywhere. When I debug it says "invalid pointer" but how can it be invalid when I stored data there before and did not free it yet?
This is my code:
if(item_cnt >= (arr_size - 1))
{
int oldsize = arr_size;
string * temp;
arr_size *= 2;
temp = arr;
arr = new string [arr_size];
memcpy(arr, temp, oldsize * sizeof(temp));
delete[] temp;
}
Unless you absolutely have to stick with your current approach, I would recommend using a vector to hold your strings. It will manage all the memory for you.
Here's an example:
#include <vector>
#include <string>
int main()
{
std::vector<std::string> arrayOfStrings;
arrayOfStrings.push_back("Hello World!"); // To Add Items
string value = arrayOfString.at(<some index>); // To Retrieve an Item you can also use the [] operator instead of the at method
return 0;
}
The memcpy is at the root of your problem. everyone's said "don't use it", but let me explain exactly why it's a terminally bad idea.
First off, what is a c++ string, and how does it do its magic? It's basically a variable-length array of characters, and it achieves this feat by holding a pointer within each string object that points to the memory allocated to hold those characters. As the string grows or shrinks, that memory gets reallocated. Copy strings properly involves making a 'deep copy' of the contents.
Now, to your code:
arr = new string [arr_size];
This creates an array of empty string objects. Because they're empty, the internal pointers are typically null.
memcpy(arr, temp, oldsize * sizeof(temp));
Here, bad things happen. This isn't actually creating copies of the original strings, it's just overwriting the internal representation. So both the old and the new strings 'point' to the same character data. Now to really screw things up, this happens:
delete[] temp;
We delete thew old strings, but this also frees up the character memory that they were using. So our 'new' copies of these strings are pointing at memory that's actually been freed. We now have a car-crash waiting to happen : The character data could be re-used for anything, and when we try and delete the strings again, the operating system will hopefully spot that you're trying to free memory that hasn't been allocated.
Your array should be really
vector<string>
This is a recommended way of implementing arrays of dynamic size. By using vector you avoid necessity to reallocate/copy stuff manually and avoid problems like the one you have altogether.
Mixing old style and new style memory operations is always a bad idea... here you use memcpy and new/ delete. be aware that delete[] also calls the dtor for each element of the array...
Edit:
ctor --> dtor
hth
Mario

Deleting dynamic array of char in C++

I have this class, with the atribute 'word'
class Node {
char *word;
Inside the Node constructor, I do this asignation:
word = new char[strlen(someword)];
In the destructor of the Node class, I try to delete the contents pointed by word:
delete []word;
I obtain the next message after executing the programs:
"Heap block at 003E4F48 modified at 003E4F51 past requested size of 1"
What am I not doing well?
You have a buffer overflow in your program, somewhere else in code you didn't post. The problem is that you're not allocating enough memory -- you don't leave room for the null terminator at the end of your string. You should change the allocation to this:
word = new char[strlen(someword) + 1]; // +1 for null terminator
...
strcpy(word, someword);
You should be thankful your C runtime caught your error. In most cases, a one byte buffer overflow will result in silent memory corruption and not be detected until much later, if ever.
You should also consider using the std::string class, which automatically manages the memory for you, so you don't have to deal with subtle issues like this.
strlen will return the length of the string but will not account for the null-terminating extra byte. My guess is that you are then copying in a string and appending a null-byte but you did not account for it when you originally allocated the memory. Try changing the code to read:
word = new char[strlen(someword) + 1];
You have a corrupt heap - somewhere else in your code you are writing outside the allocated memory or deleting something you shouldn't be - are you sure you don't mean strlen(someworrd) + 1?. The best solution to this problem is to use a std:;string or a std::vector rather than a dynamic array.