I have this simple piece of code which reverses a string:
# include <string>
string str = "abcd";
char *ch = new char[str.length()];
for (int i = 0, j = str.length() - 1; i < str.length(); i++, j--)
ch[j] = str[i];
str = string(ch);
cout << str;
This works fine, however I was wondering if the char array *ch must be zero terminated (perhaps it just works ok. because by chance there happens to be a 0 at memory position ch + str.length(). Therefore I've written the following quick test:
string str = "abcd";
char *ch = new char[str.length()];
for (int i = 0, j = str.length() - 1; i < str.length(); i++, j--)
ch[j] = str[i];
// note: illegal memory access, just a quick test
ch[str.length()] = 'a';
str = string(ch);
cout << str;
In the above code it is ensured that *ch is never zero terminated. To my suprise the code still works ok, I can't get my head around this. How can str = string(ch) result in "dbca" when at ch[str.length] there is 'a'; I would either expect a memory error or "dbcaa" as a result.
It doesn't matter what you did before this line:
str = string(ch);
The reason is that the line above may allocate memory, and the memory manager may have used the memory directly following your ch buffer as allocated space. So the a character you wrote there previously has vanished. Or something else happened during the construction of str that assumed that the space you wrote to previously is available.
If you want to know for sure, use your debugger. The std::string constructor and implementation will tell you what exactly occurred (that is, if your program even gets this far since you did introduce undefined behavior before the line of code above).
It's called undefined behaviour. It could be there is a zero after the last address of ch so it could appear to work. But you're overwriting memory allocated from the memory manager which will corrupt it, so you will run into trouble in a bigger application.
The memory manager could reserve a few more bytes in debug builds for debug purpose. Try a release build and see what happens
You're code is totally broken, having undefined behaviour. Specifically...
ch[j] = ch[i];
...reads from ch[i] which is uninitialised memory - as bgoldst commented it's probably meant to be str[i], then - even if that didn't invalidate any expectations of program behaviour...
str = string(ch);
...attempts construction using a ch, which points to still uninitialised memory that could have any content at all, and will be scanned along until a NUL happens to be hit, some access violation crashes the program, or whatever other undefined behaviour manifests. If you fixed the loop to copy from str, then you'd probably want this to cope with the lack of NUL termination:
str = string(ch, str.length());
Perhaps the vaguely worthwhile question is "isn't it almost impossible that I'd have observed (the claimed) dbca output despite the above errors?". To that I'd say:
dbca is not dcba - which did you actually see?
garbage characters in memory might not do anything on your terminal, and it's quite possible that attempting to print from where ch was allocated printed nothing visible, or e.g. printed some crap then a clear-back-to-the-start-of-line, delete-previous, backspace etc. character code, then happened to hit the memory allocated by the std::string object (seemingly lacking a short-string-optimisation buffer), and therefore displayed its contents too.
So - it's not so statistically amazing to constitute evidence for your program somehow having defined behaviour....
Related
I was looking at a project and came across the following code and am unable to figure out what the sprintf is doing in this context and was hoping someone might be able to help me figure it out.
char storage[64];
int loc = 0;
int size = 35;
sprintf(storage+(loc),"A"); //Don't know what this does
loc+=1;
sprintf(storage+(loc),"%i", size); //Don't know what this does
loc+=4;
sprintf(storage+(loc), "%i", start); //Don't know what this does
start += size;
loc += 3;
The code later does the following in another part
string value;
int actVal;
int index = 0;
for(int j = index+1; j < index+4; j++)
{
value += storage[j];
}
istringstream iss;
iss.str(value);
iss >> actVal; //Don't understand how this now contains size
The examples I have seen online regarding sprintf never covered that the above code was possible, but the program executes fine. I just can't figure out how the "+loc" affects storage in this instance and how the values would be saved/stored. Any help would be appreciated.
Ugly code! Regardless, for the first part, storage+(loc) == &storage[loc]. You end up with a string "A35\0<unknown_value>1234\0", assuming start = 1234, or in long form:
sprintf(&storage[0],"A");
sprintf(&storage[1],"%i", size);
sprintf(&storage[5], "%i", start);
For the second part, assuming we have the "A35\0<unknown_value>1234\0" above, we get:
value += '3';
value += '5';
value += '\0';
value += '<unknown_value>'; // This might technically be undefined behaviour
So now value = "35". [1]
iss.str(value);
iss >> actVal;
This turns the string into an input stream and reads out the first string representing an integer, "35", and converts it into an integer, giving us basically actVal = atoi(value.c_str());.
Finally, according to this page, yes, reading an uninitialised ("indeterminate value" is the official term) array element is undefined behaviour thus should be avoided.
[1] Note that in a usual implementation, there is a theoretical 10/256 chance that the <unknown_value> could contain an ASCII digit, so value could end up being between 350 and 359, which is obviously not a good outcome and is why one shouldn't ignore undefined behaviour.
The function sprintf() works just like printf(), except the result is not printed in stdout, rather it is store in a string variable. I suggest you read the sprintf() man page carefully:
https://linux.die.net/man/3/sprintf
Even if you are not on a Linux, that function is pretty much similar across different platforms, be it Windows, Mac or other animals. That said, this piece of code you have presented seems to be unnecessarily complicated.
The first part could be written as:
sprintf(storage,"A %i %i", size, start);
For a similar-but-not-equal result, but then again, it all depends on what exactly the original programmer intended this storage area to hold. As Ken pointed out, there are some undefined bytes and behaviors coming from this code as-is.
From the standard:
int sprintf ( char * str, const char * format, ... );
Write formatted data to string
Composes a string with the same text that would be printed if format was used on printf, but instead of being printed, the content is stored as a C string in the buffer pointed by str.
sprintf(storage+(loc),"A");
writes "A" into a buffer called storage. The storage+(loc) is pointer arithmetic. You're specifying which index of the char array you're writing into. So, storage = "A".
sprintf(storage+(loc),"%i", size);
Here you're writing size into storage[1]. Now storage = "A35\0", loc = 1, and so on.
Your final value of storage = "A35\0<garbage><value of start>\0"
actVal: Don't understand how this now contains size
The for loop goes through storage[1] through storage[5], and builds up value using the contents of storage. value contains the string "35\0<garbage>", and iss.str(value) strips it down to "35\0".
iss >> actVal
If you have come across std::cin, it's the same concept. The first string containing an integer value is written into actVal.
I've just finished C++ The Complete Reference and I'm creating a few test classes to learn the language better. The first class I've made mimics the Java StringBuilder class and the method that returns the string is as follows:
char *copy = new char[index];
register int i;
for(i = 0; i <= index; i++) {
*(copy + i) = *(stringArray + i);
} //f
return copy;
stringArray is the array that holds the string that is being built, index represents the amount of characters that have been entered.
When the string returns there is some junk after it, such as if the string created is abcd the result is abcd with 10 random characters after it. Where is this junk coming from? If you need to see more of the code please ask.
You need to null terminate the string. That null character tells the computer when when string ends.
char * copy = new char[ length + 1];
for(int i = 0; i < length; ++i) copy[i] = stringArray[i];
copy[length] = 0; //null terminate it
Just a few things. Declare the int variable in the tighest scope possible for good practice. It is good practice so that unneeded scope wont' be populate, also easier on debugging and kepping track. And drop the 'register' keyword, let the compiler determine what needs to be optimized. Although the register keyword just hints, unless your code is really tight on performance, ignore stuff like that for now.
Does index contain the length of the string you're copying from including the terminating null character? If it doesn't then that's your problem right there.
If stringArrary isn't null-terminated - which can be fine under some circumstances - you need to ensure that you append the null terminator to the string you return, otherwise you don't have a valid C string and as you already noticed, you get a "bunch of junk characters" after it. That's actually a buffer overflow, so it's not quite as harmless as it seems.
You'll have to amend your code as follows:
char *copy = new char[index + 1];
And after the copy loop, you need to add the following line of code to add the null terminator:
copy[index] = '\0';
In general I would recommend to copy the string out of stringArray using strncpy() instead of hand rolling the loop - in most cases strncpy is optimized by the library vendor for maximum performance. You'll still have to ensure that the resulting string is null terminated, though.
My goal with my constructor is to:
open a file
read into everything that exists between a particular string ("%%%%%")
put together each read row to a variable (history)
add the final variable to a double pointer of type char (_stories)
close the file.
However, the program crashes when I'm using strcat. But I can't understand why, I have tried for many hours without result. :/
Here is the constructor code:
Texthandler::Texthandler(string fileName, int number)
: _fileName(fileName), _number(number)
{
char* history = new char[50];
_stories = new char*[_number + 1]; // rows
for (int j = 0; j < _number + 1; j++)
{
_stories[j] = new char [50];
}
_readBuf = new char[10000];
ifstream file;
int controlIndex = 0, whileIndex = 0, charCounter = 0;
_storieIndex = 0;
file.open("Historier.txt"); // filename
while (file.getline(_readBuf, 10000))
{
// The "%%%%%" shouldnt be added to my variables
if (strcmp(_readBuf, "%%%%%") == 0)
{
controlIndex++;
if (controlIndex < 2)
{
continue;
}
}
if (controlIndex == 1)
{
// Concatenate every line (_readBuf) to a complete history
strcat(history, _readBuf);
whileIndex++;
}
if (controlIndex == 2)
{
strcpy(_stories[_storieIndex], history);
_storieIndex++;
controlIndex = 1;
whileIndex = 0;
// Reset history variable
history = new char[50];
}
}
file.close();
}
I have also tried with stringstream without results..
Edit: Forgot to post the error message:
"Unhandled exception at 0x6b6dd2e9 (msvcr100d.dll) in Step3_1.exe: 0xC00000005: Access violation writing location 0c20202d20."
Then a file named "strcat.asm" opens..
Best regards
Robert
You've had a buffer overflow somewhere on the stack, as evidenced by the fact one of your pointers is 0c20202d20 (a few spaces and a - sign).
It's probably because:
char* history = new char[50];
is not big enough for what you're trying to put in there (or it's otherwise not set up correctly as a C string, terminated with a \0 character).
I'm not entirely certain why you think multiple buffers of up to 10K each can be concatenated into a 50-byte string :-)
strcat operates on null terminated char arrays. In the line
strcat(history, _readBuf);
history is uninitialised so isn't guaranteed to have a null terminator. Your program may read beyond the memory allocated looking for a '\0' byte and will try to copy _readBuf at this point. Writing beyond the memory allocated for history invokes undefined behaviour and a crash is very possible.
Even if you added a null terminator, the history buffer is much shorter than _readBuf. This makes memory over-writes very likely - you need to make history at least as big as _readBuf.
Alternatively, since this is C++, why don't you use std::string instead of C-style char arrays?
Can anyone spot the reason why nothing gets printed onto console using below C++ code;
string array[] = { "a", "b", "c", "d" };
int length = sizeof(array);
try
{
for (int i = 0; i < length; i++)
{
if (array[i] != "") cout << array[i];
}
}
catch (exception &e)
{
e.what();
}
You use the wrong length:
int length = sizeof(array)/sizeof(array[0])
The actual reason you don't see anything on the console is because the output is buffered, and since you haven't wrote a newline it's not flushed. In the meantime your app crashes.
No end of line character.
Also as mentioned by Dave, sizeof is not the length of the array
This answer assumes that string == std::string.
let T be an arbitrary type, and n be an arbitrary positive integer - then:
sizeof(T[n]) == n * sizeof(T)
That is - sizeof(array) is not the length of the array, but the total amount of memory used by the array (in chars). Your std::string implementation could very well be using more than 1 char's worth of memory to store its structure. This leads to length holding a value much greater than 4.
This causes the program to read from past the end of array; an operation for which C++ imposes no requirements (it is Undefined Behaviour).
In terms of the C++ abstract machine, a program containing Undefined Behaviour can do absolutely anything, even before the point in the execution of the program at which the Undefined Behaviour was encountered. In your particular case your program exhibits this behaviour by not printing anything (even though you had made 4 well defined calls to operator<< before the erroneous array indexing).
You have tagged this eclipse-cdt, so I will assume that you are using GCC to compile your program, and are running it under a modern operating system with memory-protection. In this case the actual reason for the behaviour that you are seeing is probably that std::cout is buffering the first few strings that you stream into it and so not immediately printing them to the console. After that you get to the buffer overrun and your operating system interrupts the process with a EXC_BAD_ACCESS signal or similar. This causes the immediate termination of your program, which does not give std::cout a chance to flush its buffered values. All up, this means that nothing gets printed.
As mentioned in another answer, you should replace the line:
length = sizeof(array);
with:
length = sizeof(array)/sizeof(array[0]);
This will guarantee that length will hold the value 4, rather than the value 4 * sizeof(string), that could be many times the length of the array.
This code is compiling clean. But when I run this, it gives exception "Access violation writing location" at line 9.
void reverse(char *word)
{
int len = strlen(word);
len = len-1;
char * temp= word;
int i =0;
while (len >=0)
{
word[i] = temp[len]; //line9
++i;--len;
}
word[i] = '\0';
}
Have you stepped through this code in a debugger?
If not, what happens when i (increasing from 0) passes len (decreasing towards 0)?
Note that your two pointers word and temp have the same value - they are pointing to the same string.
Be careful: not all strings in a C++ program are writable. Even if your code is good it can still crash when someone calls it with a string literal.
When len gets to 0, you access the location before the start of the string (temp[0-1]).
Try this:
void reverse(char *word)
{
size_t len = strlen(word);
size_t i;
for (i = 0; i < len / 2; i++)
{
char temp = word[i];
word[i] = word[len - i - 1];
word[len - i - 1] = temp;
}
}
The function looks like it would not crash, but it won't work correctly and it will read from word[-1], which is not likely to cause a crash, but it is a problem. Your crashing problem is probably that you passed in a string literal that the compiler had put into a read-only data segment.
Something like this would crash on many operating systems.
char * word = "test";
reverse(word); // this will crash if "test" isn't in writable memory
There are also several problems with your algorithm. You have len = len-1 and later temp[len-1] which means that the last character will never be read, and when len==0, you will be reading from the first character before the word. Also, temp and word are both pointers, so they both point to the same memory, I think you meant to make a copy of word rather than just a copy of the pointer to word. You can make a copy of word with strdup. If you do that, and fix your off-by-one problem with len, then your function should work,
But that still won't fix the write crash, which is caused by code that you have not shown us.
Oh, and if you do use strdup be sure to call free to free temp before you leave the function.
Well, for one, when len == 0 len-1 will be a negative number. And that's pretty illegal. Second, it's quite possible that your pointer is pointing at an unreserved area of memory.
If you called that function as followed:
reverse("this is a test");
then with at least one compiler will pass in a read only string due to backwards compatibility with C where you can
pass string literals as non-const char*.