Please suggest what is wrong with this string reversal function? - c++

This code is compiling clean. But when I run this, it gives exception "Access violation writing location" at line 9.
void reverse(char *word)
{
int len = strlen(word);
len = len-1;
char * temp= word;
int i =0;
while (len >=0)
{
word[i] = temp[len]; //line9
++i;--len;
}
word[i] = '\0';
}

Have you stepped through this code in a debugger?
If not, what happens when i (increasing from 0) passes len (decreasing towards 0)?
Note that your two pointers word and temp have the same value - they are pointing to the same string.

Be careful: not all strings in a C++ program are writable. Even if your code is good it can still crash when someone calls it with a string literal.

When len gets to 0, you access the location before the start of the string (temp[0-1]).

Try this:
void reverse(char *word)
{
size_t len = strlen(word);
size_t i;
for (i = 0; i < len / 2; i++)
{
char temp = word[i];
word[i] = word[len - i - 1];
word[len - i - 1] = temp;
}
}

The function looks like it would not crash, but it won't work correctly and it will read from word[-1], which is not likely to cause a crash, but it is a problem. Your crashing problem is probably that you passed in a string literal that the compiler had put into a read-only data segment.
Something like this would crash on many operating systems.
char * word = "test";
reverse(word); // this will crash if "test" isn't in writable memory
There are also several problems with your algorithm. You have len = len-1 and later temp[len-1] which means that the last character will never be read, and when len==0, you will be reading from the first character before the word. Also, temp and word are both pointers, so they both point to the same memory, I think you meant to make a copy of word rather than just a copy of the pointer to word. You can make a copy of word with strdup. If you do that, and fix your off-by-one problem with len, then your function should work,
But that still won't fix the write crash, which is caused by code that you have not shown us.
Oh, and if you do use strdup be sure to call free to free temp before you leave the function.

Well, for one, when len == 0 len-1 will be a negative number. And that's pretty illegal. Second, it's quite possible that your pointer is pointing at an unreserved area of memory.

If you called that function as followed:
reverse("this is a test");
then with at least one compiler will pass in a read only string due to backwards compatibility with C where you can
pass string literals as non-const char*.

Related

Zero termination when casting a char array to string?

I have this simple piece of code which reverses a string:
# include <string>
string str = "abcd";
char *ch = new char[str.length()];
for (int i = 0, j = str.length() - 1; i < str.length(); i++, j--)
ch[j] = str[i];
str = string(ch);
cout << str;
This works fine, however I was wondering if the char array *ch must be zero terminated (perhaps it just works ok. because by chance there happens to be a 0 at memory position ch + str.length(). Therefore I've written the following quick test:
string str = "abcd";
char *ch = new char[str.length()];
for (int i = 0, j = str.length() - 1; i < str.length(); i++, j--)
ch[j] = str[i];
// note: illegal memory access, just a quick test
ch[str.length()] = 'a';
str = string(ch);
cout << str;
In the above code it is ensured that *ch is never zero terminated. To my suprise the code still works ok, I can't get my head around this. How can str = string(ch) result in "dbca" when at ch[str.length] there is 'a'; I would either expect a memory error or "dbcaa" as a result.
It doesn't matter what you did before this line:
str = string(ch);
The reason is that the line above may allocate memory, and the memory manager may have used the memory directly following your ch buffer as allocated space. So the a character you wrote there previously has vanished. Or something else happened during the construction of str that assumed that the space you wrote to previously is available.
If you want to know for sure, use your debugger. The std::string constructor and implementation will tell you what exactly occurred (that is, if your program even gets this far since you did introduce undefined behavior before the line of code above).
It's called undefined behaviour. It could be there is a zero after the last address of ch so it could appear to work. But you're overwriting memory allocated from the memory manager which will corrupt it, so you will run into trouble in a bigger application.
The memory manager could reserve a few more bytes in debug builds for debug purpose. Try a release build and see what happens
You're code is totally broken, having undefined behaviour. Specifically...
ch[j] = ch[i];
...reads from ch[i] which is uninitialised memory - as bgoldst commented it's probably meant to be str[i], then - even if that didn't invalidate any expectations of program behaviour...
str = string(ch);
...attempts construction using a ch, which points to still uninitialised memory that could have any content at all, and will be scanned along until a NUL happens to be hit, some access violation crashes the program, or whatever other undefined behaviour manifests. If you fixed the loop to copy from str, then you'd probably want this to cope with the lack of NUL termination:
str = string(ch, str.length());
Perhaps the vaguely worthwhile question is "isn't it almost impossible that I'd have observed (the claimed) dbca output despite the above errors?". To that I'd say:
dbca is not dcba - which did you actually see?
garbage characters in memory might not do anything on your terminal, and it's quite possible that attempting to print from where ch was allocated printed nothing visible, or e.g. printed some crap then a clear-back-to-the-start-of-line, delete-previous, backspace etc. character code, then happened to hit the memory allocated by the std::string object (seemingly lacking a short-string-optimisation buffer), and therefore displayed its contents too.
So - it's not so statistically amazing to constitute evidence for your program somehow having defined behaviour....

Segfaults on appending char* arrays

I'm making a lexical analyzer and this is a function out of the whole thing. This function takes as argument a char, c, and appends this char to the end of an already defined char* array (yytext). It then increments the length of the text (yylen).
I keep getting segfaults on the shown line when it enters this function. What am I doing wrong here? Thanks.
BTW: can't use the strncpy/strcat, etc. (although if you want you can show me that implementation too)
This is my code:
extern char *yytext;
extern int *yylen;
void consume(char c){
int s = *yylen + 1; //gets yylen (length of yytext) and adds 1
//now seg faults here
char* newArray = new char[s];
for (int i = 0;i < s - 1;i++){
newArray[i] = yytext[i]; //copy all chars from existing yytext into newArray
}
newArray[s-1] = c; //append c to the end of newArray
for (int i = 0;i < s;i++){ //copy all chars + c back to yytext
yytext[i] = newArray[i];
}
yylen++;
}
You have
extern int *yylen;
but try to use it like so:
int s = (int)yylen + 1;
If the variable is an int *, use it like an int * and dereference to get the int. If it is supposed to be an int, then declare it as such.
That can t work:
int s = (int)yylen + 1; //gets yylen (length of yytext) and adds 1
char newArray[s];
use malloc or a big enought buffer
char * newarray=(char*)(malloc(s));
Every C-style string should be null-terminated. From your description it seems you need to append the character at c. So, you need 2 extra locations ( one is for appending the character and other for null-terminator ).
Next, yylen is of type int *. You need to dereference it to get the length (assuming it is pointing to valid memory location ). So, try -
int s = *yylen + 2;
I don't see the need of temporary array but there might be a reason why you are doing it. Now,
yytext[i] = newArray[i]; //seg faults here
you have to check if yytext is pointing to a valid write memory location. If yes, then is it long enough to fill the appending character plus null terminator.
But I would recommend using std::string than working with character arrays. Using it would be a one liner to solve the problem.

Junk after C++ string when returned

I've just finished C++ The Complete Reference and I'm creating a few test classes to learn the language better. The first class I've made mimics the Java StringBuilder class and the method that returns the string is as follows:
char *copy = new char[index];
register int i;
for(i = 0; i <= index; i++) {
*(copy + i) = *(stringArray + i);
} //f
return copy;
stringArray is the array that holds the string that is being built, index represents the amount of characters that have been entered.
When the string returns there is some junk after it, such as if the string created is abcd the result is abcd with 10 random characters after it. Where is this junk coming from? If you need to see more of the code please ask.
You need to null terminate the string. That null character tells the computer when when string ends.
char * copy = new char[ length + 1];
for(int i = 0; i < length; ++i) copy[i] = stringArray[i];
copy[length] = 0; //null terminate it
Just a few things. Declare the int variable in the tighest scope possible for good practice. It is good practice so that unneeded scope wont' be populate, also easier on debugging and kepping track. And drop the 'register' keyword, let the compiler determine what needs to be optimized. Although the register keyword just hints, unless your code is really tight on performance, ignore stuff like that for now.
Does index contain the length of the string you're copying from including the terminating null character? If it doesn't then that's your problem right there.
If stringArrary isn't null-terminated - which can be fine under some circumstances - you need to ensure that you append the null terminator to the string you return, otherwise you don't have a valid C string and as you already noticed, you get a "bunch of junk characters" after it. That's actually a buffer overflow, so it's not quite as harmless as it seems.
You'll have to amend your code as follows:
char *copy = new char[index + 1];
And after the copy loop, you need to add the following line of code to add the null terminator:
copy[index] = '\0';
In general I would recommend to copy the string out of stringArray using strncpy() instead of hand rolling the loop - in most cases strncpy is optimized by the library vendor for maximum performance. You'll still have to ensure that the resulting string is null terminated, though.

strcat error "Unhandled exception.."

My goal with my constructor is to:
open a file
read into everything that exists between a particular string ("%%%%%")
put together each read row to a variable (history)
add the final variable to a double pointer of type char (_stories)
close the file.
However, the program crashes when I'm using strcat. But I can't understand why, I have tried for many hours without result. :/
Here is the constructor code:
Texthandler::Texthandler(string fileName, int number)
: _fileName(fileName), _number(number)
{
char* history = new char[50];
_stories = new char*[_number + 1]; // rows
for (int j = 0; j < _number + 1; j++)
{
_stories[j] = new char [50];
}
_readBuf = new char[10000];
ifstream file;
int controlIndex = 0, whileIndex = 0, charCounter = 0;
_storieIndex = 0;
file.open("Historier.txt"); // filename
while (file.getline(_readBuf, 10000))
{
// The "%%%%%" shouldnt be added to my variables
if (strcmp(_readBuf, "%%%%%") == 0)
{
controlIndex++;
if (controlIndex < 2)
{
continue;
}
}
if (controlIndex == 1)
{
// Concatenate every line (_readBuf) to a complete history
strcat(history, _readBuf);
whileIndex++;
}
if (controlIndex == 2)
{
strcpy(_stories[_storieIndex], history);
_storieIndex++;
controlIndex = 1;
whileIndex = 0;
// Reset history variable
history = new char[50];
}
}
file.close();
}
I have also tried with stringstream without results..
Edit: Forgot to post the error message:
"Unhandled exception at 0x6b6dd2e9 (msvcr100d.dll) in Step3_1.exe: 0xC00000005: Access violation writing location 0c20202d20."
Then a file named "strcat.asm" opens..
Best regards
Robert
You've had a buffer overflow somewhere on the stack, as evidenced by the fact one of your pointers is 0c20202d20 (a few spaces and a - sign).
It's probably because:
char* history = new char[50];
is not big enough for what you're trying to put in there (or it's otherwise not set up correctly as a C string, terminated with a \0 character).
I'm not entirely certain why you think multiple buffers of up to 10K each can be concatenated into a 50-byte string :-)
strcat operates on null terminated char arrays. In the line
strcat(history, _readBuf);
history is uninitialised so isn't guaranteed to have a null terminator. Your program may read beyond the memory allocated looking for a '\0' byte and will try to copy _readBuf at this point. Writing beyond the memory allocated for history invokes undefined behaviour and a crash is very possible.
Even if you added a null terminator, the history buffer is much shorter than _readBuf. This makes memory over-writes very likely - you need to make history at least as big as _readBuf.
Alternatively, since this is C++, why don't you use std::string instead of C-style char arrays?

why memcpy alters the string array?

I have code which goes like this. Could you tell me why it is not behaving as I would expect it to be?
/*
* test.cpp
*
* Created on: Dec 6, 2012
* Author: sandeep
*/
#include<iostream>
#include<string.h>
using namespace std;
int main()
{
int i=0;
string s="hello A B:bye A B";
char *input;
input=new char(s.size());
for(i=0;i<=s.size();i++)
input[i]=s[i];
char *tokenized1[2],*tokenized2[3];
tokenized1[0]=strtok(input,":");
tokenized1[1]=strtok(NULL,":");
i=0;
char *lstring;
while(i<2)
{
lstring=new char(strlen(tokenized1[i]));
memcpy(lstring,tokenized1[i],strlen(tokenized1[i])+1);
cout<<tokenized1[0]<<" "<<tokenized1[1]<<endl;
tokenized2[0]=strtok(lstring," ");
tokenized2[1]=strtok(NULL," ");
tokenized2[2]=strtok(NULL," ");
char c=tokenized2[0][0];
cout<<c<<endl;
cout<<tokenized2[0]<<" "<<tokenized2[1]<<" "<<tokenized2[2]<<endl;
i++;
}
}
and the output is this.
hello A B by
h
hello A B
hello A B by
b
by
There are some junk values at the end of 1st, 4th and 6th line of output.
Why tokenized1[1] got altered when I I did memcopy of tokenized1[0]? and how to solve this?
There's a couple of bugs in the following new call. You need to use square brackets; also, the argument is off by one.
lstring=new char[strlen(tokenized1[i]) + 1];
Without the square brackets, your are allocating space for one character. As a result, the memcpy() writes past the allocated memory.
edit: I just noticed the other new, which will also need to be fixed:
input=new char[s.size() + 1];
Finally, s[i] reads past the end of the string in:
for(i=0;i<=s.size();i++)
input[i]=s[i];
There could well be other bugs, not to mention memory leaks...
You don't appear to be zero terminating 'input'
In addition to what NPE said, there's a couple of other small things:
char *input;
input=new char(s.size());
This might have something to do with it - you are allocating a single character. You then write that one character, and overwrite other memory that is used for who-knows-what. Try this instead:
char *input = new char[s.size() + 1];
Another issue is your loop, immediately below that:
for(i=0;i<=s.size();i++)
input[i]=s[i];
At least on my system, using std::string::operator[] with an offset equal to s.size() fails; I don't know about your particular implementation, but I'm pretty it fails as well. Be safe, rather than sorry, and recode your loop thusly:
for(i = 0; i < s.size(); i++)
input[i] = s[i];
input[i] = 0;
I hope this helps.